The new word2vec requires total_examples to be specified in the train command, now it

Hello Guy, I'm using the following command: <code class="notranslate

try corpus_count instead of <code class="notranslate"

Sorry I have no time to update this right now. <a class="user-mention notranslate" dat

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

error with the training command about word2vec-sentiments HOT 16 OPEN

linanqiu commented on June 30, 2024

error with the training command

from word2vec-sentiments.

Comments (16)

eriksonJAguiar commented on June 30, 2024 4

Hello Guy, I'm using the following command:

model.train(sentences, total_examples=model.corpus_count, epochs=model.iter) and it worked !

sentences.sentences_perm(), not work !!

Thanks for help

from word2vec-sentiments.

dalamar66 commented on June 30, 2024 2

This one worked for me:

model.train(sentences.sentences_perm(), total_examples=model.corpus_count, epochs=model.iter)

from word2vec-sentiments.

linanqiu commented on June 30, 2024

try corpus_count instead of corpus_count()

I'll update this during the weekend

from word2vec-sentiments.

linanqiu commented on June 30, 2024

Let me know if it works!

from word2vec-sentiments.

RenatoPerotti commented on June 30, 2024

First I changed the line into:
model.train(sentences.sentences_perm, total_examples=model.corpus_count)
after that it proposes to ad epochs so I changed it into:
model.train(sentences.sentences_perm, total_examples=model.corpus_count, epochs=model.iter)
but now I get a more complex error message:

_Exception in thread Thread-13:
Traceback (most recent call last):
File "E:\Python3\lib\threading.py", line 916, in _bootstrap_inner
self.run()
File "E:\Python3\lib\threading.py", line 864, in run
self._target(*self._args, **self.kwargs)
File "E:\Python3\lib\site-packages\gensim\models\word2vec.py", line 854, in job_producer
for sent_idx, sentence in enumerate(sentences):
File "E:\Python3\lib\site-packages\gensim\utils.py", line 687, in iter
for document in self.corpus:
TypeError: 'method' object is not iterable

from word2vec-sentiments.

linanqiu commented on June 30, 2024

Try model.iter()

…

On Jul 22, 2017 7:12 PM, "RenatoPerotti" ***@***.***> wrote: I changed the line into: model.train(sentences.sentences_perm, total_examples=model.corpus_count, epochs=model.iter) but now I get a more complex error message: _Exception in thread Thread-13: Traceback (most recent call last): File "E:\Python3\lib\threading.py", line 916, in _bootstrap_inner self.run() File "E:\Python3\lib\threading.py", line 864, in run self._target(*self._args, **self. *kwargs) File "E:\Python3\lib\site-packages\gensim\models\word2vec.py", line 854, in job_producer for sent_idx, sentence in enumerate(sentences): File "E:\Python3\lib\site-packages\gensim\utils.py", line 687, in iter for document in self.corpus: TypeError: 'method' object is not iterable* — You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub <#16 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ACZ8yg8PDLNfZv6qSyCJdtoAr3EWdgJgks5sQoHngaJpZM4OeGri> .

from word2vec-sentiments.

RenatoPerotti commented on June 30, 2024

Nope, I get that it is not callable again (like the first issue we had with corpus_count):
_---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
in ()
----> 1 model.train(sentences.sentences_perm, total_examples=model.corpus_count, epochs=model.iter())

TypeError: 'int' object is not callable

_
I hope you got another idea :)

from word2vec-sentiments.

linanqiu commented on June 30, 2024

Sorry I have no time to update this right now. @eriksonJAguiar can you let me know the versions of python and gensim you're using? If it's the latest, I'll just make the change you mentioned. Thanks!

from word2vec-sentiments.

eriksonJAguiar commented on June 30, 2024

Hi @linanqiu, I'm using Python 3.5 and gensim 2.3.0 !!

from word2vec-sentiments.

Christings commented on June 30, 2024

Hello, I'm using the following command:
model.train(sentences, total_examples=model.corpus_count, epochs=model.iter) and it worked ! Thanks.

from word2vec-sentiments.

shashankboosi commented on June 30, 2024

Hello there,

I tried the command but I am getting the error

raise ValueError("You must specify an explict epochs count. The usual value is epochs=model.epochs.")
ValueError: You must specify an explict epochs count. The usual value is epochs=model.epochs.

when I tried :

model_dm.train(perm_sentences, total_examples=model_dm.corpus_count,epochs=model_dm.epochs)

but I mentioned the epochs as mentioned, but I still got the error.

Python version : 3.5
Gensim version : 3.4

Can you tell me what the problem is ?

Regards,
Shashank Reddy Boosi.

from word2vec-sentiments.

gsbnair commented on June 30, 2024

Hi Shashank,
As few comments said above, You need to make the following modification:
Change model.train(sentences.sentences_perm())
to
model.train(sentences.sentences_perm(), total_examples=model.corpus_count, epochs=model.iter)

It works! And got almost same results. 0.86464
I am using Python version : 3.6 And Gensim version : 3.4

from word2vec-sentiments.

NileshBharti2 commented on June 30, 2024

from future import absolute_import, division, print_function
import codecs
import glob
import logging
import multiprocessing
import os
import pprint
import re

import nltk
import gensim.models.word2vec as w2v
import sklearn.manifold
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns

import warnings
warnings.filterwarnings(action='ignore', category=UserWarning, module='gensim')

import gensim
%pylab inline
logging.basicConfig(format='%(asctime)s : %(levelname)s : %(message)s', level=logging.INFO)
nltk.download("punkt")
nltk.download("stopwords")
book_filenames = sorted(glob.glob(r"C:\Users\Nilesh\Desktop\Machine.txt"))
print("Found books:")
book_filenames

corpus_raw = u""
for book_filename in book_filenames:
print("Reading '{0}'...".format(book_filename))
with codecs.open(book_filename, "r", "utf-8") as book_file:
corpus_raw += book_file.read()
print("Corpus is now {0} characters long".format(len(corpus_raw)))
print()
tokenizer = nltk.data.load('tokenizers/punkt/english.pickle')
raw_sentences = tokenizer.tokenize(corpus_raw)

#convert into a list of words
#rtemove unnnecessary,, split into words, no hyphens
#list of words
def sentence_to_wordlist(raw):
clean = re.sub("[^a-zA-Z]"," ", raw)
words = clean.split()
return words
#sentence where each word is tokenized
sentences = []
for raw_sentence in raw_sentences:
if len(raw_sentence) > 0:
sentences.append(sentence_to_wordlist(raw_sentence))

print(raw_sentences[5])
print(sentence_to_wordlist(raw_sentences[5]))
token_count = sum([len(sentence) for sentence in sentences])
print("The book corpus contains {0:,} tokens".format(token_count))

#ONCE we have vectors
#step 3 - build model
#3 main tasks that vectors help with
#DISTANCE, SIMILARITY, RANKING

Dimensionality of the resulting word vectors.

#more dimensions, more computationally expensive to train
#but also more accurate
#more dimensions = more generalized
num_features = 300

Minimum word count threshold.

min_word_count = 3

Number of threads to run in parallel.

#more workers, faster we train
num_workers = multiprocessing.cpu_count()

Context window length.

context_size = 7

Downsample setting for frequent words.

#0 - 1e-5 is good for this
downsampling = 1e-3

Seed for the RNG, to make the results reproducible.

#random number generator
#deterministic, good for debugging
seed = 1
word2vec = w2v.Word2Vec(
sg=1,
seed=seed,
workers=num_workers,
size=num_features,
min_count=min_word_count,
window=context_size,
sample=downsampling
)
word2vec.build_vocab(sentences)

print("Word2Vec vocabulary length:", len(word2vec.wv.vocab))
word2vec.train(sentences)

ValueError Traceback (most recent call last)
in ()
----> 1 word2vec.train(sentences)

~\Anaconda3\lib\site-packages\gensim\models\word2vec.py in train(self, sentences, total_examples, total_words, epochs, start_alpha, end_alpha, word_count, queue_factor, report_delay, compute_loss, callbacks)
609 sentences, total_examples=total_examples, total_words=total_words,
610 epochs=epochs, start_alpha=start_alpha, end_alpha=end_alpha, word_count=word_count,
--> 611 queue_factor=queue_factor, report_delay=report_delay, compute_loss=compute_loss, callbacks=callbacks)
612
613 def score(self, sentences, total_sentences=int(1e6), chunksize=100, queue_factor=2, report_delay=1):

~\Anaconda3\lib\site-packages\gensim\models\base_any2vec.py in train(self, sentences, total_examples, total_words, epochs, start_alpha, end_alpha, word_count, queue_factor, report_delay, compute_loss, callbacks)
567 sentences, total_examples=total_examples, total_words=total_words,
568 epochs=epochs, start_alpha=start_alpha, end_alpha=end_alpha, word_count=word_count,
--> 569 queue_factor=queue_factor, report_delay=report_delay, compute_loss=compute_loss, callbacks=callbacks)
570
571 def _get_job_params(self, cur_epoch):

~\Anaconda3\lib\site-packages\gensim\models\base_any2vec.py in train(self, data_iterable, epochs, total_examples, total_words, queue_factor, report_delay, callbacks, **kwargs)
239 epochs=epochs,
240 total_examples=total_examples,
--> 241 total_words=total_words, **kwargs)
242
243 for callback in self.callbacks:

~\Anaconda3\lib\site-packages\gensim\models\base_any2vec.py in _check_training_sanity(self, epochs, total_examples, total_words, **kwargs)
612 if total_words is None and total_examples is None:
613 raise ValueError(
--> 614 "You must specify either total_examples or total_words, for proper job parameters updation"
615 "and progress calculations. "
616 "The usual value is total_examples=model.corpus_count."

ValueError: You must specify either total_examples or total_words, for proper job parameters updationand progress calculations. The usual value is total_examples=model.corpus_count.

from word2vec-sentiments.

AnshikaAgrawal commented on June 30, 2024

Hello there,

I tried the command but I am getting the error

raise ValueError("You must specify an explict epochs count. The usual value is epochs=model.epochs.")
ValueError: You must specify an explict epochs count. The usual value is epochs=model.epochs.

when I tried :

model_dm.train(perm_sentences, total_examples=model_dm.corpus_count,epochs=model_dm.epochs)

but I mentioned the epochs as mentioned, but I still got the error.
Python version : 3.5
Gensim version : 3.4
Can you tell me what the problem is ?

Regards,
Shashank Reddy Boosi.

You can give explicit epochs value like 20, just as stated in the error. It worked for me!

from word2vec-sentiments.

gsbnair commented on June 30, 2024

I may be able to help you if you send me the code in full to check.

On Sat, 16 Nov 2019 at 2:15 PM, Anshika Agrawal ***@***.***> wrote: Hello there, I tried the command but I am getting the error raise ValueError("You must specify an explict epochs count. The usual value is epochs=model.epochs.") ValueError: You must specify an explict epochs count. The usual value is epochs=model.epochs. when I tried : model_dm.train(perm_sentences, total_examples=model_dm.corpus_count,epochs=model_dm.epochs) but I mentioned the epochs as mentioned, but I still got the error. Python version : 3.5 Gensim version : 3.4 Can you tell me what the problem is ? Regards, Shashank Reddy Boosi. You can give explicit epochs value like 20, just as stated in the error. It worked for me! — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#16?email_source=notifications&email_token=AAU3VQWUE537DIS3PANXQ5LQT6XKXA5CNFSM4DTYNLRKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEHMXBY#issuecomment-554617735>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAU3VQU6YNZHG3NGI5SX4KTQT6XKXANCNFSM4DTYNLRA> .

-- Thanks & Regards, Suresh Babu

from word2vec-sentiments.

shashankboosi commented on June 30, 2024

Hello there,
I tried the command but I am getting the error

raise ValueError("You must specify an explict epochs count. The usual value is epochs=model.epochs.")
ValueError: You must specify an explict epochs count. The usual value is epochs=model.epochs.

when I tried :

model_dm.train(perm_sentences, total_examples=model_dm.corpus_count,epochs=model_dm.epochs)

but I mentioned the epochs as mentioned, but I still got the error.
Python version : 3.5
Gensim version : 3.4
Can you tell me what the problem is ?
Regards,
Shashank Reddy Boosi.
You can give explicit epochs value like 20, just as stated in the error. It worked for me!

Yep. That is how it works I guess.
Thank you :)

from word2vec-sentiments.

error with the training command about word2vec-sentiments HOT 16 OPEN

Comments (16)

Dimensionality of the resulting word vectors.

Minimum word count threshold.

Number of threads to run in parallel.

Context window length.

Downsample setting for frequent words.

Seed for the RNG, to make the results reproducible.

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent