Giter VIP home page Giter VIP logo

Comments (16)

eriksonJAguiar avatar eriksonJAguiar commented on June 30, 2024 4

Hello Guy, I'm using the following command:

model.train(sentences, total_examples=model.corpus_count, epochs=model.iter) and it worked !

sentences.sentences_perm(), not work !!

Thanks for help

from word2vec-sentiments.

dalamar66 avatar dalamar66 commented on June 30, 2024 2

This one worked for me:

model.train(sentences.sentences_perm(), total_examples=model.corpus_count, epochs=model.iter)

from word2vec-sentiments.

linanqiu avatar linanqiu commented on June 30, 2024

try corpus_count instead of corpus_count()

I'll update this during the weekend

from word2vec-sentiments.

linanqiu avatar linanqiu commented on June 30, 2024

Let me know if it works!

from word2vec-sentiments.

RenatoPerotti avatar RenatoPerotti commented on June 30, 2024

First I changed the line into:
model.train(sentences.sentences_perm, total_examples=model.corpus_count)
after that it proposes to ad epochs so I changed it into:
model.train(sentences.sentences_perm, total_examples=model.corpus_count, epochs=model.iter)
but now I get a more complex error message:

_Exception in thread Thread-13:
Traceback (most recent call last):
File "E:\Python3\lib\threading.py", line 916, in _bootstrap_inner
self.run()
File "E:\Python3\lib\threading.py", line 864, in run
self._target(*self._args, **self.kwargs)
File "E:\Python3\lib\site-packages\gensim\models\word2vec.py", line 854, in job_producer
for sent_idx, sentence in enumerate(sentences):
File "E:\Python3\lib\site-packages\gensim\utils.py", line 687, in iter
for document in self.corpus:
TypeError: 'method' object is not iterable

from word2vec-sentiments.

linanqiu avatar linanqiu commented on June 30, 2024

from word2vec-sentiments.

RenatoPerotti avatar RenatoPerotti commented on June 30, 2024

Nope, I get that it is not callable again (like the first issue we had with corpus_count):
_---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
in ()
----> 1 model.train(sentences.sentences_perm, total_examples=model.corpus_count, epochs=model.iter())

TypeError: 'int' object is not callable

_
I hope you got another idea :)

from word2vec-sentiments.

linanqiu avatar linanqiu commented on June 30, 2024

Sorry I have no time to update this right now. @eriksonJAguiar can you let me know the versions of python and gensim you're using? If it's the latest, I'll just make the change you mentioned. Thanks!

from word2vec-sentiments.

eriksonJAguiar avatar eriksonJAguiar commented on June 30, 2024

Hi @linanqiu, I'm using Python 3.5 and gensim 2.3.0 !!

from word2vec-sentiments.

Christings avatar Christings commented on June 30, 2024

Hello, I'm using the following command:
model.train(sentences, total_examples=model.corpus_count, epochs=model.iter) and it worked ! Thanks.

from word2vec-sentiments.

shashankboosi avatar shashankboosi commented on June 30, 2024

Hello there,

I tried the command but I am getting the error

raise ValueError("You must specify an explict epochs count. The usual value is epochs=model.epochs.")
ValueError: You must specify an explict epochs count. The usual value is epochs=model.epochs.

when I tried :

model_dm.train(perm_sentences, total_examples=model_dm.corpus_count,epochs=model_dm.epochs)

but I mentioned the epochs as mentioned, but I still got the error.

Python version : 3.5
Gensim version : 3.4

Can you tell me what the problem is ?

Regards,
Shashank Reddy Boosi.

from word2vec-sentiments.

gsbnair avatar gsbnair commented on June 30, 2024

Hi Shashank,
As few comments said above, You need to make the following modification:
Change model.train(sentences.sentences_perm())
to
model.train(sentences.sentences_perm(), total_examples=model.corpus_count, epochs=model.iter)

It works! And got almost same results. 0.86464
I am using Python version : 3.6 And Gensim version : 3.4

from word2vec-sentiments.

NileshBharti2 avatar NileshBharti2 commented on June 30, 2024

`

from future import absolute_import, division, print_function
import codecs
import glob
import logging
import multiprocessing
import os
import pprint
import re

import nltk
import gensim.models.word2vec as w2v
import sklearn.manifold
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns

import warnings
warnings.filterwarnings(action='ignore', category=UserWarning, module='gensim')

import gensim
%pylab inline
logging.basicConfig(format='%(asctime)s : %(levelname)s : %(message)s', level=logging.INFO)
nltk.download("punkt")
nltk.download("stopwords")
book_filenames = sorted(glob.glob(r"C:\Users\Nilesh\Desktop\Machine.txt"))
print("Found books:")
book_filenames

corpus_raw = u""
for book_filename in book_filenames:
print("Reading '{0}'...".format(book_filename))
with codecs.open(book_filename, "r", "utf-8") as book_file:
corpus_raw += book_file.read()
print("Corpus is now {0} characters long".format(len(corpus_raw)))
print()
tokenizer = nltk.data.load('tokenizers/punkt/english.pickle')
raw_sentences = tokenizer.tokenize(corpus_raw)

#convert into a list of words
#rtemove unnnecessary,, split into words, no hyphens
#list of words
def sentence_to_wordlist(raw):
clean = re.sub("[^a-zA-Z]"," ", raw)
words = clean.split()
return words
#sentence where each word is tokenized
sentences = []
for raw_sentence in raw_sentences:
if len(raw_sentence) > 0:
sentences.append(sentence_to_wordlist(raw_sentence))

print(raw_sentences[5])
print(sentence_to_wordlist(raw_sentences[5]))
token_count = sum([len(sentence) for sentence in sentences])
print("The book corpus contains {0:,} tokens".format(token_count))

#ONCE we have vectors
#step 3 - build model
#3 main tasks that vectors help with
#DISTANCE, SIMILARITY, RANKING

Dimensionality of the resulting word vectors.

#more dimensions, more computationally expensive to train
#but also more accurate
#more dimensions = more generalized
num_features = 300

Minimum word count threshold.

min_word_count = 3

Number of threads to run in parallel.

#more workers, faster we train
num_workers = multiprocessing.cpu_count()

Context window length.

context_size = 7

Downsample setting for frequent words.

#0 - 1e-5 is good for this
downsampling = 1e-3

Seed for the RNG, to make the results reproducible.

#random number generator
#deterministic, good for debugging
seed = 1
word2vec = w2v.Word2Vec(
sg=1,
seed=seed,
workers=num_workers,
size=num_features,
min_count=min_word_count,
window=context_size,
sample=downsampling
)
word2vec.build_vocab(sentences)

print("Word2Vec vocabulary length:", len(word2vec.wv.vocab))
word2vec.train(sentences)

ValueError Traceback (most recent call last)
in ()
----> 1 word2vec.train(sentences)

~\Anaconda3\lib\site-packages\gensim\models\word2vec.py in train(self, sentences, total_examples, total_words, epochs, start_alpha, end_alpha, word_count, queue_factor, report_delay, compute_loss, callbacks)
609 sentences, total_examples=total_examples, total_words=total_words,
610 epochs=epochs, start_alpha=start_alpha, end_alpha=end_alpha, word_count=word_count,
--> 611 queue_factor=queue_factor, report_delay=report_delay, compute_loss=compute_loss, callbacks=callbacks)
612
613 def score(self, sentences, total_sentences=int(1e6), chunksize=100, queue_factor=2, report_delay=1):

~\Anaconda3\lib\site-packages\gensim\models\base_any2vec.py in train(self, sentences, total_examples, total_words, epochs, start_alpha, end_alpha, word_count, queue_factor, report_delay, compute_loss, callbacks)
567 sentences, total_examples=total_examples, total_words=total_words,
568 epochs=epochs, start_alpha=start_alpha, end_alpha=end_alpha, word_count=word_count,
--> 569 queue_factor=queue_factor, report_delay=report_delay, compute_loss=compute_loss, callbacks=callbacks)
570
571 def _get_job_params(self, cur_epoch):

~\Anaconda3\lib\site-packages\gensim\models\base_any2vec.py in train(self, data_iterable, epochs, total_examples, total_words, queue_factor, report_delay, callbacks, **kwargs)
239 epochs=epochs,
240 total_examples=total_examples,
--> 241 total_words=total_words, **kwargs)
242
243 for callback in self.callbacks:

~\Anaconda3\lib\site-packages\gensim\models\base_any2vec.py in _check_training_sanity(self, epochs, total_examples, total_words, **kwargs)
612 if total_words is None and total_examples is None:
613 raise ValueError(
--> 614 "You must specify either total_examples or total_words, for proper job parameters updation"
615 "and progress calculations. "
616 "The usual value is total_examples=model.corpus_count."

ValueError: You must specify either total_examples or total_words, for proper job parameters updationand progress calculations. The usual value is total_examples=model.corpus_count.

from word2vec-sentiments.

AnshikaAgrawal avatar AnshikaAgrawal commented on June 30, 2024

Hello there,

I tried the command but I am getting the error

raise ValueError("You must specify an explict epochs count. The usual value is epochs=model.epochs.")
ValueError: You must specify an explict epochs count. The usual value is epochs=model.epochs.

when I tried :

model_dm.train(perm_sentences, total_examples=model_dm.corpus_count,epochs=model_dm.epochs)

but I mentioned the epochs as mentioned, but I still got the error.

Python version : 3.5
Gensim version : 3.4

Can you tell me what the problem is ?

Regards,
Shashank Reddy Boosi.

You can give explicit epochs value like 20, just as stated in the error. It worked for me!

from word2vec-sentiments.

gsbnair avatar gsbnair commented on June 30, 2024

from word2vec-sentiments.

shashankboosi avatar shashankboosi commented on June 30, 2024

Hello there,
I tried the command but I am getting the error

raise ValueError("You must specify an explict epochs count. The usual value is epochs=model.epochs.")
ValueError: You must specify an explict epochs count. The usual value is epochs=model.epochs.

when I tried :

model_dm.train(perm_sentences, total_examples=model_dm.corpus_count,epochs=model_dm.epochs)

but I mentioned the epochs as mentioned, but I still got the error.

Python version : 3.5
Gensim version : 3.4

Can you tell me what the problem is ?
Regards,
Shashank Reddy Boosi.

You can give explicit epochs value like 20, just as stated in the error. It worked for me!

Yep. That is how it works I guess.
Thank you :)

from word2vec-sentiments.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.