Comments (16)
Hello Guy, I'm using the following command:
model.train(sentences, total_examples=model.corpus_count, epochs=model.iter)
and it worked !
sentences.sentences_perm(), not work !!
Thanks for help
from word2vec-sentiments.
This one worked for me:
model.train(sentences.sentences_perm(), total_examples=model.corpus_count, epochs=model.iter)
from word2vec-sentiments.
try corpus_count
instead of corpus_count()
I'll update this during the weekend
from word2vec-sentiments.
Let me know if it works!
from word2vec-sentiments.
First I changed the line into:
model.train(sentences.sentences_perm, total_examples=model.corpus_count)
after that it proposes to ad epochs so I changed it into:
model.train(sentences.sentences_perm, total_examples=model.corpus_count, epochs=model.iter)
but now I get a more complex error message:
_Exception in thread Thread-13:
Traceback (most recent call last):
File "E:\Python3\lib\threading.py", line 916, in _bootstrap_inner
self.run()
File "E:\Python3\lib\threading.py", line 864, in run
self._target(*self._args, **self.kwargs)
File "E:\Python3\lib\site-packages\gensim\models\word2vec.py", line 854, in job_producer
for sent_idx, sentence in enumerate(sentences):
File "E:\Python3\lib\site-packages\gensim\utils.py", line 687, in iter
for document in self.corpus:
TypeError: 'method' object is not iterable
from word2vec-sentiments.
from word2vec-sentiments.
Nope, I get that it is not callable again (like the first issue we had with corpus_count):
_---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
in ()
----> 1 model.train(sentences.sentences_perm, total_examples=model.corpus_count, epochs=model.iter())
TypeError: 'int' object is not callable
_
I hope you got another idea :)
from word2vec-sentiments.
Sorry I have no time to update this right now. @eriksonJAguiar can you let me know the versions of python and gensim you're using? If it's the latest, I'll just make the change you mentioned. Thanks!
from word2vec-sentiments.
Hi @linanqiu, I'm using Python 3.5 and gensim 2.3.0 !!
from word2vec-sentiments.
Hello, I'm using the following command:
model.train(sentences, total_examples=model.corpus_count, epochs=model.iter) and it worked ! Thanks.
from word2vec-sentiments.
Hello there,
I tried the command but I am getting the error
raise ValueError("You must specify an explict epochs count. The usual value is epochs=model.epochs.")
ValueError: You must specify an explict epochs count. The usual value is epochs=model.epochs.
when I tried :
model_dm.train(perm_sentences, total_examples=model_dm.corpus_count,epochs=model_dm.epochs)
but I mentioned the epochs as mentioned, but I still got the error.
Python version : 3.5
Gensim version : 3.4
Can you tell me what the problem is ?
Regards,
Shashank Reddy Boosi.
from word2vec-sentiments.
Hi Shashank,
As few comments said above, You need to make the following modification:
Change model.train(sentences.sentences_perm())
to
model.train(sentences.sentences_perm(), total_examples=model.corpus_count, epochs=model.iter)
It works! And got almost same results. 0.86464
I am using Python version : 3.6 And Gensim version : 3.4
from word2vec-sentiments.
`
from future import absolute_import, division, print_function
import codecs
import glob
import logging
import multiprocessing
import os
import pprint
import re
import nltk
import gensim.models.word2vec as w2v
import sklearn.manifold
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
import warnings
warnings.filterwarnings(action='ignore', category=UserWarning, module='gensim')
import gensim
%pylab inline
logging.basicConfig(format='%(asctime)s : %(levelname)s : %(message)s', level=logging.INFO)
nltk.download("punkt")
nltk.download("stopwords")
book_filenames = sorted(glob.glob(r"C:\Users\Nilesh\Desktop\Machine.txt"))
print("Found books:")
book_filenames
corpus_raw = u""
for book_filename in book_filenames:
print("Reading '{0}'...".format(book_filename))
with codecs.open(book_filename, "r", "utf-8") as book_file:
corpus_raw += book_file.read()
print("Corpus is now {0} characters long".format(len(corpus_raw)))
print()
tokenizer = nltk.data.load('tokenizers/punkt/english.pickle')
raw_sentences = tokenizer.tokenize(corpus_raw)
#convert into a list of words
#rtemove unnnecessary,, split into words, no hyphens
#list of words
def sentence_to_wordlist(raw):
clean = re.sub("[^a-zA-Z]"," ", raw)
words = clean.split()
return words
#sentence where each word is tokenized
sentences = []
for raw_sentence in raw_sentences:
if len(raw_sentence) > 0:
sentences.append(sentence_to_wordlist(raw_sentence))
print(raw_sentences[5])
print(sentence_to_wordlist(raw_sentences[5]))
token_count = sum([len(sentence) for sentence in sentences])
print("The book corpus contains {0:,} tokens".format(token_count))
#ONCE we have vectors
#step 3 - build model
#3 main tasks that vectors help with
#DISTANCE, SIMILARITY, RANKING
Dimensionality of the resulting word vectors.
#more dimensions, more computationally expensive to train
#but also more accurate
#more dimensions = more generalized
num_features = 300
Minimum word count threshold.
min_word_count = 3
Number of threads to run in parallel.
#more workers, faster we train
num_workers = multiprocessing.cpu_count()
Context window length.
context_size = 7
Downsample setting for frequent words.
#0 - 1e-5 is good for this
downsampling = 1e-3
Seed for the RNG, to make the results reproducible.
#random number generator
#deterministic, good for debugging
seed = 1
word2vec = w2v.Word2Vec(
sg=1,
seed=seed,
workers=num_workers,
size=num_features,
min_count=min_word_count,
window=context_size,
sample=downsampling
)
word2vec.build_vocab(sentences)
print("Word2Vec vocabulary length:", len(word2vec.wv.vocab))
word2vec.train(sentences)
ValueError Traceback (most recent call last)
in ()
----> 1 word2vec.train(sentences)
~\Anaconda3\lib\site-packages\gensim\models\word2vec.py in train(self, sentences, total_examples, total_words, epochs, start_alpha, end_alpha, word_count, queue_factor, report_delay, compute_loss, callbacks)
609 sentences, total_examples=total_examples, total_words=total_words,
610 epochs=epochs, start_alpha=start_alpha, end_alpha=end_alpha, word_count=word_count,
--> 611 queue_factor=queue_factor, report_delay=report_delay, compute_loss=compute_loss, callbacks=callbacks)
612
613 def score(self, sentences, total_sentences=int(1e6), chunksize=100, queue_factor=2, report_delay=1):
~\Anaconda3\lib\site-packages\gensim\models\base_any2vec.py in train(self, sentences, total_examples, total_words, epochs, start_alpha, end_alpha, word_count, queue_factor, report_delay, compute_loss, callbacks)
567 sentences, total_examples=total_examples, total_words=total_words,
568 epochs=epochs, start_alpha=start_alpha, end_alpha=end_alpha, word_count=word_count,
--> 569 queue_factor=queue_factor, report_delay=report_delay, compute_loss=compute_loss, callbacks=callbacks)
570
571 def _get_job_params(self, cur_epoch):
~\Anaconda3\lib\site-packages\gensim\models\base_any2vec.py in train(self, data_iterable, epochs, total_examples, total_words, queue_factor, report_delay, callbacks, **kwargs)
239 epochs=epochs,
240 total_examples=total_examples,
--> 241 total_words=total_words, **kwargs)
242
243 for callback in self.callbacks:
~\Anaconda3\lib\site-packages\gensim\models\base_any2vec.py in _check_training_sanity(self, epochs, total_examples, total_words, **kwargs)
612 if total_words is None and total_examples is None:
613 raise ValueError(
--> 614 "You must specify either total_examples or total_words, for proper job parameters updation"
615 "and progress calculations. "
616 "The usual value is total_examples=model.corpus_count."
ValueError: You must specify either total_examples or total_words, for proper job parameters updationand progress calculations. The usual value is total_examples=model.corpus_count.
from word2vec-sentiments.
Hello there,
I tried the command but I am getting the error
raise ValueError("You must specify an explict epochs count. The usual value is epochs=model.epochs.")
ValueError: You must specify an explict epochs count. The usual value is epochs=model.epochs.when I tried :
model_dm.train(perm_sentences, total_examples=model_dm.corpus_count,epochs=model_dm.epochs)
but I mentioned the epochs as mentioned, but I still got the error.
Python version : 3.5 Gensim version : 3.4
Can you tell me what the problem is ?
Regards,
Shashank Reddy Boosi.
You can give explicit epochs value like 20, just as stated in the error. It worked for me!
from word2vec-sentiments.
from word2vec-sentiments.
Hello there,
I tried the command but I am getting the errorraise ValueError("You must specify an explict epochs count. The usual value is epochs=model.epochs.")
ValueError: You must specify an explict epochs count. The usual value is epochs=model.epochs.when I tried :
model_dm.train(perm_sentences, total_examples=model_dm.corpus_count,epochs=model_dm.epochs)
but I mentioned the epochs as mentioned, but I still got the error.
Python version : 3.5 Gensim version : 3.4
Can you tell me what the problem is ?
Regards,
Shashank Reddy Boosi.You can give explicit epochs value like 20, just as stated in the error. It worked for me!
Yep. That is how it works I guess.
Thank you :)
from word2vec-sentiments.
Related Issues (20)
- Doesn't test data shouldn't be used while building the model? HOT 1
- How to get label when input is a document? HOT 1
- How to detect sentiment of text HOT 2
- ipynb file seems to be invalid HOT 2
- can't clone this repo HOT 7
- Error with Sentence Labels?
- How to classify new sentences? HOT 2
- Predict sentiment of new data HOT 3
- Learning Testing Sentence Label While training HOT 1
- sentences_perm() impose randomness while calculating accuracies HOT 2
- Sorry, silly question 😅
- No such file or directory: 'train-pos.txt' HOT 1
- Poor result HOT 1
- Bad results HOT 9
- AttributeError: 'numpy.ndarray' object has no attribute 'tags' HOT 3
- Error when inspecting the model HOT 6
- 'numpy.ndarray' object has no attribute 'tags' HOT 4
- KeyError: 'TRAIN_NEG_0' HOT 2
- How could I apply the code for multiple classes classification HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from word2vec-sentiments.