Comments (6)
demo-word-accuracy.sh also crashes.
The other demos run great.
Original comment by [email protected]
on 19 Aug 2013 at 7:45
from word2vec.
Im on OSX Lion compiled with clang.
Using valgrind the issue appears to be on line 102 of compute-accuracy.c
vec[a] = M[a + b2 * size] - M[a + b1 * size] + m[a + b3 * size];
With 30k as the input on the command line for words the size of M is 24,000,000
bytes or 6M float array, but from putting in an if statement the program
regularly accesses memory outside of this range.
Putting the if statement with a printf msg stops the seg fault.
I have:
if (a + b3 * size > 6000000) printf("Memory overflow\n");
Putting this statement in there outputs a bunch of memory overflow messages but
aside from that it seems as the though the program keeps trucking along and I
get a final output of
ACCURACY TOP1: 18.77 % (122 / 650)
Total accuracy 26.19% Semantic accuracy: 24.76% Syntactic accuracy: 26.91%
Questions seen / total: 12268 19544 62.77%
This is obviously not a fix, something to do with buffers but I'm not a C
expert by any means.
Original comment by [email protected]
on 22 Aug 2013 at 5:55
from word2vec.
Thanks for reporting this bug, it should be fixed now.
Original comment by [email protected]
on 23 Aug 2013 at 6:08
- Changed state: Fixed
from word2vec.
Seems still broken.
deleted all data files. Updated to latest. Re-applied the OSX fix (#include
<malloc.h> becomes stdlib.h)
make clean
make
re-ran the script.
Starting training using file text8
Words processed: 17000K Vocab size: 4399K
Vocab size (unigrams + bigrams): 2586139
Words in train file: 17005206
Words written: 17000K
real 0m20.452s
user 0m19.601s
sys 0m0.816s
Starting training using file text8-phrase
Vocab size: 123636
Words in train file: 16337523
Alpha: 0.000119 Progress: 99.59% Words/thread/sec: 22.46k
real 1m37.069s
user 12m8.130s
sys 0m1.240s
newspapers:
./demo-phrase-accuracy.sh: line 12: 1189 Segmentation fault: 11
./compute-accuracy vectors-phrase.bin < questions-phrases.txt
Original comment by [email protected]
on 23 Aug 2013 at 6:30
from word2vec.
No idea what I'm doing, but if it helps:
(gdb) run vectors-phrase.bin <questions-phrases.txt
Starting program: /Users/benjamin/Documents/code/word2vec/compute-accuracy
vectors-phrase.bin <questions-phrases.txt
Reading symbols for shared libraries +.............................. done
newspapers:
Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0x000004b4b1e6d740
0x00000001000019fd in main ()
Original comment by [email protected]
on 23 Aug 2013 at 6:37
from word2vec.
Removing -Ofast from the make file seems to have helped. But wow is it slower,
maybe a 90% speed reduction?
output:
newspapers:
ACCURACY TOP1: 8.33 % (1 / 12)
Total accuracy: 8.33 % Semantic accuracy: 8.33 % Syntactic accuracy: nan %
ice_hockey:
ACCURACY TOP1: 0.00 % (0 / 56)
Total accuracy: 1.47 % Semantic accuracy: 1.47 % Syntactic accuracy: nan %
basketball:
ACCURACY TOP1: 0.00 % (0 / 30)
Total accuracy: 1.02 % Semantic accuracy: 1.02 % Syntactic accuracy: nan %
airlines:
ACCURACY TOP1: 14.29 % (6 / 42)
Total accuracy: 5.00 % Semantic accuracy: 5.00 % Syntactic accuracy: nan %
people-companies:
ACCURACY TOP1: 25.00 % (1 / 4)
Total accuracy: 5.56 % Semantic accuracy: 5.56 % Syntactic accuracy: nan %
Questions seen / total: 144 3218 4.47 %
Original comment by [email protected]
on 23 Aug 2013 at 7:05
from word2vec.
Related Issues (20)
- How can I get the coordinates of all words?
- Charset encoding HOT 1
- Patch for /trunk/demo-train-big-model-v1.sh HOT 4
- Fix build errors and warnings
- cnnot check out ?
- slightly optimized word2vec
- expTable array initialization
- Missing first letters in precompiled word vectors file: GoogleNews-vectors-negative300.bin HOT 2
- Keeping indexing URIs in plain text models
- Cluster output malformed
- The definition of sentences? HOT 1
- Patch for /trunk/word2vec.c
- Patch for /trunk/README.txt
- Create vectors for classification HOT 1
- Patch for distance.c: minor off-by-one error
- word2phrase: AddWord2Vocab strcpy's full length of strings longer than MAX_STRING HOT 1
- distance.c - fread'ing all vector floats at once boosts loading speed by ~3x...
- Updating global variables without locking
- word2vec.c: Minor tweak to reduce CPU pipeline stalls (3% gain) HOT 1
- Make fails on Windows + Cygwin
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from word2vec.