sunbelbd / song Goto Github PK
View Code? Open in Web Editor NEWSONG: Approximate Nearest Neighbor Search on GPU. SONG is a graph-based approximate nearest neighbor search toolbox.
License: MIT License
SONG: Approximate Nearest Neighbor Search on GPU. SONG is a graph-based approximate nearest neighbor search toolbox.
License: MIT License
Hi,
I'm just using song for running datasets in your paper.
Can you tell me this parameter setting for each dataset? In the readme, you give an example with pq_size = 100.
So can I use 100 for all of them? Will this parameter largely affect the quality and efficiency of graph building and query?
Actually, I test nytimes dataset in https://github.com/erikbern/ann-benchmarks, song takes around 10 min to finish graph construction. Does that make sense? All the commands I use follows your readme.
./build_graph.sh /path2/nytimes-256-angular_base_libsvm 290000 256 cos
Also, can you share piece of code that how you calculate the recall? Since I found that you do not read groundtrue file in your system. I add this computation code at the end of your main.cu file.
Does this correct for how you calculate in your paper?
Hi, I tried running song with various datasets in libsvm format.
letter (dim=16), poker (dim=10) worked well, but acoustic (dim=50) and higher dimensional datasets did not.
As poker has a larger number of entities than acoustic, I think this may be related to dimension.
It raises a GPU error: illegal memory access was encountered.
I used Cuda 10.0/10.1 and TITAN V, GV100 for running.
Would you kindly check if the acoustic/mnist dataset works in your environment?
I did not change anything in config.h and ran:
./generate_template.sh
./fill_parameters.sh 100 50 cos
./build_graph.sh data/acoustic_scale 78823 50 cos
./test_query.sh data/acoustic_scale.t 19705 50 cos 5
Hi,
I have been trying to generate graph for out-of-memory datasets using 1-bit random projections (mentioned in the SONG paper). I looked in the config file, makefile and the paper but could not figure out. Can you point to the right direction?
Even if I uncomment the #define __ENABLE_HASH in the config.h file, I am not getting the 1-bit random graph.
Hello, can i ask how do you guys calculate the recall. Do you have a groundtruth file ro do you create the groundtruth yourself?
Your inputs will be greatly appreciated.
Thanks
Hi,
I'd like to reproduce results in SONG paper. It seems the parameter <pq_size>
for fill_parameter.sh
not only controls the recall of the search, but also controls the quality of the graph. Can you share the parameters you used to build graphs for NYTimes, SIFT, GloVe200, UQ_V, GIST, MNIST8m?
Thanks
Hi, I have noticed that you used sift1M dataset in your paper.
How can I use fvecs format dataset with your code?
Are you planning to add fves to libsvm converting script or sift1M in libsvm format?
Does cuda version affect the results?
In the warp_independent_search_kernel
, new
and delete
are used to allocate memory. Could you please advise a workaround when dynamic memory allocation is not supported by a programming model ? Thank you.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.