u-alberta / prima Goto Github PK
View Code? Open in Web Editor NEWPersonal Research Management for the Internet Archive.
License: GNU Affero General Public License v3.0
Personal Research Management for the Internet Archive.
License: GNU Affero General Public License v3.0
shingles.db is created the first time min_hash.py is run then used for repeated calls to that function (it isn't recreated unless the user deletes it.)
inverted_index.db used to be created by tfidf.py in order to get my function k_means_clusterer.py to work but I changed that code so it doesn't need a database.
Right now I'm thinking keep shingles.db the way it is to save computation time on repeated calls to min_hash.py and just get rid of inverted_index.db altogether.
Should the user be able to specify the file type they want saved in the command line arguments?
In min_hash should they be able to specify k-shingles and number of hash functions?
In bm25 should they be able to specify number of documents to be returned or minimum score for documents returned?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.