ssumukh / search-engine-for-wikipedia Goto Github PK
View Code? Open in Web Editor NEWUsed Block-Sort-Based-Indexing to create the inverted index of the entire WikiPedia dump (73.3 GB), then queries on the index and retrieves top 10 results via relevance ranking of the documents, implemented using tf-idf scoring.