This is a Python implementation of search engine. Index file is built using LISA documents collection and search by default is based on Ranked Retrieval Model, but also implements Boolean Retrieval Model which search queries support operators AND
, OR
, NOT
, (
and )
.
-
Make sure you have a
Python 3
interpreter on your machine. The preferable version isPython 3.6
, because solution was tested on this version. -
Install
nltk
library if you haven't got it on your machine. One way of doing it is running in terminal:sudo pip3 install -U nltk
-
Clone the repository by running in terminal
git clone https://github.com/ilya16/search-engine
or download the archive with the system by following the linkhttps://goo.gl/m8yT9q
-
In terminal go to the directory with the unzipped solution using
cd
command and then executecd src
Application supports two modes: GUI and Console.
GUI mode is run by executing
python app.py
Console mode is run by executing
python app.py console
Console mode with Boolean Retrieval is run by executing
python app.py console bool
-
If previous steps are completed without any errors, application should run in one-two seconds. If there is no file
results/indexfile
, index will be built from scratch in up to 20 seconds.