Giter VIP home page Giter VIP logo

eyedex's Introduction

THE EYE SEARCHING TOOL

Search (almost) everything on The-Eye

I created this because a friend of mine wanted books on a particular subject, but didn't want to sift through all files on the site
Requires at least 4GB's of RAM (database is loaded to RAM)

Before you start

  1. Download the databases
    • Google Drive Link
    • Databases are split in two parts, JUST the Piracy/ directory and everything else
  2. Format the databases to feed to the program
    • Move the databases to the "database" folder
    • python3 formatter.py [inputfile] [outputfile]
    • Wait for it to finish
  3. Compile the searcher program
    • Navigate to the cpp folder
    • Compile the searcher:
      • g++ fastSearch.cpp thirdParty.{cpp,h} -lz -lpthread -lzmq -o searcher
    • In case of errors, make sure required libs are installed(libzmq3-dev and others as needed)
  4. Install required python libraries:
    • pip3 install pyzmq flask rapidfuzz

Usage

  1. Rename formatted database file to "dbformatted.json" and move it to cpp/
    • You can edit the fastSearch.cpp file to modify this file name(i was too lazy to implement something for that)
  2. Run the searcher file in cpp/ and wait for it to be loaded
  3. While waiting for it to load to RAM, run the main.py file to initiate flask
    • You can customize things like minimum matching score from the main.py file
  4. Navigate to localhost:5000 in a browser(more info provided there)
  5. Done

Notes

  • This was originally meant to be hosted on a server; however the cost of a server is too high for me so it's open source instead
  • This is NOT a mirror of all files in The-Eye, it is just an index of the files hosted there
  • I'll try to update the database roughly every month. I will not be releasing the scraping code just yet
  • The database is NOT 100% accurate, and stuff like html pages are not scraped correctly
  • I used json as it is easy to use, readable and supports the nested format that i want. The format is easy to iterate over line-by-line(inspect the first few hundred bytes to see what i mean)

eyedex's People

Contributors

xosrov avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.