Giter VIP home page Giter VIP logo

sefaria-sql's Introduction

Sefaria-SQL

Converts Sefaria-Export to SQLite database.

Where to get database

How to use

  1. git clone https://github.com/Sefaria/Sefaria-SQL.git
  2. git clone https://github.com/Sefaria/Sefaria-Export.git (into the same dir that Sefaria-SQL is in)
  3. Go to scripts/links and run: pytyhon2 createLinks.py
  4. Go to scripts/fileList and run: pytyhon2 createFileList.py
  5. (Not really needed b/c headers are part of clone) go to Sefaria-SQL/scripts/headers and run: pytyhon2 createHeaders.py
  6. Open Sefaria-SQL in Eclipse for Java (File -> import -> Existing Projects into Workspace)
  7. In src/SQLite.java, you can change variables
  8. Run project
  9. The exported database is in testDBs/ and word counts are saved in wordCounts/

Exploring the Code

The java code is in src/

SQLite.java is the highest level code (it run at startup). Book.java contains methods for inputing the data about each book into the database. Simularly, Header, Link, Searching, and Text are responsible for putting their respective items into the database (in their own table). Node.java is responsible for putting in Nodes for complex texts and/or alternate structures.

There are some preprocessing python srcipts in scripts/

scripts/fileList/createFileList.py creates a list of files to be upload based on the index and exported files.

scripts/links/createLinks.py converts Sefaria-Export/links/links.csv to scripts/links/links0.csv which has divided level numbers and is easier to upload.

License

GPL

sefaria-sql's People

Contributors

herzberg avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.