Giter VIP home page Giter VIP logo

learndb-py's Introduction

LearnDB

What I Cannot Create, I Do Not Understand -Richard Feynman

In the spirit of Feynman's immortal words, the goal of this project is to better understand the internals of databases by implementing a relational database management system (RDBMS) (sqlite clone) from scratch.

This project was motivated by a desire to: 1) understand databases more deeply and 2) work on a fun project. These dual goals led to a:

  • relatively simple code base
  • relatively complete RDBMS implementation
  • written in pure python
    • No build step
  • zero configuration
    • configuration can be overriden

This makes the learndb codebase great for tinkering with. But the product has some key limitations that means it shouldn't be used as an actual storage solution.

Features

Learndb supports the following:

  • it has a rich sql (learndb-sql) with support for select, from, where, group by, having, limit, order by
  • custom lexer and parser built using lark
  • at a high-level, there is an engine that can accept some SQL statements. These statements expresses operations on a database (a collection of tables which contain data)
  • allows users/agents to connect to RDBMS in multiple ways:
    • REPL
    • importing python module
    • passing a file of commands to the engine
  • on-disk btree implementation as backing data structure

Limitations

  • Very simplified 1 implementation of floating point number arithmetic, e.g. compared to IEEE754).
  • No support for common utility features, like wildcard column expansion, e.g. select * ...
  • More limitations

Getting Started: Tinkering and Beyond

  • To get started with learndb first start with tutorial.md.
  • Then to understand the system at a deeper technical level read reference.md. This is essentially a complete reference manual directed at a user of the system. This outlines the operations and capabilities of the system. It also describes what is (un)supported and undefined behavior.
  • `Architecture.md`` - this provides a component level breakdown of the repo and the system

Hacking

Install

  • System requirements
    • requires a linux/macos system, since it uses fcntl to get exclusive read access on database file
    • python >= 3.9
  • To install for development, i.e. src can be edited from without having to reinstall:
    • cd <repo_root>
    • create virtualenv: python3 -m venv venv
    • activate venv: source venv/bin/activate
    • install requirements: python -m pip install -r requirements.txt
    • install Learndb in edit mode: python3 -m pip install -e .

Run REPL

source venv/bin/activate
python run_learndb.py repl

Run Tests

  • Run all tests:

  • python -m pytest tests/*.py

  • Run btree tests: -python -m pytest -s tests/btree_tests.py # stdout

  • python -m pytest tests/btree_tests.py # suppressed out

  • Run end-to-end tests: python -m pytest -s tests/e2e_tests.py

  • Run end-to-end tests (employees): python -m pytest -s tests/e2e_tests_employees.py

python -m pytest -s tests/e2e_tests_employees.py -k test_equality_select

  • Run serde tests: ... serde_tests.py

  • Run language parser tests: ... lang_tests.py

  • Run specific test: python -m pytest tests.py -k test_name

  • Clear pytest cache python -m pytest --cache-clear

References consulted

Project Management

  • immanent work/issues are tracked in tasks.md
  • long-term ideas are tracked in docs/future-work.md

Footnotes

  1. When evaluating the difference between two floats, e.g. 3.2 > 4.2, I consider the condition True if the difference between the two is some fixed delta. The accepted epsilon should scale with the magnitude of the number โ†ฉ

learndb-py's People

Contributors

spandanb avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.