Giter VIP home page Giter VIP logo

libprz's Introduction

libprz

A distributed bitmap index and query engine for embedding or overlaying on top of other databases

Example Schema

keyspace user_events
row_id -> [product, user]
colmns -> event_name : counter

[prod1, user1] : {login : 1}
[prod1, user1] : {landing_page : 2}
[prod1, user1] : {signup : 1}

[prod1, user2] : {login : 1}
[prod1, user2] : {landing_page : 2}
[prod1, user2] : {signup : 1}

[prod1, user2] : {login : 2}
[prod1, user2] : {landing_page : 7}
[prod1, user2] : {signup : 2}

Indexes

  • Indexes are stored on disk in leveldb
  • All index addresses spaces are 64 bit
  • Index chunks are 256 bytes
  • Offsets are 16 bytes (64 bits)
  • If the index value can't be transformed to a 64bit unsigned int, it must be hashed by cityhash.
  • In level db all keys are contained within the same namespace. Need to prepend byte sequence to keys to partition keyspaces. Partition byte sequence are 16 bits (2 bytes).
  • Fields [prod, event] are encodeded as [bytes] where the first uint16 is the size of the byte sequence
  • Because of hash collsion we can't rely upon hash values alone to find unique IDs. Anytime a hash is employed we must keep a dictionary mapping hash values to actual value.

hash -> table row_id inverted index

[partition][table][hash_val][row_id] : NULL
[2][bytes][16][bytes] : NULL

Range/equality encoded bitslice index

[partition][field1 bytes][value][offset] : [index chunk 0x00]
[partition][field1 bytes][value][offset] : [index chunk 0x01]

[partition][field2 bytes][value][offset] : [index chunk 0x01]
[partition][field2 bytes][value][offset] : [index chunk 0x03]

[2][bytes][16][16] : [256]

libprz's People

Contributors

mstump avatar

Watchers

James Cloos avatar vanguard_space avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.