Giter VIP home page Giter VIP logo

persistentcollections.jl's Introduction

PersistentCollections.jl

Build Status Coverage Status

Julia Dict and Set data structures safely persisted to disk.

All collections are backed by LMDB - a super fast B-Tree based embedded KV database with ACID guaranties. As with other B-Tree based databases reads are faster than writes. However, write performance is still decent (expect 1k-10k TPS).

Care was taken to make the data structures thread-safe. LMDB handles most of the locking well - we just have to exclusively lock the LMDB.Environment when writing to prevent multiple threads opening multile write transactions (deadlock will occur).

Quick Start

  1. Install this package:
    import Pkg
    Pkg.add("https://github.com/blenessy/PersistentCollections.jl.git")
  2. Create an LMDB.Environment in a directory called data (in your current working directory):
    using PersistentCollections
    env = LMDB.Environment("data")
  3. Create an AbstractDict in your LMDB environment:
    dict = PersistentDict{String,String}(env)
  4. Use it as any other dict:
    dict["foo"] = "bar"
    @assert dict["foo"] == "bar"
    @assert collect(keys(dict)) == ["foo"]
    @assert collect(values(dict)) == ["bar"]
  5. (Optional) note the asymetric performance characteristic of LMDB (B-Tree) based database:
    @time dict["bar"] = "baz";  # Writes to LMDB (B-Tree) are relatively slow
    @time dict["bar"];          # Reads are very fast though :)

User Guide

Dynamic types

It is possible to create persistent collection of Any type although some methods will not be able to convert the value to the correct type because no metadata is stored for this in DB. Most notably the getindex method (e.g. dict["foo"]) will not return a converted value. To mitigate this limitation, use the get method, which includes a default value. The type of the default value (if other than nothing) will be used to convert the value to the desired type.

env = LMDB.Environment("data")
dict = PersistentDict{Any,Any}(env)
dict["foo"] == "bar"
dict["foo"]                  # PersistentCollections.LMDB.MDBValue{Nothing}(0x0000000000000003, Ptr{Nothing} @0x000000012c806ffd, nothing)
get(dict, "foo", "")         # "bar"
convert(String, dict["foo"]) # "bar"

Multiple persistent collections in the same LMDB Environment

It is possible if you need transactional consistency between multiple persistent collections:

  1. Create your LMDB.Environment with "named database" support by specifying the number of persistent collections yoy want with the maxdbs keyword argument:
    env = LMDB.Environment("data", maxdbs=2)
  2. Instantiate your persistent collections with a unique (within LMDB env.) id:
    dict1 = PersistentDict{String,String}(env, id="mydict1")
    dict2 = PersistentDict{String,Int}(env, id="mydict2")

Danger Zone: Manual sync writes to disc

Yes, you can expect significant increase with write throughput if you are willing to risk loosing your last written transactions. Please note that database integrity (risk of curruption) is not in danger here.

unsafe_env = LMDB.Environment("data", flags=LMDB.MDB_NOSYNC)
unsafe_dict = PersistentDict{String,String}(unsafe_env)
flush(unsafe_env) do 
    unsafe_dict["foo"] = "bar"
    unsafe_dict["foo"] = "baz"
end # <== data is flushed to disk here

This is equvalent to:

unsafe_env = LMDB.Environment("data", flags=LMDB.MDB_NOSYNC)
unsafe_dict = PersistentDict{String,String}(unsafe_env)
try
    unsafe_dict["foo"] = "bar"
    unsafe_dict["foo"] = "baz"
finally
    flush(unsafe_env)
end

Running Tests

make test

Analyzing Code Coverage

make coverage

Benchmarks

make bench

Status

CI/CD

  • Travis CI integration
  • Coveralls integration (when public)
  • All platforms supported
  • Part of Julia Registry

PersistentDict

  • Optimised implementation
  • Thread Safe
  • MDB_NOSYNC support
  • Named database support
  • Manual flush (sync) to disk

PersistentSet

  • Implemented
  • Thread Safe
  • MDB_NOSYNC support
  • Named database support
  • Manual flush (sync) to disk

Credits

Lots of LMDB wrapping magic was pinched from wildart/LMDB.jl - who deserves lots of credits.

persistentcollections.jl's People

Contributors

blenessy avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.