Giter VIP home page Giter VIP logo

emdb's Introduction

EMDB

EMDB is a NIF library for the Memory-Mapped Database database, aka. MDB.

The main purpose of this package is to provide a very fast Riak backend.

But this module could also be used as a general key-value store to replace:

Requirements

  • Erlang R14B04+
  • GCC 4.2+ or MS VisualStudio 2010+

Build

$ make

API

The following functions were implemented:

  • open/1: equivalent to emdb:open(DirName, 10485760).
  • open/2: equivalent to emdb:open(DirName, 10485760, 0).
  • open/3: creates a new MDB database. This call also re-open an already existing one. Arguments are:
    • DirName: database directory name
    • MapSize: database map size (see map.hrl)
    • EnvFlags: database environment flags (see map.hrl). The possible values are defined in emdb.hrl.
  • close/2: closes the database
  • put/2: inserts Key with value Val into the database. Assumes that the key is not present, 'key_exit' is returned otherwise.
  • get/1: retrieves the value stored with Key in the database.
  • del/1: Removes the key-value with key Key from database.
  • update/2: inserts Key with value Val into the database if the key is not present, otherwise updates Key to value Val.
  • drop/1: deletes all key-value pairs in the database.

Usage

$ make

$ ./start.sh

%% create a new database
1> {ok, Handle} = emdb:open("/tmp/emdb1").

%% insert the key <<"a">> with value <<"1">>
2> ok = Handle:put(<<"a">>, <<"1">>).

%% try to re-insert the same key <<"a">>
3> key_exist = Handle:put(<<"a">>, <<"2">>).

%% add a new key-value pair
4> ok = Handle:put(<<"b">>, <<"2">>).

%% search a non-existing key <<"c">>
5> none = Handle:get(<<"c">>).

%% retrieve the value for key <<"b">>
6> {ok, <<"2">>} = Handle:get(<<"b">>).

%% retrieve the value for key <<"a">>
7> {ok, <<"1">>} = Handle:get(<<"a">>).

%% delete key <<"b">>
8> ok = Handle:del(<<"b">>).

%% search a non-existing key <<"b">>
9> none = Handle:get(<<"b">>).

%% delete a non-existing key <<"z">>
10> none = Handle:del(<<"z">>).

%% ensure key <<"a">>'s value is still <<"1">>
11> {ok, <<"1">>} = Handle:get(<<"a">>).
%% update the value for key <<"a">>
12> ok = Handle:update(<<"a">>, <<"7">>).

%% check the new value for key <<"a">>
13> {ok, <<"7">>} = Handle:get(<<"a">>).

%% delete all key-value pairs in the database
14> ok = Handle:drop().

%% try to retrieve key <<"a">> value
15> none = Handle:get(<<"a">>).

%% close the database
16> ok = Handle:close().

...

17> q().  

####Note: The code below creates a new database with 80GB MapSize, avoid fsync after each commit (for max speed) and use the experimental MDB_FIXEDMAP.

{ok, Handle} = emdb:open("/tmp/emdb2", 85899345920, ?MDB_NOSYNC bor ?MDB_FIXEDMAP).

Performance

For maximum speed, this library use only binaries for both keys and values.

See the impressive microbench against:

  • Google's LevelDB
  • SQLite
  • Kyoto TreeDB
  • BerkeleyDB

MDB performs better on 64-bit arch.

Supported OSes

Should work on 32/64-bit architectures:

  • Linux
  • OSX
  • FreeBSD
  • Windows

TODO

  • Unit tests
  • PropEr testing
  • Bulk "writing"

Volunteers are always welcome!

Status

Work in progress. Don't use it in production!

LICENSE

EMDB is Copyright (C) 2012 by Aleph Archives, and released under the OpenLDAP License.

emdb's People

Contributors

alepharchives avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

emdb's Issues

Clustering ?

Hello, what are the options to scale out lmdb on several clustered computers ? Could something like scalaris be used ?
Have fun

Is it really safe to read pointers from get after the txn has finished?

Just been reading the code. In your c_src drv, in get, you create a txn and do the get and then abort the txn. The pointers to data returned from the get are then copied to the ErlNifBinary which are then sent back to erlang-land.

It's not clear from the mdb docs, but I'd be surprised if those pointers are safe after the txn has been aborted. Having read various bits about how mdb works, once the txn has been finished, there's nothing to stop someone else coming in and modifying those locations, and given how mdb reuses pages, it could well end up with utterly different key-value pairs in those locations.

I think you really have to expose the whole txn api within erlang and if the values really need to exist outside the scope of the txn then you're going to have to copy. I think - I could be wrong though.

I also have some concerns about memory management of those binaries. The value locations should not be freed by erlang when the binary is GC'd as they're pointers straight into the mmap used by mdb. Instead they should just be forgotten about. I think you'd have to use enif_make_resource_binary for that so you can specify a noop dtor. Again I could be wrong - I'm curious as to whether you've considered these issues and found they are safe as you've written them.

map / fold ?

Hello,

could this driver expose a map or fold interface on top of the native binary put/get/del ? could a mapping or folding pure erlang runtime function be operated on some ranges ?

Thanks for sharing this module

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.