Giter VIP home page Giter VIP logo

forestdb's Introduction

ForestDB

ForestDB is a key-value storage engine developed by Couchbase Caching and Storage Team, and its main index structure is built from Hierarchical B+-Tree based Trie, called HB+-Trie. ForestDB paper has been published in IEEE Transactions on Computers.

Compared with traditional B+-Tree based storage engines, ForestDB shows significantly better read and write performance with less storage overhead. ForestDB has been tested on various server OS environments (Centos, Ubuntu, Mac OS x, Windows) and mobile OSs (iOS, Android). The test coverage stats for ForestDB are available in ForestDB Code Coverage Report.

ForestDB benchmark program is also available for performance comparisons with other key-value storage engines.

Please visit the ForestDB wiki for more details.

Main Features

  • Keys and values are treated as an arbitrary binary.
  • Applications can supply a custom compare function to support a customized key order.
  • A value can be retrieved by its sequence number or disk offset in addition to a key.
  • Write-Ahead Logging (WAL) and its in-memory index are used to reduce the main index lookup / update overhead.
  • Multi-Version Concurrency Control (MVCC) support and append-only storage layer.
  • Multiple snapshot instances can be created from a given ForestDB instance to provide different views of database.
  • Rollback is supported to revert the database to a specific point.
  • Ranged iteration by keys or sequence numbers is supported for a partial or full range lookup operation.
  • Manual or auto compaction can be configured per ForestDB database file.
  • Transactional support with read_committed or read_uncommitted isolation level.

How to build

See INSTALL.MD

How to Use

Please refer to Public APIs and tests/fdb_functional_test.cc in ForestDB source directory.

How to contribute code

  1. Sign the Couchbase Contributor License Agreement
  2. Submit code changes via either a Github PR or via Gerrit (for Gerrit usage, see Instructions from the couchbase-spark-connector project.)

Note regarding master branch

The 'master' git branch of forestdb contains a number of changes which ultimately were not kept for production builds of Couchbase Server. Production builds were kept on an earlier release branch named 'watson' corresponding to Couchbase Server 4.5. Couchbase Server 5.0, 5.1, 5.5, and 6.0 added some bug fixes on branches made from 'watson', namely 'spock' and 'vulcan'. For Couchbase Server 6.5 and forward, a new branch 'cb-master' was created from the then-current 'vulcan' branch.

'cb-master' should be seen as the equivalent of 'master' for all Couchbase Server production build purposes. Any additional production bug fixes will go there, and release-specific branches will be made from there when necessary.

The current 'master' branch is left untouched and unsupported, for use by community users who may depend on the work done there.

forestdb's People

Contributors

abhinavdangeti avatar avsej avatar borrrden avatar ceejatec avatar chippiewill avatar chiyoung avatar daverigby avatar greensky00 avatar hisundar avatar jimwwalker avatar kbhute-ibm avatar snej avatar t3rm1n4l avatar tahmmee avatar tleyden avatar trondn avatar vmx avatar wurikiji avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

forestdb's Issues

Some questions and Suggestions about ForestDB

ForestDB is a very good KV storage component, Some questions and Suggestions about ForestDB:

  1. The binary cross-c/C++ compiler call is not supported on MSVC because in fdb_types.h:
    #ifndef __cplusplus

    #pragma once

#define false (0)

#define true (1)

#define bool int

#endif
sizeof(bool)==1 On the c ++ compiler, and sizeof(bool)==4 On the c ++ compiler, ForestDB must be compiled using the c ++ compiler, because byte alignment is different, the result error by call fdb_get_default_config() and fdb_get_default_kvs_config() API.

  1. Some API parameters are not appropriate:

    LIBFDB_API
    fdb_config fdb_get_default_config(void) change for LIBFDB_API
    fdb_status fdb_get_default_config(fdb_config* fconfig)
    LIBFDB_API
    fdb_kvs_config fdb_get_default_kvs_config(void) change for LIBFDB_API
    fdb_status fdb_get_default_kvs_config(fdb_kvs_config* config)
    fdb_status fdb_doc_update(fdb_doc **doc,... change for fdb_status fdb_doc_update(fdb_doc *doc,...

    That's more reasonable.

  2. After enable stale block reusing, repeat full table write->commit->full table update->commit->full table delete->commit, etc,stale block reusing ineffectiveness, If you control the commit granularity, that you can.

4.enable DOCIO_BLOCK_ALIGN macro in option.h, Partial record call fdb_get_kv() retrieval failed.

  1. When num_keeping_headers is set to 1, Write 100w record(key: int, value: 1KB), set/get/update is good, But deleting errors:
    [FDB ERR: -61] Read error: BID 550859 in a database file './tag-table.db' is not read correctly: only -61 bytes read
    (hex) 0x88d2400, 4096 (0x1000) bytes
    0000 00 20 00 00 00 04 3d dc ff ff ff ff ff ff ff ff . ....=.........
    ...
    ...
    0ff0 ff ff ff ff ff ff ff ff 00 00 00 00 00 00 00 ff ................
    [FDB ERR: -61] Failed to read a database header with block id 550859 in a database file './tag-table.db'
    [FDB ERR: -5] Read error: read offset 14757395258967638016 exceeds the file's current offset 2256322560 in a database file './tag-table.db'

[FDB ERR: -5] Failed to read a database header with block id 14757395258967641292 in a database file './tag-table.db'

No error occurred while num_keeping_headers>1.

Tag a New Release

I know that the project is dormant and development has stagnated, but it'd be awesome if y'all could tag a new release at whatever commit you feel is reasonable. I'd like to include this as a dependency in a project, but I'd hate to have to refer to either HEAD or a specific commit hash directly.

Thanks!

Crash when parsing malformed FDB

Hi folks,

An interesting crash was found while fuzz testing of the forestdb_dump binary which can be triggered via a malformed database file. Although this malformed file only crashes the program as-is, it could potentially be crafted further and create a security issue where these kinds of files would be able compromise the process's memory through taking advantage of affordances given by memory corruption. It's recommend to harden the code to prevent these kinds of bugs as it could greatly mitigate such this issue and even future bugs.

crash.fdb.txt

(renamed to .txt for github)

$ forestdb_dump crash.fdb
Segmentation fault (core dumped)

$ gdb -q forestdb_dump
Reading symbols from forestdb_dump...

(gdb) r crash.fdb
Starting program: forestdb_dump crash.fdb
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff79f2700 (LWP 1245583)]
[New Thread 0x7ffff71f1700 (LWP 1245584)]
[New Thread 0x7ffff69f0700 (LWP 1245585)]
[New Thread 0x7ffff61ef700 (LWP 1245586)]

Thread 1 "forestdb_dump" received signal SIGSEGV, Segmentation fault.
_sb_read_given_no (file=0x5555555df630, sb_no=<optimized out>, sb=<optimized out>, log_callback=<optimized out>) at forestdb/src/superblock.cc:1520
1520	        sb->bmp_doc_offset[i] = _endian_decode(enc_u64);

(gdb) bt
#0  _sb_read_given_no (file=0x5555555df630, sb_no=<optimized out>, sb=<optimized out>, log_callback=<optimized out>) at forestdb/src/superblock.cc:1520
#1  0x00005555555a7aa7 in sb_read_latest (file=0x5555555df630, sconfig=..., log_callback=<optimized out>) at forestdb/src/superblock.cc:1674
#2  0x000055555557d75e in filemgr_open (filename=filename@entry=0x7fffffffc5c0 "crash.fdb", ops=<optimized out>, config=config@entry=0x7fffffffc270, 
    log_callback=log_callback@entry=0x5555555db048) at forestdb/src/filemgr.cc:1005
#3  0x0000555555584036 in _fdb_open (handle=handle@entry=0x5555555daee0, filename=filename@entry=0x7fffffffe6b0 "crash.fdb", filename_mode=filename_mode@entry=FDB_VFILENAME, 
    config=config@entry=0x7fffffffdf10) at forestdb/src/forestdb.cc:1689
#4  0x0000555555585ae1 in fdb_open (ptr_fhandle=0x7fffffffe160, filename=0x7fffffffe6b0 "crash.fdb", fconfig=0x7fffffffe1a0)
    at forestdb/src/forestdb.cc:833
#5  0x0000555555563654 in process_file (opt=0x7fffffffe2e0) at forestdb/tools/forestdb_dump.cc:254
#6  0x0000555555561f75 in main (argc=2, argv=0x7fffffffe418) at forestdb/tools/forestdb_dump.cc:390

(gdb) i r
rax            0x68                104
rbx            0x5555555dfc00      93824992803840
rcx            0x5555555e0d30      93824992808240
rdx            0x88f46760570fd337  -8578117726758776009
rsi            0x652e23e27000      111248844943360
rdi            0x0                 0
rbp            0x7ffffffd4b20      0x7ffffffd4b20
rsp            0x7ffffffd4850      0x7ffffffd4850
r8             0x0                 0
r9             0x0                 0
r10            0x5555555ea000      93824992845824
r11            0xfffffffffffff000  -4096
r12            0xe744e44a068       15892692115560
r13            0x5555555df630      93824992802352
r14            0x7ffffffd48b8      140737488177336
r15            0xdeadcafebeefc002  -2401039830844719102
rip            0x5555555a7783      0x5555555a7783 <_sb_read_given_no(filemgr*, size_t, superblock*, err_log_callback*)+883>
eflags         0x10202             [ IF RF ]
cs             0x33                51
ss             0x2b                43
ds             0x0                 0
es             0x0                 0
fs             0x0                 0
gs             0x0                 0

(gdb) x/i $rip
=> 0x5555555a7783 <_sb_read_given_no(filemgr*, size_t, superblock*, err_log_callback*)+883>:	mov    %rdx,-0x68(%r8,%rax,1)

(gdb) exploitable
Description: Access violation near NULL on destination operand
Short description: DestAvNearNull (15/22)
Hash: d6199a1b37a756f3d37f258a8faaa290.ecc8eda54691748cb17fcce5cae118bb
Exploitability Classification: PROBABLY_EXPLOITABLE
Explanation: The target crashed on an access violation at an address matching the destination operand of the instruction. This likely indicates a write access violation, which means the attacker may control write address and/or value. However, it there is a chance it could be a NULL dereference.
Other tags: AccessViolation (21/22)

Error building on OSX -- fatal error: 'snappy-c.h' file not found

Steps to build:

$ brew install cmake
$ brew install snappy # note, I didn't use "sudo", does that make a difference?
$ git clone <repo>
$ cd forestdb && mkdir build && cd build && cmake ../

Cmake output: https://gist.github.com/tleyden/555a6ea565f768c739f3

When I ran make all, I got an error:

[ 27%] Building CXX object CMakeFiles/docio_test.dir/src/docio.cc.o
/Users/tleyden/DevLibraries/forestdb/src/docio.cc:26:10: fatal error: 'snappy-c.h' file not found
#include "snappy-c.h"
         ^
1 error generated.
make[2]: *** [CMakeFiles/docio_test.dir/src/docio.cc.o] Error 1
make[1]: *** [CMakeFiles/docio_test.dir/all] Error 2
make: *** [all] Error 2

Full make all output: https://gist.github.com/tleyden/86c926a85adfd9d175da

I do have snappy-c.h in my /usr/local/Cellar dir:

$ find /usr/local/Cellar -iname "*snappy*"
/usr/local/Cellar/snappy
/usr/local/Cellar/snappy/1.1.1/include/snappy-c.h
....

but it doesn't seem to be found during the compilation. Is my brew misconfigured or do I need to do another step?

Add 'const' qualifiers in API

In the public API in forestdb.h, please add the const qualifier to function parameters that take pointers but do not modify the memory pointed to. For example,

fdb_status fdb_open(fdb_handle *handle, char *filename, fdb_config *config);

should be

fdb_status fdb_open(fdb_handle *handle, const char *filename, const fdb_config *config);

Error building on Master branch

I did the following steps on ubuntu 18.04

mkdir build
cd build
cmake ../
make all

it produces an error:

/forestdb/utils/debug.cc:88:5: error: ‘ucontext’ was not declared in this scope
     ucontext *u = (ucontext *)context;
     ^~~~~~~~

Screenshot from 2019-05-23 13-31-43

build error for aarch64

orestdb/utils/debug.cc:93:57: error: no member named 'gregs' in 'mcontext_t'; did you mean 'regs'?
    unsigned char *pc = (unsigned char *)u->uc_mcontext.gregs[REG_EIP];
                                                        ^~~~~
                                                        regs
/usr/include/aarch64-linux-gnu/sys/ucontext.h:55:34: note: 'regs' declared here
    unsigned long long int __ctx(regs)[31];
                                 ^
forestdb/utils/debug.cc:93:63: error: use of undeclared identifier 'REG_EIP'
    unsigned char *pc = (unsigned char *)u->uc_mcontext.gregs[REG_EIP];
                                                              ^

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.