Giter VIP home page Giter VIP logo

pmemkv's Introduction

pmemkv

GHA build status Coverity Scan Build Status Coverage Status PMEMKV version Packaging status

⚠️ Discontinuation of the project

The pmemkv project will no longer be maintained by Intel.

  • Intel has ceased development and contributions including, but not limited to, maintenance, bug fixes, new releases, or updates, to this project.
  • Intel no longer accepts patches to this project.
  • If you have an ongoing need to use this project, are interested in independently developing it, or would like to maintain patches for the open source software community, please create your own fork of this project.
  • You will find more information here.

Introduction

pmemkv is a local/embedded key-value datastore optimized for persistent memory. Rather than being tied to a single language or backing implementation, pmemkv provides different options for language bindings and storage engines.

For more information, including C API and C++ API see: https://pmem.io/pmemkv. Documentation is available for every branch/release. For most recent always see (master branch):

Latest releases can be found on the "releases" tab.

There is also a small helper library pmemkv_json_config provided. See its manual for details.

Table of contents

  1. Installation
  2. Language Bindings
  3. Storage Engines
  4. Benchmarks
  5. Contact us

Installation

Installation guide provides detailed instructions how to build and install pmemkv from sources, build rpm and deb packages and explains usage of experimental engines and pool sets.

Language Bindings

pmemkv is written in C/C++ and can be used in other languages - Java, Node.js, Python, and Ruby.

pmemkv-bindings

C/C++ Examples

Examples for C and C++ can be found within this repository in examples directory.

Other Languages

The above-mentioned bindings are maintained in separate GitHub repositories, but are still kept in sync with the main pmemkv distribution.

Storage Engines

pmemkv provides multiple storage engines that share common API, so every engine can be used with all language bindings and utilities. Engines are loaded by name at runtime.

Engine Name Description Experimental Concurrent Sorted Persistent
cmap Concurrent hash map No Yes No Yes
vsmap Volatile sorted hash map No No Yes No
vcmap Volatile concurrent hash map No Yes No No
csmap Concurrent sorted map Yes Yes Yes Yes
radix Radix tree Yes No Yes Yes
tree3 Persistent B+ tree Yes No No Yes
stree Sorted persistent B+ tree Yes No Yes Yes
robinhood Persistent hash map with Robin Hood hashing Yes Yes No Yes

The production quality engines are described in the libpmemkv(7) manual and the experimental ones are described in the ENGINES-experimental.md file.

pmemkv also provides testing engines, which may be used in unit tests or for benchmarking application overhead

Engine Name Description Experimental Concurrent Sorted Persistent
blackhole Accepts everything, returns nothing No Yes No No
dram_vcmap Volatile concurrent hash map placed entirely on DRAM Yes Yes No No

Contributing a new engine is easy, so feel encouraged!

Benchmarks

Experimental benchmark based on leveldb's db_bench to measure pmemkv's performance is available here: https://github.com/pmem/pmemkv-bench (previously pmemkv-tools).

Contact us

If you read the blog post and still have some questions (especially about discontinuation of the project), please contact us using the dedicated e-mail: [email protected].

pmemkv's People

Contributors

annamarcink avatar gbuella avatar igchor avatar jandorniak99 avatar jerryyangsh avatar jimchenglin avatar kfilipek avatar kilobyte avatar kkajrewicz avatar lamby avatar ldorau avatar lukaszstolarczuk avatar marcinslusarz avatar michalbiesek avatar nedved1 avatar patkamin avatar robdickinson avatar roblatham00 avatar szadam avatar tuliom avatar vinser52 avatar wlemkows avatar yuelimv avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pmemkv's Issues

Batched updates

RocksDB has a Write operation where a batch can be passed in:

WriteBatch batch;
batch.Delete("key1");
batch.Put("key2", "value2");
s = db->Write(WriteOptions(), &batch);

Should we provide something similar? Where's the transaction boundary?

Fast truncate operation

Efficient way to clear all persisted keys & their values

More important when using device DAX (#60), since formatting the entire device takes a long time. Truncating can be much faster since only the used portion of the device needs to be reset.

Use ASSERT_EQ over ASSERT_TRUE

We should be using ASSERT_EQ when it's relevant to see the value when the test fails. (Not that we shouldn't use ASSERT_TRUE, but be mindful this doesn't provide much context on failure)

pmkv_test should use ASSERT_TRUE, not assert

I'm guessing this was a copy/paste error by yours truly, but all gtests should be using ASSERT_TRUE for test assertions. A failed assert causes the entire test program to stop at that point, rather than just reporting the single test failed and completing the rest of the test suite.

Expose recover method

Inner node rebalancing is not yet implemented, so is there a temporary workaround that could be made available? One idea is to expose the initial recovery method so that the inner nodes could be periodically rebuilt, but this is basically the same as closing and re-opening the database. Is there a cleaner way to handle this until rebalancing is automatic?

Inner node rebalancing

This should be done incrementally (or by a background thread) in place of whatever temporary mechanism is introduced by #9.

Rename MultiGet to GetList

The MultiGet method is patterned after RocksDB...but our API may grow to support different types of batched reads down the road, so Get, GetList, GetMap, etc. is probably a better convention.

Exists operation

Currently this is done by doing a get where the value is ignored, but this has the overhead of copying the value string. This could be done by having a specific operation that just works against the keys. This could be done by a bloom filter (which would be KeyMayExist) or by checking the inner/leaf nodes (which would be KeyExists).

Free key/value memory when marking slot inactive

Currently a delete sets the hash for a slot to zero but the actual key/value strings stored in persistent memory aren't freed. This was intentional for the early stage prototype (to benchmark the smallest transaction, which is flipping a single byte) -- and using an asynchronous thread to free persistent memory might have unintended consequences. Let's start with a simple version that doesn't leak (ie. freeing key/value as an inline part of the delete operation) and go from there.

Rename metadata method to analyze

Feedback on #34 in favor of renaming KVTree::Metadata to KVTree::Analyze:

  • 'Metadata' is not a verb, but all the other KVTree methods are actionable verbs
  • 'Analyze' better informs that this could be a long-running/blocking action

Add metadata method to public API

Currently KVTree has GetPath and GetSize methods to return metadata about the datastore, but this pattern will pollute the API if we add a bunch of these down the road. Instead let's replace those current methods with a single Metadata method that populates a struct with all those details.

struct KVTreeMetadata {                              // add
    string path;
    size_t size;
    size_t leaves;
    size_t nodes;
}
class KVTree {
  public:
    const string& GetPath();                         // remove
    const size_t GetSize();                          // remove
    void Metadata(const KVTreeMetadata& metadata);   // add

This will be very helpful for changes like #10 that affect internal structure and are easier to validate if more internal state can be exposed easily.

Power fail safety testing

Set up a long-running test that breaks the database in random places and verifies that all data is recovered properly afterward.

Versioning strategy

What if internal format ever changes? Do we need a version flag stashed somewhere?

Add random insert/update/delete tests

We're testing the Get method for both sequential and random access, so let's do the same with the other operations. It's also worth checking some of our base assumptions (like inner node size) in the context of random operations. Also test missed get (worst-case) performance.

Convert build to produce/use shared library

The current build system compiles pmemkv sources directly into example, stress, and test programs, and links those programs with NVML shared libraries (libpmem and libpmemobj). Here's the current output:

224K pmemkv_example
229K pmemkv_stress
764K pmemkv_test (includes gtest)

Better to emit a shared library (libpmemkv.a) that bundles up our sources along with libpmem and libpmemobj. This will reduce size of our test programs and make it so a developer using pmemkv needs to only include and manage libpmemkv as a dependency.

Both make and cmake builds should be converted to this style.

Here's the original build output for comparison:

radickin@radware-ubuntu:~/work/pmemkv$ make clean example
rm -rf /dev/shm/pmemkv /tmp/pmemkv pmemkv_example pmemkv_stress pmemkv_test
g++  src/pmemkv.cc src/pmemkv_example.cc -o pmemkv_example \
3rdparty/nvml/lib/libpmemobj.a 3rdparty/nvml/lib/libpmem.a -I3rdparty/nvml/src/include \
-O2 -std=c++11 -ldl -lpthread -lrt -std=c++11 -DOS_LINUX -fno-builtin-memcmp -march=native

radickin@radware-ubuntu:~/work/pmemkv$ make clean stress
rm -rf /dev/shm/pmemkv /tmp/pmemkv pmemkv_example pmemkv_stress pmemkv_test
g++  src/pmemkv.cc src/pmemkv_stress.cc -o pmemkv_stress \
3rdparty/nvml/lib/libpmemobj.a 3rdparty/nvml/lib/libpmem.a -I3rdparty/nvml/src/include \
-DNDEBUG -O2 -std=c++11 -ldl -lpthread -lrt -std=c++11 -DOS_LINUX -fno-builtin-memcmp -march=native

radickin@radware-ubuntu:~/work/pmemkv$ make clean test
rm -rf /dev/shm/pmemkv /tmp/pmemkv pmemkv_example pmemkv_stress pmemkv_test
g++  src/pmemkv.cc src/pmemkv_test.cc -o pmemkv_test \
3rdparty/nvml/lib/libpmemobj.a 3rdparty/nvml/lib/libpmem.a -I3rdparty/nvml/src/include \
3rdparty/gtest/src/gtest-all.cc -I3rdparty/gtest/include -I3rdparty/gtest \
-O2 -std=c++11 -ldl -lpthread -lrt -std=c++11 -DOS_LINUX -fno-builtin-memcmp -march=native

Miscellaneous cleanup

There are a few things held over from the original prototype that I'd hoped to clean up before this goes out to anybody externally.

Prefix compression

There are two levels possible -- at the leaf node, and at the inner nodes.

Process terminates if persistent allocation fails

If we run out of space, a special status code should be returned for the offending operation, rather than crashing the process.

radickin@radware-ubuntu:~/work/pmemkv$ make stress
g++  src/pmemkv.cc src/pmemkv_stress.cc -o pmemkv_stress \
3rdparty/nvml/lib/libpmemobj.a 3rdparty/nvml/lib/libpmem.a -I3rdparty/nvml/src/include \
-DNDEBUG -O2 -std=c++11 -ldl -lpthread -lrt -std=c++11 -DOS_LINUX -fno-builtin-memcmp -march=native
rm -rf /dev/shm/pmemkv
PMEM_IS_PMEM_FORCE=1 ./pmemkv_stress

Opening for writes
   in 18 ms
Inserting 1000000 values
terminate called after throwing an instance of 'nvml::transaction_alloc_error'
  what():  failed to allocate persistent memory array
Aborted (core dumped)
Makefile:24: recipe for target 'stress' failed
make: *** [stress] Error 134

Make the library, and tests less dependent on a specific revision of pmemobj

Relying on a specific version of pmemobj results in such constructs in pmemkv_test:

const int LARGE_LIMIT = 6012299;

Since the cmake build already uses whatever version of pmemobj it finds using pkg-config, this can be a problem. Having the "wrong" version of pmemobj after doing a cmake build can result in such errors:

$ ./pmemkv_test --gtest_filter=KVTest.LargeAscendingTest
Note: Google Test filter = KVTest.LargeAscendingTest
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from KVTest
[ RUN      ] KVTest.LargeAscendingTest
/home/tej/code/pmemkv/src/pmemkv_test.cc:747: Failure
Value of: kv->Put(istr, (istr + "!")) == OK
  Actual: false
Expected: true
out of memory
[  FAILED  ] KVTest.LargeAscendingTest (26001 ms)

Multi-language API proposal

What would the pmemkv API look like if it were designed for use by high-level languages? (Node, Java, Ruby, Python, etc) Let's do some prototyping to give a sense of what's possible, and end up with a concrete proposal.

Resolve overlap between cmake/make

The build system currently has duplicate logic for downloading 3rd party libraries and building the shared library and test programs. This logic appears in both CMakeLists.txt and the Makefile (specifically the thirdparty and sharedlib targets).

The installation section of the README (https://github.com/pmem/pmemkv#installation) should be updated accordingly to fit the new approach.

Options for caching of leaf attributes

Currently both keys and hashes are cached in the volatile leafnodes, which gives the fastest performance, but also leads to the highest DRAM usage. Options to avoid caching keys, or avoid caching both keys and hashes (as per FPTree), would be useful to have in addition to the current default policy. Should this be done with a conditional define (to make it easy to seek out the best options to lock in) or a configuration parameter to leave more choice to the customer?

Download dependencies at build time

Currently pmemkv has two library dependencies -- gtest and nvml. We'd like to use stable versions of these rather than the latest build, and not have to include any code from those upstream projects in the pmemkv repo. Currently gtest code is included and the build relies on whatever version of NVML is installed on the system (which may not be a stable version).

Persistent pool size should be set via parameter

Size of the persistent pool was hardcoded in the initial prototype, so let's make this configurable.

  • How to specify? (parameter to constructor or other means?)
  • What is the valid range of sizes?
  • What if a different size is specified when opening an existing store?

Support binary-safe keys

Although pmemkv doesn't prevent binary data in key strings, we're using strcmp internally to compare keys (which leads to bad results if a null char appears in the middle of a key string).

Remove p<> wrappers for key/value strings

Since KVString manages memory internally (using pmemobj_tx_add_range_direct), then we don't need the p<> wrappers for these in KVLeaf. This would eliminate all get_ro and get_rw calls we're making.

The only tricky point is that we'll need to implement a KVString::swap method, something like this?

void KVString::swap(KVString* target) {
    if (target->str) {                                                   // target is long string
        if (str) {                                                       // local is long too
            target->str.swap(str);
        } else if (sso[0] == 0) {                                        // local is empty
            str = target->str;
            target->str = nullptr;
            pmemobj_tx_add_range_direct(target->sso, 1);
            target->sso[0] = 0;
        } else {                                                         // local is short
            str = target->str;
            target->str = nullptr;
            pmemobj_tx_add_range_direct(target->sso, SSO_SIZE);
            strcpy(sso, target->sso);
            pmemobj_tx_add_range_direct(sso, 1);
            sso[0] = 0;
        }
    } else {                                                             // target is short string
        if (str) {                                                       // local is long
            target->str = str;
            str = nullptr;
            pmemobj_tx_add_range_direct(sso, SSO_SIZE);
            strcpy(target->sso, sso);
            pmemobj_tx_add_range_direct(target->sso, 1);
            target->sso[0] = 0;
        } else if (sso[0] == 0) {                                        // local is empty
            pmemobj_tx_add_range_direct(sso, SSO_SIZE);
            strcpy(target->sso, sso);
            pmemobj_tx_add_range_direct(target->sso, 1);
            target->sso[0] = 0;
        } else {                                                         // local is short too
            char temp[SSO_SIZE];
            strcpy(sso, temp);
            pmemobj_tx_add_range_direct(sso, SSO_SIZE);
            strcpy(target->sso, sso);
            pmemobj_tx_add_range_direct(target->sso, SSO_SIZE);
            strcpy(temp, target->sso);
        }
    }
}

Add decent README

See screedb README for inspiration

  • Pre-release software statement
  • Downloading and installing
  • Running tests
  • Related work (FPTree, nvml containers, pmse)
  • Architecture diagram?
  • Contributing (coding standards)

Not freeing nodes during shutdown

This leaks memory, but apparently not enough to crash the unit tests. It should be possible for an application to open/close multiple times without leaking any memory due to orphaned nodes. (Original benchmarks only opened the database once, so this didn't crop up early)

Zero-copy key rename operation

Since pmemkv uses a zero-copy strategy for splitting persistent leaves, can we do the same for key rename? Many other kv-stores implement rename as a remove followed by a put, but this is painful if we're copying large values to do so.

Known issues in CMake build

CMake is partially supported, with a few issues that would have to be resolved to use it officially:

  • CMake build does not download third-party libraries, missing make thirdparty logic
  • Binaries and shared libraries from CMake build are larger than those produced by make

Document operational procedures

  • Using pmempool utilities
  • Taking a snapshot
  • Backing up a database / restoring from backup
  • Moving a database to a new machine
  • Migrating between versions

Thread-safe implementation

Lock eliding using TSX? (assuming that key collisions will be rare)
Background maintenance thread?
Lots of new tests obviously

KVEmptyTest.SizeofTest failure

I hit one unit test failure in the latest pmemkv (actually I notice this failure long time ago).
The size of KVInnerNode on my machine is indeed 112 since the struct is not packed, not sure why expecting 232.

sizeof(KVInneNode) = 8 + 8 + 8 + 5*8 + 6 * 8 = 112

Erro message:

[ RUN      ] KVEmptyTest.SizeofTest
pmemkv_test.cc:140: Failure
Value of: 232
Expected: sizeof(KVInnerNode)
Which is: 112
[  FAILED  ] KVEmptyTest.SizeofTest (0 ms)

Setup:

  - i7-4770 CPU
  - CentOS 7 with kernel 3.10.0-514.2.2.el7.x86_64
  - gcc version 4.8.5 20150623 (Red Hat 4.8.5-11) (GCC) 

Are opened/closed counts necessary?

These are being maintained but not used except to log values...if there isn't any special logic that actually triggers on a count mismatch, then maybe these should be removed?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.