Giter VIP home page Giter VIP logo

dash's Introduction

Dash: Scalable Hashing on Persistent Memory

Persistent memory friendly hashing index, to appear at VLDB 2020.

More details are described in our VLDB paper below. If you use our work, please cite:

Baotong Lu, Xiangpeng Hao, Tianzheng Wang, Eric Lo:
Dash: Scalable Hashing on Persistent Memory. 
PVLDB 13(8): 1147-1161 (2020)

What's included

  • Dash EH - Proposed Dash extendible hashing
  • Dash LH - Proposed Dash linear Hashing
  • CCEH - PMDK patched CCEH variant used in our benchmark
  • Level Hashing - PMDK patched level hashing variant used in our benchmark
  • Mini benchmark framework
  • Example program - how to integrate Dash to your application

Fully open-sourced under MIT license.

Building

Dependencies

We tested our build with Linux Kernel 5.5.3 and GCC 9.2. You must ensure that your Linux kernel version >= 4.17 since we use MAP_FIXED_NOREPLACE in our customized PMDK.

The external dependencies are our customized PMDK and epoch manager, which are also open-sourced.

Compiling

Assuming to compile under a build directory:

git clone https://github.com/baotonglu/dash.git
cd dash
mkdir build && cd build
cmake -DCMAKE_BUILD_TYPE=Release -DUSE_PMEM=ON .. 
make -j

Running benchmark

As stated in our paper, we run the tests in a single NUMA node with 24 physical CPU cores. We pin threads to physical cores compactly assuming thread ID == core ID (e.g., for a dual-socket system, we assume cores 0-23 are located in socket 0, and cores 24-47 in socket 1). To run benchmarks, use the test_pmem executable in the build directory. It supports the following arguments:

./build/test_pmem --helpshort
Usage: 
    ./build/test_pmem [OPTION...]

-index      the index to evaluate:dash-ex/dash-lh/cceh/level (default: "dash-ex")
-op         the type of operation to execute:insert/pos/neg/delete/mixed (default: "full")
-n          the number of warm-up workload (default: 0)
-p          the number of operations(insert/search/delete) to execute (default: 20000000)
-t          the number of concurrent threads (default: 1)
-r          search ratio for mixed workload: 0.0~1.0 (default: 1.0)
-s          insert ratio for mixed workload: 0.0~1.0 (default: 0.0)
-d          delete ratio for mixed workload: 0.0~1.0 (default: 0.0)
-e          whether to register epoch in application level: 0/1 (default: 0)
-k          the type of stored keys: fixed/variable (default: "fixed")
-vl         the length of the variable length key (default: 16)

Check out also the run.sh script for example benchmarks and easy testing of the hash tables.

Example program

To know how to integrate the Dash into your application, check out example.cpp under src. The executable is example under your build directory. Also check CMakeLists.txt to know how to link with dependencies (customized PMDK and epoch manager) for correct build.

Miscellaneous

We noticed a possible mmap bug on our testing environment: MAP_SHARED_VALIDATE is incompatible with MAP_FIXED_NOREPLACE (since Linux 4.17). To ensure safe memory mapping, we modified the original PMDK to use MAP_SHARED rather than MAP_SHARED_VALIDATE, which has the same functionality as the former one except for extra flag validation. For a more detailed explanation and minimal reproducible code, please check out our blog post about this issue.

Contact

For any questions, please contact us at [email protected] and [email protected].

dash's People

Contributors

baotonglu avatar fengjixuchui avatar wangtzh avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.