Giter VIP home page Giter VIP logo

tli's Introduction

Testbed for Learned Indexes (TLI)

TLI is a testbed to compare (learned) indexes on various datasets and workloads, and it is generally composed of three components (i.e., workload generation, hyper-parameter tuning, performance evaluation). We develop this system from the well-known SOSD framework. Besides, we use perf and pmu-tools to measure micro-architectural metrics.

Dependencies

One dependency that should be emphasized is Intel MKL, used when testing the performance of XIndex and SIndex. The detailed steps of installation can be found here.

Generally, the dependencies can be installed in the following steps.

$ cd /tmp
$ wget https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS-2019.PUB
$ apt-key add GPG-PUB-KEY-INTEL-SW-PRODUCTS-2019.PUB
$ rm GPG-PUB-KEY-INTEL-SW-PRODUCTS-2019.PUB

$ sh -c 'echo deb https://apt.repos.intel.com/mkl all main > /etc/apt/sources.list.d/intel-mkl.list'
$ apt-get update
$ apt-get install -y intel-mkl-2019.0-045

$ apt -y install zstd python3-pip m4 cmake clang libboost-all-dev
$ pip3 install --user numpy scipy
$ curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
$ source $HOME/.cargo/env

After the installation, the following two lines in CMakeLists.txt may require modification.

set(MKL_LINK_DIRECTORY "/opt/intel/mkl/lib/intel64")
set(MKL_INCLUDE_DIRECTORY "/opt/intel/mkl/include")

Running the testbed

We provide a number of scripts to automate things. Each is located in the scripts directory, but should be executed from the repository root.

  • ./scripts/download.sh downloads and stores required data from the Internet
  • ./scripts/build_rmis.sh compiles and builds the RMIs for each dataset. If you run into the error message error: no override and no default toolchain set, try running rustup install stable.
  • ./scripts/download_rmis.sh will download pre-built RMIs instead, which may be faster. You'll need to run build_rmis.sh if you want to measure build times on your platform.
  • ./scripts/prepare.sh constructs the single-thread workloads and compiles the testbed, and ./scripts/prepare_multithread.sh for concurrency workloads.
  • ./scripts/execute.sh, execute_latency.sh, execute_errors.sh, execute_perf.sh executes the testbed on single-thread workloads, storing the results in results, and ./scripts/execute_multithread.sh for concurrency workloads.

Build times can be long, as we make aggressive use of templates to ensure we do not accidentally measure vtable lookup time.

Results

The results in results/through-results are obtained in single-thread workloads, results/multithread-results in concurrency workloads, results/string-results for string indexes. They are shown in the following format.

(index name) (bulk loading time) (index size) (throughput) (hyper-parameters)

The results in results/latency-results are obtained measuring latencies in single-thread workload, and are shown in the following format.

(index name) (bulk loading time) (index size) (average, P50, P99, P99.9, max, standard derivation of latency) (hyper-parameters)

The results in results/errors-results are obtained measuring position searches, and are shown in the following format.

(index name) (bulk loading time) (index size) (average, P50, P99, P99.9, max, standard derivation of latency) (average position search overhead) (position search latency per operation) (average prediction error) (hyper-parameters)

The filenames of csvs in results mainly comply with the following rule.

{dataset}_ops_{operation count}_{range query ratio}_{negative lookup ratio}_{insert ratio}_({insert pattern}_)({hotspot ratio}_)({thread number}_)(mix_)({loaded block number}_)({bulk-loaded data size}_)results_table.csv

The results in results/perf-results are obtained measuring micro-architectural metrics.

tli's People

Contributors

curtis-sun avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.