Giter VIP home page Giter VIP logo

Comments (5)

erez-speedb avatar erez-speedb commented on May 28, 2024 1

Performance tests pass, the results are the same as current main.

from speedb.

isaac-io avatar isaac-io commented on May 28, 2024

An experimental implementation is in https://github.com/speedb-io/speedb/tree/exp/snapshot-reuse
Note that this implementation is incomplete and suffers from at least the following two bugs (rephrased from Jira):

last_snapshot_ is incorrectly cached

A caching of the last taken snapshot was implemented in order to facilitate faster snapshot taking in read-mostly workloads, such as pure select MySQL benchmarks, where MySQL is taking a snapshot for each query, and the global DB mutex serializes the reads so it effectively limits the amount of parallelism possible.

That work allows avoiding taking the mutex in read-only workloads and taking the mutex for much shorter periods in workloads that include writes by reusing the last snapshot if it’s still valid. However, in order to ensure that we always hold a valid reference, the DB instance also takes a reference on that snapshot and keeps it alive until the next snapshot is taken, even if multiple writes happened in between, rendering that snapshot useless (because a snapshot is based on the DB’s sequence number, and every write increments it). Moreover, having a live obsolete snapshot (after everyone who needed it released it) prevents the compaction from discarding old keys, increasing space and write amplification.

We need to release the cached snapshot on writes (at least some part of the write is done under the mutex, so it’s valid to release the snapshot during that phase). However, there is a need to be careful about when to release the snapshot, since in the case of transactions LastPublishedSequenceNumber() is used rather than LastSequenceNumber(), so we only want to release the cached snapshot after the former is incremented. The fact that multiple write flows are currently implemented in RocksDB (single queue, two queues, unordered writes, etc.), with our own addition thrown into the mix with #23, this is a task that potentially involves touching many parts of the code.

snapshots: min_uncommitted_ may be overwritten by other transactions when providing last_snapshot_

When the last snapshot matches the current sequence number, with the snapshot optimisation we return the same SnapshotImpl instance that we provide to other callers. However, WritePrepared and WriteUnprepared transactions also make use of the min_uncommitted_ field in SnapshotImpl to store their own information about the state of the transaction. This may conflict between different transactions, leading to unexpected results.

We need to check all usages of min_uncommitted_ to see if that’s indeed an issue, and if so, we need to fix it before merging it, as it may cause a correctness issue in those cases.

from speedb.

isaac-io avatar isaac-io commented on May 28, 2024

New WIP branch that tries to tackle these issues, but is currently incomplete and experiences data races between reference subtraction and SnapshotRecord release: isaac/snapshot.

from speedb.

Guyme avatar Guyme commented on May 28, 2024

@erez-speedb - Please run standard tests.

from speedb.

ofriedma avatar ofriedma commented on May 28, 2024

This Feature must have folly library installed.

How to install folly on Ubuntu:

sudo apt install libssl-dev libfmt-dev
git clone https://github.com/facebook/folly
cd folly
sudo ./build/fbcode_builder/getdeps.py install-system-deps --recursive
mkdir build_
cd build_
cmake .. -DBUILD_SHARED_LIBS=ON
make -j $(nproc) install

If there is an error with the compile process above that is related with fmt,
Please run:

 sed -i "s/format_to/fmt::format_to/g" /usr/include/fmt/chrono.h

To compile speedb with this enhancement please compile it using cmake:

git clone https://github.com/speedb-io/speedb
cd speedb
mkdir build
cd build
cmake .. -DWITH_SNAP_OPTIMIZATION=ON -DBUILD_SHARED_LIBS=ON
make -j $(nproc)

from speedb.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.