rust-lib-project / calibur Goto Github PK

View Code? Open in Web Editor NEW

102.0 102.0 12.0 716 KB

A port for rocksdb

License: Apache License 2.0

Rust 100.00%

calibur's People

Contributors

Stargazers

Watchers

Forkers

aylei w41ter jmpotato tennyzhuang ariesdevil hawkingrei isgasho hunterlxt imotai bornchanger hw-standalonecomplex rohankumardubey

calibur's Issues

Feature: Add CI action for this project

Feature: do not acquire mutex for snapshot

Every time we create a iterator we will acquire lock for version_set and it may be block by some other background thread.
So we need a thread-local snapshot ptr or an atomic ptr to avoid acquire mutex.

Feature: support compress data with LZ4

part of #4
The feature may be too large so I think it's better to split it into multiple tasks with different algorithm

Feature: Support compression algorithm for block-based table

Description

To save disk space, RocksDB will compress the data by block by some popular algorithm such as LZ4 and ZSTD. To make code simple to understand, I think we only need to implement these two algorithm at first, to be compatible with current data formats of RocksDB. Of course, maybe we can find a better file format and a better compression algorithm in the future, but it is not for now.

pread returns EINVAL in linux

Run simple example in PR #6: cargo run --example simple_example, and program will failed in Engine::open. The detail errno is EINVAL.

According to man 2 read:

EINVAL fd is attached to an object which is unsuitable for reading; or the file was opened with the O_DIRECT flag, and either the address specified in buf, the value specified in count, or the file offset is not suitably aligned.

I noticed that get_current_manifest_path will invokes FileSystem::read_file_content, which eventually invoke AsyncRandomAccessFile::open, and O_DIRECT is added to open, but there doesn't seem to be any alignment in read_file_content.

Feature: Support prefix-seek and seek bound

Description

RocksDB can create bloom-filter with prefix of keys. And when user want to seek some key, RocksDB can tell whether this key is found when the prefix of which seek key match the first key in DB.

RocksDB can give a bound to iterator so that iterator would not skip too many tombstone.

Feature: refactor compaction picker

Decription

Now we will sort all files every level by priority and only take the higher score level. But it may be some case that the highest score level is during compact job and we can not pick any file. So that we need to skip the highest level to find another level to compact

Feature: Support multi thread to finish L0 compaction

Feature Description

Here we only use one thread to compact file from L0 to base level. But for level style compaction, the compaction job which will merge multiple files from L0 to base level, must run with only one job. If we only use one thread, this job will be slowly. So RocksDB will split this job to multiple range and every thread run one range to speed up.

module

compaction
db.rs run_compaction_job

Skills

You need know well about compaction of LSM Tree.

Feature: Support block-cache to avoid frequent IO requests.

Feature Description

In order to complete the prototype of the database as quickly as possible, I did not design the cache before, which would cause all requests to directly access the disk data. I hope a rocksdb-like block-cache but if anyone else has a better opinion, I'll gladly accept it.

Module

add a new module cache.
table. Create the cache object in table_factory and send it to every TableReader.

Feature: Support direct io and read buffer for linux os

Problem

In this issue, #7, rocksdb-rs will return error because we do not give an align block size.
We may need to give a align read size to file for direct io and cache the data in reader itself

Feature: Remove wal files when flush has been finished

Description

As a LSM Tree engine, all the data will be persisted in write-ahead-log files by an append-IO, and then they will be applied to structure in memory, which we called memtable. And when the data in memtable are flushed on disk, we could remove the wal files to release disk space.

Design

How to decide whether a log file could been removed? We have keep a log number for each of column family. It means the max number which has been persisted on disk. The minimal number for all the column families, is the last log, in which data has been flushed to SST on disk.

is this a wrapper of rocksdb？

what‘s the difference between rust-rocksdb

Feature: using table cache to avoid read all sst files when open DB

Problem Description

To quickly finish engine, we open all file when they are generate or open all files when open DB.
But if most of files will never be access because they are cold data, it is not a good idea to store the filter-block in memory.

Design && Work Item

add buffered read to avoid multiple io when open a file.
a thread-safe lru-cache structure for both table-cache and block-cache.
refactor filemeta and read sst interface because we need to access table id at first and then get the file from cache.

rust-lib-project / calibur Goto Github PK

calibur's People

Contributors

Stargazers

Watchers

Forkers

calibur's Issues

Description

Description

Decription

Feature Description

module

Skills

Feature Description

Module

Problem

Description

Design

Problem Description

Design && Work Item

Recommend Projects

Recommend Topics

Recommend Org