mozilla / rkv Goto Github PK

View Code? Open in Web Editor NEW

299.0 11.0 52.0 4.69 MB

A simple, humane, typed key-value storage solution.

Home Page: https://crates.io/crates/rkv

License: Apache License 2.0

Rust 99.93% Shell 0.07%

lmdb

rkv's Issues

Deadlock: trying to open a store while a writer is in progress

Ran into this when I was testing #58, following snippet can reproduce the deadlock.

let store = rkv.open_or_create("store");
let writer = store.write();
let another_store = rkv.open_or_create("another_store");

The deadlock was caused by the fact that LMDB doesn't allow two write transactions running at the same time. In this case, the writer is a wrapper of RwTransaction. Under the hood, open_or_create will create another write transaction to open a store, it'll hang forever since writer will never get a chance to abort or commit. So the solution would be either open all the stores before spawning any writer, or commit the writer and then open other stores.

We should at least document this for Store to avoid this kind of deadlock.

Documentation: selection guidance

We should write some helpful words about why you might use this system, why you might not, and point into the tuning docs (#4) to illuminate the space of the former.

Swift wrapper library

Use the C FFI (#6) to expose an idiomatic interface to Swift code.

document various limitations of LMDB

We might want to document various limitations of LMDB in order to offer the "least surprise" for the rkv users. Off top of my head, LMDB has following limitations:

Max key size (MDB_MAXKEYSIZE = 511 bytes), it also applies to the value for the dupsort store. Note that it's a compile-time configuration, and can't be changed at runtime
Max environment size (default as 10 MB), once the environment is full, following writes will fail
Max number of database (default as 5), hosting a moderate number (say up to a few dozens) of databases in a single environment is fine, however, hosting too many has both memory and performance impact
Max readers (default as 126), the maximum readers/threads allowed to access an LMDB environment. Passing this limit will end up failing to create readers
Too many writes in a single transaction. Unsure what exactly the maximum writes is, but LMDB may complain while conducting a bulk load write in a single transaction. To workaround this, we can periodically commit a transaction and create a new one for the rest of writes.

Unify `Reader` and `Writer`

Presently there are 2 transaction types (plus the proposed Multi*) for both Read and Write transactions.

There doesn't have to be, though. In fact, it's causing problems when I want to execute a transaction over different types of Stores

I see two options:
The easiest option is pull the functions from Integer* and Multi* into Reader and Writer, and suffixing them with _int and _multi. They will take IntStore and MultiStore as parameters respectively.

The harder, but perhaps cleaner option is a significant chunk of refactoring:
Move the accessor functions into the *Store structs themselves. Make the the functions accept either a Reader or a Writer. This way we'll have more ergonomic and intuitive methods like:

IntStore::put<I: PrimitiveInt>(txn: Writer, id: I, val: Value) -> Result<(), StoreError>
MultiStore::get(txn: Reader, id: K) -> Result<Iter<Value>, StoreError>

For maximum modularity.. I would make a new trait: ReadTransaction which can be implemented by both Reader and Writer.. since you can fetch values in Write transactions as well.

Live backup

LMDB supports a safe atomic backup operation. We should expose this functionality.

Bincode::serialize generates much bigger results on String types

Noticed this when I was investigating this TODO item. The current serialization mechanism (serialize a two-element tuple i.e. (type, value)) seems to introduced a significant amount of overheads on the String type Values.

Here is some examples:

serialize(&(1u8, true)).len() -> 2 // actual size: 2
serialize(&(2u8, 1e+9).len() -> 9 // actual size: 9 (1 + 8)
serialize(&(3u8, "hello world".to_string())).len() -> 20 // actual size: 12 (1 + 11)
serialize(&(4u8, "4dd69e99-07e7-c040-a514-ccde0cfd4781".to_string())).len() -> 45 // actual: 37 (1 + 36)

Unsure if it was caused by the padding, or by the serializations. But I think it's worth a further investigation.

Alternatively, we can just write the Type and Value directly to a buffer, then pass the result to put function. For big Values, we can avoid the double allocation by leveraging the "MDB_RESERVE" feature, which basically reserves enough space for the value, and return the buffer so that the user can populate the buffer afterwards. The following snippets illustrate the basic idea,

fn put(&self, key, value) {
    // say BIG_VALUE_THRESHOLD = 32
    let length = ::std::mem::size_of_value(&value) + 1;  // value size + type size

    if length < BIG_VALUE_THRESHOLD {
        let buf = [u8, BIG_VALIE_THRESHOLD];
        buf.write_u8(&type);
        buf.write_all(&value);
        self.txn.put(&k, &buf[..length]);
    } else {
        let mut reserved_buf = self.txn.reserve(&k, length);
        reserved_buf.write_u8(&type);
        reserved_buf.write_all(&value);
    }
}

Placeholder: Firefox developer tooling

(Placeholder because this eventually belongs in Bugzilla.)

It would be useful for developers to be able to interact with rkv files via the Firefox developer tools, as well as via the JS API in the console.

committing transaction returns lmdb::Error instead of StoreError

The recent changes in #101 directly expose the lmdb::RoTransaction and lmdb::RwTransaction types instead of wrapping them in Reader and Writer types when calling Rkv.read() and Rkv.write().

This has the side-effect of also exposing the lmdb::Error type when calling R[o|w]Transaction.commit(), which was previously converted into a StoreError by the Reader.commit() and Writer.commit() functions.

And that's inconsistent with most of the other functions in the public API, including Environment.begin_r[o|w]_transaction() and the various Store::[Single|Multi|etc.]Store functions, which all wrap an lmdb::Error in a StoreError::LmdbError.

It's also obviously inconsistent with any other function that returns another type of StoreError, and it means that consumers of rkv need to handle both the StoreError type and the underlying lmdb::Error type.

We should ensure that the public API returns StoreError consistently to indicate failure.

support iterators and ranged lookups

In order to be able to iterate keys, and do so from an arbitrary point in the key space, rkv should expose LMDB's support for iterators and ranged lookups (behind humane abstractions as appropriate).

order "create" consistently in functions that get_or_create/create_or_open

When retrieving or creating a new environment handle, the name of the Manager method is get_or_create (the word "create" appears second); but when retrieving or creating a new store handle, the name of the Rkv method is create_or_open (the word "create" appears first).

We should make these functions (and others that either get or create a thing) use a consistent ordering of those words.

(We might also want to use get/open consistently, although I'm open to arguments that environments and stores are different, and it makes sense to get the former and open the latter.)

Support lmdb write flags (HOWTO)

Lmdb supports a wide range of write flags to change the default behavior when issuing writes to the store. Currently, rkv::readwrite::Writer passes the default write flag to its put function, which simply overwrites the value if the key is already in the store.

One solution could be just exposing all the write flags from lmdb, and let developers decide which one to use, the upside is apparently it offers a great deal of flexibility, the downside is they will need to know all the store types and its corresponding write flags in lmdb. Misusing them may incur some undesired behaviors, or even worse corrupt the store.

The other way to handle the write flags is abstract them away by providing a few stores instead, with each store has its own semantics on put/get/delete/cursor. Such as:

Store, just a dumb k/v store as what you'd expect in JS, Rust, or Python
DupStore, which supports dup keys, aside from those APIs in Store, it might also have mput, mget to insert or get multiple values for the same key

The advantage is that developers do not need to know the underlying details of lmdb, just treat them as some persistent k/v stores. Obviously, they will lose the fine-grained control on the store, and perhaps some performance losses.

Some design decisions need to be made before taking actions to implement it. Given that one of rkv's design goals is to smooth out lmdb's rough edges, I am more inclined to the second plan.

@mykmelez thoughts?

integrate clippy in some fashion

Over in #65, @ncloudioj fixed some clippy nits. We should consider integrating clippy in some fashion to reduce the risk of introducing more such nits.

Per https://github.com/rust-lang-nursery/rust-clippy, "Since this is a tool for helping the developer of a library or application write better code, it is recommended not to include Clippy as a hard dependency. Options include using it as an optional dependency, as a cargo subcommand, or as an included feature during build. These options are detailed below."

I'm unsure which of these options is best.

Incorrect description about reading the uncommitted writes in the same transaction

A write transaction also supports reading, but the version of the store that it reads doesn't include changes it has made.

This seems not correct, at least running the code snippets shows the opposite result. Within the same write transaction, all the writes should be immediately visible regardless of the commit state of that transaction.

Thanks @piatra for pointing out this by noticing the discrepancy between the document and the actual code result!

Wiki changes

FYI: The following changes were made to this repository's wiki:

defacing spam has been removed
Restricting write access to contributors is strongly encouraged. Please make that change (documentation).

These were made as the result of a recent automated defacement of publically writeable wikis.

docs are broken/fail to build for 0.5.1

Going to the link from the README of https://docs.rs/rkv/ says "The requested resource does not exist."

Bringing up the crates page at https://docs.rs/crate/rkv/0.5.1 says they failed to build, linking to the log at https://docs.rs/crate/rkv/0.5.1/builds/117580

Question about locks

Hi, quick question.

Why do Read and Write locks are taken on the Rkv environment instead of the store? It seems like although there can be multiple stores withing same environment, the locks are global. Does it mean that I cannot have two threads writing to different databases?

Adding mechanisms to gracefully handle "map is full" error

Each lmdb store has a predefined size (10MB by default), if a store is running out of free space due to,

Store was filled by the data
Some orphan transactions prevented lmdb from reclaiming the unused pages

either way, all the following inserts will be rejected by lmdb with a MDB_MAP_FULL error.

We will have to provider some bailout mechanisms for this particular issue to avoid the write downtime.

Resizing the store, this requires all users terminate their transactions (read/write) first, also, once the size gets increased, there is no way to shrink it unless recreating a new one and copying all the data over
To let lmdb reclaim the unused pages, we need to ensure there is no orphan holding locks in lmdb's transaction table. Lmdb has a API (mdb_reader_check) to clean up those zombie transactions, looks like this API was not exposed by ldmb-rs, we might have to add that to upstream first.

Command-line tooling

Likely drawing on #2, it would be useful for developers to be able to interact with rkv files directly.

C FFI

In order for rkv to be usable within Gecko and from Swift and Android, it needs a C FFI on which C++/Swift/Java APIs can be built. This issue tracks defining those FFIs.

Documentation: tuning and modeling guide

As a relatively flexible piece of infrastructure, rkv/LMDB will benefit from a tuning/usage guide. This will cover the following (and more):

CODE_OF_CONDUCT.md file missing

As of January 1 2019, Mozilla requires that all GitHub projects include this CODE_OF_CONDUCT.md file in the project root. The file has two parts:

Required Text - All text under the headings Community Participation Guidelines and How to Report, are required, and should not be altered.
Optional Text - The Project Specific Etiquette heading provides a space to speak more specifically about ways people can work effectively and inclusively together. Some examples of those can be found on the Firefox Debugger project, and Common Voice. (The optional part is commented out in the raw template file, and will not be visible until you modify and uncomment that part.)

If you have any questions about this file, or Code of Conduct policies and procedures, please see Mozilla-GitHub-Standards or email [email protected].

(Message COC001)

Finish manager interface

In order to maintain LMDB's requirement that each database is opened only once at a time in each process, we have a manager that canonicalizes paths and maintains a set of open databases.

This interface needs to be finished:

We need a mechanism for closing stores when we're done with them. I don't think it's enough to simply remove the Arc from the map: that would allow duplicate opens. It might be enough to use Weak instead of Arc, which will automatically close a database if it isn't referenced by any consumer.
The manager doesn't expose the same builder API that direct opening supports. Instead one supplies a closure to do the work. We might be able to smooth this out a little.

add tests for put_with_flags() functions in multi.rs and integermulti.rs

We don't currently have tests for the put_with_flags() functions in multi.rs and integermulti.rs. We should add those.

Observer notifications

We should design and implement a system for watching particular keys, or stores as a whole. Most likely we should not notify values — the recipient can read directly from the store.

See mozilla/mentat#551 for guidance.

determine how to access multiple stores within a single transaction

Over in #42, I noted that "it's unclear if/how it's possible to open multiple stores within a single transaction, which LMDB itself supports," and @ncloudioj responded:

The current rkv::Store abstraction doesn't support that because it wraps the transaction into the Reader/Writer. To support multi-store reads/writes, it needs to take the transaction out from store, perhaps something like,
let txn = rkv.write();
let store_foo = rkv.create_or_open("foo");
let store_bar = rkv.create_or_open("bar");
store_foo.write(txn, "key0", "value0");
store_bar.write(txn, "key1", "value1");
txn.commit();
The downside is that users can't use the Reader/Writer any more. Another potential approach, which reuses the design of Writer/Reader, is introduce a MultiStore so that multiple stores could be get_or_created at the same time in a single transaction. Its read/write API will be slightly different,
let store_names = vec!["foo", "bar", "baz"];
let mega_store = rkv.create_or_open(&store_names);
let writer = mega_store.write();
writer.write("foo", "key0", "value0"); // it takes a store name here
writer.write("bar", "key1", "value1");
writer.commit();

This is a tough problem. The latter approach feels a bit more intuitive and is also likely to be more compact, provided store names are short; whereas the former grows a line for each store involved in the transaction.

On the other hand, the former approach has the advantage of being more strongly typed, because stores are referenced by handle after creation, so it isn't possible to compile code that opens the "foo" and "bar" stores and then writes to the "baz" store; whereas the latter approach will happily compile that code (and then fail at runtime).

Also note #29, although #28 (comment) suggests that I didn't actually understand LMDB database handles when I filed it, and it's the wrong thing to do.

I'm also puzzling over the requirement that LMDB database handles be opened with reference to a specific transaction but can then be reused by any other transaction, as described in the docs for mdb_dbi_open, which additionally notes:

The database handle will be private to the current transaction until the transaction is successfully committed. If the transaction is aborted the handle will be closed automatically. After a successful commit the handle will reside in the shared environment, and may be used by other transactions.

This function must not be called from multiple concurrent transactions in the same process. A transaction that uses this function must finish (either commit or abort) before any other transaction in the process may use this function.

However, lmdb-rs appears to manage those constraints by acquiring a mutex and creating/committing a throwaway transaction in Environment::create_db, so that shouldn't be an issue.

Regardless, from browsing the LMDB docs, it seems like the intent is for handles to stores to be long-lived, so perhaps the former approach is better, even though it doesn't let you use Reader/Writer, as it requires you to explicitly create the handles, which also enables you to reuse them.

Or perhaps even better is a related approach in which Rkv::write returns a non-store-specific Writer rather than an lmdb::RwTransaction (ditto for Rkv::read; it returns a Reader), and it has a put method that takes a store handle rather than a store name, i.e. something like:

let store_foo = rkv.create_or_open("foo");
let store_bar = rkv.create_or_open("bar");
let writer = rkv.write();
writer.put(store_foo, "key0", "value0");
writer.put(store_bar, "key1", "value1");
writer.commit();
// store_foo and store_bar can be reused to read
let reader = rkv.read();
reader.get(store_foo, "key0");
reader.get(store_bar, "key1");

(This has the added advantage that we no longer return the low-level RoTransaction and RwTransaction lmdb-rs structs from rkv methods.)

@ncloudioj What do you think?

switch lmdb-rs dependency to mozilla/lmdb-rs

Now that we've forked lmdb-rs to https://github.com/mozilla/lmdb-rs to take fixes for rkv, we should switch our lmdb-rs dependency to it.

We could do so via a lmdb = { git="https://github.com/mozilla/lmdb-rs" } entry in the [dependencies] section of Cargo.toml or via an entry in the [patches] section. Unsure which is better.

figure out what to do about stale readers

LMDB > Caveats notes:

A broken lockfile can cause sync issues. Stale reader transactions left behind by an aborted program cause further writes to grow the database quickly…

Fix: Check for stale readers periodically, using the mdb_reader_check function or the mdb_stat tool.

We should figure out what to do about stale readers: whether to clear them out periodically ourselves or make this the responsibility of the consumer (exposing an API for them to do so).

Thorough support for versioning

We will have at least three different kinds of version.

The disk format itself, which will be tied to the version of LMDB. Failures here will signal MDB_VERSION_MISMATCH.
A storage format version, which will infrequently change along with rkv itself. A change in how we represent values, in how we store metadata, or capabilities (e.g., encryption, file locking) might require us to lock out old clients who are still able to read the disk format.
One or more domain versions. These will be managed by consumers: they're analogous to SQLite's PRAGMA user_version. Typical uses will be to document and alter the assumptions of consuming code, to track migrations, and to lock out buggy clients. One can imagine multiple consumers using the same database file, each with their own key space and version number.

All three of these will be present in the API and in documentation.

order "path" parameter consistently in Rkv static constructors

The Rkv static constructor functions new, with_capacity, and from_env all take a path parameter, but they don't order it consistently: it's the first parameter for new and with_capacity and the second parameter for from_env. We should make its order consistent across all three functions, which presumably means making it the first parameter for from_env as well.

figure out what to do about a stale writer

LMDB > Caveats notes:

A broken lockfile can cause sync issues… stale locks can block further operation.
…
Stale writers will be cleared automatically on some systems:

Windows - automatic

Linux, systems using POSIX mutexes with Robust option - automatic

not on BSD, systems using POSIX semaphores. Otherwise just make all programs using the database close it; the lockfile is always reset on first open of the environment.

We should figure out what to do about a stale writer on systems that don't clear it automatically.

Documentation: API docs

Blob support

Just about anything that we can get as u8s is something we can store…

This would be in incompatible version bump to add the type signature.

writer.delete by key and value fails

When deleting a single record in a DUP_SORTdb, the result is always LmdbError::NotFound.

Upon tracing the lmdb-rs-sys put and delete code, the key and value passed into both has an identical set of bytes, so I'm not sure what the hangup is. I've created a unit test that demonstrates this:
https://github.com/mozilla/rkv/pull/93/files#diff-06aeb1dbfcf2eb1e00a4f7fe0edab250R488

Still not yet sure why it's failing. I guess I need to read up on LMDB docs.

Add a JSON export API

It would be useful for debugging and testing to be able to dump an entire database as JSON.

This should be relatively easy to implement: all of our types work with Serde, so we would 'just' need to make an iterator over keys and values serialize as a container.

hide lmdb::RoCursor type from consumers of Reader/Writer structs

The Readable trait in readwrite.rs currently leaks the lmdb::RoCursor type, and we generally don't want to show consumers the types from the lmdb crate dependency (preferring to encapsulate them in higher-level rkv types). We should figure out a way to hide that type from consumers of the Reader/Writer structs that implement the Readable trait.

rev minor version and publish new release

PR #62 is a breaking change, so we should rev the minor version for it and publish a new release.

consider making Value optional

Currently Rkv provides 4 types of Stores: Single, Multi, Integer and MultiInteger. All of them require the value to be of type Value, which imposes certain overhead, since the values must be encoded and decoded (and copied). This can be undesirable if the user only uses Blob type for values.

In general, it feels like this type (Value) and its encoding logic (or compression) should be specific for each user and I don't quite understand why it is here. I could use lmdb-rkv, but it would be nice not to deal with UNC paths on Windows and restrictions like one environment per process per path. Consider adding methods do deal with &[u8] values instead of Value.

many test failures when using Windows Subsystem for Linux

On Windows, when using Windows Subsystem for Linux (with the Ubuntu distro), I see a bunch of test failures:

$ cargo test
    Finished dev [unoptimized + debuginfo] target(s) in 5.96s
     Running target/debug/deps/rkv-5955ce6badfae226

running 22 tests
test env::tests::test_concurrent_read_transactions_prohibited ... ok
test env::tests::test_blob ... FAILED
test env::tests::test_delete_value ... FAILED
test env::tests::test_isolation ... FAILED
test env::tests::test_iter ... FAILED
test env::tests::test_iter_from_key_greater_than_existing ... FAILED
test env::tests::test_multiple_store_iter ... FAILED
test env::tests::test_multiple_store_read_write ... FAILED
test env::tests::test_open ... FAILED
test env::tests::test_open_a_missing_store ... ok
test env::tests::test_open_fail_with_badrslot ... ok
test env::tests::test_open_fails ... ok
test env::tests::test_open_from_env ... FAILED
test env::tests::test_open_store_for_read ... FAILED
test env::tests::test_open_with_capacity ... FAILED
test env::tests::test_read_before_write_num ... FAILED
test env::tests::test_read_before_write_str ... FAILED
test env::tests::test_round_trip_and_transactions ... FAILED
thread '<unnamed>' panicked at 'test integer::tests::test_integer_keys ... written: LmdbError(Corrupted)FAILED',
libcore/result.rs:test manager::tests::test_same ... 945ok:
5
test manager::tests::test_same_with_capacity ... thread 'ok<unnamed>
' panicked at 'rkv: "PoisonError { inner: .. }"', libcore/result.rs:945:5
thread '<unnamed>' panicked at 'rkv: "PoisonError { inner: .. }"', libcore/result.rs:945:5
thread '<unnamed>' panicked at 'rkv: "PoisonError { inner: .. }"', libcore/result.rs:945:5
thread '<unnamed>' panicked at 'rkv: "PoisonError { inner: .. }"', libcore/result.rs:945:5
thread '<unnamed>' panicked at 'rkv: "PoisonError { inner: .. }"', libcore/result.rs:945:5
thread '<unnamed>' panicked at 'rkv: "PoisonError { inner: .. }"', libcore/result.rs:945:5
thread '<unnamed>' panicked at 'rkv: "PoisonError { inner: .. }"', libcore/result.rs:945:5
thread '<unnamed>test env::tests::test_store_multiple_thread ... ' panicked at 'FAILEDrkv: "PoisonError { inner: .. }"
',
failures:
libcore/result.rs
:---- env::tests::test_blob stdout ----
thread 'env::tests::test_blob' panicked at 'read: LmdbError(BadTxn)', libcore/result.rs:945:5
note: Run with `RUST_BACKTRACE=1` for a backtrace.

---- env::tests::test_delete_value stdout ----
thread 'env::tests::test_delete_value' panicked at 'wrote: LmdbError(BadTxn)', libcore/result.rs:945:5
note: Panic did not include expected string 'not yet implemented'
---- env::tests::test_isolation stdout ----
thread 'env::tests::test_isolation' panicked at 'wrote: LmdbError(BadTxn)', libcore/result.rs:945:5

---- env::tests::test_iter stdout ----
thread 'env::tests::test_iter' panicked at 'Unexpected LMDB error BadTxn.', /home/myk/.cargo/registry/src/github.com-1ecc6299db9ec823/lmdb-rkv-0.8.2/src/cursor.rs:263:17

---- env::tests::test_iter_from_key_greater_than_existing stdout ----
thread 'env::tests::test_iter_from_key_greater_than_existing' panicked at 'wrote: LmdbError(BadTxn)', libcore/result.rs:945:5

---- env::tests::test_multiple_store_iter stdout ----
thread 'env::tests::test_multiple_store_iter' panicked at 'opened: LmdbError(Corrupted)', libcore/result.rs:945:5

---- env::tests::test_multiple_store_read_write stdout ----
thread 'env::tests::test_multiple_store_read_write' panicked at 'opened: LmdbError(Corrupted)', libcore/result.rs:945:5

---- env::tests::test_open stdout ----
Root path: "/tmp/test_openBkPsg4"
thread 'env::tests::test_open' panicked at 'success but no value: LmdbError(BadTxn)', libcore/result.rs:945:5

---- env::tests::test_open_from_env stdout ----
Root path: "/tmp/test_open_from_envuAftlz"
thread 'env::tests::test_open_from_env' panicked at 'success but no value: LmdbError(BadTxn)', libcore/result.rs:945:5

---- env::tests::test_open_store_for_read stdout ----
thread 'env::tests::test_open_store_for_read' panicked at 'write: LmdbError(BadTxn)', libcore/result.rs:945:5

---- env::tests::test_open_with_capacity stdout ----
Root path: "/tmp/test_open_with_capacityvY2MVw"
thread 'env::tests::test_open_with_capacity' panicked at 'success but no value: LmdbError(BadTxn)', libcore/result.rs:945:5
note: Panic did not include expected string 'opened: LmdbError(DbsFull)'
---- env::tests::test_read_before_write_num stdout ----
thread 'env::tests::test_read_before_write_num' panicked at 'read: LmdbError(BadTxn)', libcore/result.rs:945:5

---- env::tests::test_read_before_write_str stdout ----
thread 'env::tests::test_read_before_write_str' panicked at 'read: LmdbError(BadTxn)', libcore/result.rs:945:5

---- env::tests::test_round_trip_and_transactions stdout ----
thread 'env::tests::test_round_trip_and_transactions' panicked at 'wrote: LmdbError(BadTxn)', libcore/result.rs:945:5

---- integer::tests::test_integer_keys stdout ----
thread 'integer::tests::test_integer_keys' panicked at 'write: LmdbError(BadTxn)', libcore/result.rs:945:5

---- env::tests::test_store_multiple_thread stdout ----
thread 'env::tests::test_store_multiple_thread' panicked at 'joined: Any', libcore/result.rs:945:5

945
failures:
:    env::tests::test_blob
5    env::tests::test_delete_value

    env::tests::test_isolation
    env::tests::test_iter
    env::tests::test_iter_from_key_greater_than_existing
    env::tests::test_multiple_store_iter
    env::tests::test_multiple_store_read_write
    env::tests::test_open
    env::tests::test_open_from_env
    env::tests::test_open_store_for_read
    env::tests::test_open_with_capacity
    env::tests::test_read_before_write_num
    env::tests::test_read_before_write_str
    env::tests::test_round_trip_and_transactions
    env::tests::test_store_multiple_thread
    integer::tests::test_integer_keys

test result: FAILED. 6 passed; 16 failed; 0 ignored; 0 measured; 0 filtered out

I don't see these on Windows outside of WSL, however; nor on Ubuntu running outside of Windows (in a virtual machine on a macOS host).

Android wrapper library

Use JNA and the C FFI (#6) to build an idiomatic Android storage library.

Consider replacing '&Store' with 'Store' in the function calls

Since type Store is essentially Copy-able, all its occurrences in the function calls could be passed by value instead of by reference.

It'll be consistent with the API definitions in LMDB. Clippy also suggests this change for the efficiency purpose.

run examples in automation

We should run the example programs in examples/ in automation to ensure we catch bustage.

add tests for delete_all() functions in multi.rs and integermulti.rs

We don't currently have tests for the delete_all() functions in multi.rs and integermulti.rs. We should add those.

Documentation: errors

We should check that our error hierarchy is clear and understandable, and make sure it's documented well.

prevent applications from opening both named databases and default database

LMDB stores key-value pairs for named databases in the default database, which makes it dangerous for an application to open both named databases and the default database within the same environment using rkv, as the default database will contain pairs they didn't add, and those pairs cannot be read by rkv (because they aren't formatted the way rkv expects, i.e. by using bincode to serialize Rust values to bytes).

Thus we should prevent applications from opening both named databases and the default database within the same environment via a compile time (ideally) or runtime error.

grow Manager to hand out Arc<Store> and ensure Store itself is Sync

Per the conversation in #28, the Manager should grow to conserve and hand out Arc<Store> instances and ensure that the Store itself is Sync (or otherwise help the consumer to satisfy the constraint from http://www.lmdb.tech/doc/ to "not have open an LMDB database twice in the same process at the same time.").

enable management of environments with non-default configurations

There are currently three Rkv methods that create environments: new, with_capacity, and from_env. If one consumer opens an environment using Rkv::new (which uses the default capacity of 5 databases), and then a second consumer tries to open the same environment with Rkv::with_capacity(10), then the Manager's RwLock will serialize those calls, but what should the result of the second call be? If we return the cached environment, then we're returning an environment with a different capacity than the consumer requested.

At the moment, we actually avoid this problem, since Manager::get_or_create only accepts an Rkv::* callback that takes a single parameter, which means it only accepts Rkv::New, since Rkv::with_capacity and Rkv::from_env both take two parameters. But that actually raises a new issue: how do you use the Manager to protect an environment with a non-default configuration?

We should rethink the way we manage environments to enable managing environments with non-default configurations. In the process, we'll need to figure out what to return when a consumer tries to get_or_create an environment with a different configuration than an already-cached environment for the same path.

provide mechanism for consumers to migrate DB files between 32- and 64-bit builds

rkv should provide some mechanism for consumers to migrate database files between 32-bit and 64-bit builds, since LMDB 0.9 files are bit-width-dependent, and users sometimes switch between 32-bit and 64-bit builds of software that uses rkv or copy database files from a 32-bit to a 64-bit system.

This blog post about The LMDB file format notes that it's possible to compile a 32-bit build of mdb_dump and use that on a 64-bit system to dump a 32-bit database file to a portable format that can be reloaded into a new file. It also references this Lua reimplementation of the mdb_dump utility that can read 32-bit database files on a 64-bit system (and presumably vice-versa).

Note that the database format on the LMDB master branch (i.e. the development version that is slated to become LMDB 1.0) is bit-width-independent, so this issue won't exist there. But if we upgrade rkv to LMDB master/1.0 in the future, then we'll have to deal with database migration from the 0.9 format to the 1.0 format. So we can't avoid this issue by upgrading the version of LMDB we embed.

Implemenet clear/drop on stores

LMDB provides mdb_drop with two flavors:

Clear all the kv pairs in the given store and keep the store
Drop the store, which truncates the store and also deletes it in the environment

lmdb-rs has two separate functions for this, txn.clear_db and txn.drop_db, respectively. The latter is marked as unsafe, because the underlying store will be unsafe to use after the call. Rkv can't enforce that, and it's all up to the consumers.

@mykmelez For kvstore, shall we focus on the common case "clear" for now?

lifetimes restricting get->set from a write txn

I have a function where I'd like to fetch a value from one key, then use that retrieved value to delete another key. I'm sure there is a way around this, but I can't find it.
The problem seems to be that the value I retrieve has a narrower lifetime than the writer, and I can't seem to convince it otherwise:

    pub fn del_by_txn(&self,
                      writer: &mut Writer<&str>,
                      store: Store,
                      name: &str,
                      key: &str) -> Result<(), MegadexError> {
        let idstore = self.indices.get(name).ok_or(MegadexError::IndexUndefined(name.into()))?;
        match writer.get(idstore, key)? {
            Some(Value::Str(ref id)) => writer.delete(&self.main, id).map_err(|e| e.into()),
            None => return Ok(()),
            e => return Err(MegadexError::InvalidType("Str".into(), format!("{:?}", e))),
        }
    }

results in :

error[E0623]: lifetime mismatch
   --> megadex/src/lib.rs:142:67
    |
136 |    writer: &mut Writer<&str>,
    |                 ------------ these two types are declared with different lifetimes...
...
142 |             Some(Value::Str(ref id)) => writer.delete(&self.main, id).map_err(|e| e.into()),
    |                                 ^^ ...but data from `writer` flows into `writer` here

I've tried making the function generic for K : AsRef<[u8]> like the prototype of Writer, and set key to type K, however that fails because I can't make id retrieved equal type K. If I am explicit about the types, per above.. I can't seem to make data from writer flow into writer

mozilla / rkv Goto Github PK

rkv's Issues

Recommend Projects

Recommend Topics

Recommend Org