Giter VIP home page Giter VIP logo

Comments (12)

ordian avatar ordian commented on August 23, 2024

cc #4

Avoiding copies where necessary. (Rather than decoding the value, return &[u8].)

from rkv.

mykmelez avatar mykmelez commented on August 23, 2024

The Value type is designed to prevent an accidental "cast" from one type to another when storing and then retrieving a value.

I understand how Value might feel extraneous for a use case that only ever deals with Blob values. And I'm open to ideas for mitigating the overhead. However, I'm also wary of adding more complexity to the interface, and a parallel set of methods that take &[u8] values seems like it would double the cognitive overhead of the *Store types.

from rkv.

ordian avatar ordian commented on August 23, 2024

@mykmelez why not just this (a breaking change)?

pub fn put<K: AsRef<[u8]>, V: AsRef<[u8]>>(self, txn: &mut RwTransaction, k: K, v: V) -> Result<(), StoreError>;
pub fn get<T: Transaction, K: AsRef<[u8]>>(self, txn: &T, k: K) -> Result<Option<&[u8]>, StoreError> 

from rkv.

ordian avatar ordian commented on August 23, 2024

or if you want to keep the Value type

pub fn put<'a, K: AsRef<[u8]>, V: Cow<'a, [u8]>>(self, txn: &mut RwTransaction, k: K, v: V) -> Result<(), StoreError>;

EDIT: V: AsRef<[u8]> should work for Value as well.

And for a value the calls would be

// serialization to vec in `to_bytes` can't fail, right?
store.put(&mut writer, key, value.to_bytes()?);
store.get(&reader, key).and_then(Value::parse);

// Expose a helper function to transform the result.
impl Value {
    /// Tries to parse the value from bytes.
    pub fn parse(value: Option<&[u8]>) -> Result<Option<Value>, StoreError> {
        Ok(match value {
            Some(bytes) => Value::from_tagged_slice(bytes).map(Some).map_err(StoreError::DataError)?,
            None => None,
        })
    }
}

and for bytes it would simply be

store.put(&mut writer, key, bytes);
store.get(&reader, key);

from rkv.

mykmelez avatar mykmelez commented on August 23, 2024
store.put(&mut writer, key, value.to_bytes()?);
store.get(&reader, key).and_then(Value::parse);

This feels harder than it should be to read and write typed values.

I wonder if it would be sufficient to expose a new Store type (RawStore?) that takes and returns only &[u8] values. I'm unsure how to generalize that across the types of keys (AsRef<[u8]>, PrimitiveInt) and single vs. multi-stores, however. I don't want to end up with four new store types!

from rkv.

devinrsmith avatar devinrsmith commented on August 23, 2024

First of all - rkv looks like a great project. I like the level of documentation.

I was going to open an issue that was almost exactly the same as this one. For my use case, I'd always be using Blobs - and the necessity of always wrapping and unwrapping against that type is a bit against the rust mantra of "zero cost abstractions".

My guess is that most use cases will always choose a single value type, and in the current implementation would always end up wrapping and unwrapping against that. I think it would be good to abstract a little bit, and pull the value type up to the db layer - allow the user to choose any generic value type they want (even Value), as long as it implemented the required traits.

from rkv.

devinrsmith avatar devinrsmith commented on August 23, 2024

I did a quick survey of some other rust LMDB libraries -

lmdb - uses trait AsRef<[u8]> for values
lmdb-zero:

pub trait AsLmdbBytes {
    fn as_lmdb_bytes(&self) -> &[u8];
}

lmdb-rs-m:

pub trait ToMdbValue {
    fn to_mdb_value<'a>(&'a self) -> MdbValue<'a>;
}

from rkv.

ordian avatar ordian commented on August 23, 2024

I wonder if it would be sufficient to expose a new Store type (RawStore?) that takes and returns only &[u8] values. I'm unsure how to generalize that across the types of keys (AsRef<[u8]>, PrimitiveInt) and single vs. multi-stores, however. I don't want to end up with four new store types!

@mykmelez What do you think of making store types generic over a value type?
Something like this:

pub struct SingleStore<V: Value> {
    /// ...,
    phantom: std::marker::PhantomData<V>,
}

impl<V: Value> for SingleStore<V> {
    pub fn get<T: Readable, K: AsRef<[u8]>>(
        self,
        reader: &T,
        k: K
    ) -> Result<Option<V>, StoreError> {...}
    pub fn put<K: AsRef<[u8]>>(
        self,
        writer: &mut Writer,
        k: K,
        v: V
    ) -> Result<(), StoreError> {...}
}


// FIXME: inherit from std::convert::TryFrom<Error=Error, [u8]> when it's stabilized
extern crate try_from;
use try_from::TryFrom;

pub trait Value<'a>: TryFrom<&'a [u8], Err = Error> + AsRef<[u8]> {}   

impl Value for rkv::value::Value {
    /// ...
}

Note, that trait Value inherits AsRef<[u8]>, I think value.to_bytes() should always succeed (for the struct Value, it allocates a vector a writes bytes to it, which should never fail).

from rkv.

mykmelez avatar mykmelez commented on August 23, 2024
pub struct SingleStore<V: Value> {
    /// ...,
    phantom: std::marker::PhantomData<V>,
}

This would require structs that own a SingleStore to specify the lifetime of its Value type:

pub struct OwnsSingleStore<'s> {
    store: SingleStore<Value<'s>>,
}

Which limits usage in Firefox, where rkv is being exposed to JavaScript code via an FFI that cannot wrap generic types.

I wonder if it would be possible to make store types enums, with a variant that accepts the value::Value type and another that generalizes across types implementing the Value trait (here called ValueTrait to distinguish it from the value::Value type and the SingleStore::Value variant ):

#[derive(Copy, Clone)]
pub enum SingleStore<V: ValueTrait> {
    Value { db: Database, },
    ValueTrait {
        db: Database,
        phantom: PhantomData<V>,
    },
}

Then, assuming one can implement ValueTrait for some type that doesn't require specifying its lifetime, perhaps even the unit type, it should be possible for a struct to own a SingleStore<()> of the Value variant.

That would still be ergonomically suboptimal, as consumers would have to specify the type of ValueTrait even if they use the SingleStore::Value variant, which ignores it. And I suppose we could simplify via a newtype like pub struct ValueSimpleStore(SingleStore<()>). But if each store type has such a newtype, then we're back to doubling the number of store types, and we might as well implement a separate store type that generalizes over ValueTrait.

I wonder if the approach taken by the lmdb crate would be preferable: provide access to the wrapped type, so consumers who aren't satisfied by the higher-level type can access the lower-level type directly. For the lmdb crate, that means you can get an ffi::MDB_dbi from an lmdb::Database. For rkv, it would mean you can get an lmdb::Database from an rkv::SingleStore.

So you could create a SingleStore, taking advantage of rkv's higher-level affordances, and then retrieve its Database to use &[u8] values directly.

from rkv.

ordian avatar ordian commented on August 23, 2024

I wonder if it would be possible to make store types enums, with a variant that accepts the value::Value type and another that generalizes across types implementing the Value trait

I'm not sure it's possible to specify ValueTrait w/o a lifetime specifier. For &'a [u8] values, 'a should be bound by the reader lifetime. Alternatively, one could make get and put methods generic over Value (instead of store types), but that would require a user to specify a value type for each method call, which is not nice.

For rkv, it would mean you can get an lmdb::Database from an rkv::SingleStore.
So you could create a SingleStore, taking advantage of rkv's higher-level affordances, and then retrieve its Database to use &[u8] values directly.

This seems like a good approach to me.

from rkv.

rrichardson avatar rrichardson commented on August 23, 2024

Is there anything stopping us from simply adding a put_raw and get_raw (or put_bytes get_bytes whatever) to each of the Store APIs?

Failing that, I like the RawStore or ByteStore idea. I mean, Rkv is about type-safe serialized object storage. It makes sense to support optimizations for a &[u8] key/value, but I don't think that the data model should change for it.

from rkv.

rnewman avatar rnewman commented on August 23, 2024

One could conceivably implement SingleStore in terms of a RawStore, which adds to its appeal.

I intended rkv to expose most of LMDB's capabilities with way, way fewer sharp edges, and introducing a rudimentary type system of tagged values was part of that. But I had also intended to facilitate storing an arbitrary set of bytes.

I haven't thought it through in detail, but conceivably the Blob bincode calls to deserialize and serialize can be omitted, which should preserve the benefits of tagging with very little overhead — after all, they're just serializing/deserializing [u8] to [u8].

from rkv.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.