Comments (5)
I'm not sure if this means fully hierarchical keys, or just moving from bucket+key to potentially bucket+key+subkey. Everything before "Caveats" seems to imply the former, but the "Only support single-level" phrase conflicts with that. Personally, I don't see a single, fixed, extra level of depth worth the effort. Complex uses will end up having to construct a compound key in a []byte anyway.
Fully hierarchical keys would be nice, assuming the API and internals can still remain clean. What if it's just Buckets all the way down?
// variadic so you can go multiple levels deep in one call.
// returns nil if bucket does not exist.
func (db *DB) Bucket(keys ...[]byte) *Bucket
func (db *DB) CreateBucket(key []byte) error
func (db *DB) CreateBucketIfNotExists(key []byte) error
// just like DB.Bucket, goes deeper, returns nil for non-existing buckets.
// calling Bucket on nil receivers is valid and returns nil.
func (b *Bucket) Bucket(keys ...[]byte) *Bucket
// gives error if b==nil.
// buckets and keys directly under this bucket are in separate namespaces; that is,
// a bucket can have both a sub-bucket and a data item with the same key.
func (b *Bucket) CreateBucket(key []byte) error
func (b *Bucket) CreateBucketIfNotExists(key []byte) error
// get a key inside this bucket; returns nil if b==nil or key is not found
func (b *Bucket) Get(key []byte) []byte
// walk the bucket; returns nil if b==nil
func (b *Bucket) Cursor() *Cursor
The variadic versions make it easier to use, and guaranteeing behavior for b==nil helps write error checks just at the end. For example, if I have a compound key <timestamp, userid, itemid, mumble>, these two would be interchangeable:
a := db.Bucket([]byte("widgets")).Bucket(timestamp).Bucket(userid).Bucket(itemid).Get(mumble)
b := db.Bucket([]byte("widgets"), timestamp, userid, itemid).Get(mumble)
So .Bucket() becomes a lot like cd
, it's a reference deeper into the tree, and using it avoids repeating & re-walking the path there. I don't know much about the internals, if this is feasible at all.
And, well, then you could also support direct Get and Put with hierarchical keys. Using a Bucket is means of 1. avoiding the walking cost 2. often, staying within one transaction.
func (db *DB) Get(keys ...[]byte) []byte
func (b *Bucket) Get(keys ...[]byte) []byte
Also, I switched bucket keys to []byte in the examples. I feel like using both string and []byte for keys in one API misleading.
Anyway, appreciate all the work and please be careful with api growth.
P.S. I've been looking for a way to encode sql-style compound keys in a []byte while preserving sort order. Everything I've been able to find assumes that all variable length fields are nul-terminated, and thus do not contain all possible binary values; and quoting nuls is hard if you're trying to preserve sort order. Hierarchical keys would eliminate the need.
from bolt.
@tv42 I think Buckets all the way down is probably the way to go. I like the idea of variadic args too.
One of my concerns with infinitely-nestable buckets is cursor allocations. I wanted to be able to allocate a main cursor and a subcursor initially and just reuse them. Although as long as we can reuse the same cursor at each depth in the hierarchy then I'm fine with n-levels.
You bring up a good point about bucket names being bytes instead of strings. I'll probably have to change that.
from bolt.
I think []byte is more appropriate for keys due to the cost of copying []byte to string.
from bolt.
I wrote a library that might be relevant: https://github.com/tv42/compound
not as in BoltDB using it, but because of the underlying trick.
It can combine multiple []byte into a single key, in a way that does not confuse "foo"+"bar" with "foob"+"ar", while preserving sort order.
The trick is: quote 0x00 as 0x00 0xFF, separate items with 0x00 0xNN where NN is any value other than FF. I use that space to store the data type, for extra safety.
So for Bolt, you could make the bucket (or key) space be nested, and join the []byte names with this logic. The API could look like the varargs stuff above.
It seems lmdb internals have a big separation between buckets and keys, not sure if blowing up the number of buckets would hurt too much. Basically, logically you have hierarchical keys:
val := tx.Get(a, b, c, d, e)
but where do you draw the boundary between "buckets" and "keys"? In my mind, the bucket separation is largely an artifice of not having this hierarchy in the first place, so it might not make sense to keep that separation, if the API has this power.. but then again, I don't know much that would change the implementation details.
from bolt.
@tv42 Thanks for the link to the library. I'll check it out tomorrow. Just as a heads up, I'm going to be implementing nested keys this week because we're looking to do Bolt integration and production-level testing next week and that's the last missing piece.
My plan right now is to utilize the leafPageElement.flags
field to create two new leaf element types: an embedded bucket type and a regular bucket type. (The names of those types will probably change)
The embedded bucket will be used if the subbucket contents are small enough to fit inline on the leaf page. The regular bucket type will be used if the subbucket contents are large and then a root page pointer will be used just like the current bucket
type.
I'll add some more details tomorrow. I gotta run.
from bolt.
Related Issues (20)
- C/C++ binding for key/value storage HOT 3
- Security at rest HOT 1
- Can I use the same key for nested bucket and filed in a bucket/ HOT 1
- Document for New Contributors?
- page already freed HOT 5
- How to check if a file is a valid boltdb database without panicing HOT 2
- Check is a database open HOT 1
- what kind fo key can enhance write speed HOT 1
- concurrent writes and deadlocks HOT 2
- [RFE] change sequence number HOT 2
- [RFE] Database Generation ID HOT 7
- Database file size not updating after reaching 1GB HOT 3
- Meta2 make DBFile invalid HOT 1
- how to get the value that the current cursor points to HOT 3
- how to use boltdb for multiple files HOT 2
- page already freed on certain builds
- Not able to create subbucket inside loop
- Tons of compilation errors HOT 1
- permission denied in user home directory when open boltdb path HOT 1
- Cursor.Last() returns nil for non-empty bucket
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bolt.