khonsulabs / bonsaidb Goto Github PK

View Code? Open in Web Editor NEW

993.0 993.0 37.0 5.5 MB

A developer-friendly document database that grows with you, written in Rust

Home Page: https://bonsaidb.io/

License: Apache License 2.0

Rust 99.60% JavaScript 0.01% HTML 0.40%

database rust

bonsaidb's People

Stargazers

Watchers

bonsaidb's Issues

Add better range types for queries

Right now the Query API forces using Range, but ideally it would allow any RangeBound. This means the API needs to change, however.

I did some initial searching for a quick solution but didn't find any other range types that supported serde out of the box. It's not a tough problem, just something that seems low-priority for now.

Create Transaction Builder methods

Currently, push() and update() are implemented by creating single-entry transactions and issuing apply_transaction() with the created transaction.

We should have a way to build a transaction with more than one entry.

Add pagination to view queries

Unsure of how this impacts local queries. For iteration in sled, each result is returned, so "skipping" to catch up in a view would still scan those items. I'm not sure if pagination makes sense in the traditional sense. If sled, however, is returning handles to data that can be loaded, then we could support simple pagination by just skipping along the iterator.

If traditional pagination will make it too difficult to keep things performant, we should think if we care to expose pagination -- we could expose a "result limit", and you're responsible for incrementing your key request range.

Figure out TLS strategy for WebSockets

The initial implementation in #28 was a quick-and-dirty way to get a secondary transport to help test what bugs were in PliantDB vs the new QUIC transport layer.

Ideally, we would support a layer of routing to eventually support REST APIs and more. This would move the WebSocket endpoint to a URL.

I'm uncertain if using warp is the best for this or not. It's what I currently am most familiar with, and it seems like it would support the composability I would hope to offer someday.

Add View version checking to the integrity scanner

Add a schema map to the database to define the view versions. If a version change is detected, the integrity scanner should just invalidate the entire tree.

Add better hostname resolution

For Fabruic, we've opted for connect_with to resolve a hostname using the CloudFlare resolver with very secure settings. This is perfect for a game client, but not great for hosting a database server on a private network without hostnames or with private DNS.

Right now, PliantDB doesn't use connect_with and instead resolves the hostname using Tokio's ToSocketAddrs and uses connect with the resolved address. This means PliantDB currently works great for hosting a database, but for secure, trustworthy DNS resolution, we don't have a solution.

I see a couple of options:

Add an alternate path to Fabruic for using the Tokio resolver instead of trust-dns, and have a PliantDB feature flag to switch resolvers used.
Have PliantDB optionally use connect_with only if a non-IP-address hostname is detected, but keeping the current behavior by default.

Given the goals of fabruic, I'm leaning towards solving it completely in PliantDB with something like option 2.

Optimize how reduce is executed

Currently, reduce is implemented by default on the View trait with an Err that returns that the method isn't implemented. When executing the view code, we always call reduce, which means we always do a little extra work even if the view hasn't actually provided an implementation.

Either we should test if it's actually implemented when registering the view and note it somewhere and optimize this flow, or we should refactor how Reduce is implemented -- can we leverage the type system better by splitting traits? Or at the end of the day, is that more complicated?

Implement View reduction

Initial implementation should reduce values into the MapEntry, which means that for unique key outputs, we keep the values minimized.

When querying, we re-reduce using all resulting values, using the cached entries from each of the nodes.

When a document is removed, we have to re-reduce the remaining values.

Add tests for integrity checker

To test this properly, we need to be able to query with stale results. Blocked by #13.

Test configuring to run on launch and verify it automatically scans without running queries.
Test that running a stale query after adding a new view (should return no restults)

Add option to specify data update policy on view queries

The default behavior is to ensure the view is up-to-date before returning any data.

Keep this behavior by default, but add the ability to:

Return stale data (don't wait for the view to update, pull whatever it can immediately with as little delay as possible)
Same as above, except kick off a view update job to run in the background.

This allows ultimate flexibility: If you want eventually-consistent data accesss, use "update_after". If you know another process is updating the view regularly, you can just request stale data always allowing the other process to control the "caching".

Add Collection Versioning (allows basic migration logic)

We need to have a way to allow collections to upgrade themselves between versions.

Add an CollectionSchema trait with a version() function, like View. Also add a a function like async fn upgrade_from_previous_version<C: Connection>(connection: &C, stored_version: u64). Maybe blocked by #113.
Store a mapping of all known schema names and their versions when creating a database.
Upon opening an existing database, check the list of schemas against the ones stored. If any versions don't match, call upgrade_from_previous_version before allowing any operations on that collection.
A list of all views should be stored for each collection.
If a collection or view is missing after upgrade, the files should be removed.
- This sounds dangerous, but having data that must be cleaned up manually is also bad. Maybe there should be a setting that prevents this behavior for Collections? Views are ephemeral so this is fine for views no matter what.
Schema should have a callback that is invoked when any set of collections are upgraded (with the list of upgraded collections).

Add WASM support to Client

We should transparently support using the PliantDb client within WASM. The client will only be able to support using the WebSocket protocol.

Add options for fetching multiple documents

Add unit test for cross-plane `PubSub`

In working on creating a more extensive demo, I found server-side generated PubSub messages never reached the clients.

As I finished fixing it today, I realized that also websockets and pliant clients wouldn't have heard each other either. This is all fixed, but we should have a unit test to test these use cases.

Refactor client workers to share more code

The process and structure of websocket_worker.rs and worker.rs are very similar. It should be possible to abstract most of the logic away and keep the transport-specific code minimized to small chunks of glue code.

Design View Query API

CouchDB View API for reference

Allow querying by:

Single key
Range of keys
Set of keys (multi-get)
-Option to include documents in the response-
-Descending order for reverse iteration-
-Allow skipping updating the view by specifying either an "update after" or simply "allow stale data" option-
-Pagination-
-Sorting?-

Pick what we should do for V1 and split the rest into other tasks.

Add ability to query views with documents optionally returned

Register an IANA port

Right now, BonsaiDb uses UDP Port 5645, unless otherwise specified, and is not registered with IANA. I've taken initial steps to attempt to register a port with IANA, but the reality is that this project is early in development. Because there have been deployments of BonsaiDb in non-experimental environments, we are using a currently-unassigned user port.

The only experimental UDP ports available are 1021 and 1022, both of which require superuser privileges to bind to on Linux.

This ticket is to serve as a reminder that there is no guarantee or expectation that the port used by default will be available at this time. Technically even registering a port doesn't give you that guarantee, but it at least gives us more of a right to use that port by default in deployments.

Add reading of transaction logs and unit tests

While we're not working on replication yet, we should add the necessary methods to read the transaction logs so that we can unit test the transactions.

Handle graceful shutdown properly

For both websocket and fabruic connections, we should have a shutdown handle that can be "selected" with each of the payload receivers, so that when a shutdown is requested, any existing requests for connections are handled.

Introduce a shared shutdown signaling mechanism
#32
Update all connection types to reject requests with a ShuttingDown error once a graceful shutdown phase has begun.
Update all connection types to close their individual connections once a GracefulShutdown has begun and all outstanding requests have been serviced.

Write a book

Originally, I thought it would be a long time before a book would be useful, preferring to put more documentation into the docs themselves. However, the more I think about it, a step-by-step guide to adopting, using, and administrating PliantDB would be incredibly useful for adoption.

I think the general goal should be for the book to be focused on practical use cases and "guide"-style documentation. The docs in the code should be focused on the functionality of the code, and reference the book when sections are available that are helpful.

An idea for the book would be to build an app from start to finish with sections highlighting the migration from a single-user executable all the way to a fully deployed cluster. Perhaps using Kludgine as the UI to tie all the projects together.

Handle graceful websocket disconnects correctly

Right now the server and client do not do any of the recommended WebSocket closing procedures. It doesn't really impact the protocol used in PliantDB, but my understanding from doing some research is that if we want to support interacting with WebSockets on the browser, implementing graceful closing will prevent errors from popping up in the browser consoles for expected disconnections.

Implement periodically-saved Key/Value Store

At the local database level, this should be implemented as a lightweight atomic-operation focused key-value store looking to replicate these features in redis:

Atomic locks, which should be expandable to using a leader-leader synchronization scheme a la Redlock
Time-expiring keys
Atomic operations, including most of the related commands on this page.

We can go above and beyond the default redis configuration and use Sled to allow the data set to be larger than what can fit in memory: Use a Sled tree to store each entry, but keep an in-memory cache of recently used keys. When keys are modified, keep track and only update each changed entry when flushing to Sled.

The last trick of the puzzle is to try to enforce memory limitations on the amount of data loaded in memory and evicting keys based on last usage.

When exposed in this fashion, the API should fit into the existing DatabaseRequest/DatabaseResponse structures, and make exposing them to the Client fairly straightforward as those things go.

Create Example App

While this project is being developed to be the core architecture of Cosmic Verge, it's a huge project, and it's nice to have more achievable goals.

For Cosmic Verge, there's a unified vision of using PliantDb as an app platform. We are developing Gooey with the goal of being able to write functional PliantDb-backed applications using mostly cross-platform code.

The idea of the example application is still nebulous. A typical example is a todo app, and honestly it's tempting because I'm between todo apps myself right now. But, for now, this is a placeholder issue.

To track this "milestone", refer to this project

Add ability to emit multiple values from view maps

Use case for this feature: storing a tags array in a document. A view to list all documents by tag would want to emit one entry per tag in the array.

Two approaches:

Return a Vec as an option.
Switch to a collection that's passed in that can emitted against.

I slightly prefer the latter, but the former feels more functional in design.

Prevent memory exhaustion attacks via serialization

We shouldn't ever use bincode::deserialize directly. The preferred method is to use the Options trait. The DefaultOptions struct talks about what options are set up by default, and the important part for us is the byte limit.

Technically since PliantDb relies on Fabruic, to truly be preventing it, this issue should also be fixed before closing this one out fully:

For cbor, the situation is more complicated. Here's a devlog describing my experiment of writing my own serialization format. As of this edit (July 13), my thoughts are now that we should:

Adopt pbor. And... rename it for goodness sake. (done, now named Pot)
Add max_size configuration options to pot
Switch wire protocol to Pot (Blocked by khonsulabs/fabruic#28)
Add a configuration option to the database, and use this single setting for bincode and pot. CBOR isn't an attack vector when it's export-only.

Add document deletion

Add Unique Views, where Keys are guaranteed to be unique.

Add caching layer for view mapping status

As views are updated, the returned transaction ID should be able to be cached to allow for RwLock-level blocking (not potentially IO blocking like accessing sled`) when accessing a view that is updated.

Add ability to paginate view queries

The view query API needs to:

be able to return fixed numbers of elements
be able to start at the first element or the last element

Convert git-pre-commit-hook.sh to a Rust executable

While many developers have bash available to them, there's no reason for it to require bash. We should install a rust executable as the pre-commit that does the same commands, but doesn't require bash. This will enable the pre-commit hook to work on Windows.

It appears the xtask repo also has some tools for this exact task.

Add configuration to Server to limit number of open databases.

With the initial implementation of the server (#19), a choice was made to delay this work.

The problems this setting is aiming to solve:

Working within limits of the number of open files
Sled will be keeping an in-memory cache for each open database. We may want to expose this cache size for configuration, but regardless: each open database will eat memory for cache.

The logic implemented should keep track of the last time a database was accessed, and when a database needs to be unloaded, the oldest database should be evicted first.

One question that needs to be answered is, under high load, if a server is configured to only have, let's say 20 databases open, should we allow temporary bursting if a queue of requests comes in that exceeds that? Or should requests block until the limit is exceeded? Maybe two settings -- a target and a maximum?

Add automated ACME certificate generation

Fabruic now supports using a certificate store to authenticate the QUIC connection instead of only using pinned certificates, and once #40 is updated, we can use the same TLS certificate for HTTP as we can for QUIC.

To make deploying easier, having built-in functionality to generate certificates using ACME would be incredibly useful.

Create a command line tool for Local that can dump a database and restore it

Should create a folder structure a la:

export-path\collection-name\document-id.cbor

Ability to cross-convert to JSON or other output formats would be nice too.

This tool should be written in such a way that it can operate without the schema -- for now needing a list of collection IDs.

Reconnecting client doesn't disconnect exisiting subscibers

In discussing some of the PubSub details yesterday, I reminded myself that the Client is "dumb" relative to its knowledge of what existing subscribers are subscribed to. The point of the conversation was pointing out how the API for pubsub gives an Arc<Message>, and I mistakenly thought I handled a cool optimization in the client: if two subscribers subscribe to a single topic, the server would only send the message once.

This isn't true, and at the time, it seemed like just an optimization that could be done. However, in working on the reconnecting logic for #61, I realized that the reconnecting logic for all clients retained the same SubscriberMap. The effect is that if a client disconnect occurs, existing subscribers will never receive an error nor will they receive any messages once a reconnect happens.

This could be fixed by implementing the optimization mentioned in the first paragraph. The client would keep track of all topics for all local subscribers. Upon connecting/reconnect, the client would create a single remote subscriber and subscribe to all of the topics. From the subscriber's perspective, the disconnect would be transparent.

While that sounds amazing for many use cases, it also prevents the ability for a subscriber to know of a disconnect. Another approach would be to clear the subscribers map upon reconnect, thus forcibly disconnecting existing subscribers. The pubsub loops would just need an outer loop to manage recreating the subscriber upon an error.

I don't think these approaches are mutually exclusive, but it might be reasonable to only implement one of these approaches to solve this bug.

Implement protocol version negotiation

Right now, we're not using the QUIC protocol exchange at all. We should have some logic for major protocol version negotiation.

Implement a schema-id system to make storage and transmission more efficent.

Updated after #44.

We now have properly namespaced strings in use everywhere. One issue with using strings for collection IDs is that they're sent across the network. This, at most, should only need 4 bytes to represent if we had a reliable collision-free way to convert these strings to u32s. Originally, the idea was to use a hash. However, there's a more correct way to do it that ensures there will be no collisions:

Create a system in which connected clients cache what IDs are known and what are not, and send the full names as needed to the client, otherwise, send the IDs.
- In server:
  - for requests, names should take an Either<Name,ID>, allowing the client to specify a name when it doesn't have a cached value yet. In the response, the server should track if a client has received a given ID yet, and if not, send the missing mappings before sending the response that uses those IDs.
- In client:
  - a lookup table is established using the values sent from the server. Responses are translated through this lookup table.
  - When sending a request, see if a name has a cached ID, and if so, use it instead of the name.

Add serialization support to View results. Currently, keys and values are still serialized.

Implement background job service

To implement the view system properly, we need to have a background job service. For now, the service can be simple: The ability to launch or check to see if a job with a given ID is running or in queue, and if so, wait for it to complete.

Redo `Client::new` into methods based on whether a pinned certificate is used.

Update fabruic and use the new methods that take Uris when no pinned certificate is passed. Use the existing functionality otherwise.

Add unit test for document update failures

Conflict
Not found
No change (not a failure but still test it)

Begin implementing a Server

Initial server requirements:

Ability to register one or more schemas as available to use.
APIs exposed over QUIC for:
- Add/Delete/List databases
- List available schemas
- Executing all Connection related queries
Creating a database should initialize a new subfolder in the data folder.
Server, for now, will keep all databases open. The use case for Cosmic Verge is a fixed number of databases per server. For large numbers of databases, #27 will address that.
Expose a command line executable that's easy to wrap. Maybe a macro_rules macro to create a main entrypoint for you.
Command line executable needs to have:
- Serve
- Server data folder initialization (cert generation)

Add View updating logic

When a document is updated, add it to the invalidated entries tree.
When an existing document_map entry is found in the mapper, remove the existing entries within the old key.

Want to contribute

Project looks interesting to me. How can I start contributing to a project ? Where do I start from ?

Implement PubSub

Add PubSub module to core.
- Goal of this module is to expose a PubSub API to end-users. Eventually, it will also be used for internal notifications between cluster nodes.
- Core serializable types for topics, payloads
- async topic registrar that keeps track of subscribers and relays payloads when messages are received
- Very thoroughly unit-tested
Expose PubSub methods.
- Needs to support allowing multiple different PubSub clients over a single Connection -- best method will probably be to create a PubSub client from a Connection.
  - For Local, it can interact directly with a shared PubSub registrar on the Storage type.
  - For Server, the registrar will need to namespace topics with the database name, and each Storage instance delegates any PubSub to the Server.
  - To expose to the network, the clients will need to be able to register multiple unique IDs to individual topics corresponding to a specific database. These IDs will be included in the network Request/Response structures.

Add Key-Value storage to backup

Revert hardcoded nightly version

Code coverage started failing recently, and it appears that it's due to an ICE that dates back a little bit, but only recently started cropping up most likely due to rustfmt being broken on nightly for a while: . I tried to narrow it down, but I can't seem to get the ICE to occur outside of the project.

The earliest version of nightly that installed all the default components that doesn't cause an ICE is 2021-03-25.

The commit to revert is: cd668bf

Add ability to get a reduced list back -- skip the rereduce and return the array.

Implement Basic App Platform

The goal of this issue is to create a basic app platform that allows defining restful API endpoints, simple HTTP serving, a custom over-the-wire (WebSocket/PliantDb) API. A developer could use this platform to create an app that uses PliantDb to manage users and permissions, and optionally an HTTP layer with WebSocket support serving requests. The HTTP layer's main purposes are writing RESTful APIs and serving a single-page app that is powered by WASM + PliantDb over WebSockets.

Prevent malicious pubsub clients

After #43, we should have a location to store client-specific information on the server. Once we do, we should track which subscriber ids each client created, and return errors if a client tries to unsubscribe or subscribe on an ID that doesn't belong to the client.

khonsulabs / bonsaidb Goto Github PK

bonsaidb's People

Stargazers

Watchers

Forkers

bonsaidb's Issues

Recommend Projects

Recommend Topics

Recommend Org