appaquet / exocore Goto Github PK

View Code? Open in Web Editor NEW

58.0 58.0 3.0 20.02 MB

A distributed private application framework

License: Apache License 2.0

Rust 97.39% Cap'n Proto 0.22% Shell 0.49% Dockerfile 0.02% Swift 1.05% Ruby 0.05% TypeScript 0.77%

backend blockchain database mobile peer-to-peer rust wasm

exocore's People

Contributors

Stargazers

Watchers

Forkers

mrene scaevola

exocore's Issues

Implement pending store cleanup

Make framing more generic to wrap any bytes, and then convert "Entries" in chain to it.

Allow embedding frames into frames.
Instead of having a seperate signature, we could just have a SignedFrame wrap a NonSignedFrame. This would allow preventing having an Option of the signature and be clear about what is what.

Support for streaming in transport

Right now, transport only support message passing. In order to optimize chain bootstrapping, we need to be able to support streaming.

The way it could be implemented is to add an optional stream / sink field in InMessage / OutMessage.

In libp2p implementation, we need a new ProtcolHandler to support this. We use the "OneShot" handler that works for message passing, but not for streaming

Code cleanup & rename for blocks/operations

-Block depth => block height
-Block should perhaps be renamed BlockHeader ?
-Rename pending_operation to something else
-ChainBlockIterator -> BlockRefIterator

Support for joining a diverged chain

Pending store operations cleanup

Right now, the pending store is just infinitly collecting operations, even if they had been committed to the chain store.

In order to cleanup, we need to implement a logic based on how deep (configurable) an operation is now in the chain, so that we are pretty confident that all nodes have that operation committed.

This logic will happen in the commit manager, since its the piece that is responsible for merging the pending and chain stores.

Exceptions:

If a node has been partitioned for a while, and didn't catch the blocks being created for an operation, it may try to sync that back to the other nodes. This means that we should ALWAYS wait for chain synchronization to happen first, then commit manager to cleanup the pending store, before sending or receiving pending sync requests. But first, we need to implement #33 since we need to check if operations are in the chain before proceeding to allowing pending sync.

Chain synchronization completion / improvements

We need to:

Handle nodes timeout
Handle local progress vs other nodes and keep track of leadership
If majority of nodes are offsync or offline, we are offsync, reset leader
If we're synchronized, we need to make sure we're still in sync with leader or make sure we still follow a good leader
Implement exponential back-off for sync with nodes

Create a in-memory pending store

Create a libp2p transport implementation

Research on Tantivy use in exomind-index

Questions

Can we index our segments like we want ?
- 1:1 data chain segment <-> tantivy segment
- Since data chain segment are immutable, we won't touch those tantivy segments anymore in theory
How fast is it to re-index the "pending" segment ?
- This segment will be constantly re-indexed
- Its results will be marked as "pending"
Can we do all queries that we had in Lucene ?
- Complex sorting by date + score + by entity type
Can we do ranking on multiple objects like in Exomind ?
- What we indexed in Exomind are Traits, not Entities
- If we had multiple traits matching, we were summing the scores (which is not right, because threads may get 20-30 emails that increases score too fast)
- Exomind then took top N results, and fetched entities for those results
Can we do paging like we were doing in Exomind ?
- Paging "key" that was representing the sorting field's value (ex: timestamp)

TODO

Test the segment indexation hypothesis we had (pending and immutable segment)
Implement a mock data layer (or similar API for now since it's not yet stabilized)

To research: Improve chain directory storage

Snappy compression? Is it even worth it since the data is encrypted?
~~Opening segments could be faster to open by passing last known offset~~
~~Caching of segments metadata~~
Segments hash & sign hashes using in-memory key ==> Makes sure that nobody changed the file while we were offline

Commit Manager should detect divergence from rest of cluster instead of waiting for chain synchronizer to detect it

Chain synchronizer has to wait for maximum leader depth delta (max_leader_common_block_delta) before trying to re-elect a master and find out all nodes have diverged.

We can do this faster in commit manager if we detect that majority of nodes are voting for blocks that are divergent from our local chain

Implement data engine events stream

This stream will be used by engine's users to be notified of changes in the different stores. The index layer will use this stream to know when it needs to reindex certain segments. This stream should probably throttled at the user level depending on the amplification (ex: we don't want to reindex on every change if they are happening at high rate).

A handle should also be notified if its stream had been discontinued since the channel used to back the stream will be bounded to prevent potential memory blow up.

Index chain by entry ID and operation ID

Will be needed by the engine for replication and lookup what's in the chain right now vs pending

Implement Node & Cell keys serialization and storage

Need a way to store node's keys to disk

Pending Synchronizer code cleanup

TODO:

Cleanup duplicate logic as mentionned here: #15 (comment)
Extract infinite bounds value to const as mentionned here: #15 (comment)

Make chain thread safe and segments referenceable

TODO:

Split DirectorySegment into an immutable and mutable version
Mutable version is under a RwLock
Immutable version is under a Arc so that we can reference it

To think:

We'll need to be able to truncate if chain has divereged. This means that we may need to truncate underlying file of an immutable segment, and end up with dangling segment. We need to panic Arc is being used somewhere while we truncate (we could have a global lock for when this happen & have grace period)

Chain synchronizer should maintain proper synchronization status

As mentioned in #41, the commit manager and the pending synchronizer need to rely on the chain synchronization status to know if they can proceed with their operations. This is crucial to prevent old operations from reviving if a node had been partitioned for a while.

This issue is about implementing the logic to detect when the local chain status is deemed uncertain because we're partitionned, we've diverged, we're lagging behind other nodes, etc.

Implement replication engine simulator

Check for: https://docs.rs/kaos/0.1.1-alpha.2/kaos/

Upstream capnp thread safe modification once we have AtomicU64 stable.

See https://github.com/appaquet/capnproto-rust/tree/thread-safe-arena-readlimiter

The atomic integers should be stabilized soon as the PR was merged on Jan 7th 2019: rust-lang/rust#57425

Implement pending storage persistence

Right now, pending store is kept in memory. In order to increase resilience of it in case of crash, we need persist it.

The ideal persistence would be a WAL since the pending store is continuously being appended and cleaned up. We could directly mmap the operations data from it instead of storing them in memory.

Operations would still be in memory, but not their frame data.

TODO:

Find a WAL crate that we can reference bytes from
Think of a way to cleanup old segments of log

Prevent pending sync & commit if chain is not synchronized

Implement data engine

Create data engine integration tests harness & implement tests

Make chain indexed by operation ID

Depends on #34

Notes:

Use extindex
This is needed to prevent a old operation from being re-added to the chain if a node still had an old pending store operation.
This will required to split the current live segment from the immutable segment.
Use the JSON simple storage to save the available indices

Questions

Should it be seperated from the chain?

Handle cell authentication in WebSocket transport

Libp2p transport: Queue messages until nodes connected

Queue messages until node is connected to prevent losing messages if peer isn't online yet

Commit Manager should detect when votes for proposed block is stalling

Operation in a block should be binary searched

The commit manager stores operations by operation id. We should do binary lookup within the block instead of iterating through all operations to find an operation in Block's get_operation method.

Implement query structures + serde JSON

Notes

Should be same as Exomind to support current web/ios clients
Could eventually evolve into another form (GraphQL?)
Implement querying on Tantivy

Implement a better block proposal selection logic

We need a proper logic to propose blocks to be added to the chain so that we minimize odds of having multiple blocks generated at the same time by multiple nodes, which would result in splitting the concensus.

We also need a logic to select the best block to be signed when mutliple blocks are proposed.

We also need a logic to make proposed blocks timeout if no concesus happened after a while because of the presence of multiple blocks that split the concensus.

We also need a logic to detect stale commits, and propose another block (time based?)

A block should also have a limit amount of operations

Create scoring benchmark for search

Notes

Create gold standard (@appaquet's data?)
Evaluate gold standard with metric (MAP?)
Think about how we could tweak / learn weights (evolution algo?)

Fix Android build

See ring's build script: https://github.com/briansmith/ring/blob/ab0726d0cdb0c9aafd787e772592331b610acc75/mk/travis.sh

Convert remaining enum Error to Failure's derived errors

We should use the Failure's crate type of Error for better backtrace tracking and enforce proper display message.

See implementation in the framed.rs and the pending store Error for examples.

To convert:
Engine, Transport, Chain, extindex, extsort

Make sure we have #[fail(cause)] where we need it

Move Operation and Block to own packages

TODO

PendingOperation should be a trait like Block is in chain and should named Operation
Engine should have a EnginePendingOperation impl and Block should has the same
Extract Blocks related code from chain/mod.rs to its own module
Extract Operations related code from pending/mod.rs
Create impl for EngineOperation, PendingOperation

Create consistent clock to generate monotonically / unique time and IDs

First place to be used is for operation ID generation. They need to be unique in the cluster, and monotonically increasing. Clock doesn't need to be super precise, but enough to prevent block im commit manager, and prevent IDs collision.

Time should be:

Milliseconds precision timestamp (47 bits, up to year 6427)
Unique node identifier (9 bits, 512 nodes)
Sub milliseconds operation counter to prevent collisions (8 bits, 256... Up to 256k ops/s per node)

SECP251K1 to sign the 32b SHA3 256 hash. This will result in a 64b compact signature.

Problems:

If we sign every since operations, blocks, messages, this may result in bloated chain and performance hit. Think about how we could have a hybrid. Blocks are important to sign, but messages a bit less.

Probably same structure as Exomind to reuse web & ios clients, but could be under a different format (json? yaml?)
Implement traits + entities structs

A Cell should contains:
- Nodes of the cell
- The local node ref
- Information to generate consistent clock (see #6)
- Signature & encryption related data
NodeID & CellID should be structs and should be renamed for Id and not ID

appaquet / exocore Goto Github PK

exocore's People

Contributors

Stargazers

Watchers

Forkers

exocore's Issues

Recommend Projects

Recommend Topics

Recommend Org