Light

hldb / welo Goto Github PK

View Code? Open in Web Editor NEW

29.0 29.0 2.0 6.68 MB

peer-to-peer, collaborative states using Merkle-CRDTs

License: Other

TypeScript 100.00%

ipfs ipld merkle-crdt p2p peer-to-peer

welo's Introduction

HLDB

A peer-to-peer database protocol

Summary

HLDB can be used to build local-first applications. It is best suited for social/collaborative applications that do not require consensus.

Each peer has their own copy of the database called a replica. The peer's local replica is used as the source of truth. Updated remote replicas are merge with the local replica to see the new state.

In this way, the applications are edge-computed by the participating peers. Applications designed this way give users more control with potential to make large scale database breaches a thing of the past.

Encryption?

There is no encryption built into the protocol yet.

Access Control

Currently only write access can be controlled and is not able to be updated for now. Access is controlled and enforced by correct peers on their own replicas.

Papers

At the core of the database replica is a Merkle-CRDT. This type of CRDT satisfies BEC, byzantine eventual consistency. This property ensures SEC and that any number of faulty replicas cannot affect correct ones.

These are two papers the foundation of the protocol are built on:

Specification

The protocol specification can be found in hldb/specs.

Implementations

Name	Language
welo	typescript

welo's People

Contributors

Stargazers

Watchers

Forkers

welo's Issues

Test script runs Tests inside Browsers

Add scripts to run tests in the browser (multiple browsers preferably)
Only install test browser binaries before running browser tests

add live replicator tests

generate docs with typedoc

Use typedoc to generate api documentation.

The default setting renders html, I would prefer markdown added to /docs.

shared channel

Add implementation of Live Replicators Shared Channel from the Live Replicator Version 1 draft spec.

Prereq:

#23

add benchmark for state calculation

Replace Keychain with libp2p keychain api

Allows removing the KeyChain import hack 🎉

Refactor to use Libp2p directly

Have been avoiding using the libp2p provided by ipfs because it is not available when using ipfs via the ipfs-http-client. Recently have cared less about this use-case.

The two pieces to be refactored to use libp2p directly are:

The peer monitor will benefit from being able to listen for updates instead of polling.

Replacing the Keychain with libp2p.keychain will nullify headaches around the KeyChain not being exported in the first place.

In the further future, will be able to use libp2p protocol handlers to build the BEC push replicator.

design and spec of live replicator

Spec:

hldb/specs#3

design overview:

join libp2p pubsub channel for finding live replicator peers for the database
- on peer join attempt to join a one-to-one purpose pubsub channel with them
- on peer join one-to-one purpose channel
on received heads advertisement
- if heads advert has unknown CIDs
  - request CIDs and traverse valid entries
- if heads advert CIDs are all known
  - implementation may choose to advertise local heads or not on that one-to-one channel
on change to replica heads:
- advertise to all open one-to-one pubsub channels

not sure if one-to-one channels should be used by multiple databases.

Blog basic usage tutorial

direct channel

Add implementation of Live Replicators Direct Channel from the Live Replicator Version 1 draft spec.

Prereq:

#26

Improve Tests

Separate unit and integration tests
Export test utils

Opal class makes ipfs.libp2p directly available

The IPFS api type says ipfs.libp2p is undefined so will need to check for libp2p, throw if actually undefined, then directly expose libp2p from opal with the correct type.

add benchmark for replication

Automated API docs

Add a script to automatically build API documentation.

Browser Support

Browser support has not been added yet.
It should be straight forward.
Not many Node APIs are in use currently.
Most of the work will be running tests in the browser and adding them to CI.

add live replicator (Libp2p pubsub + IPFS)

Add implementation of #17 to Opal replicators. Also requires building out the replicator handler.

add benchmark for writes

chore: cleanup scripts, filenames, types

Focus here is on cleaning up the package scripts, file structure and names, their defined types, and setting up linting again.

A lot of the scripts were copied from ipjs when the plan was to write everything in js with types in jsdoc. This has not forked, especially after the move to typescript.

remove unused package scripts
clean up src and test
- refine the types
- make type interfaces for manifest components
- ~~rename manifest component files to match their component.type names~~
- fix linter errors

add live replicator module (untested)

Add implementation of the Live Replicator protocol.

Status - Oct 2022

This month is focused on adding a live replicator, benchmarks, and the first published release (alpha).

Still need to finish local persistence for database from last month. Will try to finish before Monday.
I anticipate this area to be reworked a few times before finding a good, general solution for databases.

Going to try to have a replication demo by the 10th.

From last month:

#12

This month:

🎃

Add starter FAQ.md document

Status - Nov 2022

November focus is on building a Zzzync replicator, planned Opal tasks are a bit sparse.

IPFS Camp 22 was the end of last month and has set back development a bit. There are a few things to catch up on in November. Most things are small and can be done together in a day. Other things are larger but mostly finished like the draft spec. And the largest thing to catch up on will be the live replicator.

Tracklist:

🥧

Usage References

More usage references are necessary so people who want to can test out the project easily.

add pubsub peer monitor

The Shared Channel for the Live Replicator needs to be able to see peers joining and leaving. Since this is not provided by the libp2p API directly, pubsub.getSubscribers will need to be polled.

Blocks abstraction from ipfs.block api

Will be creating an abstraction for working with ipfs.block api. This api sounds like it would take and return Block instances but instead it takes byte arrays. The abstraction is needed to simplify specifying the cid version and byte encoding used so that the CID returned is correct. It will also make encoding and decoding easier by providing default codec and hasher options.

Status - Dec 2022

December focus is on heavy testing, perf and reliability, and then documentation and a beta release.

Tracklist:

🎁

add benchmark for traversal

Ease keeping version parity between IPFS/Libp2p packages

Sometimes there are version -> interface mismatches between installed ipfs and libp2p packages. Would be great to make keeping this parity easier.

Cleanup testing utilities

Right now the tests are kind of thrown together and pull in and use things like IPFS and Identity in a very messy way.

Plans:

provide preset IPFS configs
identity fixtures
storage/keychain fixtures
remove ipfs repo data, no reason to commit it.
remove identities/keychain saved data after testing

Design use of Storage Abstraction

When creating/opening a database a storage api will be provided. The api can be used by the database and its components to read/write persisted data. Each component will have control over its own datastore.

Benchmark Welo

It's important to write benchmarks to understand the capability of the system and to track the effect of changes.

Will add benchmarks with benchmark.js and track them with github-action-benchmark.

add benchmark for reads

NodeJS examples for read/write/replication

CI Github Action workflow runs Browser Tests

Keyvalue updates read and write to storage

Now that the replica/graph state is kept on disk the state of the database's replica is available immediately. The same needs to be done for the keyvalue stores index. When completed the index should be able to be queried immediately, as opposed to loading and processing the entries again.

Behavior:

read the store index on start without loading entries into memory
compute new index and write root hash to storage (not doing perf improvements yet)

rewrite manifest module registry

The registry contains components. The components are referenced by a key in a database manifest. If the components referenced by the manifest are registered, the database can be opened and SEC is ensured.

The plan is to use something like protocol ids for registration and resolution so that components can be upgraded since versioning is included. Also the prefix will make sure that components of different categories are not mixed up, e.g. a store component is not registered as an access component.

Browser tests

Set up testing for the browser

locally persisted databases

Add local persistence for databases. Allows for reopening and saving progress of databases on a machine.

Base Feature Set

Tracking base feature set for Opal's 1.0-beta release in December 2022.

Locally persisted Database replicas
Easy Custom Database States
Pubsub Heads Exchange Replicator
#65
~~IPLD Schema Validation (if ready in javascript)~~
Automated Release: ci, change log, and api docs

Opal Milestones:

Peer Monitor uses Libp2p api

Use Libp2p Pubsub api events to listen for peer joins.

Status - Sept 2022

Many of the project files have just been committed. Almost every source file has a unit tests for it.

Bi-directional iterative traversal of the merkle-dag has been implemented and tested thoroughly (although there is always more room for testing; especially crucial components like this). This is done by keeping all known entry cids inside of an adjacency list with incoming and outgoing links being tracked. Source files with respective unit tests:

graph -> test
traversal -> test

Two key features still need to be added. They are persistence and replication. This means that currently the utility of Opal currently is only mutating local states in memory. The mentioned two features are key and are the priority this month.

On adding persistence, the goal will be to also allow for opening databases in O(1) time this includes the time before being able to edit the database. To do this the states of the a few different components will need to be persisted, specifically:

index: the reduced state to allow for immediate reading of the current state of the database.
graph: an adjacency list which includes reverse resolutions for all links in the merkle-dag.

Unfortunately to this kind of usability will rely on more stateful-ness, at least as I understand the problem now. What is nice is that these states may be able to be verified as correct/up-to-date potentially by references hashes. In which case the components generating and editing those stats can be sure that everything is working.

Focus this month is building a solid project foundation:

live replicator message 1.0.0

Add implementation of Live Replicator Message from the Live Replicator Version 1 draft spec.

prep for adding features

Rework and polish some things before adding features like local database persistence.

Graph updates read and write to disk

Using https://github.com/rvagg/js-ipld-hashmap it is possible to create a persistent graph for the replica. With a persistent graph, databases can be opened immediately without needing to load data into memory.

refactors and cleanup

Replace src/storage with a util that just makes handling Datastores easier. (does the same thing but just as a util instead of a class.
Swap out node's EventEmitter with the standardized EventTarget

style: get linting working again

As a result of #3 linting has been broken.

Will look to find a solution that works with prettier

design and add replicator handler logic and api

A Replicator is a mod like Storage and Keychain. A Replicator will be handed to every database. Databases will start and stop the replicator instance.

code-coverag with c8

Use c8 and improve code coverage in tests

move to Typescript

All cypsela projects from now on will probably use typescript. It's becoming widely adopted and improves the process of writing javascript. Maybe types in javascript will be standardized in the future.

Iterative and Concurrent Traversal

The traverser needs to be turned into an async iterator.
The reason this hasn't been done yet is mainly time and difficulty.
The traverser needs to be able to traverse efficiently for ordered and unordered traversal
so there is no duplicate code.
Fetching and caching entries ahead of when they are needed would be nice.

IIRC the difficulty is doing this with the using the yield statement.
May need to make a custom AsyncGenerator

Traverser Function returns Async-Iterator
Replica.traverse returns an Async-Iterator
Store API consumes Async-Iterator
IPFS/Pubsub Heads Exchange Replicator consumes Async-Iterator

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.