farcasterxyz / hub-monorepo Goto Github PK

View Code? Open in Web Editor NEW

654.0 22.0 346.0 11.94 MB

Implementation of the Farcaster Hub specification and supporting libraries for building applications on Farcaster

Home Page: https://www.thehubble.xyz

License: MIT License

JavaScript 0.47% Shell 0.61% TypeScript 83.82% CSS 0.01% Rust 15.00% Makefile 0.02% Go 0.07%

farcaster

hub-monorepo's Introduction

Hubble Monorepo

This monorepo contains Hubble, an official Farcaster Hub implementation, and other packages used to communicate with Hubble.

Getting Started

To run Hubble, see the Hubble docs.
To use Hubble, see the hub-nodejs docs.
To use the HTTP API to read Hubble data, see the HTTP API docs

Packages

Package Name	Description
@farcaster/hubble	A Farcaster Hub implementation
@farcaster/hub-nodejs	A Node.js client library for Hubble
@farcaster/hub-web	A Browser client library for Hubble
@farcaster/core	Shared code between all packages

Contributing

Please see CONTRIBUTING.md

hub-monorepo's People

Contributors

Stargazers

Watchers

Forkers

benbrostoff markusbug koloz193 whatrocks gsgalloway pfletcherhill drshrey casualuser camarmstr zachterrell57 sagar-a16z shrimalmadhur alvesjtiago cassonmars jmoo sanjayprabhu yashkarthik web3galaxy i001962 raynos vaibhav-shah xjjda22 deodad kyleamathews timdaub adityapk00 avichalp djma gr7d fungible-systems maurelian covendev a747996286 werme rahuljayakrishnan pavan-nambi rajivpo kcchu prood outricked vinliao jcliff sigmoidfractal alexpaden payton bstchow 0xzoz michaelhly davidfurlong brucexc pangzhi mirceapasoi psatyajeet infokozes neynarxyz codewithmilo manan19 sljeff niranjans harryet cryptobenkei meyanis95 lesgreys branksypop gokalpkoc stephancill musnit philcockfield seearms blankerl pfista mattleesounds myz1237 98967eth parthkohli bhgomes razzle-code kengoldfarb kyranstar omahs sheldot shayzluf rysiman jetr1x silentnoname rozhnovskiyigor hellno jessandrich kislitsinsergey daddyunikii fedecastelli naratech-eng siddharthvader estfe thebursin1 beckerfelix gskril mboyle logosnodos mohsennazar

hub-monorepo's Issues

SignedMessages should have a network identifier

Each signed message should contain an identifier to indicate which network it was intended for. Roughly, we might have a mainnet, testnet and devnet though these names can change and more may be added later.

Update hashing to Blake2b, signing to Ed25519

The new specification requires Blake2B for hashing and Ed25519 for signing, while we currently use keccak and ECDSA. The goal of this ticket is to:

Update the hashing and signing schemes in the playground
If any special configurations are required for the scheme, specify them in the protocol

v2.1 Refactor

Refactor the current implementation to match the new v2.1 design doc.

Track follows in an LWW set

What is the feature you would like to implement?
Hubs should track "follows" which is an action that users should be able to perform

Why is this feature important?
We need this in v2 so that all applications can create and read follow relationships established in other applications.

Will the protocol spec need to be updated??
Section 4.2 will need to be updated with the specifics of the set and reconciliation

How should this feature be built? (optional)
TBD

feat: one custody address per fid

What is the feature you would like to implement?

Implement signer set data structure that is consistent with ID Registry (only one active custody address)
Update SignerAdd message to be one-directional (custody address authorizes delegate, delegate does not need to authorize custody address)
Delete CustodyRemoveAll message type
Implement "optimistic signer set" and/or "signer garbage collection" from this doc

Why is this feature important?

Hub's signer set should be consistent with ID Registry
Having multiple legitimate custody address complicates the signer set and is hard to explain

Handle merge conflict with identical roots

When two roots are received from the same user pointing to the same ethereum block in the rootBlock value, the design doc says that we should keep the roots with the lowest lexicographical order. This method was chosen because it allows a deterministic way to resolve conflicts that can be performed independently by each node.

A good solution would:

Choose an algorithm for lexicographical ordering of hash values
Implement the logic to order it in the addRoot function of the engine
Write tests to validate the behavior.

The important part here is that we have defined lexicographical in a formal way such that anyone implementing this in another language knows exactly how the comparison function should work, and even knows what tests to run against such an algorithm to ensure that it is working correctly.

Update signing and message validation to support personal_sign for custody signers

Custody signers are Ethereum addresses, so we should treat them as Ethereum wallets that sign messages using EIP 191 version 0x45 rather than raw secp256k1 private keys.

This change includes:

Add enum types for the signature algorithms
Add signatureType to Message type
Set signatureType whenever we create a message using new enum type
Update factories to set signatureType
Add signatureType for EIP 191 personal_sign
Whenever a custody address signs a message, use ethers signMessage to be compatible with other wallets
Update validateMessage method to use ethers verifyMessage when signatureType is eip-191-0x45 (or whatever we want to call it)
Update externalSignatureType for verifications to be consistent with whatever signatureType label we use

bug: consistency issues with messages in cast threads

Problem

Take a cast thread that has 4 layers: OP, child, gchild, ggchild. Each child cast is linked to its parent which eventually connects back to the OP. This creates a couple of problems:

If the child cast goes missing from the network then the gchild and the ggchild are left in a dangling state and can never be associated back to the thread.
If the child was written by bob, and the node has chosen not to subscribe to bob, it could not construct the thread without fetching bob's child message from his server.

Ideally, clients would always be able to reconstruct the thread and show missing messages accordingly.

Proposed Solution

We introduce a new cast sub type (reply cast) that must contain:

A URI reference to the parent (present today)
A URI reference to the thread OP
A generation clock which increments with every generation (child = 1, gchild = 2 etc.)

Nodes currently merge a thread sequentially from OP to ggchild, they can now do it in parallel after ingesting the OP. They can also skip messages from users they don't subscribe to. Our data storage requirements go up 48 bytes a cast and we have a few more checks in our merge set.

Counter Proposal A

The alternative is to stick with the current approach where he gchild if the child cannot be provided. Messages in a thread will always propagate in a consistent state or not at all, at the expense of making the sync process less parallelizable within threads. This solves (1) and possible even reduces the occurrence of inconsistency issues which would be helpful for clients.

It does not solve (2) since bob's message would never be ingested. In this case, the node sync would have to specially fetch bob's message using the URI provided in order to construct the thread. It does not need to retain the whole message once verified, just enough to construct the thread.

The main argument against this approach is that it forces us to choose between no sharding (i.e. you can't ignore bob) which is undesirable for scaling or having dependencies on messages from users nodes don't care about, which can create more choke points when syncing data.

feat: GraphQL API

What is the feature you would like to implement?
Update existing query API for engine and add GraphQL API

Why is this feature important?
The engine right now exposes arbitrary lookup methods for retrieving data from sets. These should be standardized, updated to query data across sets, and also be compatible with GraphQL.

Will the protocol spec need to be updated??
We should either update the protocol spec to include the query API spec or add documentation to the hub repo.

Additional context
This notion doc has more context: https://www.notion.so/Hub-Client-API-s-1e0ce692d08c40fbb234c6ec4b5a88aa

Ingest contract events from ethereum nodes

Hubs should ingest events from the IDRegistry and NameRegistry contracts via a JSON-RPC endpoint and populate the SignerSets correctly.

Discussion: Network of Hubs with a Simple Sync Mechanism

This is a high level overview of how a network of Hubs is expected to work with a Simple Sync mechanism.

Simple Sync here refers to #106, i.e a bulk download of all data from another Hub.

Anatomy of a Hub

Hub is a collection of services that together perform the work of synchronizing and maintaining Farcaster Data.
For now the services we're interested in are:

Engine: Hubs will have an Engine that can process and store FC messages
JSON RPC Server: A RPC server that can share data about the Hub or what's stored in its Engine
Libp2p Node: A Libp2p node that sends and receives Messages over the Gossipsub network. Messages received from a client are sent to the network, while Messages received from the network are replayed into the Engine.

Network Protocols and Transports

A quick overview of the network protocols available and what they'll be used for.

Libp2p & GossipSub

Hubs will use Libp2p to establish a pubsub mesh between each other.
This becomes the primary mechanism through which all Messages that originate from Hub clients are propagated between Hubs.

Example - A newly created Cast from a user is submitted from the FC Client they use to their configured Hub, which will then publish that Message to the network. Libp2p is then responsible for making sure that message is delivered to all Hubs.

JSON RPC

Hubs can request data from each other more explicitly using the JSON RPC.
Simple Sync will be implemented using a series of JSON RPC calls.

Running a Hub

Bootstrapping

Bootstrapping a Hub to an existing network requires 2 things.

Simple Sync to set up the Engine from another Hub
a. This is the first thing a Hub does on boot up
b. The Engine is set up by replaying all the messages received from the other Hub
A Libp2p node to receive new Messages from other Hubs on the network
a. While the Engine is being synced, Messages received are buffered
b. Once Simple Sync is complete, the buffered Messages are replayed into the Engine

From this point we expect the Hub to be in sync with the rest of the Hubs on the network.

Runtime

During normal operation, Hubs only rely on Libp2p to deliver new messages and keep them in sync.

Liveness

If a Hub process is restarted, the whole Bootstrapping process must be restarted as well.
This is not great and a huge limitation of this Sync mechanism.
If a Hub loses network connectivity entirely for:
a. A short duration (5s): GossipSub will republish messages to the Hub. This is configurable.
b. A long duration: We will need to rerun bootstrap.

Gaps

Divergence in state due to Network loss or GossipSub failure

This is the primary limitation of the current proposal. Hubs have no way to know that they're going out of sync from each other.

Some things we could do in the near term:

To Gossip, we could add the total number of messages or total number of messages of this type (Casts). This would give Hubs some indication of ongoing divergence of state. After some amount of delta, we can trigger a full re-sync.
We could periodically download entire sets from another Hub. (expensive)

feat(utils): Generate a large number of mock events to populate an Engine

What is the feature you would like to implement?
A utility function that populates an engine with many users' casts, reactions, follows etc

Why is this feature important?
This is needed to build a good test for the simple Sync mechanism described here #102
Will the protocol spec need to be updated??
No

How should this feature be built? (optional)
Need to mock many users and generate events for each type, per user.

feat: denoting fid mentions clearly in cast text

What is the feature you would like to implement?
A distinct way to format a fid mention within a cast body text so that it can be parsed by clients

Why is this feature important?
When a user mentions someone, the protocol specifies that their fid should be tagged (e.g. 91293). But we need a special way to mark fids so that they do not collide with numbers or with fnames which can technical be numbers as well.

Will the protocol spec need to be updated??
Yes, we will need to update the short cast part of the specification

How should this feature be built? (optional)
TBD

Additional context
Add any other context or screenshots about the feature request here.

Use fid instead of username

Right now, the codebase uses username to refer to an account and includes usernames in messages. We should use fid: number instead and remove any reference to usernames.

Refactor sets to have idempotent merges

One attribute of CRDTs is that they have idempotent merge functions. Right now, in the cast set, reaction set, and verification set, we throw an error in cases where we should no-op (i.e. when a message already exists or has already been overwritten).

TODO for this issue:

Refactor add and remove methods (or equivalent methods) to no-op when valid messages cause no side effects
[Bonus] Update our return logic to provide result statuses that differentiate between no-ops and successful updates

Update yarn packages (Aug 2022)

Many of our packages are now a major version behind - we should try to get on the latest versions of all of them

➜  hub git:(main) yarn outdated

Package                          Current Wanted  Latest Package Type    URL
@noble/ed25519                   1.6.0   1.6.1   1.6.1  dependencies    https://paulmillr.com/noble/
@types/faker                     5.5.9   5.5.9   6.6.9  devDependencies https://github.com/DefinitelyTyped/DefinitelyTyped/tree/master/types/faker
@types/jest                      27.5.2  27.5.2  28.1.7 devDependencies https://github.com/DefinitelyTyped/DefinitelyTyped/tree/master/types/jest
@types/node                      17.0.45 17.0.45 18.7.6 devDependencies https://github.com/DefinitelyTyped/DefinitelyTyped/tree/master/types/node
@typescript-eslint/eslint-plugin 5.30.0  5.33.1  5.33.1 devDependencies https://github.com/typescript-eslint/typescript-eslint#readme
@typescript-eslint/parser        5.30.0  5.33.1  5.33.1 devDependencies https://github.com/typescript-eslint/typescript-eslint#readme
eslint                           8.18.0  8.22.0  8.22.0 devDependencies https://eslint.org
ethereum-cryptography            1.1.0   1.1.2   1.1.2  dependencies    https://github.com/ethereum/js-ethereum-cryptography#readme
faker                            5.5.3   5.5.3   6.6.6  dependencies    https://github.com/Marak/Faker.js#readme
husky                            7.0.4   7.0.4   8.0.1  devDependencies https://typicode.github.io/husky
jest                             27.5.1  27.5.1  28.1.3 devDependencies https://jestjs.io/
lint-staged                      12.5.0  12.5.0  13.0.3 devDependencies https://github.com/okonet/lint-staged#readme
log-update                       4.0.0   4.0.0   5.0.1  dependencies    https://github.com/sindresorhus/log-update#readme
neverthrow                       4.3.1   4.4.2   5.0.0  dependencies    https://github.com/supermacro/neverthrow#readme
nodemon                          2.0.18  2.0.19  2.0.19 devDependencies https://nodemon.io
ts-jest                          27.1.5  27.1.5  28.0.8 devDependencies https://kulshekhar.github.io/ts-jest
ts-node                          10.8.1  10.9.1  10.9.1 devDependencies https://typestrong.org/ts-node

feat: hubs events API for tracking set mutations

Overview and background

Add an API to the hubs to enable clients to stream events and replicate hub state. A client should be able to connect to a hub and receive updates when new messages are merged into the hub's sets (collection of CRDTs that represent the hub's state). The goal is to enable state replication without clients having to implement their own CRDTs.

This feature is important for launching the hub's v2.0.0 milestone, because Merkle will need to run a hub and its backend simultaneously and will need this events API to keep its database in sync with the hub. We expect most hub operators to use this feature to index hub data for easier querying.

Hub state

Hub state is represented by a collection of CRDTs that we call sets:

CastSet - short casts and recasts
FollowSet - user-to-user follow actions
ReactionSet - reactions to casts (i.e. likes)
SignerSet - custody addresses (from IDRegistry events) and delegate signers (Eddsa keys authorized to sign messages on behalf of a user)
VerificationSet - two-way authorizations between users and external entities (i.e. an Ethereum address)

Each set accepts particular message types that mutate its state when merged:

CastSet - CastShort, CastRecast, CastRemove
FollowSet - FollowAdd, FollowRemove
ReactionSet - ReactionAdd, ReactionRemove
SignerSet - IDRegistryEvent (from IDRegistry contract), SignerAdd, SignerRemove
VerificationSet - VerificationEthereumAddress, VerificationRemove

Sets accept messages in any order and multiple times, but we only want to publish events when state changes. Here are examples of expected behavior:

A CastShort message is received that the CastSet has not seen before and hasn't already been removed. The CastSet merges in the new message and publishes a "new cast" event with the message.
A FollowAdd message is received that the FollowSet has already removed via a FollowRemove message (the FollowSet is a LWW set, so the remove message has to have a timestamp after the add message). The FollowSet discards the new message and does not publish any events.
A ReactionAdd message is received that the ReactionSet has already merged in. The ReactionSet discards the duplicate message and does not publish any events.

Work included in this ticket

Gather list of events from all hub sets that can could be used to replicate hub state (i.e. cast added, cast removed, reaction added, signer added, etc)
Design our own spec for subscriptions (via RPC or websocket) or integrate something like graphql-ws
Investigate whether/how the events API design is influenced by the choice of a persistence layer for hubs (i.e. leveldb/rocksdb vs postgres)
Decide how the events API will relate to the JSON-RPC API (i.e. should we add a JSON-RPC endpoint for retrieving all historical events?)
Decide how the events API will relate to the hub-to-hub sync protocol
Gather list of edge cases that could cause a service watching these events to get out of sync (i.e. dropped events, etc) with the hubs and address them

Similar work

Deliverable

When Merkle runs a hub (we will have to implement #74 beforehand), they should be able to implement a service that opens a websocket connection (or another protocol) via a new endpoint and receives events from the hub. After the hub merges some list of messages (we can agree on a test case), the service should have received a single event for each message that was merged.

Refactor verification tests to be faster

Currently engine.verification.test.ts and verificationSet.test.ts are the slowest test suites (both ~7s). Speed up those test suites to be equivalent time to others by moving async code into beforeAll block and sharing generated messages between tests.

Farcaster URI specification should be enforce in recasts, verification claims

The Cast URI and Ethereum Address URI in recasts and verification claims should be validated using our URI validator and adhere to the URI specification.

feature: data structure to mark user mentions in casts

implementation ticket for farcasterxyz/protocol#30

feat: replace schema with message type

What is the feature you would like to implement?
Replace schema inside message body with a type field inside message data. The type should be a 16-bit number.

Why is this feature important?
Right now message type is inferred from the format of the message and the schema field. We should be able to quickly and more directly parse a message and know what set to route it to for merging.

Will the protocol spec need to be updated??
Yes, the protocol spec should include the table of supported message types with each version.

How should this feature be built? (optional)
Here is an initial list of message types:

CastShort
CastRecast
CastRemove
ReactionAdd
ReactionRemove
FollowAdd
FollowRemove
VerificationEthereumAddress
VerificationRemove
SignerAdd
SignerRemove

And here are potential future message types:

Metadata
CastLong
CastPoll
CastEvent
CastRSVP
GroupInviteAdd
GroupInviteGrant
GroupInviteAccept
VerificationBitcoinAddress
VerificationTwitter
VerificationDomain

Add remove custody address ability to signer set

Right now the signer set only accepts new custody addresses (via addCustody). There needs to be a way to remove custody addresses as well.

When a custody address is removed, all delegate signers of that custody address should be removed as well.

Scenario Simulator

Write a scenario simulator that can put through the nodes through the following situations and assert that the consensus algorithm behaves as expected in every case.

Simple Consensus
Set up 5 clients and 5 nodes. Have each client generate 5 sets with 5 messages each, where each set has a new root block and a new signer. For this phase, we assume that all signers are always valid. Then:

Broadcast all messages serially
Broadcast messages in random order
Broadcast messages to multiple nodes simultaneously
Broadcast messages in random order with randomized delays

Split Brains
Separate two nodes into a “split brain” and then:

Have 2 client send messages to one brain and 3 clients to the other and then heal the partition
Have each client send messages alternately to each split and then heal the partition.

Separate Registry
Now introduce the registry contract which progressively moves signers forward on command and then:

Broadcast messages, but don’t move the signers forward
Broadcast messages, but only send signer events to some nodes, and then to other nodes after a long delay.
Broadcast messages, but skip some signer events
Simulate a rollback of a signer change

Fuzz Testing
Finally, let’s take a fuzz testing approach where we through the kitchen sink. Develop a client that will generate messages in random order (root + n messages of arbitrary type and repeat) and that will call the registry and change signers at arbitrary periods.

Keep track of all messages being generated in a log file
After running the script for an hour compare both and see if they line up.
Randomize temporary network partitions

feat: gRPC API

What is the feature you would like to implement?
Add methods to JSON-RPC API for clients to query the hubs in a standardized way.

Why is this feature important?
Right now our sets expose various lookup functions (i.e. getAllHashes or lookup). These should be standardized and implemented in the JSON-RPC server.

Additional context
See this notion doc for some extra context: https://www.notion.so/Hub-Client-API-s-1e0ce692d08c40fbb234c6ec4b5a88aa

Handling text when receiving CastDelete messages

One of the requirements of the delete operation is that the text from the original cast be removed from the node. Since the delete operation will often be received before the cast due to our reverse chronological sync, it is important that we check the insertion of every cast efficiently and ensure that it's text is removed if a delete operation was applied later in the chain. This might require us to maintain a list of "unmatched deletes" which every new cast is checked against.

Add new signer set to engine

Store map of signer sets under _signers attribute
Add methods to pass messages to signer set merge from engine

chore: review message formats and names

Once all changes are in and the release is ready to be cut, do another pass of all the message formats and names before finalizing them.

Strict Mode for Clients

When receiving messages from a client, nodes can be much stricter with what they accept. For instance, they can reject conflicting messages without generating a conflict proof, because they can prevent it from entering the network. This is not urgent for Phase I, but should be implemented before a launch.

Replace existing "signer change" code with signer set

Remove addSignerChange method from engine
Replace use of _users with _signers
Update check for signer changes to be check for signer in signer set (i.e. get rid of signerForBlock)

feat: basic run script and code organization

Write new entrypoint for running hub after simulator is deprecated
Decide difference between client, node, farcasterNode, engine

Use enums for hash algorithm and use them in all messages

Currently the Message type has hash as a field. According to the spec, messages should also have hashType in order to eventually allow us to change hashing algorithms gracefully.

For this ticket:

Add enum type for the hash algorithms (see this thread from the signer set PR)
Add hashType to Message type
Set hashType whenever we create a message using new enum types
Update factories to set hashType
Update validateMessage in engine to fail when hash type is invalid

Add blockhash to Verification Claim

All Ethereum address verification claims should include a blockHash from the most recent Ethereum block.

The blockhash helps order verifications which is useful when the same address is found to be verified to multiple Farcaster ID's. Ordering allows applications to resolve conflicts easily, and lets users nullify older verifications by issuing new ones with a more recent blockhash. The protocol itself does not verify or validate the blockHash's, that's entirely up to client applications.

feat: hub should implement a binary protocol to communicate with each other

Hubs should be able to encode and pass messages via a highly efficient transport protocol like protobuf, flatbuffers or capnproto

feat: farcaster npm package

What is the feature you would like to implement?

Include types and message validation into sharable npm package for clients to use
Integrate package into hub codebase to avoid duplicating code
Use typescript, don't need to support other languages

Why is this feature important?

Clients shouldn't have to write their own wrappers to interact with hubs

Will the protocol spec need to be updated??

bug: dropped blockchain events can lead to divergent network state

There is a race condition with the current implementation of the Engine which can lead to a divergent network state among nodes breaking the principle of eventual consistency. The problem goes like this:

At block 100, @alice changes her signer to 0xf00
At block 110, @alice changes her signer to 0xbar
Node A receives the signer change 0xf00 but not the one to 0xbar
Node B receives both signer changes
Alice broadcasts a new chain with a root set to 120, but accidentally signs it with 0xf00 (invalid root)
Node A accepts this message from Alice, but Node B rejects it
Later, Node A receives the signer change

At this point, Node A has one valid chain more than Node B and the nodes will never converge. This is a likely scenario since most blockchain event subscriptions are done over websockets, which do not have strong delivery guarantees and can drop events.

Proposed Solution

Since we cannot rely on websockets, we have to implement a polling mechanism to ensure that we collect events even if the websocket fails. A pseudo-code implementation of the polling algorithm with be:

Set lastPolledBlock to 0
Ask the ETH node for lastKnownBlock
Ask the ETH node for all events between lastPolledBlock and (lastKnownBlock - 3) (for safety)
Set lastPolledBlock to lastKnownBlock - 3
Repeat 2 - 4 as needed

We don't need to simulate this exactly, but we should assume that the engine may receive events out of order and so we should be able to handle that gracefully. There are two approaches:

Accept First, Reject later -- in this approach, the engine will accept any signed chain that appears valid, but when it receives a new event that might change the validity of a previous chain it re-examines all of them and drops invalid changes. This is simple to implement, but leads to more state reversal which can cause further complexity for clients and entities interacting with nodes.
Reject First, Accept Later -- in this approach, the engine keeps track of the lastPolledBlock and only accepts signed chains that are signed with blocks that are equal to or below this. This means that nodes will reject some number of messages if they are received too early. This slows down eventual consistency, but avoids state reversal meaning that it is very unlikely that a chain will be accepted and rejected in the future. The other downside of this approach is that it exposes the engine to more state.

Next Steps:

Decide which approach to take
Implement the logic in the addRoot and addSignerChange methods and write unit tests to cover these cases

Add SignerAdd and SignerRemove validation to engine

Custom validation for SignerAdd

childSignatureType is valid
edgeHash is a valid hash of the reconstructed SignerEdge object
childSignature is valid signature of edgeHash by childKey
All normal message validation of hash, signature, etc

Integrate verifications with all simulators

Right now verifications are only integrated into BasicSimulator. Add verification messages to ChaosSimulator, SplitBrainSimulator, and SplitBrainAltSimulator.

feat: hubs should be able to peer with a pre-defined list

Hubs should be able to sync with a pre-defined, hardcoded list of peers

The long term solution is to have a more dynamic peering discovery system but that is out of scope for v2.0.

chore: test coverage

Implement coveralls/codecov for monitoring test coverage for every PR
Add test coverage requirements to PR merge checklist
Do a final review of all uncovered lines before releasing 2.0.0

Containerize the hubs

Hubs should be executable as a docker container that anyone can download

This is important to make it easy for users to run this as a service. The right long term solution is to turn this into a self-contained binary but for now we're ok cutting corners if it means shipping a release faster.

feat: handle conflict between two CastRemove messages with the same target

What is the feature you would like to implement?
Use timestamp or message hash to pick a winner between two CastRemove messages that reference the same target. Right now CastSets could get out of sync because they will store the one that they received first.

Why is this feature important?
Set convergence

feat: persistent storage for hub data

Hub data should be persisted to a reliable data store like Postgres, Redis or equivalent.

This is important for v2 because the Hub needs to be run as a service in the cloud that can be restarted. The correct long term design might be to use a low-level kv-store like rocksdb or levelsdb, but we're ok cutting corners a bit to ship quickly and then refactor it on later. It's important that we use a simple querying and table layout so that we can migrate later when necessary.

feat(sync): Implement a simple Engine sync over RPC

What is the feature you would like to implement?
Need to be able to sync data between Engines over RPC

Why is this feature important?
This is a simple Sync mechanism that copies everything from another Hub to the local Engine.

Will the protocol spec need to be updated??
No, this is an early checkpoint in the Sync protocol

How should this feature be built? (optional)
We already have RPC APIs for each of the sets.

Simple Sync will do the follow:

RPC all the Fids known by the peer
RPC all the Messages per Set (CastSet/ReactionSet/FollowSet etc)
Merge all the messages into the local Engine

Deprecate simulator

Remove simulator from the codebase to speed up phase 1 and 2 development. A network simulator will be re-introduced in phase 3, either in this repo or another simulation-focused one.

feat: Implement a JSON RPC server and client

What is the feature you would like to implement?
An RPC server/client that will allow Hubs to share data.

Why is this feature important?
Hubs need to serve data to each other and to clients

Will the protocol spec need to be updated??
No

How should this feature be built? (optional)
Using the jayson package

Protoype Identity System

Goal

Build a working prototype of the Identity contracts in Solidity to estimate gas costs and identify any obvious shortcomings in the design. The design for this proposal is being discussed in this Notion doc.

In-place console updates for Simulator

When the simulator is run, we want to track two things -- the current state of each node (the network table) and the stream of events happening (the log lines). In the diagram below, you'll see that we're printing both sequentially which creates a lot of thrash.

It would be a huge improvement if the network state kept updating at the top of the screen in place, and we had a few lines below where we could watch the logs scrolling by.

Replace emoji field in Reaction with types (and write tests)
ReactionSet CRDT to sync reactions and write tests.
Engine has methods to get and set reactions.
Node syncs reactions between engines when sync function is called.
Update the simulator to broadcast and visualize reactions.
Update the protocol documentation with any changes.

farcasterxyz / hub-monorepo Goto Github PK

hub-monorepo's Introduction

Hubble Monorepo

Getting Started

Packages

Contributing

hub-monorepo's People

Contributors

Stargazers

Watchers

Forkers

hub-monorepo's Issues

Problem

Proposed Solution

Counter Proposal A

Anatomy of a Hub

Network Protocols and Transports

Libp2p & GossipSub

JSON RPC

Running a Hub

Bootstrapping

Runtime

Liveness

Gaps

Divergence in state due to Network loss or GossipSub failure

Overview and background

Hub state

Work included in this ticket

Similar work

Deliverable

Proposed Solution

Goal

Recommend Projects

Recommend Topics

Recommend Org