celestiaorg / celestia-node Goto Github PK

View Code? Open in Web Editor NEW

911.0 33.0 900.0 88.05 MB

Celestia Data Availability Nodes

License: Apache License 2.0

Makefile 0.74% Go 99.00% Dockerfile 0.12% Shell 0.14%

data-availability data-availability-sampling celestia

celestia-node's People

Contributors

Stargazers

Watchers

Forkers

renaynay bidon15 rach-id vgonkivs orlandorode97 tzdybal jenyasd209 jbowen93 0xmaverick rajivpo fulu666 atoulme cemozerr plaintext-capital smolnba pmondal08 farukyasar ddim77 jesuspc rectoup johnny-ruz yura2434 rootulp toanngosy evan-forbes tumelo-mapheto kingday599 fl4zyyy sxmixklink pukikikiki noddenet duong2024 ihormicu jlbit phule999 daniljpg yevgenmachynskyi zssbecker redk4n nguyentien91 zavolli alphena-ek sagalasso eaddaa mmc6185 0xyicheng ruesandora akyoldogu mahofmahof sysrex cevizlee dibekif deliricee djontravolta zyjblockchain testnetdeneme0 allgorithma galaxynode distractedm1nd masteryarik legg thamdestech okannako madexel omercinar300 unfatalih xiongdao1 axhakan dfinfty1 1000dolar ytochkaaa tlanderon orangebobby nihilsupernum alysanyan webnetjoker villarael freytara uthersunlighter mitchelnode koltigin walldiss musoooobooooo captainfantastic123 ligulfzhou michaelcutro fcpidz213 handmadesoc martichou 0xjepsen ohhkaneda nitinmewar b0dh1i vere-studios mkznnwb504eb2 wbzhbxr494eb1 naina16981 alipostaci2001 rahulghangas hoytren

celestia-node's Issues

node: Define a Node Types/Modes

Context

During sync discussions, we've touched upon the need for Nodes to differentiate peers by their Mode/Types. To accomplish this we need to concretely define those types.

Implementation

// node.Type defines a type for Nodes to be differentiated.
type Type uint8

const (
	// node.Full is a full-featured Celestia Node.
	Full Type = iota + 1
	// node.Light is a stripped-down Celestia Node which aims to be lightweight while preserving highest possible
	// security guarantees.
	Light
)

// String converts Type to its string representation.
func (t Type) String() string {
	if !t.IsValid() {
		return "unknown"
	}
	return typeToString[t]
}

// IsValid reports whenever the Type is valid.
func (t Type) IsValid() bool {
	_, ok := typeToString[t]
	return ok
}

// typeToString keeps string representations of all valid Types.
var typeToString = map[Type]string{
	Full:  "Full",
	Light: "Light",
}

NOTE: Instead of Type there could be Mode

Decouple Providing from `PutBlock`

Summary

Move providing outside of the PutBlock function and into its own.

Details

Only the data availability header and access to the context routing IPFS api is needed in order to provide to the IPFS DHT, so it is possible to decouple providing from the PutBlock. This gives us the advantage of having more control over when we provide and when we do not. This has already been done in the PR that inspired this issue, celestiaorg/celestia-core#427, but should likely be decoupled from that PR into its own.

Action Items

create a new Provide function
remove providing from PutBlock

References

this would make celestiaorg/celestia-core#374 cleaner, as no providing should be performed when saving and loading block data to the local IPFS store.
natural continuation of celestiaorg/celestia-core#375
would eventually help with celestiaorg/celestia-core#423
Already implemented in celestiaorg/celestia-core#427

ipfs: consider ipld-prime and ipld-selectors

Another interesting suggestion brought up by @Wondertan is to use ipld-prime. Particularly, the IPLD selectors look extremely cool and relevant to what we are doing. E.g. they could be used to download a range of leafs that belong to a particular namespace at once (#221).

Note: If it turns out much simpler to use ipld-selectors instead of manually adding logic to traverse the NMT for #5, we should prioritize this issue. Let's keep in mind though that ipld-prime is a relatively young project and it might be better to wait until it matures and stabilizes a bit further.

implementation: https://github.com/ipld/go-ipld-prime
selectors: https://github.com/ipld/specs/blob/master/selectors/selectors.md
selectors package: https://github.com/ipld/go-ipld-prime/tree/master/traversal/selector
blog-post with examples: https://www.memoryandthought.me/go,/ipld/2021/01/23/go-ipld-prime-intro.html

build: Network Types

Goal

We should have software-level support for different network types, e.g. celestia-devent or celestia-mainnet

Current solution

In upcoming #57 this is solved by simply extending Config with a network type field.

Desired solution

The network type should be hardcoded in binary, as different network types usually run over different software versions, and allowing users to manually change network type is a source of bugs coming from users joining a network with a software version we were not expecting.

ci: Create pre-generated templates for issue types

We should be prepared for the floodgate of opened issues opened by us and others as the development progresses after time

I do like the idea what tendermint has in its main repo and it will help us to automate the triaging process for the future tickets
Here is the list of initial templates(could be more in the future if needed) :

Bug
Feature Request
General Questions
Change Request for existing feature
Others

Detailed information for each of the point:
No. 1 - Similar to tendermint's bug template
No. 2 - If the user can see that a new feature is beneficial for the product, he/she/they can propose it here
No. 3 - StackOverflow like questions specific to the repo (can be moved to another place if neccessary)
No. 4 - If the user finds out that the existing feature should be reworked/changed/improved using lib/architecture, then he/she/they can do it here
No. 5 - if none of 4 above fits the user's need

For now, it seems like No. 2 and No. 4 can be merged into 1 template. Nevertheless, as the product grows, it will be hard and intense in manual human-labour for the team/contributors to triage what requests are "freshly new features" and what requests are "outdated/should be updated"

shares: Implement the `SharesService`

Implement SharesService in package shares.

Implement:

type Share
Constructor function for SharesService

The SharesService contains the following methods / functionality:

Start
Stop

For ShareExchange, a light node only requires the ability to request shares and perform sampling (random share requests across random peers), and store the shares they receive, while a full node should be able to serve those share requests as well as perform requests.

ShareStore only applies to light nodes as full nodes store full blocks, so if the node is a full node, ShareStore should be disabled.

type Service interface {
	GetShare(ctx context.Context, dah header.DataAvailabilityHeader, row, col int) (Share, error)
	GetShares(context.Context, DataAvailabilityHeader) ([]Shares, error)
	GetSharesByNamespace(context.Context, DataAvailabilityHeader, namespace.ID) ([]Shares, error)
	
	Start(context.Context) error
	Stop(context.Context) error
}

GetSharesByNamespace depend on #5

node: embed `Core` node process into `Full` node

We need a way to be able to embed a Core node process into the Full node so we don't have to start a Core node separately and pass in its RPC endpoint to the celestia node on initialisation. Ideally, via either the Node or the rpc component, we can control the lifecycle for the Core node (Start/Stop).

This would not only help with testing, but is the desired end-goal for devnet (is to default run a "trusted" Core node alongside a Full node). This will of course be optional as we also have the goal to implement Full <> Full communication instead of only relying on a trusted Core node, but for now, our source of NewBlocks is from Core nodes and we must have some way of spinning them both up simultaneously with as little configuration on the user-side as possible.

state: Implement ability to request account balance info from Celestia Core node

Implement the ability for a Celestia light or full node to request account information from a Celestia Core node via gRPC for the purposes of submitting a transaction.

Optional: implement SubmitTx throttling for DoS protection for the trusted Celestia Core node.

rpc: Implement an RPC endpoint for users to submit transactions to Celestia Node

Implement an RPC server in package rpc that will handle inbound requests from users to submit transactions. This feature should be available for both light and full nodes.

Implement simple server that listens on a default port
Add a config parameter to change the port
Handler for requests on endpoint /submit_tx -- should trigger an AccountQuery in the node to get the state for given account in order to submit a tx
- optional, implement /submit_tx_sync which would be blocking, vs /submit_tx_async

node: Implement node structure

Implement a basic Node structure that contains node-specific information, for example:

config
libp2p host (up to @Wondertan how he wants to implement it)
registeredServices

The following should be implemented:

type Node
NewNode( config )
Node.Start()
Node.Stop()
Node.RegisterService() (this will be called from within the specific service's constructor)

ci: Setup CI automation through GH actions

Basic go linting / testing pipeline. Will add to it later when we need.

govet
golint
gofmt
gotest

ipld: Implement API for application sovereignty: query data per namespace

Summary

Application Clients (e.g. from ORUs) need a way to simply download all data specific to their namespace (application sovereignty).

Proposal

Add a simple library that traverses the NMT and returns all data of the requested namespace ID.

It might be valuable to consider adding an RPC endpoint for this too. E.g. for block explorers who want to serve Txs of certain applications (or ORU chains), too.

[optimization] Use bytes slice pool for allocations

Context

We allocate plain byte slices extensively throughout the project repos and after some work with them discard allocated slices causing GC to clean them up. In some hot paths, like whole block processing flow, this surely causes additional pressure on both allocator and GC. Fortunately, there is a relatively simple trick to avoid that and reduce stress in general for all data allocations by reusing fixed-sized allocated buffers through sync.Pool as basic primitive, though we can rely on already existing libs with simple APIs not to reinvent the wheel.

Expected Gains

Stop-the-world takes much less CPU time.
Much less RAM usage for the long run execution, but in some cases can be observed in some e2e testing as well.
Less pressure on allocators, as mentioned already.

I would like to provide real numbers, but this is very application-specific, so for us, they could be very different. For my previous project, htop showed almost 2x RAM usage reduction after applying this to multiple places.

Options:

https://github.com/libp2p/go-buffer-pool

Has fewer stars and less used within Go comm, but has a smarter API. Actually, we already using it as we rely on IPFS/libp2p, thus using it directly would allow us to share allocated bytes within IPFS and LL code, what is generally good. I would just stick with this one.

https://github.com/valyala/bytebufferpool

From the gopher who created fasthttp. BTW, he is from Kyiv and I know him personally. He is obsessed with optimizations.

Places to be altered

Rsmt2d
NMT
Un/marshalling of core data types
Reactor's msg un/marshalling

pls add more

config: Implement node configuration functionality

Implement node configuration functionality in a separate config package or under the node package (I have no preference):

Default config that is generated on ./node init
Take in config TOML file and parse it into Config structure (we will be using TOML as that is what Tendermint uses)
Methods to retrieve information in a nicer ("pretty") way from the config if necessary
Methods to retrieve private fields / details from config file if necessary

Data Availability(Bad Encoding) Fraud Proofs

Another kind of fraud proofs that we need to implement, are data availability fraud proofs: nodes need to be able to generate proofs in case they observed an invalid erasure coding.

Details are laid out in:

Ideally, the implementation would be accompanied with a brief ADR.

Tasks

Add BadEncodingFraudProof, ShareProof, and NamespaceMerkleTreeInclusionProof protobuf messages.
Add stubs in node to produce and verify BEFPs. (#263)
Implement local production and verification of BEFPs. (Draft implementation celestiaorg/celestia-core#513)
Implement passing around BEFPs in the network.
Define node behavior ( #413)

ci: improve security (github actions)

Make sure github actions does not have admin rights over this repo (or any celestiaorg repos for that matter).

Improve RetrieveBlockData

Summary

Current RetrieveBlockData implementation spawns a goroutine per each share in ExtendedDataSquare. In the worst case with the max block 128X128 that is 16384 goroutines. This causes the race detector to stop working in tests, as it has a limit to 8k routines. Furthermore, this a lot of routines for a single operation, and in fact most of the time they are idle.

Why routines are idle. Shares in a block are addressed and committed with multiple Merkle Trees, so when we request a whole block we need to walk through all the trees. The walking step is: request blob of data by its hash(1), unpack block(2), check if we unpacked more hashes or a share(3), and proceed with unpacked hashes onto the next step recursively(3). Those steps are executed until no more hashes left and we have all the shares. From this, we can see that every walking step is a full roundtrip - network request, unpack, request again, and so on. Now, imagine those 16384 goroutines which walk down the same trees. In practice, those routines mostly wait for every single roundtrip to finish and then they compete to initiate the next roundtrip. Competing, in this case, is useless and we need to avoid it.

Instead of spawning goroutine per share, we should spawn a routine per tree we need to walk through, particularly per every DAHeader root. This way, we don't have any competing routines and only one routine initiates all ther roundtrips. In numbers, for the largest block that would be 128 routines per Row DAHeader root and 256 routines if want to fetch and store all the inner nodes of trees.
NOTE: All the shares are addressed and committed twice, with two Row and Column Merkle Trees, thus it is not required to traverse both trees to fetch the shares, but again, if we need to store all inner nodes of both trees(cc @adlerjohn) than both tree should be traversed.

Implementation Details

// TODO

Action Items

// TODO

References

As already mentioned, the original spark of the issue is the race detector complaining about exceeding the limit of 8k routines. The solution for that was to detect if the race detector running and simply skipping the test 😬. Later, after some discussions, celestiaorg/celestia-core#357 was created. And the solution for it is being implemented here celestiaorg/celestia-core#424.

Also, supersede celestiaorg/celestia-core#278

Appendix

There is a plan to remove DAHeader and use only one root hash to address and commit to all the shares. In such a case, we would not need to spawn any routines at all. The routine calling the RetrieveBlockData needs to be blocking anyway and it can then take care of initiating all the round trips.

ci: lint often fails due to timeout being exceeded

CI often fails due to lint timeout exceeded, like here: https://github.com/celestiaorg/celestia-node/pull/52/checks?check_run_id=3554301525#step:3:73

We already merged a pr (#49) to increase the timeout by 1 minute, but that didn't fix it.

rpc: Implement a way to submit/broadcast transactions to Celestia Core

Broadcast

submit tx to Celestia Core node via /broadcast_tx_async (non blocking)
submit tx to Celestia Core node via /broadcast_tx_sync (blocking)

block: Decide which DB to use for `BlockStore`

We haven't decided which db to use in order to store erasure coded blocks in Full nodes. I think this warrants a small ADR on its own (which can also be combined with our decision on how to store blocks, how many to cache, etc.)

From what I remember at the offsite, we discussed using badgerdb. I'd like to start a thread here and then have either me or @Wondertan write up the final decision in an ADR.

log: Choose logging framework and supporting utilities

For obvious reasons, we have to choose a way of logging things. Instead of arguing on which library to use, let's first agree with the most convenient way of using a logger for us. Currently, I see only two options, but feel free to give others(NOTE: Each option would provide identical output.):

Passing logger in each component/service constructor and keeping it in structures, like:

package foo

// def
type FooService struct {
	...
	log log.Logger
}

func (fs *FooService) Foo() {
	fs.log.Debug("Foo fooed!")
}

// usage
func logic() {
	toplvlLogger := log.Logger("")
	fs := &FooService(..., toplvlLogger.With("foo"))
	fs.Foo()
}

Defining logger as private per-package globar var, like:

// def
package foo

var log = logging.Logger("foo")

type FooService struct {
	...
}

func (fs *FooService) Foo() {
	log.Debug("Foo fooed!")
}

// usage 
func logic() {
	fs := &FooService(...)
	fs.Foo()
}

Personally, the second way seems much cleaner to me and my favorite, as it does not require us to pass the logger, everywhere, all the time.

WDYT?

use a different network than the default public ipfs network

Summary

Ideally, we want to connect only to nodes that run some lazyledger node, too.

Problem Definition

IPFS nodes connect to a large set of nodes by default. That's cool but we want our nodes to only connect to a set of nodes that 1) load the LL ipld plugin and 2) run some other portion of the lazyledger software (light client, (full) node, archival node etc).

Proposal

Provide an alternative set of bootstrap nodes. Ideally, we'd automatically deploy a node before each release and its address to the default config (this is good for testing purposes).
Add a way to only connect to a subnet. Using a private network (swarm key) could be the first very simple approach. Ideally, though we would modify to which nodes the ll-core nodes connect to by using some form of different network ID / protocol version (?) than the public ipfs network uses.

For Admin Use

Not duplicate issue
Appropriate labels applied
Appropriate contributors tagged
Contributor assigned/self-assigned

block: Implement `BlockStore`

Implement a BlockStore to store erasure coded blocks. Additionally, implement a feature to cache latest X amount of blocks for easy serve to other Celestia Full Node peers.

Components:

Discuss which DB to use #46
Block store
Block get

Sub issue of #25

Write basic project skeleton with DI

Goal

The project is intended to grow with a variety of components and services in it. To remove the time and mental overhead of writing and updating node initialization logic, where we build and order all the components ourselves, we should delegate that to a DI. There are multiple options, but Uber's one is solely based on reflection and does not involve code-generation. Even reflection is considered to be slow, the slowness manifests only on node build time, not the runtime.

DHT Reproviding

celestiaorg/celestia-core#375 disables providing in Bitswap here. Unfortunately, this options also disables reproviding.

Reproviding is basically re-execution of providing. Another important conceptual part of DHT is that its entries are not persistent and cleaned up through time, thus there should be some logic that automatically renews entries on DHT.

As celestiaorg/celestia-core#375 disables it automatically we need to enable it manually. Luckily though, we can implement our own strategy for reproviding which only reprovides roots and not all the cids in blockstore, as IPFS excessively do.

Importantly, this is required for the network to work properly, otherwise recently proposed blocks won't be available after ~12 hours. Thus, this needs to be part of celestiaorg/celestia-core#381

node: internal IPC/RPC/Internal API for `Node`

Implement some kind of interface for the Node so that any other application using the Node object and/or local CLI could remotely interact with the running daemon by using the same standardised interface.

Components:

Implement an interface for interacting with Node
- What kind of functionality do we want to provide via this interface?
Implement transport(s) (whether it's HTTP/IPC/Websocket, etc.) for communication with this interface
Implement authorisation for this interface

header: Implement `ExtendedHeaderService`

Implement an ExtendedHeaderService in package header:

block: Implement `NewBlockEventSubscription`

Sub-issue of #25

Implement NewBlockEventSubscription such that requesting the following information from the following Celestia Core endpoints is supported:

Request:

new blocks from the /block endpoint

Consider moving project docs to a separate docs specific repo

Currently, we are following the pattern of celestia-core and tendermint to include docs in the repo in here, but ideally, docs/specs/adrs should be extracted into a separate repo.

[EPIC] e2e: Celestia network tests

Celestia network tests roadmap

Before incentivised testnet, we should have a set of test plans that contain such scenarios:

CI/CD

In order to put out the tests above into a regular run cadence/schedule we need to do the following:

Polish/Harden out the set of tests that are used for experiments
Split tests by amount of instances and which test-plans are executed in different stages
Define the inbound requirements which triggers test runs (e.g. once a release candidate is released in either of the stack app/node/optimint)
Implement a new github action for e2e/network runs
Store a bucket of output logs
Notification mechanism

repo: Define Node's repository to manage all on disk data

Goal

There are multiple things that Celestia Node needs to store:

Configurations
Crypto Keys(p2p/consensus)
Peers
Share blobs
Block blobs
State
DHT entries(for routing)

With the purpose of encapsulation, sealing all on-disk footprint from Node let's introduce a Repository. It should manage the root directory, versioning, categorizing, and grouping of any generated or user data to be stored on disk.

Actions

#65
Core Repository managing Core's config/keys/genesis (#81)

	// Note that other things can be done later
	type core.Repository interface {
		Config() (Config, error)
		PutConfig(*Config)  error
	}

Repository interface (#86)

	type node.Repository interface {
		  Keystore() (keystore.Keystore, error)
		  Datastore() (datastore.Batching, error)
		  Core() (core.Repository, error)
	  
		  Config() (*Config, error)
		  PutConfig(*Config) error
	  
		  Path() string
		  Close() error
			    
		  // Optional
		  DiskUsage() (uint64, error)
		  SetAPI(string) error
		  GetAPI() (string, error)
	}

Config saving/loading (#86)
Datastore over badger (#86)
Locking(no more than once instance of Node must use a Repo simultaneously) (#86)
Various statistics, like disk usage

block: Implement `BlockService`

Implement BlockService in a separate block package:

type RawBlock -- a "raw" block received from Celestia Core (that is not erasure coded)
type ErasuredBlock -- an erasure coded block
Constructor function for BlockService
Start
Stop
NewBlockEventSubscription -- ability for full nodes to "subscribe" to new RawBlocks from Celestia Core via RPC
BlockExchange -- ability for full nodes to request / send their other full node peers RawBlocks
ErasureCodedBlockStore

Note that full nodes will be able to ask for RawBlocks from either a trusted Celestia Core node (that is running simultaneously with the full node), or from other full node peers. Full nodes will learn of new blocks via a header announcement either through ExtendedHeaderSub (in which they will be notified by their other full node peers), or via announcement from their trusted Celestia Core node via RPC.

Write ADR for high-level pre-devnet architecture

Summary

Describe the high-level architecture of the Celestia node and its different components (and then start the implementation accordingly).

Details

This could be several ADRs:

high-level architecture describing the interaction with a tendermint node and the components

And more granular descriptions of:

@renaynay @Wondertan feel free to amend / edit.

cmd: Implement a basic CLI app

I recommend using cobra as the maintainer is directly affiliated with golang and is generally more active in maintenance as opposed to urfave/cli.

Requirements:

basic app that interacts with node binary that is able to initialise and start the node, among other commands
- celestia light init
- celestia light start
- celestia full init
- celestia full start
ability to take in config file passed in via --config flag on init
nice grouping / categorisation of flags according to area (e.g. all flags related specifically to light mode are grouped under command ./celestia light --help, as well as other issue areas beyond the node type like metrics, utils, etc.)

Nice to haves:

ascii art something cool :D maybe this:

  / ____/__  / /__  _____/ /_(_)___ _
 / /   / _ \/ / _ \/ ___/ __/ / __ `/
/ /___/  __/ /  __(__  ) /_/ / /_/ / 
\____/\___/_/\___/____/\__/_/\__,_/  ```

ipfs: more light-weight node

Currently, we spin up a full ipfs node which is kinda bloated for what we are trying to achieve.

@Wondertan brought up some very good alternatives:

https://github.com/hsanjuan/ipfs-lite the most barebones approach that would e.g. allow use using the the same data store as for the tendermint Store (#182) but it require providing a DHT and libp2p host
https://github.com/ipfs/go-ipfs-http-client is an alternative http client which fully implements the IPFS CoreAPI - this is quite the opposite of the above but it's still interesting to consider this, as it would very easily allow nodes to run ipfs in a different process (not together with the tendermint node, maybe even on a different machine)

We should carefully understand the pros/cons before we jump into either of these. I'm leaning towards the first approach as it is the most light-weight and gives us more freedom on how to interact with ipfs. It would especially be the most reasonable approach if we ever go all-in and replace the tendermint p2p stack with something libp2p based. This is also related to the work over at optimint: https://github.com/lazyledger/optimint/labels/C%3Ap2p

ci: Create a make file

go test
golangci-lint

IPLD: Optimize network roundtrips

Consider using graphsync instead of bitswap

gaphsync spec: https://github.com/ipld/specs/blob/master/block-layer/graphsync/graphsync.md
graphsync architecture: https://github.com/ipfs/go-graphsync/blob/master/docs/architecture.md
bitswap: https://docs.ipfs.io/concepts/bitswap/

Note that this currently has low priority. Only in case bitswap performs unacceptably for our case (which does not seem like it does: celestiaorg/ipld-plugin-experiments#9 (comment)), we need to reprioritize this before launch.

block: Implement erasure coding

This is a sub-issue of #25

Implement erasure coding on new blocks received from the BlockFetcher.

TODO @renaynay: Add a detailed description for this ticket.

p2p: define and import networking services to Node

Celestia-node is p2p centric and requires integration of some libp2p services/components listed below:

Host
Bootstrapper
Routing, Discovery, RecordValidator(DHT)
PubSub(GossibSub)
ConnMngr and ConnGater
AutoNat and Port Mapper(this can be ommitted, but that's a no-brainer to integrate this)

Also, supporting Config fields for each component should be added.

Further clarify proofs of invalid erasure coding

Summary

The node types section only states who (which node type) is able to generate proofs of invalid erasure coding. Nowhere is explained what happens after generating them. This leaves a lot of room for interpretation. Who should care about those proofs and how will they be propagated (to who)?

Consumers of fraud proofs would be anyone that just does DAS. I guess that could be made more explicit.

Also, do they trigger slashable events? If yes, who will be slashed?

In terms of what penalities fraud proofs are associated with, that's more of a consensus/evidence concern.

Action Items

These should be separate issues that should be handled successively:

specify the data structure for these fraud proofs
specify the flow of events and the underlying assumptions
- clarify grace period to light clients to wait for erasure coding fraud proofs and implications for light clients (wait for fraud proofs?)
- specify penalties / slashing (if any)

related issue about evidence types: celestiaorg/celestia-specs#23
also related: celestiaorg/celestia-specs#110

shares: Decide default pruning `ShareStore` behaviour for light nodes

This is out of scope for devnet, but we should think about how we want Share pruning to work for light nodes. Our goal is to keep light nodes truly light, so we shouldn't impose large storage requirements on them.

Questions:

Do light nodes really need to cache the shares they're requesting either directly (SharesByNamespace) or via sampling? If so, why?
If so, how much / how long?

ipld: Extract IPLD plugin into separate repository

Goal

For ShareExchange and DASing to work we need to use IPLD implementation of NMT, which is currently located in Core. The Node needs to use it as well, so we need to extract the plugin with supporting functions into a separate repo to be used by both Core and Node repos.

Actions

Create new repo #111
Cherry-pick plugin and ipld package from celestiaorg/celestia-core#427
Define what logic should be in the repo and what should be part of Node repo.
Update core and close celestiaorg/celestia-core#296
#114

Tree-based approach for partial storage nodes network sharding and peer discovery

Partial storage nodes are nodes that only store some of the blocks in the blockchain, and can be queried by any other nodes (including light clients and other partial nodes) to download data from parts of the chain.

There's two main questions:

Granularity: How should the chain be sharded (e.g. per 1 block, 10 blocks, etc)?
Peer discovery: How peers associated with the shards be discovered?

I propose a method of answering the above, with a scalable "tree-based" approach.

Granularity

Let's assume a network-wide constant MIN_GRANULARITY = 1000 blocks where MIN_GRANULARITY is the minimum number of consecutive blocks you can advertise that you are storing to the network (which we call a "blockset"), and constant BASE = 10. We call a range of blocksets a "blockrange" (e.g. blockrange 0-10K consists of blocksets 0-1K, 0-2K, ..., 9K-10K). We can organise the blocksets into a directory structure, where each directory has BASE number of subdirectories (blockranges) or files (blocksets). Let's say there's 10 million blocks in the chain, the directory would look as follows:

0-10M/
├─ 0-1M/
│  ├─ 0-100K/
│  │  ├─ 0-10K/
│  │  │  ├─ 0-1K
│  │  │  ├─ 1K-2K
│  │  │  ├─ ...
│  │  │  ├─ 9K-10K
│  ├─ 100K-200K/
│  ├─ .../
│  ├─ 900K-1M/
├─ 1M-2M/
├─ .../
├─ 9M-10M/

Peer discovery

Subnet-based

Each subdirectories (blockranges) or files (blocksets) would be its own network topic. For example, a topic could be 0-10K (blockrange) or 0-1K (blockset). The network has the following interfaces:

GetPeers(topic) returns some IP addresses for peers that have advertised that they are serving the blockrange/set for topic.
Advertise(topic, ip) advertises that a node with IP address ip is serving the blockrange/set for topic.

The above operations might be expensive or time-consuming. Therefore, depending on how many blocks and blockranges there are in the network, partial storage nodes may only advertise up to a certain height of blockranges, and likewise clients querying the nodes might only try to get peers from a certain height of blockranges. Let's assume a client-side variable GRANULARITY, where GRANULARITY >= MIN_GRANULARITY, on both partial storage nodes and client nodes.

When a partial storage node wants to call Advertise() on blockranges that it's serving, it will only do so on blockranges that have a greater granularity than GRANULARITY. For example, if a partial storage node is serving blocks 0-1M, and GRANULARITY = 100,000, the it will call Advertise() on 0-1M, 0-100K, ..., 900K-1M, but not 0-10K, ..., 9K-10K, etc.

Similarly, if a client wants to download data in block 1500 for example, the deepest blockrange it would try to GetPeers() for is 0-100K. One can also construct different algorithms to find peers, using a top-to-bottom approach. For example, the client can first call GetPeers() on blocks 0-10M, but if no node is storing 10M blocks, it could then try calling GetPeers() on blocks 0-1M, and so on.

This would allow the network to self-adjust the acceptable data in each shard, depending on how big blocks are or how much storage resources partial nodes have.

Note: GRANULARITY is a client-side variable that can be adjusted automatically by the client itself based on its success on downloading blocks at different granularities. On the other hand, MIN_GRANULARITY and BASE are network-wide variables that have to be agreed network-wide as part of the p2p protocol.

Status message-based

An alternative to a subnet-based peer discovery approach is an approach where there's only one network of partial storage nodes, that have status messages that represent which blocks they have. Partial storage nodes would have the following interface:

GetStatus(GRANULARITY) where GRANULARITY >= MIN_GRANULARITY returns a bit field where the index of each bit in the field is a blockrange corresponding to GRANULARITY, and on-bit means that the node has the blocks in that blockrange.

For example, if a GetStatus(1M) is called in a chain with 10M blocks, and the partial storage node is only storing blocks 1M-2M, the bit field would be as follows:

0100000000
 ^
 |
blockrange 1M-2M

Investigate the implications of the client choosing whether to sample from row or column

In the fraud proofs paper, the client picks the (x, y) co-ordinates, but the node decides whether to return a response from the row or column root.

In the current implementation, the client also decides whether the response should be from a row or column root.

We should consider the security implications to this if, any.

node: Implement `Service` interface

Implement a Service interface that will represent all of the "services" that will be constructed and registered on the node via DI (such as dig). I recommend this gets implemented in the node package rather than a separate package, but I am open to hearing arguments for separating it from node.

The interface should contain at a bare minimum the following behaviours:

Start
Stop

rpc: simple RPC client to dial Celestia Core endpoints (temporary for devnet)

Implement a simple RPC client on the Node that can be started and stopped, but can dial specific endpoints on the Celestia Core node like /block and get the raw block from Celestia Core.

We can either import the rpc interface from tendermint or implement our own.

DHT usage and future work

Background

The Data Availability model we use requires data discovery. We rely on IPFS's Kademlia DHT, which basically allows any network participant to find a host for a certain piece of data by its hash.

Usage Description

To describe the way we use it, let's introduce a simple pseudo-code interface for it:

interface DHT {
	// Find the nearest peer to the hash and ask him to keep a record of us hosting the data.
	// By default, records are stored for 24h.
	Provide(hash) 
	// Find peers hosting the data by its hash. 
	FindProviders(hash) []peer 
	// Periodically execute Provide for a given hash to keep record around.
	Reprovide(hash) 
}

When a block producer creates a block, it saves it and calls Provide for every Data Availability root of the block, making it discoverable and afterward available. After, any other node that wants to get the block's data or validate its availability can call FindProviders, detect the block producer, and finally access the block data through Bitswap. The block producer and block requester also call Reprovide. Overall, with the described flow, we aim for maximum confidence that data of any particular block is always discoverable from peers storing it.

What's Left

The current state of the implementation does not conform to the flow above, and these things are left to be done:

Periodic reproviding for roots(https://github.com/lazyledger/lazyledger-core/issues/393)
Providing for non-producer nodes that retrieved a full block(celestiaorg/celestia-core#394)

Pain Points

Node churn

Records of someone hosting data are stored on peers selected not by their qualities but by the simple XOR metric. Unfortunately, this eventually makes different light clients store those records unreliably, as they are not meant to be full-featured daemons. Therefore, some data may become undiscoverable for some period of time.

Solutions

Basically, reproviding helps here. However, we never know when a light client leaves and data may be undiscoverable for many hours until the next reprovide happens, which would store records on another node.
Full routing table DHT client can keep records besides ones chosen by XOR metric, and the nodes running it are expected to be reliable. Thus they can fill a gap of undiscoverable hours.

Providing Time

We need to ensure providing takes less time than the time between two subsequent block proposals by a node. Otherwise, DHT providing wouldn't keep up with block production, creating an evergrowing providing queue. Unfortunately, for the standard DHT client, providing can take up to 3 mins on a large-scale network.

From this also comes a rule - the bigger the committee is, the more time the node has to proceed with providing. So naturally, the larger the network, the larger the committee is, and the larger the providing time, so altogether, these can overlap organically, not causing any issues. But if we still observe slow providing time being an issue, full routing table DHT client for block producer would be a solution as it significantly drops providing time.

Other Possible Improvements

Store fewer DHT records by storing only one root for a block. In the end, we would end up storing block roots and peer addresses only. (celestiaorg/celestia-core#378)
Play around with Kademlia Bucker size. Default is 20.
Play around with resiliency parameter. Default is 3.
Play around with provide validity time. Default is 24h.

share: Define reasonable timeout for `SharesAvailable` for `full` and `light`

Background

To validate a block's availability, we need to randomly sample some of its parts by requesting them from the network. Network requests must have a timeout not to wait for a response infinitely. Also, the nature of the IPFS software does not specify any timeout for data requests for its users, thus not to wait endlessly, we have to specify an adequate limit for sample response time.

Current state

I defined a random timeout for the request in 1 min that is not backed up by any rationale.

Proposed solution

I am not entirely about the proper way for finding ideal timings here, but I think we need to bench medium, numbers over a real-world environment and add on top slightly more additional time for inaccuracies.

ci: Add build command for Makefile

A build command that can be executed with such kinds of params

vanilla one
with params on the build itself
with log levels
- ideally having options to either stdout, err into cli or writing to a log file

das: ValidateAvailability should not panic, or, documented better

ValidateAvailability currently panics if the number of samples is greater than the squareWidth**2.

This behavior should either be changed (implicitly use the min(numSamples, squareWidth) for the actual number of samples), or, the documentation should be improved that caller is responsible to ensure that the number of samples meaningfully depends on the block size.

celestiaorg / celestia-node Goto Github PK

celestia-node's People

Contributors

Stargazers

Watchers

Forkers

celestia-node's Issues

Context

Implementation

Summary

Details

Action Items

References

Goal

Current solution

Desired solution

Summary

Proposal

Context

Expected Gains

Options:

Places to be altered

Summary

Implementation Details

Action Items

References

Appendix

Summary

Problem Definition

Proposal

For Admin Use

Goal

Celestia network tests roadmap

CI/CD

Goal

Actions

Summary

Details

Summary

Action Items

Goal

Actions

Granularity

Peer discovery

Subnet-based

Status message-based

Background

Usage Description

What's Left

Pain Points

Node churn

Solutions

Providing Time

Other Possible Improvements

Background

Current state

Proposed solution

Recommend Projects

Recommend Topics

Recommend Org