kadena-io / chainweb-node Goto Github PK

View Code? Open in Web Editor NEW

250.0 45.0 91.0 32.07 MB

Chainweb: A Proof-of-Work Parallel-Chain Architecture for Massive Throughput

Home Page: https://docs.kadena.io/basics/whitepapers/overview

License: BSD 3-Clause "New" or "Revised" License

Haskell 95.53% Nix 0.16% Shell 0.11% Python 0.17% C 0.37% Pact 3.43% Dockerfile 0.24%

kadena chainweb-node blockchain haskell

chainweb-node's Introduction

Kadena Public Blockchain

Kadena is a fast, secure, and scalable blockchain using the Chainweb consensus protocol. Chainweb is a braided, parallelized Proof Of Work consensus mechanism that improves throughput and scalability in executing transactions on the blockchain while maintaining the security and integrity found in Bitcoin.

Read our whitepapers:

For additional information, press, and development inquires, please refer to the Kadena website

Kadena Docs Site
Installing Chainweb
Bootstrap Nodes
Configuring, running, and monitoring the health of a Chainweb Node
Mining for a Chainweb Network
Chainweb Design
- Component Structure Details
- Architecture Overview

Docs

The Kadena Docs site, which can be found here serves as a source of information about Kadena. You can find information about how to interact with the public chain, including how to get keys, view network activity, explore blocks, etc. here.

If you have additions or comments, please submit a pull request or raise an issue - the GitHub project can be found here

Installing Chainweb

Minimal recommended hardware requirements for nodes are:

2 CPU cores
4 GB of RAM
250 GB SSD or fast HDD
Public IP address

If the node is also used as API server for Pact or mining, rosetta, chainweb-data: 4 CPU cores and 8GB of RAM.

Docker (all batteries included)

A docker image is available from here and can be used with the following commands:

# Initialize the database (optional, but avoids several hours of initial db synchronization)
docker run -ti --rm -v chainweb-db:/root/.local/share/chainweb-node/mainnet01/0/ kadena/chainweb-node /chainweb/initialize-db.sh

# Run a chainweb-node in Kadena's mainnet
docker run -d -p 443:443 -v chainweb-db:target=/root/.local/share/chainweb-node/mainnet01/0/ kadena/chainweb-node

Further details can be found in the README of the docker repository.

Docker (bare metal)

A docker image with just a bare chainweb-node binary and its dependencies is available at ghcr.io/kadena-io/chainweb-node/ubuntu:latest. It is up to the user to setup and manage the database and configure the node to their needs.

docker run -p 1789:1789 -p 80:80 --entrypoint=/chainweb/chainweb-node ghcr.io/kadena-io/chainweb-node/ubuntu:latest --help
docker run -p 1789:1789 -p 80:80 --entrypoint=/chainweb/chainweb-node ghcr.io/kadena-io/chainweb-node/ubuntu:latest --print-config

Examples for running docker compose setups for chainweb-node for different usage scenarios can be found in this repository.

Ubuntu Linux

The following packages must be installed on the host system:

ubuntu-20.04:

apt-get install ca-certificates libgmp10 libssl1.1 libsnappy1v5 zlib1g liblz4-1 libbz2-1.0 libgflags2.2 zstd

ubuntu-22.04:

apt-get install ca-certificates libgmp10 libssl1.1 libsnappy1v5 zlib1g liblz4-1 libbz2-1.0 libgflags2.2 zstd

Chainweb-node binaries for ubuntu-20.04 and ubuntu-22.04 can be found here.

Download the archive for your system and extract the binaries and place them into a directory from where they can be executed.

At this point, you are ready to run a Chainweb node

Building from Source

IMPORTANT NODE: We recommend the use of officially released chainweb-node binaries or docker images, which can be found in the release section of this repository. If you decide to build your own binaries, please make sure to only use officially released and tagged versions of the code. Those versions are extensively tested to ensure that they are compatible with all other nodes in the chainweb network. It is generally not safe to run arbitrary builds of the master branch in the Kadena mainnet.

Chainweb is a Haskell project. After cloning the code with git from this GitHub repository the chainweb-node application can be built as follows.

Building with Cabal

In order to build with cabal you have to install ghc-8.10.7 (Haskell compiler) and cabal >= 3.4 (Haskell build-tool)

Linux / Mac

You need to install the development versions of the following libraries: gflags, snappy, zlib, lz4, bz2, zstd.

On apt based distribution these can be installed as follows:

apt-get install ca-certificates libssl-dev libgmp-dev libsnappy-dev zlib1g-dev liblz4-dev libbz2-dev libgflags-dev libzstd-dev

To build a chainweb-node binary:

# Only necessary if you haven't done this recently.
cabal update

# Build the project.
#
# After this, a runnable binary can be found by running `cabal list-bin chainweb-node`.
cabal build

Building with Nix

Another way to build and run chainweb is to use the Nix package manager which has binary caching capabilities that allow you to download pre-built binaries for everything needed by Chainweb. For detailed instructions see our wiki.

When the build is finished, you can run chainweb with the following command:

./result/ghc/chainweb/bin/chainweb-node

Bootstrap Nodes

Bootstrap nodes are used by chainweb-nodes on startup in order to discover other nodes in the network. At least one of the bootstrap nodes must be trusted.

Chainweb node operators can configure additional bootstrap nodes by using the --known-peer-info command line option or in a configuration file. It is also possible to ignore the builtin bootstrap nodes by using the --enable-ignore-bootstrap-nodes option or the respective configuration file setting.

Bootstrap nodes must have public DNS names and a corresponding TLS certificate that is issued by a widely accepted CA (a minimum requirement is acceptance by the OpenSSL library).

Operators of bootstrap nodes are expected to guarantee long-term availability of the nodes. The list of builtin bootstrap nodes should be kept up-to-date and concise for each chainweb-node release.

If you like to have your node included as a bootstrap node please make a pull request that adds your node to P2P.BootstrapNodes module.

Current Testnet Bootstrap Nodes

us1.testnet.chainweb.com
us2.testnet.chainweb.com
eu1.testnet.chainweb.com
eu2.testnet.chainweb.com
ap1.testnet.chainweb.com
ap2.testnet.chainweb.com

Current Mainnet Bootstrap Nodes

All bootstrap nodes are running on port 443.

us-e1.chainweb.com
us-e2.chainweb.com
us-e3.chainweb.com
us-w1.chainweb.com
us-w2.chainweb.com
us-w3.chainweb.com
jp1.chainweb.com
jp2.chainweb.com
jp3.chainweb.com
fr1.chainweb.com
fr2.chainweb.com
fr3.chainweb.com

Configuring, running, and monitoring the health of a Chainweb Node

This section assumes you've installed the chainweb-node binary somewhere sensible, or otherwise have a simple way to refer to it. For running chainweb-node via docker, please see the instruction above in this document or visit our docker repository.

Note: Your node needs to be reachable from the public internet. You will have to perform Port Forwarding if your machine is behind a router (by default port 1789 is used by the node).

NOTE: When you start chainweb-node for the first time it creates a new empty database and starts to synchronize and catch up with other nodes in the Kadena network. This process takes a long time -- several days. It is much faster (depending on hardware one to a few hours) to just synchronize the chain database or get a snapshot of it and only rebuild the pact databases from the chain-database. Please, consult the documentation of the docker images for chainweb-node about details on how to obtain an initial chain database.

Run your node:

chainweb-node

The node will communicate with other nodes in a P2P network. By default it uses port 1789 for the P2P communication.

Node services are exposed via the service API, by default on port 1848. The service API includes /info, /health-check, Pact endpoints, Rosetta endpoints, the mining API endpoints, GET endpoints for on-chain data (headers, payloads, cuts), and an HTTP event stream of block header updates. Some of these are disabled by default (e.g. mining API, Rosetta, and header updates).

While the P2P endpoint must be directly available from the public internet, it is highly recommended to expose the service API only on a private network. When service API endpoints are made available publicly it is recommended to use a reverse proxy setup things like rate limiting, authentication, and CORS.

Configuration

No particular configuration is needed for running Chainweb node on the Kadena mainnet.

Use chainweb-node --help to show a help message that includes a brief description of all available command line options.

A complete configuration file with the default settings can be created with

chainweb-node --print-config > config.yaml

This file can then be edited in order to change configuration values.

The command chainweb-node --help also provides descriptions of these configuration values.

Given a configuration file or a set of command line options it is possible to print out only those configuration values that are different from their respective default:

chainweb-node --config-file=config.yaml --some-command-line-options --print-config-as=minimal

Monitoring the health of a Chainweb Node

The following outlines how you can check that your chainweb-node is healthy

chainweb-node should be running from the public IP address and a port that is open to the other Chainweb nodes.

If you're behind a NAT, it is VERY IMPORTANT that your network allows external nodes to connect to the node you are running.

$ chainweb-node --log-level <desired-log-level>

For production scenarios we recommend that you use log-level warn or error. For troubleshooting or improved monitoring you can also use info.

Once your node is running, go through the following checks to verify that you have a healthy node:

run the command in your terminal:

$ curl -sk "https://<public-ip>:<port>/chainweb/0.0/mainnet01/cut"

navigate to this website on your browser: https://yourPublicIp:port/chainweb/0.0/mainnet01/cut
check logs for whether services are started
check if the node is receiving cuts
look for errors in the logs
look for warnings in the logs

Usually, when a node is receiving and publishing cuts (i.e. block heights at every chain), it's working correctly.

The /cut endpoint will return the latest cut that your node has. It's possible that your node is falling behind, so make sure to compare its cut height with the cut heights of the bootstrap nodes. It's also possible that you are mining to a node that is catching up to the rest of the network. Before you start mining to a node, you SHOULD verify that this node has the most up-to-date cut.

You can get the cut height of any node by running the following:

$ curl -sk https://<bootstrap-node-url>/chainweb/0.0/mainnet01/cut | jq '.height'

Mine for a Chainweb Network

Successful mining on mainnet requires specialized hardware (ASIC). The setup for solo mining involves running a chainweb-node with a configuration that enables mining and a chainweb-mining-client that connects to the mining API of a chainweb-node and provides a Stratum API for the mining hardware (ASIC).

Detailed instructions for setting up all the infrastructure needed to start mining using docker compose can be found in the documentation of docker-compose-chainweb-node/mining-node.

For example, to set up a chainweb node for mining, see this section of the docker-compose file.

Detailed mining client instructions can be found in the documentation of chainweb-mining-client

Chainweb Design

Component Structure

The Chainweb package contains the following buildable components:

chainweb library: It provides the implementation for the different components of a chainweb-node.
chainweb-node: An application that runs a Chainweb node. It maintains copies of a number of chains from a given Chainweb instance. It provides interfaces (command-line and RPC) for directly interacting with the Chainweb or for implementing applications such as miners and transaction management tools.
chainweb-tests: A test suite for the Chainweb library and chainweb-node.
cwtool: A collection of tools that are helpful for maintaining, testing, and debugging Chainweb.
bench: a collection of benchmarks

Architecture Overview

chainweb-node's People

Contributors

Stargazers

Watchers

Forkers

larskuhtz eskimor kenadia geoffreyporto rebecca-io amandacameron huglester fresheyeball gaoyang1120 gussulliman bobthebuilder420-rcl jacoby6000 obsidiansystems snatural alexfmpe gripen89 4everaerial newname12345 cratec gr4zz allchain past2017 bxlkm1 tomsmalley opt9 ceozero hassoon1986 jamexiao88 coolio-coder waltervargas akp05 ryantrinkle fosskers sshyran salvatore-fxpig bessieisbinbin davidufochain jessecu 5omnium crypto-zone mariszo jmininger shamzy1 bjing panoptisdev samonh coolcat2012 davehenton moteesh-reddy the-gaia-coin-foundation crowneye cleancoindev edmundnoble dwongdev ashr3f1 teury alijamaan trumae violattice zaptariz nopool nickrawlings2012 setecho labmember-001 urantialife zhangxu0115 kiruthik-raaj nguyenphuminh toptal126 ken0803 pinkdiamond1 innovation-labs-technical-hub arefathi pavan413 hashpool esomore koltigin luzzotica enof hoanghungict mbwmbw1337 louisaal cb-defi-notifs kadena-io omahs joethechicken liquidityqloud daplcor rmourey26 geniegod-hash

chainweb-node's Issues

Build chainweb with nix and running a chainweb node

Following instructions in README I should be able to run chainweb node after building it with nix but I couldn't. The reason is chainweb-node executable is not there.

To run chainweb-node successfully:

./result/ghc/chainweb/bin/chainweb-node --node-id=0 --config-file=./scripts/test-bootstrap-node.config

I am not sure if this is right the right executable.

Profiling and benchmark suite

Usually, every newly created Haskell code base has some performance issues. These are often due to laziness bugs. Among those are often some low-hanging fruits that can be quickly identified through performance and heap profiling.

Using thread scope can also provide insights on scheduling and garbage collection overhead.

There should also be a benchmark suite for critical operations like serializing and deserializing block headers, looking up block headers in the chain db, synchronizing db snapshots, adding new block headers, validating block headers, and difficulty adjustment computations.

Clean `Chainweb.BlockHeaderDB.RestAPI.Server`

There is a lot of duplicate code in this file, and much of it could be consolidated.

Megaparsec 7 Compatibility

It has been about 8 months since the beginning of the Megaparsec 7 series. 7 has better error handling, but broke their API in doing so. In particular, the Stream class has changed and the instance of it for Cursor in Pact.Types.ExpParser will need to be updated.

Revisit top-level `TVar TestChainwebVersionMap`

This may involve revisiting how ChainwebVersion is converted to and from Text.

Limit concurrency when fetching dependencies from "origin"

When pulling dependencies of a Cut, it is first attempted to request those from the origin node of the cut. Dependencies are queried in parallel as they are discovered.

Due to task sharing Concurrency for block headers is (big-O) bounded by the width of the chainweb. However, block headers are small and can be queried quickly. Payloads are potentially large and fetching them could pile up parallel tasks.

This is only an issue for fetching from "origin". If the origin query fails, dependencies are queried through the P2P network with has bounded concurrency.

Implement Hard Forks

Safe `BlockHeader` Decoding

#322 removes recently added logic that detects when a genesis block is being decoded. This caused problems for code reorganization, as described here.

Some logic of similar spirit ought to be reinstated somewhere. The primary consumer of the function in question (decodeBlockHeader) is the FromJSON instance of BlockHeader. Currently, the dependency graph between these types and functions is as follows:

To put the check back where it was would require drawing an arrow between decodeBlockHeader and genesisBlockHeader, which is clearly not possible. So, some further consideration is required.

ChainId incorporates network/version

ChainId needs to incorporate the version into itself so that you can never type a chain ID for a network that is invalid. I propose we move to a chain ID that incorporates a network ID (version) in its representation, and use that everywhere, including in endpoint URLs.

This eliminates magic and allows trivial creation of chain IDs as validation is automatic (assuming we solve this problem with non-serializable versions). You can never just write 0, you have to say Testnet01-0; the smart constructor can now be used safely as it will do all the validation right there, by first recovering the version, which yields the graph, which validates the index, or boom.

Add persistence for Pact checkpoints

Pact checkpoints are currently saved in memory. We need a good solution for saving / restoring checkpoints from disk

Exercise multi-chain consensus edge cases

(e.g. forks, network splits, etc.)

Enforce mandatory network upgrades

chainweb-node seems to open the same log file twice for writting

which causes scripts/run-nodes.sh to fail with errors like this:

bash ./scripts/run-nodes.sh ./result/ghc/chainweb/bin/chainweb-node 10 /tmp/run-nodes-logs
starting 10 chainweb nodes
started bootstrap node 0
started node 1
chainweb-node: /tmp/run-nodes-logs/telemetry.node0.log: openFile: resource busy (file is locked)
started node 2
chainweb-node: /tmp/run-nodes-logs/telemetry.node1.log: openFile: resource busy (file is locked)
started node 3
chainweb-node: /tmp/run-nodes-logs/telemetry.node2.log: openFile: resource busy (file is locked)
started node 4
chainweb-node: /tmp/run-nodes-logs/telemetry.node3.log: openFile: resource busy (file is locked)
chainweb-node: /tmp/run-nodes-logs/telemetry.node4.log: openFile: resource busy (file is locked)
started node 5
started node 6
chainweb-node: /tmp/run-nodes-logs/telemetry.nod

When run with strace, by means of a little wrapper script (mynode.sh):

#!/usr/bin/env bash

strace ./result/ghc/chainweb/bin/chainweb-node "$@"

Run this way:

bash ./scripts/run-nodes.sh ./mynode.sh 10 /tmp/run-nodes-logs 2> /tmp/trace-log

I get the following strace (only the interesting part):

openat(AT_FDCWD, "/tmp/run-nodes-logs/telemetry.node0.log", O_WRONLY|O_CREAT|O_NOCTTY|O_NONBLOCK, 0666) = 56
fstat(56, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
ftruncate(56, 0)                        = 0
ioctl(56, TCGETS, 0x7ffe2fcc2f00)       = -1 ENOTTY (Inappropriate ioctl for device)
ioctl(56, TCGETS, 0x7ffe2fcc2f00)       = -1 ENOTTY (Inappropriate ioctl for device)
futex(0x7ffbc0000ba8, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x7ffbc0000bb0, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x39e3ae8, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x7ffb84000ba8, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x7ffb84000bb0, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x39f4088, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x7ffbb0000ba8, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x7ffbb0000bb0, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x3a14bc8, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x7ffbbc000ba8, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x7ffbbc000bb0, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x3a25168, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x7ffbc8000ba8, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x7ffbc8000bb0, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x3a35708, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x7ffbc4000bac, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x7ffbc4000bb0, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x3a45ca8, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x7ffbdc000bac, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x7ffbdc000bb0, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x3a56248, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x7ffbd4000bac, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x7ffbd4000bb0, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x3a667e8, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x7ffbd8000bac, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x7ffbd8000bb0, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x3a76d88, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x7ffba8000ba8, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x7ffba8000bb0, FUTEX_WAKE_PRIVATE, 1) = 1
futex(0x3a87328, FUTEX_WAKE_PRIVATE, 1) = 1
rt_sigprocmask(SIG_BLOCK, [INT], [], 8) = 0
clock_gettime(CLOCK_PROCESS_CPUTIME_ID, {tv_sec=0, tv_nsec=380220146}) = 0
getrusage(RUSAGE_SELF, {ru_utime={tv_sec=0, tv_usec=344055}, ru_stime={tv_sec=0, tv_usec=36322}, ...}) = 0
mmap(0x4200e00000, 1048576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x4200e00000
madvise(0x4200e00000, 1048576, MADV_WILLNEED) = 0
madvise(0x4200e00000, 1048576, MADV_DODUMP) = 0
sched_yield()                           = 0
clock_gettime(CLOCK_PROCESS_CPUTIME_ID, {tv_sec=0, tv_nsec=391705730}) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
openat(AT_FDCWD, "/tmp/run-nodes-logs/telemetry.node0.log", O_WRONLY|O_CREAT|O_NOCTTY|O_NONBLOCK, 0666) = 57

Notice the openat in the last line and the one in the first line - same file, but no close in between.

I was not able to find the reason for this yet - still investigating.

Test suite for concurrency bugs

At least the P2P layer and the Sync layer involve non-trivial concurrency. We need good testing for that. Dejafu seems well suited for that task.

Log information about transactions in a block

At the very least, this should include:

The number of transactions in the block

What value should be used for license in module headers?

Assuming, that there is no license, should we use None? In that case, should the License file be deleted?

Release management (binaries, distribution, etc)

Distinct "sync phase" upon startup

Currently, all our components start at the same time. Futher, the sync
components assume that a node joining the network is already fairly
close to the top of the Chainweb. This is an issue for a fresh node, who
may have no data from the main "consensus line" at all. While syncing to
try and catch up, this node will also be mining on the lowest block
heights. This is completely wasted work.

Instead, we should add a preliminary phase where a node will do nothing
but sync and verify Pact transactions. Once it gets "close enough" to
the head of the Chainweb, it can resume normal operation as currently
defined.

This preliminary phase can be it's own module that just calls existing
functions. "New code" doesn't need to be written, per se. The call to
this new module should occur before [this block of code]. Notice the
"FIXME" ;)

[this block of code]
https://github.com/kadena-io/chainweb-node/blob/master/src/Chainweb/Chainweb.hs#L424-L434

Cleanup Chainweb.Time and related modules

The implementation of Chainweb.Time and the related numeric classes feel a bit like overkill. We should consider some cleanup.

Coin Contract 'credit' overwrites keys

Coin Contract credit function overwrite keys during transfer.

GHC 8.6-based nixpkgs / stack.yaml

For both Pact and Chainweb. This will allow up to unpin a great number of dependencies in our default.nix files.

Requires the MonadFail and Megaparsec issues to be cleared first.

Whitelist / Blacklist

BlockHeaders and Payloads in RocksDB

A serialized block header requires about 250-500 bytes. At a block rate of 15 seconds and 100 different chains a year of block headers is worth about 40-50GB of data. Eventually, we need to back the block header database by a proper persisted database.

Candidates include sqlite, leveldb, rocksdb. The latter seems to be the most modern one and may be a good fit for our needs.

Phase

Store Payloads in RocksDb
Implement "Tables" for RocksDB for storing different Haskell types
Add RocksDb Backend for BlockHeaderDb
Store cut history/log in RocksDb

Phase

Simplify TreeDB API (or remove TreeDB altogether)
Performance improvements for RocksDB backend

Phase

Add Haskell bindings for newer RocksDB features
- Column families (use this for implementing different tables)
- tail iterator
- prefix iterator

Consider removal of custom Numeric typeclasses

As found in the Numeric.* modules.

Improve FloydWarshall

We use the Floyd-Warshall algorithm to calculate the diameter of chain graphs. This is mostly useful for testing, since in production, we would use known, fixed-diameter graphs.

massiv is employed in implementing the Data.DiGraph.FloydWarshall module, and its usage can be improved in a few ways.

Maximize Laziness

type DenseAdjMatrix = Array U Ix2 Int

fromAdjacencySets :: AdjacencySets -> DenseAdjMatrix
fromAdjacencySets g = makeArray Seq (n :. n) go
  where go = ...

As much as possible, we want our Array types to have the internal representation of D or DW. This avoids memory allocation until the last possible moment. When we do force them, asking for U, P, or S are generally equivalent (at least for our purposes).

floydWarshall :: Array U Ix2 Double -> Array U Ix2 Double

shortestPaths :: Array U Ix2 Int -> Array U Ix2 Double
shortestPaths = floydWarshall . computeAs U . distMatrix

Since everything is being forced as U here, a lot of extra memory is being allocated. computeAs should only be used:

At the very end of a composition of matrix operations, or;
Before doing a Stencil-based operation (like convolution, etc.)

Generally, for non-Stencil ops, there should always be a way to do everything lazily (via D) until the very end.

Parallelism

At least in test environments, we can assume graph sizes of > 1000 nodes. This means a Matrix of at least 1000x1000. In my experience with Massiv, for any Matrix over 256x256, using the Par evaluation strategy quickly becomes worthwhile.

fromAdjacencySets :: AdjacencySets -> DenseAdjMatrix
fromAdjacencySets g = makeArray Seq (n :. n) go  -- should be `Par`!

Then, when compiled/ran with the correct RTS ops, we get performance boosts for free.

Stencils

The implementation of the floydWarshall function in particular allocates a new Array for each interation, which means n unnecessary allocations are occuring, where n is the width/length of the Array. Glancing at the code, it seems like this could just be rewritten as a Stencil operation and done in a single pass.

`IntMap` / `IntSet`

type AdjacencySets = HM.HashMap Int (HS.HashSet Int)

Could this be rewritten in terms of IntMap and IntSet? fromAdjacencySets uses a lot of lookup and member calls, and the Int-based structures can perform these operations about twice as fast as the hash-based ones.

Integrate Pact API changes

Upgrade pact to incorporate #451 and related and basically get it compiling:

Mempool/Pact RestAPI stuff
Pact return values in PactService -> RPC -> Mempool

Avoid RecordWildCards

The RecordWildCards pragma performs code-gen in order to function. Removing all instances of its use should improve compile times.

Test suite for P2P network

defpact-based SPV in Chainweb

Integrate Pact changes in #108, #456 and support automatic SPV

upgrade Pact and get it compiling
start Pacts from > step 1, using continuation values in SPV proof
do SPV in Haskell code before launching pact second step

Coin Faucet contract

Write a faucet contract to be installed in Testnet Genesis with some allocated amount of coins owned by a ModuleGuard. Upon request the faucet will transfer some of its coins to requester, with some maximum and maybe a daily limit. "Refunds" (where users return coins to the faucet) should simply be a matter of transfer in the coin contract back to the faucet account.

Data structure for local BlockHeader metadata

Some operations require to store local meta-data about block headers.

For instance, for efficient synchronization it is needed to track the origin of block headers which have missing dependencies. Those can be store temporarily in a "staging" area, while missing dependencies are queried from their origin. This information can also be used to track sources of block headers that fail payload validation. Other useful information may include when a block header was first received, added to the database, and passed payload validation.

Use and validate client certificates in P2P connections

Each p2p network peer has an x509 server certificate. Currently there is no authentication of TLS clients. Not authenticating clients makes the network vulnerable to several attacks when ever a peer makes use of the origin a request. See issue #82 for an example.

pass peer certificates to HTTP connection manager for usage in new connections.
require client authentication on the server side
extract peer-id and host address from client certificate.
the currently implemented protocol allows Chainweb-nodes to use public DNS names with "official" X509 certificates. Those certificates may not support usage for client authentication. If that turns out to be a common case, we either wouldn't accept connections from these nodes, or reject the "origin" information on BlockHeaders and Cuts that we get from theses nodes, so that we won't pull data from them.

The main use case for public DNS names are boot-strap nodes. These nodes may just offer two endpoints: a read-only and point with the public name for bootstrapping and a second one with an
self-signed certificate that is used for querying other nodes as client.

This issue is a prerequisite for #266

Include HTTP HEAD calls to REST API

It seems that servant doesn't generate HEAD endpoints by default.

implement generation of HEAD endpoints for all existing GET APIs.
implement clients for HEAD endpoints
include pagination information into the HTTP headers and document this
add test cases

Propagation of ChainwebVersion

The is issue serves two related purposes:

Derive the ChainGraph from the ChainwebVersion.
Propagate the ChainwebVersion reified as type parameter.

The ChainwebVersion and the ChainGraph are static values for a given chainweb instance. They are both represented as terms at runtime but are morally types. Therefore it makes sense to reify the ChainwebVersion value at the topmost layer, let the constraint solver propagate it, and inject it via reflection where it is used. By tagging data types that depend on the ChainwebVersion and in particular the ChainGraph with the reified ChainwebVersion we also get static consistency properties. For instance, we are guaranteed that the ChainIds of a BlockHash actually belong to the graph that is in scope.

Currently, we use given in a several places for the ChainGraph. This which has the benefit of not having to propagate the ChainGraph as explicit parameter to each use site, but instead have the constraint solver propagate it for us (which is convenient since dictionaries are propagate backward and forward through the control graph). Another major benefit is that it supports keeping code pure that otherwise would require some monadic context to provide the ChainGraph in the environment. That is, is for instance, convenient for defining type class instances for classes that don’t expect a parameter for a monadic environment. E.g. class Given ChainGraph => Arbitrary chainId where … instead of class HasChainGraph m => Arbitrary (m ChainId) where ….

Investigate custom "Ingress" and "Egress"

Currently, EC2 instance HTTP traffic ingress/egress is left "wide open". If you knew the IP addresses of any of our nodes, you could connect random Chainweb instances to our network.

At least for Testnet, we'd like to have control over who is in our network. Approaches:

[Terraform] Better define ingress/egress fields in our Security Group definitions?
[Terraform] Actually define a non-default AWS internal VPS?
[Haskell] Support special lists of IP address that can be read at runtime, indicating special HTTP traffic to allow?

My personal vote is for (1).

Staging area for blocks with missing dependencies

When a block header with missing dependencies is received it is likely that those dependencies will be received later on and the block can be added to the database.

For that, instead of discarding the block because of a validation failure, the block should be stored temporarily in a staging area.

In addition a synchronization session should be triggered to query the missing dependencies for that block header. For that it is helpful when the origin of the block header is recorded (cf. #79).

If the block header is part of an active cut synchronization should be done with high priority.

Mempool tx re-introduction on forks

When forks occur, transactions on the losing fork need to be put back into the mempool so they can be added to future blocks on the winning fork

Create type for query arguments in TreeDB

Create a type for the common query arguments:

        -> Maybe (NextItem (DbKey db))
         -> Maybe Limit
         -> Maybe MinRank
         -> Maybe MaxRank

cf. https://github.com/kadena-io/chainweb/pull/78/files#diff-2d86eb70515a06a3479275ea4c5c9a6bR373

MonadFail instances

fail is being moved out of Monad as part of the MonadFail proposal: https://prime.haskell.org/wiki/Libraries/Proposals/MonadFail

Lars already has a branch that addresses this for Pact, which allows it to be more forward-compatible.

Ensure proper database opening/closing within on-disk checkpointer

This issue is based upon @gregorycollins comments on an earlier PR (Ask me for information if you need the reference). In particular, he noted that there was an issue in the restore' function in SQLiteCheckpointer.hs. Here are his comments verbatim:

Start of Greg's Comments

AUGH I had a huge response written here and github ate it. Let me try to recreate it. Let's go over the sequence of operations on what should be happening here. Assume you have existing state at $v1 = (height1,hash1):

restore @ $v1 -> copies $data/$v1 to $tmp and loads it into sqlite, returns as PactDbState. $tmp contents currently identical to $v1 (we just copied it). Ideally we will have run this as the initialization portion of a bracket so that we set up rm -rf $tmp upon exit of the whole transaction.
pact runs transactions. $tmp now contains contents of $v1 plus modifications made by pact transactions
save @ $v2 -> closes sqlite db and atomically renames $tmp to $v2. Restores for DB state $v2 will now find our updated DB.
What's currently happening, as far as I can tell:

restore @ v1. withTmpFile creates $tmp, and reinitDbEnv calls SQLite to create it empty in-place (because there are no contents yet). We create a PactDbState pointing to $tmp but withTmpFile unlinks it on exit from restore.
Pact runs transactions. Probably SQLite recreates the empty db at this point. $tmp contents are either empty or contain just the modifications run by the transactions run on the empty DB.
save @ v2. We do copyFile tmp tmp2 >> rename tmp2 v2. $v2 is missing most of the historical db contents, and $tmp is leaked?

End of Greg's comments

Greg and I had a conversation over Slack over how to address this problem. I am going to create a PR that should fix this problem.

SPV support

(blocked by transaction block format, chain header oracle)

Support SPV via a built-in function that recognizes a JSON payload of a particular format.

The JSON payload contains an SPV merkle proof of some receipt on another chain plus intermediate/connecting merkle roots leading to the executing chain.

Design Hard Fork

Design implementation of Chainweb Hardforks

Nat and and Proxy traversal for P2P connections

What network environments are supported (local IP, public IP, dynamic IP addresses, public DNS names, NAT traversal, VPN, Window proxies with/without NTLM authentication, IPoAC etc.)?

Do we care at all, or do we consider this something that users should take care of?

Note, that Haskells http clients, unlike many standard libraries from other languages, don't support windows HTTP proxies. A work-around (that is common in Haskell network applications) can be the usage of curl as http client, which has reasonable support for windows proxies.

SQLite-based checkpointer for Pact

As Pact has distinct user tables and a structure that is already very amenable to versioned history, propose to implement checkpoints directly in SQLite as opposed to (a) copying SQLite DB files (b) using BTRFS or some filesystem solution (c) using RocksDB.

Why not RocksDB

RocksDB is suggested as inferior for Pact mainly because the need to have so many user schemas requiring extensive keyspace operations. Also, SQLite actually outperforms RocksDB for single-threaded indexed queries, see https://github.com/facebook/rocksdb/wiki/Performance-Benchmarks at ~50us where Pact has seen db updates in the <20us zone.

As for SQLite size limitations, they are actually not small and will probably suffice for many years, see https://www.sqlite.org/limits.html. However this design proposes that multiple connections can be used, where connections own some count of user tables. This can be introduced later as long as there is design support for it now.

Note lastly that RocksDB snapshots are not persisted to disk so that solution is not usable.

Lastly, this design actually leverages SQLite's performant indexing to leverage relational SQL as the versioning mechanism, as opposed to previous SQLite Pact usage which looks more like a key-value store. Indeed the Pact language just wants a journaled key-value, but this solution will handle reorgs "relationally".

Overview

The main notion is of a version corresponding to a fork that will be used along with block height to determine the latest version of a key, and to label entries as "forked" (or simply delete them) when a reorg occurs.

Example user table use

The following example will be used to illustrate the design. A user table will go through the following history. Row data will be represented with an arbitrary number. "Version" concept is detailed elsewhere but corresponds to a reorg history.

key	block	version	data	notes
a	10	1	123	Version 1 represents current reorg/fork
b	10	1	234
c	11	1	456	Reorg below replaces from here
a	11	1	124
d	12	1	567
c	12	1	457
c	11	2	460	Fork/reorg to Version 2
b	12	2	240

Thus, a table scan at block 12 version 1 just before the reorg should return:

key	block	version	data
a	11	1	124
b	10	1	234
c	12	1	457
d	12	1	567

A table scan at the end, post-reorg, should return:

key	block	version	data
a	10	1	123
b	12	2	240
c	11	2	460

Version detection

Block validation supplies a stream of (block height [B],parent hash [P]) pairs. A version indicates the non-forked behavior of this stream, such that receiving a monotonically-increasing dense stream of block heights indicates a single version.

When a block height arrives that is less than the expected next value, the version (V) will increment and version maintenance operations will occur.

System will need to have a central version history table ordering all (B,P) pairs and associating a version. Re-orged pairs can be discarded. The "HEAD" version is maintained in memory and can be recovered from the version history table.

The "block version" is the pair of (B,V) as seen in the examples above.

Table management

System will track all versioned system and user tables -- versioned system tables would include the Pact continuation table and refstore; user tables includes coin contract table. Tables should be associated with a database connection/file with some limit on how many tables should be in a file/conn. This initially can be a single connection but we want the design to support multiple connections. The central system tables can be in their own connection or share the first connection.

Tables will also be associated with the block version when they were created. Tables can optionally be dropped when reorged, or marked as old/invalid; table name in database can include the block version if desired. See "Deleting vs Marking" below.

Per-table versioning and history

Table operations will support reorgs by tracking versions directly in the relational schema. Queries will leverage indexes and "SELECT TOP 1" queries to find the latest key.

Pact history requests will no longer need dedicated "transaction tables". 'txid' will be stored as a relational column. Note that Pact will no longer track multiple updates in a given txid for a key as this is (a) not a good practice and (b) not relevant to larger history.

Deleting vs Marking on version changes

This solution supports either (a) marking deactivated block versions by updating the version column with a FORKED code, or simply deleting the row. Likewise, forked table creates can either associated the table as FORKED or simply drop the table.

Deletion has the advantage of space compaction; marking can help with troubleshooting. While supporting both as a configuration option might be nice it will also slow development.

Decision: use deletion. Descriptions below will nonetheless use FORKED to indicate marking option.

Versioned table schema

All versioned tables will have the following schema:

Name	Type	Note	Index
KEY	String	User key	Non-unique
BLOCK	Int	Block height	Non-unique
VERSION	Int	Reorg version
TXID	Int	Transaction ID for history	Non-unique
DATA	JSON	User data	No index

Unique constraint is (KEY,BLOCK[,VERSION]). SQLite automatically adds ROWID which is actual primary key. TXID will have an index for history queries.

Version maintenance

On a reorg, a new block version (B,V) will be introduced.

Delete/mark all rows where BLOCK >= B .
Drop/mark all tables that were created in BLOCK >= B.

On-demand version maintenance.

Version maintenance will be expensive, requiring operations on all versioned tables in the system. Version maintenance could however be on-demand.

Tables could track when they were last maintained using (B,V). The first in-block operation to occur could test this to see what maintenance needs to be done.

In the example above, consider if the updates/queries happened in block (15,2) instead of right at the reorg. Table would have (10,1) for block version, and see that (11,2) was a fork. Maintenance would occur then.

Need to maintain fork history to apply all forks that might have occurred since table block version.

Advantage of on-demand version maintenance is faster block processing assuming that not all tables are hit in every block.

Disadvantage is unpredictable work; maintaining all tables is possibly a more "even" workload.

DECISION: Attempt on-demand and fall back to "global" maintenance as time permits.

Checkpoint management

Checkpoint begin with (B,H) supplied

Detect version change; run maintenance on user tables as needed. (Note this could on-demand, see note above).
Compute block version (B,V) to put in environment for use in queries during this block.

Checkpoint discard

Might want to use SAVEPOINT in SQLite to discard. This would require more information in begin phase, or always use a SAVEPOINT and simply COMMIT on save.

Checkpoint save

Nothing required here, unless SAVEPOINT is used at which point commit.

In-block operations

Single-key query for key K

select top 1 KEY,DATA where KEY = K [and VERSION != FORKED] order by BLOCK[, VERSION]

Update key K to data D at block version (B,V) with txid T

read row data D' for K as above
write new row KEY = K, BLOCK = B, VERSION = V, TXID = T, DATA = (merge D' D)

Select all valid keys (`keys` in Pact)

Same query as single (without DATA) without TOP 1

Row history queries

Simply query ordered by TXID [avoiding FORKED rows].

Coin Faucet

Account with 1B coins
Ability to write a contract w/max of 100 coins
People can take out coins

Module Headers

Use new mining code for TestWithTime miner

The POW miner has recently been rewritten from scratch. The TestWithTime and Test miner should take advantage of this and use new mining code along with a trivial target and appropriate thread delay.

Mining reward

Block reward has the following inputs:
N - number of chains (ie 10)
S - total mining coin supply (ie 700B)
B - current blockheight
R - block rate (in some time unit) (ie 30s)
H - halfLife in years (ie 20 years)

At a given block height B, the total reward T (which is the sum of all N mining rewards on every chain for height B) is calculated with an exponential decay function for a half-life H allocating S total coins at block rate B. Thus the individual reward is T/N.

N is hardcoded as a function of B, so we can step chain count at particular blockheights.
S is a hardcoded constant, perhaps as a function of chain version.
B is available already when PactService is making blocks.
R is configured or hardcoded as a function of chain version.
H is hardcoded, perhaps as a function of chain version.