sygmaprotocol / sygma-relayer Goto Github PK

View Code? Open in Web Editor NEW

20.0 20.0 18.0 39.77 MB

License: GNU Lesser General Public License v3.0

Dockerfile 0.27% Makefile 0.42% Go 98.72% Shell 0.59%

sygma-relayer's People

Contributors

Stargazers

Watchers

Forkers

mpetrun5 wainola pinkdiamond1 defi-oracle-sygma haochizzle 0xjohnnygault f3smiley rocky-balboa-1 natali9t9 kingsleydon sygmaprotocol staava setheum-labs chinyuchan deltabridge yzcdpg xhcdpg ssuchichan

sygma-relayer's Issues

Use handler response as amount when handler converts decimal amount

Implementation details

Testing details

Acceptance Criteria

E2E tests that covers different FeeStrategies for Substrate <> EVM deposits

Implementation details

Testing details

Acceptance Criteria

Add substrate message handler for the executor module

Implementation details

add message handler that will create proposal data for the execution method on the substrate pallet
Add registration functionality for the message handler

Testing details

Add unit tests

Acceptance Criteria

We are able to register message handler
We are able to handle the message and create proposal data for the execution method on the substrate pallet
Unit tests added

Reliable broadcast

Expand communication layer implementation, so broadcast is reliable.

Implementation details

Based on this reliable broadcast specification

Testing details

unit tests
expanded communication integration tests

Acceptance Criteria

unit tests pass
integration tests pass

Implement deposit(fungibleTransfer) event handler

Implementation details

Parse deposit event and create message
pass the message into the message channel

Testing details

add unit tests

Acceptance Criteria

we are able to parse substrate deposit events(fungibletransfer)

Finish Geneirc Handler v2 planing

Since we decided that our first production ready Generic Message Passing should include fees support with scurrying tx's batching we need to plan next work and design accordingly.

Implementation details

We ned to work on SoW to understand all possbile problems we could face.
Then based on this Sow we should create all necessary tasks

Testing details

Acceptance Criteria

Scope of work for GMP created
Issues created

Docker images not tagged

Currently, we have only stable and latest tag images, but also need tagged version on releases so we and parters can rollback and use fixed versions so it doesn't accidentally pull new changes.

Expected Behavior

Current Behavior

Possible Solution

Steps to Reproduce (for bugs)

Versions

Sygma commit (or docker tag):
chainbride-solidity version:
Go version:

Use solidity docker image for e2e and local env. testing

Implementation details

Repalce e2e and local-setup images

Testing details

Acceptance Criteria

[] e2e tests and local setup should now use truffle migrated image from docker
[] Add local setup documentation to the docs

Remove custom Dockerfile and example app

Currently, the example app is not needed anymore (the production app and example app are mostly the same) and it would be better to use the same for example app and production app as we use example for e2e tests and it would be a better test.

Implementation details

Testing details

Acceptance Criteria

Error in resources id's for assets in fee oracle service config

resource.json file has incorrect resources id's for all the assets.

Implementation details

Testing details

query the rate endpoint and receive dummy data using local setup

Acceptance Criteria

Add EVM <> Phala network E2E tests

Implementation details

Testing details

Acceptance Criteria

Enable deposit and relayer metrics

Export metrics for:

deposit count
deposit error rate
total amount of relayers
available relayers
time between event and execution

Implementation details

Use openetelemetry to export specified metrics.
Deposit count is provided from core, error rate should be added to chain write methods,
total amount of relayers and available relayers should be added from the communication health check method.
Time between event and execution can calculated if the starting time is added into a map with the deposit nonce and destination and then calculated after the execution.

Testing details

Check metrics after running e2e tests

Acceptance Criteria

metrics exported and working

Remove startBlock from sharedConfig

keeping startBLock in sharedConfig makes it is impossible to change this value only for one service in order to resync it if necessary. This property should be strictly related only to particular services and should be removed to sharedConfig

Implementation details

[] Updated shared config specification
[] Make necessary changes for Relayer code
[] Make sure that other services hardly depend on this. If they are crate more issues to handle this

Testing details

Acceptance Criteria

Release pipeline

Add pipeline to the repository that will enable releasing new versions of relayers with generated CHANGELOG file.

Implementation details

We can use the release-please plugin to set up this flow.

Testing details

Test that release PR is created on the new feature

Acceptance Criteria

Working release pipeline

Add Executor module for substrate

Add Executor module for substrate, executor should be able to send extrinsic

Implementation details

details could be found in the SoW research doc

Testing details

Unit test

Acceptance Criteria

Relayer should be able to sign and send extrinsic to substrate pallet

Remove print from permissionless handler

By mistake, left a Println in permissionless generic deposit handler.

Expected Behavior

Current Behavior

Possible Solution

Steps to Reproduce (for bugs)

Versions

Sygma commit (or docker tag):
chainbride-solidity version:
Go version:

Implement bridgePallet functions

Implementation details

implement bridgePallet functions:

IsProposalExecuted(p *proposal.Proposal) bool
ExecuteProposals(proposals []*proposal.Proposal, signature []byte) (*types.Hash, error)
ProposalsHash(proposals []*proposal.Proposal) ([]byte, error)

Testing details

add unit tests

Acceptance Criteria

bridge pallet functions implemented and tested

Topology file encryption

We need to encrypt/decrypt the topology file that is saved to ChainSafe Storage

Implementation details

Testing details

Acceptance Criteria

Relayers use latest as default option

To avoid spamming RPC endpoints accidentally, we should set indexing from the latest block as a default behavior if no start block is set.

Implementation details

Testing details

Acceptance Criteria

Update EVM defaults to something sane

Our defaults currently are really low and are impacting throughput as the transactions don't appear because of too low gas.
Update max gas price, gas limit and gas multiplier to some sane values.
Go through the config and check everything.

Implementation details

Testing details

Acceptance Criteria

The Multilocation is hardcoded for evm -> substrate transfer

Expected Behavior

Relayer should parse multiplication from deposit data and pass it to substrate execution method

Current Behavior

When transferring tokens from evm -> substrate the multiplication is hardcoded in message-handler

Possible Solution

Steps to Reproduce (for bugs)

Versions

Sygma commit (or docker tag):
chainbride-solidity version:
Go version:

Add substrate connection

Implementation details

Implement the substrate connection so we are able to fetch data from substrate
implement connection struct with methods:

GetHeaderLatest() (*types.Header, error)
GetBlockHash(blockNumber uint64) (types.Hash, error)
GetBlockEvents(hash types.Hash, target interface{}) error
UpdateMetatdata() error

Testing details

add unit tests

Acceptance Criteria

we are able to establish connection and pull data from substrate chain

Add CLI for generating Libp2p key pair

For relaying parters to generate libp2p identity private key in protobuf format for ease of use we should
add command to generate keypair and printout peerID and private key in base64 format.

Implementation details

Testing details

Acceptance Criteria

Support new generic cross-chain message format

Based on changes made to the generic cross-chain message format we need to refactor relayers so they can process this new format. In addition, relayers need to use the information on the maximum fee from the message itself when executing on the destination.

Implementation details

Refactor PermissionlessGenericDepositHandler and PermissionlessGenericMessageHandler to process new message format.
Use maxFee parameter from the cross-chain message when executing a generic request on the destination chain.

Testing details

Expand unit tests for message and deposit handlers according to the new message format.
Add an e2e test to validate that the maxFee parameter is used on execution.

Acceptance Criteria

Passing unit tests.
Passing e2e tests.

Refactor SYG_DOM_X env variables to be just SYG_DOMAINS

Instead of manually specifying each domain we should have one config param for all domains to avoid needing to update terraform scripts each time.

Implementation details

Check if it is easy to make it a map and merge with shared config. If not make it a list.
Update devops task files and standalone script for deployment accordingly.

Testing details

Acceptance Criteria

task file updated
deployment script updated
shared config properly merged
not necessary to update terraform script for each new domain
Make sure it works on Dev and TestNet

Fetch topology from IPFS instead of Storage api

We should avoid being dependent on Storage API for fetching topology as we can just pull it from IPFS which
should be more resistant to failure.

Implementation details

Fetch topology from IPFS (preferably create an IPNS domain for it).

Testing details

Acceptance Criteria

Improve verbosity of the logs

It is pretty difficult to debug some of the occasions when relayer is running, next improvements are suggestions to improve general verbosity of some Relayer actions

Implementation details

1.[Info] On startup Relayer should log all SYG_DOM_N evs values. (Except private key, please in this log convert private key to corresponding address)
2. [Info] Processing any events logs should appear not every iteration call but every 5 minutes with the range of all parsed during that time blocks and amount of found events. Although when event found all the logs should remain the same
3. [Info] Print Network Topology on startup and after Refresh
4. [Info] Print Topology URL on startup and on key Refresh
5. [Info] When Relayer is not part of part of MPC group (situation when Relayer have been deployed by partners, but it is still have not being added to Topology map hence does not have any peers) it should notify about this every 5 minutes.

Testing details

Acceptance Criteria

[] Logs have been added

Move communication check to health endpoint

To get more continuous insight into libp2p communication, we want to move the communication check that is currently happening on application startup to the function being invoked when the /health endpoint is invoked.

Implementation details

Move communication health check to /health endpoint.
Health endpoint should not fail if communication can not resolve some relayers.

Acceptance Criteria

Manually test that communication check is being executed when /health endpoint is invoked.

Separate chainID and domainID when signing EIP712 data

Avoid using domainID to sign EIP712 data to fix overflow of chainID into domainID.

Expected Behavior

Current Behavior

Possible Solution

Steps to Reproduce (for bugs)

Versions

Sygma commit (or docker tag):
chainbride-solidity version:
Go version:

E2E Test client for Substrate and basic call methods test

We need to start adding E2e Substrate related tests to our Relayer code.

Implementation details

Add Substrate node to docker-compose file
implement E2e Tests skeleton
test basic call methods to check that node is responding fine
Substrate runtime with pre-setup pallets (all Resources, MPC key) should be built on release (tag) . Use some Rust or JS script as a Migration (refer to solidity)

Testing details

[] E2E tests are passing

Acceptance Criteria

General skeleton for future E2E tests implemented

Limit number of batched deposits

Problem

We ran a script that executes 10.000 deposit requests simultaneously. This resulted in relayers batching a huge number of bridging requests in one MPC signing, or what is actually problematic, into one executeProposals call. As you can see here, this fails on the destination as it is impossible because of the gas limit to execute so many transfers in one transaction.

Implementation details

We should limit the number of requests that can be batched into one MPC signing (execution). We can implement this on relayers. Once the relayer process more than X requests, it starts a new MPC signing and continues to process requests from this batch of blocks. Currently, we are batching all requests (without generic) that came in the last N blocks (where N is the number of blocks that relayers are processing in batch).

Testing details

Run a large number of deposits simultaneously and check that all executions are successful.

Acceptance Criteria

Add limit of deposit requests that can be batched in one MPC signing
Successfully execute manual test on devnet environment

Fix flaky unit test

LoadPeers unit test has flaky behavior and sometimes reorders peers.
Make the test reproducable.

Expected Behavior

Current Behavior

Possible Solution

Steps to Reproduce (for bugs)

Versions

Sygma commit (or docker tag):
chainbride-solidity version:
Go version:

Create new example of local setup with fee oracle

Create new example of local setup with fee oracle instead of basic fee handler

Implementation details

Testing details

Acceptance Criteria

docker-compose.yml should consist fee oracle server which works for test token at geth nodes

Add Substrate chain type support

Add Substrate chain support in the app level along with corresponding configuration

Implementation details

details could be found in the SoW research doc

Testing details

Unit tests

Acceptance Criteria

Relayer should support substrate chain beside EVM chain type
Relayer should be able to load substrate chain config from configuration file
Relayer should be able to launch after reading substrate chain configuration file

Add support for shared configuration [relayers]

Relayers need to be refactored so they support the new shared configuration.

Implementation details

Load shared configuration from IPNS URL
Load reduced version of domain configuration from ENV or file
Merge two configurations by domain ID

Testing details

Update all unit and e2e tests regarding made changes
Add additional tests for loading shared configuration

Acceptance Criteria

Passing unit and e2e tests
Successful devnet shared configuration loading

Add substrate client

Implementation details

Add Client package that should be able to sign and submit extrinsics

Testing details

Add unit tests if posible

Acceptance Criteria

Substrate client package added
The client is able to submit and sign extrinsics
Unit tests added

Implement general E2E Deposit test between Substrate and EVM

Implement E2E test that makes deposit from Substrate to EVM and from EVM to Substrate.
Check that balances

Implementation details

Testing details

[] E2E tests

Acceptance Criteria

[] E2E Tests are passing including deposits EVM <> Substrate

Start block calculation with latest flag

Bug description

All relayers need to start processing each domain on a specific block (dividable by block interval), as this is how we are sure that all relayers are processing the same batches of blocks. This is working as described, except when relayers are set to start from the latest block, the --latest flag.

Testing details

Manually test that proper start block is set when relayer is started with latest flag

Acceptance Criteria

Relayer started with latest flag properly calculates start block

RPC endpoints balancing

Redundancy is needed when invoking RPC endpoints for interaction with the chain.

Implementation details

Firstly, expand configuration so it can accept an array of RPC endpoints. Then design and implement a mechanism that will dial the next endpoint from the array if multiple requests timeouts.

Testing details

Add unit tests for this functionality.

Acceptance Criteria

Enable providing multiple RPC endpoints per domain
Passing unit & e2e test

Make docker images public

For relaying partners to be able to run their own relayers we should make the docker images built on release public.

Implementation details

Switch to dockerhub and make them public there.

Testing details

Acceptance Criteria

Add Retry event handler for substrate

Implementation details

Testing details

Acceptance Criteria

Remove unused functions

Implementation details

remove unused functions in /evm/calls/contracts/bridge

Testing details

Acceptance Criteria

all unused functions are removed

Update relayers for GenericHandler v1.0.0

We want to make generic handler permissionless, where each developer can use Sygma infrastructure to execute cross-chain calls without needing to contact the Sygma team to register it beforehand.

As a result, we are implementing v1.0.0 of the generic handler as a starting point - see the issue for solidity changes.

For more details on implementation and more context check this notion page.

Implementation details

Implement a new GenericDepositHandler and register it on relayer initialization (example and app)
- Currently, we are using old implementation from chainbridge-core that is assuming old format of depositData
- New implementation should just pars metadata accordingly to the new defined format (see more details here)
Implement a new GenericMessageHandler and register it on relayer initialization (example and app)
- Similar to the already mentioned deposit handler, we are currently using an old implementation from chainbridge-core that is assuming the old format depositData. We need a new implementation that is taking into consideration newly defined format of depositData.

Testing details

Add unit tests for new GenericDepositHandler and GenericMessageHandler

Acceptance Criteria

Passing unit and e2e tests
Tets local setup with new generic handler once solidity changes are finished

Add substrate event listener module

add substrate event-listener

Implementation details

Add event-listener module for listening substrate events

Testing details

add unit tests

Acceptance Criteria

Relayer is able to listen on substrate events

Add e2e tests for dynamic fee calculation for PermissionlessGenericHandler

Add e2e tests for dynamic fee calculation for GMP.

Implementation details

Testing details

Acceptance Criteria

e2e test for dynamic fee calculation for GMP

Add batch event process for substrate

Since relayer now supports batch event process for EVM, it should also support it for substrate

Implementation details

details could be found in the SoW research doc

Testing details

Unit test
E2E test that with local substrate evm with manually sending token transfer extrinsic

Acceptance Criteria

Relayer should be able to batch substrate events
Batch event number or block number should be configurable

Add relayer binaries to release

We should have binaries of each version as assets stored in release for us and partners to be able to run relayer and
CLI commands related to relayer.

Implementation details

Add build and binaries to release CI pipeline.

Testing details

Acceptance Criteria

Add CLI for generating Libp2p key pair

For relaying parters to generate libp2p identity private key in protobuf format for ease of use we should
add command to generate keypair and printout peerID and private key in base64 format.

Implementation details

Testing details

Acceptance Criteria

Process generic messages sequentially

For our v2 iteration of the generic handler, we need generic bridge requests not to be batched (one request per MPC signing). More on the reasoning behind this can be found inside technical documentation.

Implementation details

Implement a new deposit handler that will process generic requests one by one.

Testing details

Add unit tests for the new deposit handler. Expand generic handler e2e tests with a case where multiple generic requests are sent in the same block.

Acceptance Criteria

Passing unit tests
Passing e2e tests

Resource limit exceeded on relayers

Current Behavior

We realized that our relayers, after working for some time, get to this state where they are not able to open streams toward other relayers (peers).

I would say it is related to the connection number limit as described in the discussion below:

Error on dial: system: cannot reserve connection: resource limit exceeded

Possible Solution

This needs more investigation and generally checking that all connections are being closed once we are not using them anymore. In conjunction with this, I realized that if you observe the diagram on datadog of memory usage for our relayers (check for a month period) we have some kind of memory leakage. This is likely related to connection management.

We first need to validate how connections are being managed by adding some additional logging and then evaluate what is next step.

Steps to Reproduce (for bugs)

Unfortunately, it is hard to define the exact steps to reproduce this. It happens in our dev environment after relayers work for some time.