Giter VIP home page Giter VIP logo

zkevm-circuits's Issues

[EVM] Allow access to compressed words at witness generation

Sometimes just loading the compressed value itself in the circuit is sufficient to handle certain opcodes, and so there is no need to do the compression/decompression in the circuit itself. However there currently doesn't seem a way to access the r value in the assign function for opcodes to compress the input byte values and compress the word manually at witness generation time. Making r available in this function somehow would solve this issue.

Another way this could be done could be to calculate the compressed value on a higher level and pass it into the op gadget, but I think that probably makes less sense.

Implement the 8-bit word encoding

What's wrong

We plan to use the BN254 curve that supports a max integer of 2**254, which is less than a typical 256-bit word in EVM.

To work with EVM values in the circuit, we need to encode a 256-bit value into 8 bits chunks. Take word256 = 1 example, the corresponding 8 bit words is word8s = [1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0].

Other details are specified here https://hackmd.io/@liangcc/zkvmbook/https%3A%2F%2Fhackmd.io%2FQUiYS3MnTu29s62yg9EtLQ

How are we fixing this

  • Implement the circuit to check if an array of 8-bit words corresponds to a commitment of a 256-bit word.
  • The random number is not finalized yet. Can implement hash(ALL8BITS) for the time being.

Merge the pull requests using squash

I realize that currently we merge the pull requests without squashing. I think this is not a good practice to maintain a collaborative project. The benefit and drawback of merging w/ and w/o squashing:

  • If merging PRs without squash, the commit history could become quite messy as many small commits such as fixing typos will all appear in the commit history. It'll be hard to understand what each commit aims to do and commits from different PRs also interleave with each other.
  • OTOH, if we merge PRs with squash, each commit has a clear purpose and we can refer back to its original PR and check the comments in the PR. Further, CI can check whether each PRs can pass after merging into the main branch. Many famous open source projects all use this mechanism to manage the PRs from community.

I'd like to hear from what your thoughts are on this topic. And hope you could take my suggestion.

Need rust-toolchain

zkevm-circuits uses rust nightly, so I think we need a rust-toolchain file to standardize the project

Change internal representation of `MemoryAddress` to `usize`

As said by @han0110 :

was integrating with bus-mapping and found MemoryAddress is implemented in BigUint. I think for a honest prover, probably we >only need usize for that cause once memory usage up to 2**24, the gas is 20x then current block gas limit.

(btw 2**24 is the hard bound for memory address in my mind for now)

This will make the type Copy and might enable some optimizations.

Integrate `EvmCircuit` with crate `bus-mapping`

Currently EvmCircuit uses lots of self-defined struct that should be provided by crate bus-mapping like:

  • Case
  • CoreStateInstance
  • ExecutionStep
  • Operation

We should remove this when bus-mapping is ready for

  • Providing call_id, stack_pointer, gas_counter, case in ExecutionStep
  • Providing Vec<Operation> from ExecutionTrace

Add Makefile to run CI checks locally

Clippy checks and maybe doc intra-broken-links checks (which we should add once documentation starts to happen) can cause the CI failure of some workflows.

It will make it more easy to expose a test rule from a makefile that allows to run the same checks that the CI runs locally with a single cmd instruction.

Implement Bus Mapping structure

It should come with at minimum

  • Sorting capabilities for Stack, Storage and Memory operations
  • Compression capabilities via closures
  • Be as generic as possible and allow for easy plug in of external logic via closures.

Create new type for EVM memory layout that allows for unaligned memory ops management

As said by @han0110 :

  • for sha3, we decide the offset and size, it could also be mis-aligned.
  • when we run call or calldatacopy, we also decide the offset and size.
  • in the end of contract creation, we also decide the offset and size as bytecode to store

The idea would be to be able to swap between a BTreeMap and a Vec<u8> so that we can manage the memory easily for regular Opcodes and for the tricky ones.

Maybe changing completely the approach that we have now. This is also an option to consider.

Pre-load fixed constants into lookup tables

In both state and EVM circuits, we have to check that certain inputs are within an allowed range of values. At keygen, these allowed values are loaded as fixed constants into lookup tables. The constants include:

  • opcodes
  • memory addresses
  • storage addresses
  • stack addresses (a 10-bit range, from 0..=1024)
  • global counter (?)
  • program counter (?)
  • TBD: 8-bit lookup tables for encoding and decoding 256-bit words (?)
  • any more?

benchmark and start to optimise halo2 prover using ZKG polynomial commitment scheme

#38 we have replaced the halo2 lib with KZG in the verifier. We also have some work to benchmark the prover using ZKG with halo2 lib.

Our current approach is probably going to result in very large circuits. So setting an upper bound on the proving time for such large circuits would be useful to

  1. De-risk our current architecture choices
  2. Set our expectations around proving times and the cost of running a proving node.

https://github.com/kilic/halo2/tree/kzg is the branch that contains the KZG polynomial commitments.

Easy to use EVM simulator for unit tests

Currently the way to write unit tests is to manually create the execution steps and resulting operations for all the different op codes each time. This is very tedious. It would be great if we could have a very simple EVM simulator (which would also be updated when implementing an opcode) where we could just write something like this (with maybe some extra ways to mock data for testing purposes):

evm.push(1);
evm.push(2);
evm.add();

try_test_circuit!(
   evm.get_execution_steps(),
   evm.get_operations(),
    Ok(())
);

I think this will make writing unit tests much more pleasant and faster.

The closest thing to this may be something @CPerezz is working on. Do you think there could be some overlap here @CPerezz? If not I'd be willing to do this myself.

Keccak256 tables to build

Problem

We have 4 lookup tables to use in the keccak256 gadget. Listing their Python pseudocode below.

from_binary_converter_table

Purpose: Convert 64 bits value from binary to base 13 in 16 bits chunks

>>> import itertools
>>> list(itertools.product([1, 2, 3], [4, 5, 6]))
[(1, 4), (1, 5), (1, 6), (2, 4), (2, 5), (2, 6), (3, 4), (3, 5), (3, 6)]

This table has 2**16 rows

class FromBinaryConverterTable:

    def __init__(self):
        axies = [[0, 1] for i in range(16)]

        for coefs in itertools.product(*axies):
            assert len(coefs) == 16
            # key is the binary evaluation of coefs
            key = sum([coef*(2**i) for i, coef in enumerate(coefs)])
            value0 = sum([coef*(13**i) for i, coef in enumerate(coefs)])
            value1 = sum([coef*(9**i) for i, coef in enumerate(coefs)])
            self.add_row(key, value0, value1)

first_to_second_base_converter_table

Purpose: convert the middle segments, where the chunk size is 4

This table has 13**4 rows

def block_counting_function(n: int) -> int:
    table = [0, 0, 1, 13, 170]
    return table[n]


class FirstToSecondBaseConverterTable:
    def __init__(self):
        axies = [list(range(13)) for i in range(4)]

        for coefs in itertools.product(*axies):
            assert len(coefs) == 4
            # x0, x1, x2, x3 are 0~12
            x0, x1, x2, x3 = coefs
            key = x0 + x1 * 13 + x2 *13**2 + x3*13**3
            fx0 = keccak_u64_first_converter(x0)
            fx1 = keccak_u64_first_converter(x1)
            fx2 = keccak_u64_first_converter(x2)
            fx3 = keccak_u64_first_converter(x3)
            # fx0, fx1, fx2, fx3 are 0~1
            value = fx0 + fx1 * 9 + fx2 *9**2 + fx3*9**3
            
            non_zero_chunk_count = 4 - len([i for i in coefs if i==0])
            block_count = block_counting_function(non_zero_chunk_count)
            
            self.add_row(key, block_count, value)
def keccak_u64_first_converter(n: int) -> int:
    """
    n is the sum of 12 different bits.
    The theta step has 12 xor operations
    If the sum is odd, then the 12 xor operations result 1
    If the sum is even, then the 12 xor operations result 0
    """
    assert n < 13
    return n & 1


def keccak_u64_second_converter(n: int) -> int:
    """
    n is the output of 2a + b + 3c + 2d, where a, b, c, d are bits
    every possible n can be uniquely mapped to the output of a ^ (~b & c) ^ d
    bit_table is the output of a ^ (~b & c) ^ d
    """
    assert n < 9
    bit_table = [0, 0, 1, 1, 0, 0, 1, 1, 0]
    return bit_table[n]

of_first_to_second_base_converter_table

Purpose: convert the last segment, where the chunk size is less than 4

The table size should be (13 + 1) * 13 / 2 = 91 rows

class OfFirstToSecondBaseConverterTable:

    def __init__(self):
        for i in range(13):
            for j in range(13 - i):
                low = i
                high = j
                key = low + high * 13**64
                value = keccak_u64_first_converter(low + high)
                self.add_row(key, value)

from_second_base_converter_table

Purpose: Convert from base 9 to base 13 or binary

This table has 9**5 rows

class FromSecondBaseConverterTable:

    def __init__(self):
        axies = [list(range(9)) for i in range(5)]

        for coefs in itertools.product(*axies):
            assert len(coefs) == 5
            # key is the binary evaluation of coefs
            key = sum([coef*(9**i) for i, coef in enumerate(coefs)])
            value0 = sum([coef*(13**i) for i, coef in enumerate(coefs)])
            value1 = sum([coef*(2**i) for i, coef in enumerate(coefs)])
            self.add_row(key, value0, value1)

Solution

Build the 4 tables

Note

  • See this note for other details
  • Could create an abstraction MultiBaseNormalizationTable

[EVM] `ADD` example

Implement the ADD opcode in the EVM circuit.

This requires a bus mapping as input.

Create a type for EVM Stack memory which manages the indexing behind the scenes

Currently the stack is implemented as a Vec<EvmWord>. The only issue is that we need the stack to allocate TOP -> BOTTOM.

In order to do so we hardcode operations in some places in the code which perform 1025 - address which adds the correct address to the StackOps that we create.

The idea is that this should not happen and this type should hide this complexity/logic inside of it and not expose this duality of positions to the outside.

Migrate `EvmWord` to `[0u8;32]`

Will save us memory and there's no need to have BigUint if we're not going to operate with it.

Also makes easier to adapt the struct to #89

cargo fmt check with warning

on the main branch :
cargo fmt --all -- --check
Warning: can't set wrap_comments = true, unstable features are only available in nightly channel.
Warning: can't set wrap_comments = true, unstable features are only available in nightly channel.
Warning: can't set wrap_comments = true, unstable features are only available in nightly channel.
Warning: can't set wrap_comments = true, unstable features are only available in nightly channel

I saw it will cause CI failing as https://github.com/appliedzkp/zkevm-circuits/pull/78/checks?check_run_id=3728234388

So please help check it !

Thanks
Dream

Implement `BusMappingLookup::Bytecode`

Currently in evm circuit the opcode is not verified with contract's bytecode if it's a contract interaction.

Some opcodes like PUSH* that would fetch the contract's bytecode should also be verified.

Bus mapping compression gadget

Each witnessed bus mapping (of variable length) is compressed into a commitment. The committed bus mapping is passed in to both circuits as public input. The commitments are indexed by global_counter.

Within the circuit, we constrain a witnessed bus mapping at a certain global_counter to match the public input commitment at that global_counter.

Write a gadget to do this for variable-length bus mappings.

[Gadget] Keccak256 hash function

Problem

In order to be as much EVM compatible, we need to perform Keccak256 in the circuit.

Solution

We plan to implement Keccak256 as a gadget for the codebase to consume.

We have a working document here https://hackmd.io/4Bzz0FzTQLyYI5uzjgP2iA?both.

An existing implementation here https://github.com/matter-labs/franklin-crypto/tree/dev/src/plonk/circuit/hashes_with_tables/keccak

Need to figure out how to do cyclic bit rotation. Plan to use this trick https://hackmd.io/xfgP5_uMTZyaEJJG4EJoRQ?view

Plan to reuse the sponge from orchard poseidon https://github.com/zcash/orchard/blob/main/src/circuit/gadget/poseidon.rs

Verify gas is indeed sufficient in success case

Currently in EVM circuit we didn't verify gas is indeed sufficient in success case. Also currently it tracks gas_counter as accumulated gas usage, which is not friendly to verify since we need to read out the gas_available of this call which we don't actually track step by step every time.

We should track gas_left instead and verify it's indeed sufficient in every success case naively by decode next_state.gas_left into 8 bytes to verify it's not underflow.

Also after gas_left is capped in u64, then everything calculated to be gas cost like quad_memory_cost could also be capped in 8 bytes in success case.

bus-mapping: Implement Opcode trait for all non-storage-related `OpcodeId`s

This will be needed in order to have the capability of testing most of the traces that we might consider.

This will probably exclude opcodes like CALL, CALLDATACOPY etc.. As they might require a bit extra config and design.

Add docs for `bus-mapping` workspace member crate

After #20 lands, since it's only purpose was to provide a basic layout structure for the crate, it would be nice to start documenting it and it's structures and how it provides the data to be witnessed by the circuits.

Testing time too long

In the XOR circuit implementation (#75), I added the XOR lookup table with 2^16 rows. In order to accommodate so many rows, I need to increase the table size in Halo2 to 2^17. But after that, the testing time becomes very long. It took me more than 10 minutes to finish the test. What's worse is that after having XOR circuit, all other opcode tests need to bump up their table size. It's very likely that the Github test will timeout. We need to find a way to address this issue.

1559 support

  • In zkp its tricky to support 1559 because the gas block limit adjusts with each block +/- 50% per block.
  • Inside zkrollup it makes less sense because most of the blocks will have a few coordinators who may try and bid to manipulate the block size to reduce the 1559 burn.
  • There are some intersting ideas here for leader election similar to https://ethresear.ch/t/spam-resistant-block-creator-selection-via-burn-auction/5851

That said we will need to support 1559 for ethereum proof of validity at least.

Perhaps it will be easier to support 1559 if we blur the lines of where a block can start and finish. Instead of having each ZKP containing a full block. We could have a zkp contain all or a portion of a block. This would allow us to adjust the size of the gas block limit on the fly according to the rules of 1559.

Port the repo to be a workspace

Once we have the bus-mapping crate here, we should maybe aim to have the repo more as a workspace to make easier the developement of the entire solution.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.