lurk-lab / arecibo Goto Github PK

View Code? Open in Web Editor NEW

66.0 12.0 27.0 12.95 MB

An advanced fork of Nova

Home Page: https://lurk-lang.org/

License: MIT License

Rust 99.77% Just 0.23%

cryptography nova r1cs zero-knowledge

arecibo's People

Contributors

Stargazers

Watchers

arecibo's Issues

Reduce Duplication Between SatisfyingAssignment and WitnessCS Structs

It has been observed that the SatisfyingAssignment (in this repo) and WitnessCS (in bellpepper) structs have significant overlap in their definitions. This duplication can lead to maintenance challenges and potential inconsistencies.

Similarities:

Purpose: Both structs serve as a ConstraintSystem that calculates witness values for a concrete instance of an R1CS circuit.
Fields: Both structs have fields input_assignment and aux_assignment.
Methods: Most methods in both structs are identical in both functionality and structure. This includes the implementation of the ConstraintSystem trait, as well as auxiliary methods such as scalar_inputs and scalar_aux.

Differences:

Generics:
- SatisfyingAssignment is parameterized over G: Group and uses G::Scalar.
- WitnessCS is parameterized over Scalar where Scalar: PrimeField.
Attributes & Derivations:
- WitnessCS has the #[derive(Debug, Clone, PartialEq, Eq)] attribute, which provides automatic implementations of the Debug, Clone, PartialEq, and Eq traits.
- SatisfyingAssignment provides its own fmt::Debug and PartialEq implementations and does not derive Clone or Eq.

Refactoring Potential:

Shared Trait or Base Struct: Given the similarity in behavior, a shared trait or a base struct could be created to capture the common functionalities. The specific implementations (SatisfyingAssignment and WitnessCS) can then extend or implement this shared definition.
Consolidate Attributes & Derivations: Determine whether SatisfyingAssignment requires its own Debug and PartialEq implementations. If not, it could use the derived implementations similar to WitnessCS.
Type Alias or Adaptors: If the core difference lies in the type parameters (G::Scalar vs. Scalar), consider type aliasing or using adaptors to harmonize the two.

howto: make Nova params smaller

'current nova samples more generators than needed in support of some experimental work'

https://github.com/microsoft/Nova/blob/58fc746c0bb2a22750487479494406fbccfd6a74/src/r1cs.rs#L76

set total_nz on line 75 to zero

ZeroMorph: Implement Zeta_x

Zeta_x is defined in Page 24 from zeromorph paper. We need to calculate this polynomial such that in the main protocol (Page 34) we can compute q_zeta respecting the relation Zeta_x = (X - x).q_zeta.

Multilinear Evaluation Space consumption

Ref: http://bgillespie.com/doc/multilinear-evaluation.pdf

The MLE in Nova¹ is here:

arecibo/src/spartan/polys/multilinear.rs

Lines 86 to 100 in a256108

 /// Evaluates the polynomial at the given point. 

 /// Returns Z(r) in O(n) time. 

 /// 

 /// The point must have a value for each variable. 

 pub fn evaluate(&self, r: &[Scalar]) -> Scalar { 

 // r must have a value for each variable 

 assert_eq!(r.len(), self.get_num_vars()); 

 let chis = EqPolynomial::new(r.to_vec()).evals(); 

 assert_eq!(chis.len(), self.Z.len()); 

 (0..chis.len()) 

 .into_par_iter() 

 .map(|i| chis[i] * self.Z[i]) 

 .sum() 

 }

In the above, this uses Algorithm 3, which seems less interesting than algorithm 4 in terms of memory consumption. This issue is about solving the following quesiton: Would it be interesting for us to compare & bench algorithm 4?

incidentally, this shows Nova has a slightly unusual coefficient order in the presentation of its MLE:
https://github.com/lurk-lab/arecibo/blob/133b8b714a444e0c16971bbe50f806c997c392e3/src/spartan/polys/multilinear.rs#L340-L442 ↩

Implement common trait for both hiding and non-hiding KZG

We need to define and implement a common interface that any implementation of KZG (hiding or non-hiding) must implement in order be used as a building block for ZeroMorph.

Remark: the main protocol described in Page 34 of ZeroMorph paper only requires KZG to have setup, commit and open functions. In particular, it doesn't require the subprotocol (proveEval, verifyEval) from KZG. This is because the subprotocol is reconstructed at ZeroMorph layer, including batching of degree checks.

Benchmark a contiguous data representation for Sparse Polynomials

#38 switched the representation of sparse matrices from COO to CSR, to a slight performance benefits in operations such as SpMV. The key insight is that CSR is a representation where data for successive data in a loop is represented consecutively in memory. This makes it amenable to auto-vectorizers.

Nova uses a representation of a sparse polynomial in (coordinate, coefficient) form, where the representation of data being iterated upon is not contiguous:
https://github.com/lurk-lab/arecibo/blob/dev/src/spartan/polys/multilinear.rs#L137-L174

It's possible that switching to a contiguous memory representation (one vector for coefficients, one vector for coordinates) may bring performance benefits, and probably worth testing.

Integrate with grumpkin-msm

https://github.com/lurk-lab/grumpkin-msm should be integrated in https://github.com/lurk-lab/arecibo/blob/dev/src/provider/bn256_grumpkin.rs following the usual pasta_curves pattern

ZeroMorph: Initialize Fiat-Shamir in a secure way

We don't want to be vulnerable to the Frozen Heart vulnerability: https://blog.trailofbits.com/2022/04/15/the-frozen-heart-vulnerability-in-bulletproofs/

More information here: https://eprint.iacr.org/2023/691

SuperNova: Duplicated variables

In SuperNovaAugmentedCircuit::synthesize_base_case, the default R1CS is allocated at every iteration. Placing it outside the loop would avoid duplicated variables and constraints.

Spartan: Use `PowPolynomial` for `tau` polynomials

The current implementation samples log(n) challenges to construct the EqPolynomial. It was noticed that only 1 challenge is necessary since we can take the log(n) square powers: tau, tau^2, ..., tau^{2^{log(n)-1}.

Originally posted by @adr1anh in #154 (comment)

refactor/spartan: Generic batched `SumcheckEngine` module

The SumcheckEngine trait provides a useful abstraction for describing Sumcheck claims, and the corresponding prove_helper method in ppsnark.rs provides a useful generic Sumcheck prover.

`SumcheckEngine::prove_batch`

This method should take as input a heterogeneous list of SumcheckEngine implementations of possibly different sizes, and create a batched Sumcheck proof for all claims.

Dynamic dispatch may be fine in this situation, since all the "hot-loops" are implemented by each SumcheckEngine.

`SumcheckEngine` for evaluating a polynomial

We use Sumcheck to check the multi-linear polynomial in several places

batch_eval_{prove, verify}
The second claim in InnerSumcheckInstance of ppsnark.rs

If we split the InnerSumcheckInstance, we can implement the second claim using this instance, which can be reused by both SNARKs.

`InnerSumcheckInstance`

Both direct and pre-processing SNARKs need to check $Ax \cdot Bc - (u\cdot Cz - E) = 0$. In the literature, this is often referred to as the "row-check" for R1CS. Both implementation should use the same SumcheckEngine implementation to prove the claim.

`SumcheckEngine` trait

It would be very practical if the trait exposed num_remaining_variables to indicate how many variables are left to prove. We use this quantity to handle batching.

Acceleration of SpMVM in folding

Background:

Both the Nova project and our for Arecibo consider performance of folding to be critical. During our recent analysis, we noticed that the commitment to cross-terms is a significant component of the folding performance. This observation holds true even when the costs are, for large enough ops, primarily dominated by an MSM operation, as predicted by theoretical analysis.

Findings:

The texray graph showcasing our investigation's results is as follows:

RecursiveSNARK::prove_step                       5s     ├───────────────────────────────────────────────────────────────┤
    <MultiFrame as StepCircuit>::synthesize      1s     ├────────────┤
    <_ as Group>::vartime_multiscalar_mul      947ms                  ├─────────┤
    NIFS::prove                                  3s                              ├─────────────────────────────────────┤
        AZ_1, BZ_1, CZ_1                         1s                              ├─────────────┤
        AZ_2, BZ_2, CZ_2                       767ms                                             ├───────┤
        cross terms                            263ms                                                      ├─┤
        T                                      202ms                                                         ├┤
        <_ as Group>::vartime_multiscalar_mul  674ms                                                           ├──────┤
    <_ as Group>::vartime_multiscalar_mul        5ms                                                                    ┆

@winston-h-zhang to provide more information on reproduction here

The commitment to cross-terms is evident at the following location in the codebase:
arecibo/src/nifs.rs#L52C1-L53
The primary computational complexity in this function arises when computing a matrix-vector product between the R1CS matrices and an incoming instance-witness pair. This is visible at:
arecibo/src/r1cs/mod.rs#L285-L301
The matrix-vector product terms are represented as AZ_1, BZ_1, CZ_1 and AZ_2, BZ_2, CZ_2 in the texray graph.
To optimize the operation's speed, we transitioned the matrix representation from COO to CSR given that they are represented in sparse form.

Challenges:

While the MSM operation can be GPU-accelerated (as seen in the pasta-msm project), the field multiplications involved in the matrix-vector product are currently not.

Proposed Solution:

It's imperative to accelerate these field multiplications to achieve optimal performance for the folding operation.

Manage `CommitmentKeyHint`s properly in supernova

This is where the optfn1, optfn2 parameters from above come in. They need to be parametrized by a preference expressed as a static function of the SNARK that will eventually be used to compress the proof. See #5. We should open an issue to fix this across SuperNova.

Originally posted by @huitseeker in #51 (comment)

ci: run benchmark results on merge_tests and publish the results on GH-pages

We would like to keep track of performance in benchmarks. We should install CI to do so, similar to what we do in Lurk :
https://github.com/lurk-lab/lurk-rs/blob/ab427f23a709e95edb656e6b29bd55f1e22424bc/.github/workflows/merge-tests.yml#L69-L205

ZeroMorph: Polish and document code

Add comment to make it easier to understand how the implements the paper.

Consider renaming some variables.

`supernova::RecursiveSNARK::prove_step()` shouldn't need `augmented_circuit_index` as a parameter

The current setup with the program counter for supernova::RecursiveSNARK makes it so that at each step of the proof the user has to provide an augemented_circuit_index to RecursiveSNARK::prove_step().

RecursiveSNARK already internally contains the data of the program counter which has to be extracted from and then, immediately afterwards, provided as an input to RecursiveSNARK in the proving loop using some sort of logic similar to what is done here.

If RecursiveSNARK internally kept track of the next augmented circuit index, it should be possible to change the signature of prove_step to:

  pub fn prove_step<N: NonUniformCircuit<G1, G2, C1, C2>>(
    &mut self,
    pp: &PublicParams<G1, G2, C1, C2>,
    non_uniform_circuit: &N,
    z0_primary: &[G1::Scalar],
    z0_secondary: &[G2::Scalar],
  ) -> Result<(), SuperNovaError>

and add the logic from the above-linked example to the prove_step method.

By making this change, we will probably be excluding some unusual use-cases for Supernova's NIVC, but I don't think it's excluding the main use-case where program_counter directly points to the next expected circuit to evaluate.

Universalize CommitmentKey in Arecibo to save space and streamline implementation

The nature of public params in Arecibo presents a key opportunity for improvement, particularly in relation to how many bases are used in the CommitmentKey, which bases are the same for any proof using the IPA PCS (i.e. before Zeromorph, all of them).

Current Behavior:

The CommitmentKey acts as a universal parameter and ideally, we'd have a single large disk copy stored for big circuits, which can then be trimmed down as needed. The key difference between two Nova proofs (and possibly, the kind of SNARK they use) should be solely based on the number of elements they require.

Desired Behavior:

Ensure a single, universal CommitmentKey is stored and utilized to trim down specifics for each proof.

Advantages:

This would be a crucial step towards implementing Zeromorph (arecibo/issues/19). In the PCS system of Zeromorph, which mandates a universal setup, there's a clear demarcation between Public Parameters (universal) and the Prover Key (specific to proof).
There's also an opportunity to save substantial space. In Nova, unlike SuperNova which builds the CommitmentKey separately, the CommitmentKey is encapsulated within public parameters. As a result, any serialization duplicates it. Mutualizing it among proofs will yield significant space savings (lurk-rs/issues/387).

refactor: replace direct SNARK with batched SNARK

The implementations of the batched SNARKs are should be equivalent to the non-batched versions. We can replace the direct implementation with the batched one using a number of instance = 1.

direct: Conditionally squeeze outer_r challenge in cases where number of instances = 1.
pre-processing: Move all SumcheckEngine related structs from ppsnark.rs to sumcheck.rs
Implement RelaxedR1CSSNARK for batched_ppsnark::RelaxedR1CSSNARK and batched::BatchedRelaxedR1CSSNARK using vectors of size 1.

This should help with maintenance going forward as we will only have to maintain 2 SNARKs rather than 4.

Use common serialisation format for Pasta and Grumping curves

Field elements from proof / public parameters currently a being serialised differently depending on curve. For example for Pasta curves we have:

"digest":"64288c7ef43914fee16e90774bebe8727d734b28c9cee7c0b35e97a0c27efd02"

while for Grumpkin:

"digest":[560067867113498870,9482374276275323153,13618845702275708329,2842217616396256311]

If there are no actual reasons, the first (Pasta) variant of serialisation is preferable to use as soon as it doesn't require to perform elaborated Montgomery reduction while importing the field element. Our verifier uses data loader, sharpened for Pasta variant, which, for correct uint256 import into Solidity environment, requires just item reversing.

The serialisation that I used for testing was added with the following code (here):

// serialize compressed snark
let serialized_compressed_snark = serde_json::to_string(&compressed_snark).unwrap();
std::fs::write(std::path::Path::new("compressed-snark-grumpkin.json"), serialized_compressed_snark).expect("Unable to write file");

// serialize verifier key
let serialized_vk = serde_json::to_string(&vk).unwrap();
std::fs::write(std::path::Path::new("verifier-key-grumpkin.json"), serialized_vk).expect("Unable to write file");

Investigate R1CSShape inclusion in the Spartan ProverKey

See https://github.com/microsoft/Spartan2/pull/2/files
Is the same applicable to Nova, i.e. are we storing something that could be set up at proving time twice?

Supernova completion

This tracks the supplementary work we want to accomplish with SuperNova before we upstream it in a PR to Microsoft/Nova:

#94
#53
rationalize the circuit index passing (#64 #175)
Make clear iter_base_step is just a constructor (#63)
transform the NIVC examples into tests (see also #52, closed by #92)
~~finish the CompressedSNARK V0 implementation (#80)~~
finish the CompressedSNARK V1 implementation << (@mpenciak / @adr1anh to flesh out what that V1 looks like)
#93

Avoid copies of the public parameters in Supernova's Running claims

It seems quite essential to make sure that the params field of RunningClaim is an Rc to a PublicParams singleton, this sort of duplication is quickly going to be unmanageable.

Note the only reason I could see that we're manipulating a Vec<RunningClaim<...>> is to be able to index into them. We could have a struct RunningClaim<...> which would host the public parameter reference and some impl Index, impl IndexMut impls. Shall we open an issue?

Originally posted by @huitseeker in #23 (comment)

Refactor ECC tests to use expected_test

This is a follow-up of #132 and #195

There may be other tests that would benefit from this (maybe the ecc op circuit tests for constraint numbers?)

You're 💯 right. Those test have been introduced as part of #146

Originally posted by @huitseeker in #195 (comment)

Improve NovaError

On the zeromorph branch, we're relying on NovaError::InvalidIPA for the errors in our PCS.
We should have instead:

a PCSError that's common to all PCSes,
one of its variants should include InvalidLength, as we have 20 places where we haphazardly compare the length of 2 arguments, and error if they're not equal
it should be defined using thiserror,
it should be embedded as a variant of NovaError (which already uses thiserror, see the "from" annotation for that crate), probably replacing the IPAError variant of NovaError.

Deduplicate minroot and minroot-serde examples

We're ferrying a minroot_serde example since 2f544af, which duplicates the entire minroot example to demonstrate the fast (de/)serialization of public parameters.

That's a lot of real estate for a relatively simple demonstration. Could the minroot & minroot_serde examples perhaps be merged, possibly switching the behavior with either a CLI parameter or a feature?

SuperNova CompressedSNARK benchmark

This is to track the benchmark introduced in #131 and was run at this commit: 1e1658b

Note that a verification error and/or failure occurred on the largest circuit size below.

➜  arecibo git:(batched_spartan) cargo criterion --bench compressed-snark-supernova
   Compiling nova-snark v0.31.0 (/Users/clwk/fil/arecibo)
    Finished bench [optimized] target(s) in 19.00s
Gnuplot not found, using plotters backend
Benchmarking CompressedSNARKSuperNova-1circuit-StepCircuitSize-0/Prove: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 8.1s.
CompressedSNARKSuperNova-1circuit-StepCircuitSize-0/Prove                                                                          
                        time:   [804.57 ms 812.42 ms 822.02 ms]
CompressedSNARKSuperNova-1circuit-StepCircuitSize-0/Verify                                                                           
                        time:   [32.799 ms 33.233 ms 33.534 ms]

Benchmarking CompressedSNARKSuperNova-1circuit-StepCircuitSize-6565/Prove: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 11.6s.
CompressedSNARKSuperNova-1circuit-StepCircuitSize-6565/Prove                                                                          
                        time:   [1.1734 s 1.1825 s 1.1928 s]
CompressedSNARKSuperNova-1circuit-StepCircuitSize-6565/Verify                                                                           
                        time:   [47.941 ms 48.262 ms 48.729 ms]

Benchmarking CompressedSNARKSuperNova-1circuit-StepCircuitSize-22949/Prove: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 19.4s.
CompressedSNARKSuperNova-1circuit-StepCircuitSize-22949/Prove                                                                          
                        time:   [1.9019 s 1.9187 s 1.9358 s]
CompressedSNARKSuperNova-1circuit-StepCircuitSize-22949/Verify                                                                           
                        time:   [64.626 ms 65.357 ms 66.119 ms]

Benchmarking CompressedSNARKSuperNova-1circuit-StepCircuitSize-55717/Prove: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 33.7s.
CompressedSNARKSuperNova-1circuit-StepCircuitSize-55717/Prove                                                                          
                        time:   [3.3087 s 3.3360 s 3.3650 s]
Benchmarking CompressedSNARKSuperNova-1circuit-StepCircuitSize-55717/Verify: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 5.7s or enable flat sampling.
CompressedSNARKSuperNova-1circuit-StepCircuitSize-55717/Verify                                                                          
                        time:   [102.73 ms 103.94 ms 104.77 ms]

Benchmarking CompressedSNARKSuperNova-1circuit-StepCircuitSize-121253/Prove: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 62.4s.
CompressedSNARKSuperNova-1circuit-StepCircuitSize-121253/Prove                                                                          
                        time:   [6.2778 s 6.3212 s 6.3741 s]
CompressedSNARKSuperNova-1circuit-StepCircuitSize-121253/Verify                                                                          
                        time:   [185.32 ms 187.87 ms 190.58 ms]

Benchmarking CompressedSNARKSuperNova-1circuit-StepCircuitSize-252325/Prove: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 121.6s.
CompressedSNARKSuperNova-1circuit-StepCircuitSize-252325/Prove                                                                          
                        time:   [12.183 s 12.275 s 12.359 s]
CompressedSNARKSuperNova-1circuit-StepCircuitSize-252325/Verify                                                                          
                        time:   [367.57 ms 371.06 ms 376.51 ms]

Benchmarking CompressedSNARKSuperNova-1circuit-StepCircuitSize-514469/Prove: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 244.5s.
CompressedSNARKSuperNova-1circuit-StepCircuitSize-514469/Prove                                                                          
                        time:   [24.369 s 24.774 s 25.280 s]
Benchmarking CompressedSNARKSuperNova-1circuit-StepCircuitSize-514469/Verify: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 6.5s.
CompressedSNARKSuperNova-1circuit-StepCircuitSize-514469/Verify                                                                          
                        time:   [633.70 ms 652.49 ms 672.07 ms]

Benchmarking CompressedSNARKSuperNova-1circuit-StepCircuitSize-1038757/Prove: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 464.0s.
CompressedSNARKSuperNova-1circuit-StepCircuitSize-1038757/Prove                                                                          
                        time:   [46.263 s 46.423 s 46.674 s]

Benchmarking CompressedSNARKSuperNova-1circuit-StepCircuitSize-1038757/Verify: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 13.0s.
CompressedSNARKSuperNova-1circuit-StepCircuitSize-1038757/Verify                                                                          
                        time:   [1.2770 s 1.2878 s 1.3057 s]

res failed NovaError(ProofVerifyError)
thread 'main' panicked at benches/compressed-snark-supernova.rs:251:9:
assertion failed: res.is_ok()
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Error: Criterion.rs benchmark target compressed-snark-supernova exited with exit status: 101

Replace COO SpMVM by a CSR

#36 clearly shows that an SpMVM (Sparse Matrix-Vector Multiplication) is responsible for a significant amount of the folding time:

arecibo/src/r1cs.rs

Lines 169 to 172 in c360828

 pub(crate) fn multiply_vec( 

 &self, 

 z: &[G::Scalar], 

 ) -> Result<(Vec<G::Scalar>, Vec<G::Scalar>, Vec<G::Scalar>), NovaError> {

This operation is done on matrices in COO form, whereas the typical representation for such operations would be a CSR form (wiki). These matrices are created in the synthesis of (Test)ShapeCS :

arecibo/src/bellpepper/r1cs.rs

Lines 103 to 104 in c360828

 impl_nova_shape!(ShapeCS); 

 impl_nova_shape!(TestShapeCS);

Since those matrices are used only for SpMVM, we hope to replace all instances of COO by a CSR (unless something unexpected turns up, such as patterns of random access in the A, B, C matrices).

The reason that we care about this representation is that it will make all the matrix elements relevant in a row of matrix-vector multiplication adjacent in memory, which opens a path to more performant code, amenable to LLVM's auto-vectorizer.

Support accelerated MSM for halo2curves/grumpkin

This is a clone of the upstream issue microsoft/Nova#216 for internal tracking

ZeroMorph: Implement Zx

Zx is defined in Page 17 from zeromorph paper. We need to calculate this polynomial such that in the main protocol (Page 34) we can compute q_Z respecting the relation Zx = (X - x).q_Z.

Refactor Sumcheck

The ongoing work implementing batched Spartan introduces a lot of code duplication, and made some existing duplication more apparent.

Generalize use of `SumcheckEngine`

The SumcheckEngine used in ppsnark.rs makes it easier to batch multiple Sumcheck claims into one instantiation.
With the changes introduced by the batched_ppsnark module, it also efficiently handles batching instances of different sizes without padding.

Since the batched SNARK requires handling instances of different sizes, it would be nice to use a common implementation to handle batching of multiple instances.

Moreover, this would allow more code reuse by noting that we are often evaluating the same type of Sumcheck claim (for example, proving the multi-linear evaluation, or for proving $0 = \sum_x eq(x)\cdot(Az(x)*Bz(x) - u \cdot Cz(x) - E(x)$).

Implement `RelaxedR1CSSNARK` through `BatchedR1CSSNARK`

The implementations of the BatchedR1CSSNARKTrait copy a lot of code from their original non-batched versions. To minimize duplication, it should be possible to implement the non-batched variants by using a batching of 1.

The overhead would mainly consist of having vectors containing only a single element.

Avoid computing evaluations of `pow(tau)`

Given the well-structured nature of $pow(\tau)$, we know that its list of evaluations is $[1, \tau, \tau^2, \ldots, \tau^{2^{n}-1}]$ (or rather, given the evaluation order of the variables, each power $\tau^i$ is placed at the index corresponding to the bit-reversed $i$) .

If we are computing $\sum_x pow(\tau;x)\cdot F(P_1(x),P_2(x), \ldots)$, then we can more efficiently compute the univariate polynomial in round $j$ (given previous challenges $r_0,\ldots,r_{j-1}$) as

$$ S_j(X_j) = \left[ \prod_{k=0}^{j-1} \left( (1-r_k) + r_k \cdot \tau^{2^{n-k-1}} \right) \right] \left( (1-X_j) + X_j \cdot \tau^{2^{n-j-1}} \right) \left[ \sum_{i=0}^{2^{n-j-1}} \tau^i \cdot S'_{j,i}(X_j) \right] $$

where

$$ S'_{j,i}(X_j) = F(r_0, \ldots ,r_{j-1},X_j, i_0,\ldots,i_{n-j-2}), \quad i = (i_0, \ldots, i_{n-j-2}), i = \sum_{k=0}^{n-j-2} i_k \cdot 2^k $$

The last polynomial corresponds to one of the terms in the sum that comes up when computing the evaluation_points of the combination function in Sumcheck.

The neat property here, is that we can factor out the term $\left( (1-X_j) + X_j \cdot \tau^{2^{n-j-1}} \right)$ from the polynomial, which means we need to evaluate it at one point less.

If we are using Sumcheck to check that $e = \sum_x pow(\tau;x)P(x)$, then we can compute the evaluations using approximately half as many operations.

Adjust Supernova APIs for verification

This is the sister issue for lurk-lab/lurk-rs#695 which portents that the SuperNova APIs for verification will need adjustment for Lurk:

arecibo/src/supernova/mod.rs

Lines 734 to 741 in c657b04

 /// verify recursive snark 

 pub fn verify<C1: StepCircuit<G1::Scalar>, C2: StepCircuit<G2::Scalar>>( 

 &self, 

 pp: &PublicParams<G1, G2, C1, C2>, 

 circuit_index: usize, 

 z0_primary: &[G1::Scalar], 

 z0_secondary: &[G2::Scalar], 

 ) -> Result<(), SuperNovaError> {

Please refer to the discussion in the Lurk issue, which hints at a convenience API that may receive an array of witnesses for each step and aggregate them verification-side.

doc: re-factor the Supernova note to give more pointers re: code organization

The Supernova top-level description could use:

a relocation in src/supernova/Readme.md
an update at a high level, specifically:

It would be great to turn the top-level headers (and ony those) of the Supernova impl walkthrough at https://hackmd.io/@mpenciak/Byx_Yne86 so code comments with the same headers can be used to see the high-level links with https://github.com/lurk-lab/arecibo/blob/dev/notes/supernova.md

Zeromorph: shrink the `q_hat` commitments to size N/2

In ZM, the quotient polynomials, of smaller degree, are pushed into a random linear combination which is larger than needed.

Today batched_lifted_degree_quotient returns a vector of size 2^num_vars with the first half equal to 0s. We could save the first half of that commitment.

`supernova::RecursiveSNARK::iter_base_step()` does not iterate `i`

The current setup for RecursiveSNARK::iter_base_step() does not iterate RecursiveSNARK's internal step counter i. This leads to a some strange logic for the first invocation of prove_step() that could potentially lead to some unexpected behavior/off-by-one errors.

I believe it should be possible to get rid of the above code snippet and retain logical consistency if i is iterated in the first step.

It may also be worth taking the time to rename i to something more clear like num_steps.

Optimizing the memory pipeline

This is an umbrella issue for optimizing the memory allocations in arecibo. The original whole PR is contained in #118 -- however, to have a higher degree of confidence in the breakdown of optimizations, we split the changes into several self-contained PRs. For each of these separate PRs, we provide benchmarks to show that they individually contribute to better performance.

Tracking PRs

#138, replaced by #143
#139
#144; depends on ~~#138~~ #143, #139

TODOs

Port changes to SuperNova (done in #227)
Port changes to ParaNova

Implement a fancier msm for halo2curves-related crates

halo2curves 0.5.0 includes a newer implementation of the CPU MSM, see :
a recent PR (#192) suggests an update to halo2curves that does not include the adaptation of the two halo2curves-imported curve cycles (secp & grumpkin) to this newer CPU MSM,
we use the MSM in two places:
- in a FBMSM in https://github.com/lurk-lab/arecibo/blob/dev/src/provider/util/fb_msm.rs (introduced in #145, leveraged in #172) for SRS generation in pairing-equipped commitment schemes, a variant that does not perform the final sum
- in the MSM used for Commitments themselves, where this is called through the impl_traits macro calls providing implementations of DlogGroup

This task is about reaping the performance improvements on CPU-MSM by using the h2c implementations of the CPU MSM (which are benched against and improve upon the Zcash implementation, i.e. the status quo)

Unify `nova::PublicParams` and `supernova::PublicParams`

supernova::PublicParams will become the only PublicParams of nova_snark. The current nova::PublicParams will be removed, and instead be replaced by the special case of supernova::PublicParams where pp.circuit_shapes.len() == 1.

Add a digest to circuits

From: https://zulip.lurk-lab.com/#narrow/stream/22-inc/topic/Push.20to.20Coprocessors/near/109810

porcuquine:

Unfortunately, once the constraints change, the public params for the coproc will fall out of sync. We don't do any smarter checking as of now.

Now that we are getting into coprocessors, it's fairly essential that we fix this. It's going to eat up an increasing amount of the team's time the longer it goes unaddressed. I suggest we include the digest from the Nova params in the cache key — or something morally equivalent. I haven't thought about this since previously, but in poking at SuperNova, I've seen that the digest does now exist, so…

porcuquine: The most important point is that we need a digest of the circuit(s) so that if a circuit changes in any way the public params will also change.

porcuquine: The easiest way to do this probably involves using Nova's ability to compute a digest over the pp components, including the circuit shapes. It might be that we can/should add some trait that lets us get a 'partial digest' sufficient for our purposes to arecibo, then use that.

[Zeromorph] Avoid decompressing G2 points

Zeromorph verification executes pairing check with the input composed in the following way:

let pi = proof.pi;
let pairing_inputs = [
   (&c, &(-vk.s_offset_h).into()),
   (
     &pi,
     &(E::G2::from(vk.vp.beta_h) - (vk.vp.h * x))
       .to_affine()
       .into(),
   )
];

where vk.s_offset_h, vk.vp.beta_h and vk.vp.h are G2 points. In serialised form, compressed G2 points are represented as 64 bytes:

. . .
"vk_ee":{
   "vp":{
   "g":"2c6a92f8691642e6f4830e904e7c33b3b08e943e732154eca7897cd354a9bd59",
   "h":"40fe7193ec3e7adb474a338a8b55aefd065fd1d038cb80934ea313ec1c66762f9d6606bfcfb440ce81e306136d092e3e5302b422925129d2b71e80fe39bde614",
   "beta_h":"5fb81e39772b0e58a3c2620c694a22c50a7802245255b751f6896a174ba6852e851461a2fb9e5894d095827dd6b0e424d4622c5a8d2ee3a07c84bbb1e130a665"
   },
   "s_offset_h":"5fb81e39772b0e58a3c2620c694a22c50a7802245255b751f6896a174ba6852e851461a2fb9e5894d095827dd6b0e424d4622c5a8d2ee3a07c84bbb1e130a665"
}
. . .

Pairing check on Solidity side (Ethereum's precompile) requires input in specific form, where G2 points are decompressed. Decompressing points on Solidity side is just wasting gas, since it is not actually a useful computation required to be performed on-chain.

Can we update the CompressedSNARK and VerifierKey structures serialisation - to store G2 points in uncompressed form (and exclude implicit decompression from implementation of the reference verifier)?

Probably related to lurk-lab/solidity-verifier#40

SuperNova: Enforce `SecondaryCircuit == TrivialSecondaryCircuit`

The current implementation does not support multiple circuits on the secondary curve (this limit is hard-coded). With CycleFold, computation will only be done on the primary curve. The idea of this proposal would be to simplify the current API to only handle circuits on the primary curve, and always instantiating the secondary circuit as the TrivialSecondaryCircuit. Users would no longer need to supply this manually, eliminating boilerplate. The RecursiveSNARK could make this assumption too, and avoid keeping track of z0_secondary and zi_secondary entirely. The NonUniformCircuit trait would be generic over C1 only, and the secondary_circuit() could be removed.

This proposal would make the current API match more closely with the future implementation of CycleFold, and prevent some breaking changes later down the road.

On a smaller note, it seems like the TrivialStepCircuit can have arity 0, which should save 4 absorbs in Nova as well since we can avoid hashing z0, zi in the inputs and outputs computations.

Compressed SNARK for SuperNova

#2 introduces SuperNova but does not provide a compressed SNARK implementation.

v0 could require a separate compressed SNARK for each circuit's final witness — with size and verification costs linear in the number of sub-circuits.

v1 should provide a single SNARK that proves all sub-circuits as one larger circuit.

Forward-port Nova improvements (0.27)

Nova 0.27+ brings in the following PRs which have not been ported so far:

CI: push tests vs merge groups

Our current CI triggers on push to dev as well as merge groups:

arecibo/.github/workflows/rust.yml

Lines 3 to 9 in 03a456e

 on: 

 merge_group: 

 push: 

 branches: [ dev, zeromorph ] 

 pull_request: 

 types: [opened, synchronize, reopened, ready_for_review] 

 branches: [ dev, zeromorph ]

the push tests are for the most part redundant,
the check-lurk-compiles job is meant only as a warning, and so is useless on merge_group or push.

CI: include and check MSRV

We check the Rust msrv with cargo MSRV in all our repos except this one.
See e.g. the CI job for neptune: https://github.com/lurk-lab/neptune/blob/28bed5841e4cec30d9e64e2fba529e0dba493457/.github/workflows/rust.yml#L81-L93

benches: accurately test/bench Supernova values with a num_augmented_circuits equal to 2

The benchmarks improved in #178 still will report inaccurate values for the constraint count of the CompressedSNARK, since these circuit size is computed with num_augmented_circuits equal to 1.

This means it won't take PC enforcement constraints for a num_augmented_circuits >= 2, and we benchmark with at least 2 circuits.

Implement fast batching for Univariate KZG

lurk-lab/Nova#28 introduced a batched KZG that's rather elementary, we should replace it with :
https://eprint.iacr.org/2023/033.pdf

Adjust ZeroMorph for EvaluationEngineTrait

KZG is a building block that allows us to commit to univariate polynomials. ZeroMorph uses that building block to construct a commitment to multilinear polynomials, which can be used by Nova.

Part of the interfaces are already prepared for this trait, since polynomials and input points are represented as &[Fr], and the evaluation of this polynomial at this point is &Fr.

However, other adjustments are necessary. Namely, take a look on things like CommitmentEngineTrait, CommitmentKey, ProverKey, VerifierKey, etc.

Circuit soundness bug: unchecked program counter

The function get_from_vec_alloc_relaxed_r1cs does not check that program_counter is valid (i.e. in the range { $0,1,\ldots, \ell-1 $}. If program_counter is outside this range, the function returns the first RelaxedR1CSInstance by default.

This can be fixed by enforcing that the sum of all equal_bits (including one for the first instance) is 1. This ensures that exactly one of the equal_bits is set, and therefore program_counter is in the correct range.
Another possibility would be to ensure the returned index is equal to the input target_index.

Since the returned index is the same as the input target_index, it is not necessary for the function to return it. Instead, the function could return a vector of the equal_bits, which are already recomputed during the computation of U_next.

Refactor test_pp_digest to use expected_test

test_pp_digest is a pain to update due to its runtime and the changing nature of our hashing & structures:
https://github.com/lurk-lab/arecibo/blob/dev/src/lib.rs#L1019-L1114
We should refactor it to use expected_test, which would make the update much easier:
https://docs.rs/expect-test/latest/expect_test/

Add test of NIVC CompressedSNARK to the non-deterministic NIVC test, once available.

#92 left a TODO:

Add a test of compression once the NIVC CompressedSNARK lands. (#27 #80)

	/// Evaluates the polynomial at the given point.
	/// Returns Z(r) in O(n) time.
	///
	/// The point must have a value for each variable.
	pub fn evaluate(&self, r: &[Scalar]) -> Scalar {
	// r must have a value for each variable
	assert_eq!(r.len(), self.get_num_vars());
	let chis = EqPolynomial::new(r.to_vec()).evals();
	assert_eq!(chis.len(), self.Z.len());

	(0..chis.len())
	.into_par_iter()
	.map(\|i\| chis[i] * self.Z[i])
	.sum()
	}

	pub(crate) fn multiply_vec(
	&self,
	z: &[G::Scalar],
	) -> Result<(Vec<G::Scalar>, Vec<G::Scalar>, Vec<G::Scalar>), NovaError> {

	/// verify recursive snark
	pub fn verify<C1: StepCircuit<G1::Scalar>, C2: StepCircuit<G2::Scalar>>(
	&self,
	pp: &PublicParams<G1, G2, C1, C2>,
	circuit_index: usize,
	z0_primary: &[G1::Scalar],
	z0_secondary: &[G2::Scalar],
	) -> Result<(), SuperNovaError> {

	on:
	merge_group:
	push:
	branches: [ dev, zeromorph ]
	pull_request:
	types: [opened, synchronize, reopened, ready_for_review]
	branches: [ dev, zeromorph ]

lurk-lab / arecibo Goto Github PK

arecibo's People

Contributors

Stargazers

Watchers

Forkers

arecibo's Issues

Similarities:

Differences:

Refactoring Potential:

Footnotes

SumcheckEngine::prove_batch

SumcheckEngine for evaluating a polynomial

InnerSumcheckInstance

SumcheckEngine trait