zkemail / zk-email-verify Goto Github PK

View Code? Open in Web Editor NEW

306.0 13.0 62.0 73.94 MB

Verify any text in any sent or received email, cryptographically and via only trusting the sending mailserver.

Home Page: https://prove.email

License: MIT License

JavaScript 0.85% TypeScript 59.76% Solidity 6.68% Circom 32.72%

dkim dkim-verifier email ethereum solidity zk zk-snarks zkp

zk-email-verify's People

Contributors

Stargazers

Watchers

zk-email-verify's Issues

Parallelize witness generation with zkey download

We can generate the witness while we download the zkey, thus saving about 12 seconds. Pretty low priority. This trick should be upstreamed to heyanon and circom-starter.

Fuzz against independent implementations

Since there are other sha256-rsa implementations in circom (like https://github.com/zkp-application/circom-rsa-verify
), we should be able to use them to fuzz random inputs and ensure both circuits always give the same answer, or formally verify that they say identical things.

Merge snarkjs forks and bump version

Versions < 0.6.11 allow double spending. Bump all the forks of snarkjs to avoid double spend hacks.

Write email confessions prototype

Allow anyone to mask any email, and only reveal the from: email address, and their chosen subset of the body. You can't currently do something like proving ownership of an email in a domain due to the BCC's breaking that membership proof soundness.

Remove address_plus_one

There are a few twitter packages with this line because the currently deployed circuits want it. We should recompile the circuits to not use this anymore and update those keys across the frontend and S3 buckets, and remove this signal entirely.

RSA Add Test for 1024 bit keys with 2048 bit circuit

Zephyr reported that we might have a padding bug when the RSA key begins with zeros. Add a test for this in the circom tests for RSA for 1024 bit keys, for websites with poorer security.

Max len email addesses can be spoofed

Need to add a mitigation for the critical vulnerability where I can pretend to be another email address by making my email address <max_len_minus_10>@gmail.commydomain.com and <max_len_minus_10>@gmail.com reaches max_len so it truncates and thinks I'm the latter person.

Easy to fix by ensuring the array index via QuinSelector like this pseudocode:
message_id_regex_reveal[message_id_idx + max_message_id_len] === 0

Add README explaining circuits in `circuits/` dir

Perhaps useful to get other people using the code here, but also a beginner task for me to read the code

"TEST" parsed as "TES" or "TE"

For some reason, in the V0, either the regex or the packing fails and currency as "TEST" fails parsing from the circuit output, and skips the last letter.

Script to update Token registry

Create a script to update tokens not in Token registry based on the default uniswap list - https://github.com/Uniswap/default-token-list/tree/main/src/tokens

Add automatic C-based witness generation + rapidsnark prover for arbitrary circom

Needed for users to create new circuits.

This might be easiest as a fork/PR to zkrepl.dev, so that it's one click to start the 3 hour process of generating and making these files. We should be able to target specific architectures for the binaries via for instance passing -march=icelake-server -mtune=icelake-server to the C compiler, or target unknown in rust.

Ask me for the docker image for rapidsnark.

Make RSA keys variable length

The vast majority of sent emails use 2048 bit RSA, but a minority of clients use 1024 bit RSA or 4096 bit RSA. Parameterize the circuits so that we can easily recompile circuits for different length keys. Likely will be done in tandem with issue 16 to streamline compilation of new regexes.

Add end to end tests + benchmark across browsers

Start with a simple Puppeteer end-to-end test that pastes a valid email and address in, generates a proof, and verifies the proof. Benchmark this test on Browserstack to see where this code does and doesn't work.

Compress public data signals

Utilize hashing (either just naiively hash all the public inputs and check a pre-image) or do Dmitry’s new idea of efficient hashing on both sides of the proof to reduce public input size for larger circuits. Not needed for Twitter verification due to convenience of small SHAs for the time being.

Export Rust verifier

Requested by external team for use in Cosmos WASM. We should start by compiling some for proof of twitter and email wallet.

Parse 'public info' to make more readable

Run the 7 bytes unpacking step in Javascript (this helper function is called packedNBytesToString in the code) on the public inputs show to the user, so they parse what is made public more easily.

Improve DKIM key update flow

Tasks

Add more mailservers
Create a tool to automatically fetch the DKIM key for a given URL, split it into the big ints, and hash it
Create a tool to automatically parse and archive all DKIM keys for all websites (like this, since existing tools do not archive DKIM keys)
On a cron schedule, validates the current DKIM keys and pings us if one changes. If DNSSEC can update it, then it does so, else it tells us to manually do it.

Add server-side prover option

Add a button that deploys an autoscaled prover on the cloud for a quick circom proof. Add a note that your email address will be revealed to the server side prover as well as the password reset code.

Checksum fails on snarkjs fork

On sampritipanda version for https://github.com/sampritipanda/snarkjs/commits/zkemail. The last commit in this fork takes vivek's snarkjs fork, upgrades it to the latest version of snarkjs, fixes merged conflicts randomly till it worked, rewrote the main.cjs by basically deleting and rebuilding the package I think. Because generate_witness.js was generated using a newer version of snarkjs/circom but vivek's version was a older one so it was causing some frontend issues.

It seems main branch works however, which is strange.

Compare ark-circom wasm to snarkjs wasm

Benchmark the wasm generated by the rust in ark-circom as compared to the one generated by snarkjs, and see if there's significant improvements speed or memory wise in the former.

[Bug] Figure out why Outlook body hash verification doesn’t work

For some reason, only email bodies downloaded from gmail clients work. There is probably some dumb string parsing issue when downloading the email body, but detecting this and fixing it would make it a lot more general of a system. Easy for yush to generate an email that can be read on the outlook and gmail clients, so just ask him for an email if you need a copy to get started on this issue!

Fix Twitter Circuit Tests

From PR #90 (https://github.com/zkemail/zk-email-verify/pulls#issuecomment-1653670956) -- the Twitter tests seem to be broken in Docker.

Edit: They should be tested with BOTH the stable npm packages and the latest pushes to main branch here. The former is to verify that the website will still build, and the latter are to verify that changes to the core libraries have an e2e sanity test.

Add READMEs to npm packages

Do anonymous proof of Twitter follower count

This may not even be possible. There are two routes I can see.

More elegant route. Find an email from Twitter that includes the follower count (this may not exist). Make a ZK circuit to mask just that out, and then convert to an integer and prove that you have an account with at least K followers. To write this body regex, good to quickly tackle issue 16 first to utilize a new regex.
Less elegant route. This will not work in the long term and adds an extra trust assumption on the Merkle tree calculation accuracy. You can make (and periodically re-upload) a huge merkle tree of all Twitter usernames, and the ZK circuit proves membership of your account in that tree, and masks out the follower count to the nearest power of 10. This doesn't really work since you trust that the person constructing the Merkle tree didn't screw with it, which is very hard.

Solidity comparison

Raw DKIM solidity doesn't work due to calldata blowup. Compare the gas usage of an already existing solidity implementation, with and without calldata, to this implementation.

Integration with zk-regex

We made a new version of zk-regex and added circom circuits for common regexes to zk-regex-circom.
We will integrate zk-email-verify with zk-regex.

TODO:

Update twitter circom circuits.
Update READMEs.

Retry/Fail Proof if zkey Download Fails in Twitter Demo

When running the twitter demo, i got this error after witness gen:

twitter-verifier-zkeys.s3.amazonaws.com/e388b82/twitter.zkeyj.gz:1     Failed to load resource: net::ERR_CONNECTION_RESET
Storage of twitter.zkeyh.gz successful!
Storage of twitter.zkeyj.gz unsuccessful, make sure IndexedDB is enabled in your browser.
index-df1a66e7.js:226 TypeError: Failed to fetch
index-df1a66e7.js:454 Circuit inputs: Object
index-df1a66e7.js:454 zk-dl: 909916.0541992188 ms
index-df1a66e7.js:454 Starting proof generation
index-df1a66e7.js:226 generating proof for input
index-df1a66e7.js:160 witness calculation: 49734.8779296875 ms
index-df1a66e7.js:158 Uncaught (in promise) Error: Reading out of bounds
    at uZ.readToBuffer (index-df1a66e7.js:158:21140)
    at uZ.read (index-df1a66e7.js:158:21359)
    at iF (index-df1a66e7.js:158:252217)
    at async Rf (index-df1a66e7.js:158:252813)
    at async nCe (index-df1a66e7.js:158:281177)
    at async eSe (index-df1a66e7.js:160:6829)
    at async kSe (index-df1a66e7.js:226:10996)
    at async onClick (index-df1a66e7.js:454:11626)

The page then hung on Status:generating-proof. We need to add

When a zkey part download fails, retry it.
When an error happens, show it on the frontend.

Make zkp.ts generic

zk-email-verify/packages/helpers/src/zkp.ts

Line 8 in e709ae8

 export const loadURL = "https://twitter-verifier-zkeys.s3.amazonaws.com/751fae9012c8a36543f60a2d2ec528d088ed6df0/"; 

Make this an argument, not a hardcoded twitter verifier url.

JS -> TS

helpers/src/dkim has a bunch of JS. Replace that with TS.

Add root cert check for DNSSEC for automatic mailserver key update

Use the ENS DNSSEC proveAndCheck contract to automatically update the DNS key. Requires checking that it is the mail record URL, and automatically parsing the bytes out into our 17ish packed signals. Low priority since so few websites use DNSSEC.

Unable to generate circuit inputs for email with base64 encoded attachment

eml used has a base64 encoded attachment. running generate_input.ts generates this error:

Error: No public key found on DKIM verification result

Logging the result.results[0] from result = await dkimVerify(email); yields this output:

so the bodyHash and bodyHashExpecting are not matching so its failing here:

zk-email-verify/packages/helpers/dist/dkim/dkim-verifier.js

Line 189 in 5c46156

if (signatureHeader.parsed?.bh?.value !== bodyHash) {

Compress Mailserver Keys

Instead of outputting the entire RSA key to check against the solidity contract, output the hash so we can save calldata gas. This requires

Adding the hash to the circom file
Updating circom + solidity tests
Updating mailserver check with a function that accepts hashes and checks them against the stored keys, and caches valid hashes that invalidate on key upgrade

[Bug] Sometimes, zkeys don’t download the first time around, but they work the second time (after reloading)

Not sure why this happens, but it seems to happen occasionally on fresh computers. I cannot reproduce the bug. Perhaps a decent 80/20 here is, if the files don't start downloading and they don't exist for say 10 seconds straight, we should tell the user to refresh the page.

Move twitter example to separate repo

Move twitter example to another repo (fork-able by others to build on top of zk-email) to keep this one cleaner and only for libraries.

In progress here - https://github.com/zkemail/proof-of-twitter

Migrate frontend to Vite

Following these instructions can hopefully help us get rid of create react app (cra) which slows down development and load times due to bloat. Vite is also much cleaner than webpack, which has caused me hell in the past especially with typescript and ECSMA version incompatibilities.

Updating create-react-app broke a bunch of stuff so I think this is easier than upgrading. But it's also a second task here to update as many packages as possible without breaking the app.

Change imports from /dist to index.js in root for npm packages

We need to cleanup the helpers package etc so these imports can come from the root!

You are importing from @zk-email/helpers/src which is the .ts code - to transpile ts code in node_modules we would need additional config

You can change the import to /dist - import { toCircomBigIntBytes } from "@zk-email/helpers/dist/binaryFormat";

Fix decompression

The .tar.gz decompression step doesn't work -- it compresses fine and uploads fine to the s3 bucket (I think), but when downloading and decompressing (see zkp.ts in targz_frontend branch), the files are bigger than when they were uploaded, and the snarkjs fullProve step gives the error 'zkeyb invalid file format'. This details the error.

Solving this will let us add compressed downloads, halving the download size (decompression is very fast). I think we need to use the zlib library in JS to do this: https://nodejs.org/api/zlib.html#zlib_zlib_unzip_buffer_options_callback .

Fix Forge CI and external PR instant preview

Can use Tunnel: https://tunnel.dev/ for this or any other collaborate PR preview provider.

Generate a verifier.sol file for the chunked zkey

Currently, we load the zkey in chunks to ensure that we can actually fit it into memory. However, there is no verifier.sol generated. So, we either need to fork snarkjs to generate a verifier.sol from the chunked zkeys, or also output a raw zkey that corresponds to the same chunked zkey, that we can call the normal snarkjs verifier generation from.

Build DKIM archive website

Edit: This is now WIP at https://github.com/foolo/dkim-lookup!

DKIM is usually a nested DNS record. For instance, for replit, we can see here: https://easydmarc.com/tools/dkim-lookup?domain=replit.com that the DKIM is under the selector "google" and has the value:

Selector: google
Record value:
v=DKIM1; k=rsa; p=MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAk6RNxaxuNyiPhlH6rlgMOXNTaffcVsK+3E6lK1x8c7MO0w7on9zmaiApGE/2hBWQqRpy6EmRdUf6MJH5TmwM++51W4xR0TmTd1JvsbBR/9yjpR++vOahVkrdh0xPaq1zghHYaqNgsOThivw8Hgd8xWQzPPDcw7T+czQS0/Xe/nijU0dVlQX/s+evJpxP7VV/FzlMQvknMj1bCqAgzUFa1mXMO/ZfzHirpGVcJ+h1fMYOIzU4iV3KUIn6i1mg3T+Kw41MFW04F/4nnIQKTTFNGuI+T+6Ss1M1VcjlAxlwYZCJPE0Iy3cOWRBWsgXFZWx2rATlEtkasmf1NFpJu1nATwIDAQAB

Scrape the alexa top 1M websites (and a list of 50 websites that we manually add) for their DKIM key every day, and archive all the answers in a simple UI where someone can just type in a website, and see all the past DKIM keys for it. Note that these DNS records change roughly daily, and we want all selectors, including non-Google ones. Looking for a simple frontend, as well as script that can be run daily without being ratelimited. I recommend hitting DNS directly.

One way to do this is in python, use something like pydig to query the data, store it in a postgreSQL database, and provide a fastapi webserver for browsing it. Approximately 400 non-compressible bytes per entry times 1M sites changing daily would be a max of 400MB per day of data (thanks npulido for the suggestion).

Eventually, include dynamic checking (i.e. for each site, store the gap between the last n checks, and check more often around the distribution of those times).

Support weird emails in the "to" field

Strictly pursuant to the RFC, there might be weird emails, especially in the 'to' field. Supporting such buffoonery is not a high priority (it can just be rejected instead), but it would be interesting to have an explicit to regex that supports all of this.

Benchmark constraint diff with O1 and O2 flags from circom

Enable --O1 and --O2 flags from circom and see how much constraints go down. If it's significant, make sure unconstrained public variables used in the circuit aren't optimized out (i.e. any unused variables used to constrain or something), then update the build scripts and built circuits.

UI Audit Comments

Non-Chrome banner hides connect wallet
Unclear when to connect wallet, to send on chain
In interviews, ask them what they think links/buttons do
Users don't understand they can use email to prove this and don't know why twitter/email are related
Mention its not usable on mobile
Need to scroll horizontally on mobile to see output on right side
Unclear what email headers are
What are the larger project goals -- why does a user want to do this, and are they putting themselves at risk in any way
Need to define audience and goals to be able to fix it
What is the demo? What is the library? Why is it interesting or important? Introduce the demo later on, but first explain what it even is that they're engaging with.

Thinking needed for website to create feedback loop

User persona targeted
Goals
Set 6 month roadmap

Done

Link contrast bad
Remove date (march to may), what does it mean to be smaller
Why is there a community update here

[Blocked] Prove file attachment validity

Inside the body are the attachments. Create a circuit that handles the encoding/decoding of that section, and proves validity of some sub-part of that attachment. Note that this is probably blocked on making the circuits way faster via lookups or faster proving systems, because currently SHA is unsustainable at that scale.

Refactor DKIM and handle ARC

This concerns to the helpers package on DKIM verification and input generation

Use a standard npm package for DKIM fetching
Handle emails with ARC signatures where DKIM fails
- Try and revert the change made to the original email by the forwarder - eg: Put back X-Message-ID to Message-ID

Circom Bump to 2.1.6

#90 -- https://github.com/zkemail/zk-email-verify/pulls#issuecomment-1653670956

Improved array indexing may reduce circuit size.

Make Twitter demo work with more emails

Have the regex match "this email was sent to" or "this email was meant for", and ensure to have enough HTML divs so that it can't be injected. This will let the user prove Twitter ownership from any email. Also increase the max size of the body email to accomodate this.

This will remove dependence on the password reset email, and any follower notification email can be used.

Fix simple DKIM edge cases

If DKIM fails, try replacing all TABs in body with spaces: "another weird case, is the email supports TABs (ascii 9) rather than spaces". Note that this is an easy find and replace of tabs > spaces on the frontend + in the body parsing js code.

If the email verification fails, it might be due to forwarding. This can insert labels that convert the subject from "This is the subject" to "[Label] This is the subject". We have to strip the label from the beginning of the line in such cases that DKIM verification fails.

Note that ee would need to test all such 2^n permutations of edge cases (in this case n = 2 as the emails can be tabbed/not tabbed or labeled/not labeled, adding exponentially more time to verify in the case of a failure.

I think the zkp2p fixes are here:

see: https://github.com/zkp2p/zk-p2p/blob/develop/client/src/components/ProofGen/validation/hdfc.tsx#L85
and: https://github.com/zkp2p/zk-p2p/blob/develop/client/src/components/ProofGen/validation/venmo.tsx#L62

Add ZK JWT branch

We can use identical code and primitives to prove JWTs instead. Write this primitive in a branch. Contact yush for how to do this!

Make Solidity work

We need to add tests to the Solidity code.
We also need to fix issue 11 before the zk verification will work.
We need to fix the NFT generation to actually work and look good.
We should also pay out a small amount of Goerli Eth so it can double as a faucet

zkemail / zk-email-verify Goto Github PK

zk-email-verify's People

Contributors

Stargazers

Watchers

Forkers

zk-email-verify's Issues

Recommend Projects

Recommend Topics

Recommend Org