Giter VIP home page Giter VIP logo

zk-email-verify's People

Contributors

0xsachink avatar alvinlee001 avatar bisht13 avatar curryrasul avatar dependabot[bot] avatar dhoni77 avatar divide-by-0 avatar ecnerwala avatar foolo avatar isidroamv avatar javiersuweijie avatar lermchair avatar metachaser24 avatar ocolus1 avatar richardliang avatar riverruby avatar rmccrear avatar s-wu avatar saleel avatar sampritipanda avatar sorasuegami avatar stevenhao avatar wshino avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

zk-email-verify's Issues

Write email confessions prototype

Allow anyone to mask any email, and only reveal the from: email address, and their chosen subset of the body. You can't currently do something like proving ownership of an email in a domain due to the BCC's breaking that membership proof soundness.

Remove address_plus_one

There are a few twitter packages with this line because the currently deployed circuits want it. We should recompile the circuits to not use this anymore and update those keys across the frontend and S3 buckets, and remove this signal entirely.

Max len email addesses can be spoofed

Need to add a mitigation for the critical vulnerability where I can pretend to be another email address by making my email address <max_len_minus_10>@gmail.commydomain.com and <max_len_minus_10>@gmail.com reaches max_len so it truncates and thinks I'm the latter person.

Easy to fix by ensuring the array index via QuinSelector like this pseudocode:
message_id_regex_reveal[message_id_idx + max_message_id_len] === 0

"TEST" parsed as "TES" or "TE"

For some reason, in the V0, either the regex or the packing fails and currency as "TEST" fails parsing from the circuit output, and skips the last letter.

Add automatic C-based witness generation + rapidsnark prover for arbitrary circom

Needed for users to create new circuits.

This might be easiest as a fork/PR to zkrepl.dev, so that it's one click to start the 3 hour process of generating and making these files. We should be able to target specific architectures for the binaries via for instance passing -march=icelake-server -mtune=icelake-server to the C compiler, or target unknown in rust.

Ask me for the docker image for rapidsnark.

Add end to end tests + benchmark across browsers

Start with a simple Puppeteer end-to-end test that pastes a valid email and address in, generates a proof, and verifies the proof. Benchmark this test on Browserstack to see where this code does and doesn't work.

Compress public data signals

Utilize hashing (either just naiively hash all the public inputs and check a pre-image) or do Dmitry’s new idea of efficient hashing on both sides of the proof to reduce public input size for larger circuits. Not needed for Twitter verification due to convenience of small SHAs for the time being.

Export Rust verifier

Requested by external team for use in Cosmos WASM. We should start by compiling some for proof of twitter and email wallet.

Parse 'public info' to make more readable

Run the 7 bytes unpacking step in Javascript (this helper function is called packedNBytesToString in the code) on the public inputs show to the user, so they parse what is made public more easily.

Improve DKIM key update flow

Tasks

  • Add more mailservers
  • Create a tool to automatically fetch the DKIM key for a given URL, split it into the big ints, and hash it
  • Create a tool to automatically parse and archive all DKIM keys for all websites (like this, since existing tools do not archive DKIM keys)
  • On a cron schedule, validates the current DKIM keys and pings us if one changes. If DNSSEC can update it, then it does so, else it tells us to manually do it.

Add server-side prover option

Add a button that deploys an autoscaled prover on the cloud for a quick circom proof. Add a note that your email address will be revealed to the server side prover as well as the password reset code.

Checksum fails on snarkjs fork

On sampritipanda version for https://github.com/sampritipanda/snarkjs/commits/zkemail. The last commit in this fork takes vivek's snarkjs fork, upgrades it to the latest version of snarkjs, fixes merged conflicts randomly till it worked, rewrote the main.cjs by basically deleting and rebuilding the package I think. Because generate_witness.js was generated using a newer version of snarkjs/circom but vivek's version was a older one so it was causing some frontend issues.

It seems main branch works however, which is strange.

[Bug] Figure out why Outlook body hash verification doesn’t work

For some reason, only email bodies downloaded from gmail clients work. There is probably some dumb string parsing issue when downloading the email body, but detecting this and fixing it would make it a lot more general of a system. Easy for yush to generate an email that can be read on the outlook and gmail clients, so just ask him for an email if you need a copy to get started on this issue!

Do anonymous proof of Twitter follower count

This may not even be possible. There are two routes I can see.

  1. More elegant route. Find an email from Twitter that includes the follower count (this may not exist). Make a ZK circuit to mask just that out, and then convert to an integer and prove that you have an account with at least K followers. To write this body regex, good to quickly tackle issue 16 first to utilize a new regex.

  2. Less elegant route. This will not work in the long term and adds an extra trust assumption on the Merkle tree calculation accuracy. You can make (and periodically re-upload) a huge merkle tree of all Twitter usernames, and the ZK circuit proves membership of your account in that tree, and masks out the follower count to the nearest power of 10. This doesn't really work since you trust that the person constructing the Merkle tree didn't screw with it, which is very hard.

Solidity comparison

Raw DKIM solidity doesn't work due to calldata blowup. Compare the gas usage of an already existing solidity implementation, with and without calldata, to this implementation.

Integration with zk-regex

We made a new version of zk-regex and added circom circuits for common regexes to zk-regex-circom.
We will integrate zk-email-verify with zk-regex.

TODO:

  • Update twitter circom circuits.
  • Update READMEs.

Retry/Fail Proof if zkey Download Fails in Twitter Demo

When running the twitter demo, i got this error after witness gen:

twitter-verifier-zkeys.s3.amazonaws.com/e388b82/twitter.zkeyj.gz:1     Failed to load resource: net::ERR_CONNECTION_RESET
Storage of twitter.zkeyh.gz successful!
Storage of twitter.zkeyj.gz unsuccessful, make sure IndexedDB is enabled in your browser.
index-df1a66e7.js:226 TypeError: Failed to fetch
index-df1a66e7.js:454 Circuit inputs: Object
index-df1a66e7.js:454 zk-dl: 909916.0541992188 ms
index-df1a66e7.js:454 Starting proof generation
index-df1a66e7.js:226 generating proof for input
index-df1a66e7.js:160 witness calculation: 49734.8779296875 ms
index-df1a66e7.js:158 Uncaught (in promise) Error: Reading out of bounds
    at uZ.readToBuffer (index-df1a66e7.js:158:21140)
    at uZ.read (index-df1a66e7.js:158:21359)
    at iF (index-df1a66e7.js:158:252217)
    at async Rf (index-df1a66e7.js:158:252813)
    at async nCe (index-df1a66e7.js:158:281177)
    at async eSe (index-df1a66e7.js:160:6829)
    at async kSe (index-df1a66e7.js:226:10996)
    at async onClick (index-df1a66e7.js:454:11626)

The page then hung on Status:generating-proof. We need to add

  1. When a zkey part download fails, retry it.
  2. When an error happens, show it on the frontend.

JS -> TS

helpers/src/dkim has a bunch of JS. Replace that with TS.

Unable to generate circuit inputs for email with base64 encoded attachment

eml used has a base64 encoded attachment. running generate_input.ts generates this error:

Error: No public key found on DKIM verification result

Logging the result.results[0] from result = await dkimVerify(email); yields this output:

image

so the bodyHash and bodyHashExpecting are not matching so its failing here:

if (signatureHeader.parsed?.bh?.value !== bodyHash) {

Compress Mailserver Keys

Instead of outputting the entire RSA key to check against the solidity contract, output the hash so we can save calldata gas. This requires

  • Adding the hash to the circom file
  • Updating circom + solidity tests
  • Updating mailserver check with a function that accepts hashes and checks them against the stored keys, and caches valid hashes that invalidate on key upgrade

Migrate frontend to Vite

Following these instructions can hopefully help us get rid of create react app (cra) which slows down development and load times due to bloat. Vite is also much cleaner than webpack, which has caused me hell in the past especially with typescript and ECSMA version incompatibilities.

Updating create-react-app broke a bunch of stuff so I think this is easier than upgrading. But it's also a second task here to update as many packages as possible without breaking the app.

Change imports from /dist to index.js in root for npm packages

We need to cleanup the helpers package etc so these imports can come from the root!

You are importing from @zk-email/helpers/src which is the .ts code - to transpile ts code in node_modules we would need additional config

You can change the import to /dist - import { toCircomBigIntBytes } from "@zk-email/helpers/dist/binaryFormat";

Fix decompression

The .tar.gz decompression step doesn't work -- it compresses fine and uploads fine to the s3 bucket (I think), but when downloading and decompressing (see zkp.ts in targz_frontend branch), the files are bigger than when they were uploaded, and the snarkjs fullProve step gives the error 'zkeyb invalid file format'. This details the error.

Solving this will let us add compressed downloads, halving the download size (decompression is very fast). I think we need to use the zlib library in JS to do this: https://nodejs.org/api/zlib.html#zlib_zlib_unzip_buffer_options_callback .

Generate a verifier.sol file for the chunked zkey

Currently, we load the zkey in chunks to ensure that we can actually fit it into memory. However, there is no verifier.sol generated. So, we either need to fork snarkjs to generate a verifier.sol from the chunked zkeys, or also output a raw zkey that corresponds to the same chunked zkey, that we can call the normal snarkjs verifier generation from.

Build DKIM archive website

Edit: This is now WIP at https://github.com/foolo/dkim-lookup!

DKIM is usually a nested DNS record. For instance, for replit, we can see here: https://easydmarc.com/tools/dkim-lookup?domain=replit.com that the DKIM is under the selector "google" and has the value:

Selector: google
Record value:
v=DKIM1; k=rsa; p=MIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAk6RNxaxuNyiPhlH6rlgMOXNTaffcVsK+3E6lK1x8c7MO0w7on9zmaiApGE/2hBWQqRpy6EmRdUf6MJH5TmwM++51W4xR0TmTd1JvsbBR/9yjpR++vOahVkrdh0xPaq1zghHYaqNgsOThivw8Hgd8xWQzPPDcw7T+czQS0/Xe/nijU0dVlQX/s+evJpxP7VV/FzlMQvknMj1bCqAgzUFa1mXMO/ZfzHirpGVcJ+h1fMYOIzU4iV3KUIn6i1mg3T+Kw41MFW04F/4nnIQKTTFNGuI+T+6Ss1M1VcjlAxlwYZCJPE0Iy3cOWRBWsgXFZWx2rATlEtkasmf1NFpJu1nATwIDAQAB

Scrape the alexa top 1M websites (and a list of 50 websites that we manually add) for their DKIM key every day, and archive all the answers in a simple UI where someone can just type in a website, and see all the past DKIM keys for it. Note that these DNS records change roughly daily, and we want all selectors, including non-Google ones. Looking for a simple frontend, as well as script that can be run daily without being ratelimited. I recommend hitting DNS directly.

One way to do this is in python, use something like pydig to query the data, store it in a postgreSQL database, and provide a fastapi webserver for browsing it. Approximately 400 non-compressible bytes per entry times 1M sites changing daily would be a max of 400MB per day of data (thanks npulido for the suggestion).

Eventually, include dynamic checking (i.e. for each site, store the gap between the last n checks, and check more often around the distribution of those times).

Support weird emails in the "to" field

telegram-cloud-photo-size-1-5078344806983380189-w

Strictly pursuant to the RFC, there might be weird emails, especially in the 'to' field. Supporting such buffoonery is not a high priority (it can just be rejected instead), but it would be interesting to have an explicit to regex that supports all of this.

UI Audit Comments

  • Non-Chrome banner hides connect wallet
  • Unclear when to connect wallet, to send on chain
  • In interviews, ask them what they think links/buttons do
  • Users don't understand they can use email to prove this and don't know why twitter/email are related
  • Mention its not usable on mobile
  • Need to scroll horizontally on mobile to see output on right side
  • Unclear what email headers are
  • What are the larger project goals -- why does a user want to do this, and are they putting themselves at risk in any way
  • Need to define audience and goals to be able to fix it
  • What is the demo? What is the library? Why is it interesting or important? Introduce the demo later on, but first explain what it even is that they're engaging with.

Thinking needed for website to create feedback loop

  • User persona targeted
  • Goals
  • Set 6 month roadmap

Done

  • Link contrast bad
  • Remove date (march to may), what does it mean to be smaller
  • Why is there a community update here

[Blocked] Prove file attachment validity

Inside the body are the attachments. Create a circuit that handles the encoding/decoding of that section, and proves validity of some sub-part of that attachment. Note that this is probably blocked on making the circuits way faster via lookups or faster proving systems, because currently SHA is unsustainable at that scale.

Refactor DKIM and handle ARC

This concerns to the helpers package on DKIM verification and input generation

  • Use a standard npm package for DKIM fetching
  • Handle emails with ARC signatures where DKIM fails
    • Try and revert the change made to the original email by the forwarder - eg: Put back X-Message-ID to Message-ID

Make Twitter demo work with more emails

Have the regex match "this email was sent to" or "this email was meant for", and ensure to have enough HTML divs so that it can't be injected. This will let the user prove Twitter ownership from any email. Also increase the max size of the body email to accomodate this.

This will remove dependence on the password reset email, and any follower notification email can be used.

Fix simple DKIM edge cases

If DKIM fails, try replacing all TABs in body with spaces: "another weird case, is the email supports TABs (ascii 9) rather than spaces". Note that this is an easy find and replace of tabs > spaces on the frontend + in the body parsing js code.

If the email verification fails, it might be due to forwarding. This can insert labels that convert the subject from "This is the subject" to "[Label] This is the subject". We have to strip the label from the beginning of the line in such cases that DKIM verification fails.

Note that ee would need to test all such 2^n permutations of edge cases (in this case n = 2 as the emails can be tabbed/not tabbed or labeled/not labeled, adding exponentially more time to verify in the case of a failure.

I think the zkp2p fixes are here:

see: https://github.com/zkp2p/zk-p2p/blob/develop/client/src/components/ProofGen/validation/hdfc.tsx#L85
and: https://github.com/zkp2p/zk-p2p/blob/develop/client/src/components/ProofGen/validation/venmo.tsx#L62

Make Solidity work

  • We need to add tests to the Solidity code.
  • We also need to fix issue 11 before the zk verification will work.
  • We need to fix the NFT generation to actually work and look good.
  • We should also pay out a small amount of Goerli Eth so it can double as a faucet

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.