Giter VIP home page Giter VIP logo

canonical-proto3's Introduction

proto3 Canonical Encoding Rules (CER)

This defines a set of encoding rules for Protocol Buffers 3 (proto3) for serializing messages deterministically such that the serialized form is suitable for signing and encoding in cryptographic attestations (ex. Merkle trees). Similar to ASN.1 and Cap'n Proto, a set of "canonical encoding rules" (CER) is used to define a canonical encoding where the basic proto3 specification does not do so. In this sense, the default protocol buffers specification provides a set of "basic encoding rules" which are not deterministic, and we extend that specification to support deterministic encoding for cryptographic use cases.

Specification

Fields Must be Serialized In Ascending Field Order

This is the most intuitive order in which to serialize fields.

Default/Empty Values Must Not Be Serialized

Requiring default values to be serialized would prevent clients from an older version of a protocol from sending messages to transaction processors which use a later version. Also, in proto3 there is no semantic distinction between empty and default fields and thus serializing a default value is not intended to communicate any information. Thus the most canonical behavior is to always omit fields with empty or default value from serialization.

No Maps (for now)

While maps could have a canonical encoding, they are too problematic for cryptographically sensitive use cases and thus excluded for now.

Limitations

A recipient cannot determine if a message with unknown fields is canonical or not. Therefore all transaction processors which receive messages with unknown fields should treat them as not canonical. In spite of this limitation, clients from an early version of a protocol can send messages to transaction processors which understand a later version of the protocol without causing a problem. Transaction processors would also reject messages intended for a later version of the protocol which they do not understand which is likely the safest and most correct behavior in most cases.

JSON

In addition to the rules below, signable canonical protobuf JSON must follow https://gibson042.github.io/canonicaljson-spec/.

No default/empty values

Remove all fields whose value is 0, false, "", null, [], or {}.

Timestamp and Duration should use 9 fractional digits

The proto3 JSON specification states that these types can use 0, 3, 6 or 9 digits in JSON output. For a simple deterministic encoding, we specify the most precise of these 9 digits.

Do not use lowerCamelCase names

This creates unnecessary discrepancies between proto field names and their JSON representation and could lead to weird conflicts (if someone was foolish enough to define both myField and my_field).

Implementations

Please submit a PR if you have implementation details to add to this list.

Implementations should specify one of the following levels of alignment:

  • Level 1: there are clear rules to follow in order to make this implementation follow CER
  • Level 2: this implementation has explicity code generation flags or static linting tools for safely supporting CER
  • Level 3: this implementation provides a zero-allocation "is_canonical" or "unmarshal_canonical" method for checking if a message is canonical

Note that level 1 and 2 implementations can still verify that a message is canonical by re-encoding it canonically and comparing.

gogo protobuf - Level 1

gogo proto mostly follows canonical encoding rules with the caveats listed below.

Don't use gogoproto.nullable = false

This causes default/empty fields to be emitted in binary and json encodings.

canonical-proto3's People

Contributors

aaronc avatar jordaaash avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

Forkers

tubbz-alt

canonical-proto3's Issues

Define supported mappings (proto->JSON or JSON->proto)

The current document speaks about encoding proto->JSON. I think we should get clarity if this is the only supported mapping or not. There are other places that imply a usage of JSON->proto (let's call this "(JSON) decoding" in this context).

For signing and signature verification purposes, only encoding is needed.

  • Sign: put m = hash(serialize(proto)) and privkey into ECDSA secp256k1 sign
  • Verify: out m = hash(serialize(proto)), pubkey and signature into secp256k1 verify

What cosmos/cosmos-sdk#6031 suggests is that there is a reverse mapping (decode) as well. However, such a decoding would be lossy in the current implementation (i.e. decode(encode(original)) != original), e.g. because JSON Canonical Form changes strings to a normalized form, i.e. manipulates content, not only structure.

I think it would be important to know the goals of those two mappings in order to verify if the spec achives them. Obviously, things get much easier by explicitely disallowing the decoding.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.