Giter VIP home page Giter VIP logo

Comments (4)

dthorpe avatar dthorpe commented on July 28, 2024

Good points.

from open-assets-protocol.

oleganza avatar oleganza commented on July 28, 2024

@jeffomatic what is the practical use case where checking/enforcing some sort of canonical encoding is important?

If I understand correctly, it's not only ordering of fields that matters, but also Unicode encoding must have some canonical form. As of now OA clients just need to preserve the original binary file, so no one cares how exactly it is formatted (we don't have malleability issues like with bitcoin transactions). This proposal requires every client to do additional checks or sort and canonically encode data to be sure it matches the hash. Not every language/platform has appropriate unicode equivalence conversion functions, or a correct implementation of them. Also, sorting the JSON object keys is not trivial. I haven't yet seen a library that has an API for that.

In addition to above, a good chunk of existing highly important assets uses pretty-printed JSON formatting which probably won't be compatible with any kind of strict encoding rules.

from open-assets-protocol.

jeffomatic avatar jeffomatic commented on July 28, 2024

@oleganza Yeah, good observations!

Even if canonicalizing the JSON is impossible, then I think it would be nice to get the following updates to the documentation:

  1. The docs shouldn't claim that the ordering of fields is insignificant, at least in the case where you're re-rendering the definition for hash verification. Not only is the ordering significant, but so are whitespace, floating-point serialization, Unicode, and probably some other things I'm forgetting. Those all have to be exactly the same as when you first generated that hash.
  2. The docs should probably remind both issuers and verifiers that if they care about hash verification, they need to think of the asset definition as a raw byte string, generated non-canonically at a specific moment in time, rather than as pure structured data.

The rest of this message might be purely academic. Regarding your concerns about practicality, I think the issues you point out with canonicalizing JSON are very real, and could be deal-breakers. But if we magically had some kind of canonical (if imperfect) serialization method, I'm tempted to think that a canonicalize_then_hash procedure would be easier to specify, and probably less error-prone, than asking people to treat the definition as raw bytes. You could package the procedure as a blackbox hashing function that takes the asset definition in any arbitrary format, and call it any time you wanted to know the hash.

Without such a procedure, we put a burden on every server app and every client app to persist, render, and retrieve the asset definition in a specific way. In particular, regardless of how servers are storing the asset definition (e.g. as multiple columns in a SQL table, or a MongoDB document, etc.), you'd want to make sure they're also persisting the original JSON message used to generate the hash digest at the time of issuance, AND make sure they are delivering the original JSON message instead of rendering the structured data through an arbitrary pretty-printer. Likewise, clients would need to keep track of the raw HTTP body of the asset definition before proceeding with any kind of deserialization, which means that dynamic-language applications need to be careful about using any HTTP request library that performs automatic deserialization (which is a pretty widespread practice).

I'm not sure if this is such a huge deal, so to reiterate, I think even making some adjustments to the documentation would be beneficial.

from open-assets-protocol.

NicolasDorier avatar NicolasDorier commented on July 28, 2024

The best bug free way to make it works is to hash the bytes provided by the asset definition url.
Canonicalization is a mine field, that would eventually lead to a lot of problem for every implementers.

I think it is correct to say that the field ordering is not important, and say explicitly that the hash refer to the raw bytes provided by the asset definition url.

from open-assets-protocol.

Related Issues (14)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.