Trusted Data Format (TDF) is an Open, Interoperable, JSON encoded data format for implementing Data Centric Security for objects (such as files or emails) in zero-trust security world. This repository specifies the protocols and schemas required for TDF operation.
In the current opentdf implementations for c++ and javascript it seems like we are double encoding the signatures. The spec for the root signature and policy binding it only mentions base64 wrapped hmac signature.
rootSignature.sig
String
The signature for the entire payload. \n\nExample of signature generation:\nBase64.encode(HMAC(BinaryOfAllHashesCombined, payloadKey))
policyBinding
Object
This contains a keyed hash that will provide cryptographic integrity on the policy object, such that it cannot be modified or copied to another TDF, without invalidating the binding. Specifically, you would have to have access to the key in order to overwrite the policy.This is Base64 encoding of HMAC(POLICY,KEY)
In my opinion the spec either needs to be updated or those clients need to adhere to what the spec says.
I am a little confused around what the difference is between zip and zipstream. Do we have any relevant material on the difference or an example of how to implement each? Is this talking about stream ciphers vs block ciphers?
Also does this drive the encryptionInformation.method.isStreamable bool?
This is another spot where we seem to be double encoding with hex and base64 but the spec and the actual current implementations don't currently align again.
It seems that we are just extracting the gcm authentication tag off the cipher text then hex and base64 encoding it. Is this because its already a mac or should we be actually generating a sha256 hash from it?
I guess either way this needs some discussion as this is another place that we need to either update the implementations or the spec.
Today in the current implementations it seems like the EncryptionInformation.Method.IV goes unused which leaves me to think it is not needed in EncryptionInformation.Method. I propose that we remove this field as it is standard practice to generate a random IV for each encrypted chunk and prepend it to the beginning of the cipher text.
Also this will help reduce any confusion if someone is trying to implement the tdf3 spec in something other than cpp or javascript
When trying to add encrypted metadata into the new golang client I came across issues posting freeform metadata to kas like the spec says is possible.
Metadata associated with the TDF, and the request. The contents of the metadata are freeform, and are used to pass information from the client, and any plugins that may be in use by the KAS. The metadata stored here should not be used for primary access decisions
I feel like this is something that should be added to the spec because otherwise its another place that could make clients incompatible when building an implementation from the spec.
EC is smaller, faster, and more secure than RSA - TDF should use EC keys, there is no good reason not to.
This would reduce the number of "important keys" in the system that people (and our scripts, and KAS, and hardware modules) have to keep track of.
The biggest practical implementation difference between nanoTDF (which has no public spec) and TDF is the use of EC vs RSA keys - removing this would allow us to simplify our SDK logic, and share more code between the nanoTDF and TDF codepaths, as well as our KAS codepaths.
This would require us to major-version bump the spec, add EC keys, and mark the use of RSA as deprecated - we should not be afraid of doing this.