Giter VIP home page Giter VIP logo

scuttlebutt-protocol-guide's Introduction

Scuttlebutt Protocol Guide

Read online

Public Domain

This documentation is dedicated to the public domain. Please use it in any way that you wish!

scuttlebutt-protocol-guide's People

Contributors

arj03 avatar atoulme avatar balupton avatar bl0x avatar boreq avatar christianbundy avatar connoropolous avatar cryptix avatar dominictarr avatar duncan255 avatar ergl avatar mixmix avatar mplorentz avatar mycognosist avatar pdaoust avatar progval avatar soapdog avatar staltz avatar tomlisankie avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

scuttlebutt-protocol-guide's Issues

32 bit integer for milliseconds?

From the guide:

A Bendy Butt Message is a Bencoded list containing a collection with the following five elements, in this specific order.

32-bit integer representing the time the message was created. Number of milliseconds since 1 January 1970 00:00 UTC, also known as UNIX epoch.

Is it correct? It's possible to hold only up to ~1.5 mouthes of milliseconds in a 32-bit integer number..

Extra metafeeds section to other documentation?

Heyo

Figured I'd open up a discussion about the protocol documentation's current section on metafeeds. How do people feel, should it remain part of the protocol docs when considering its lack of adoption in existing clients?

Suggestion: Move the (excellent!) section to some more suitable resource like https://dev.scuttlebutt.nz/ to limit confusion for people looking into SSB currently or in the future. If people come into agreement on this suggestion, I could extract the section in some suitable way and massage it into dev.ssb.nz :)

Information on gossip.ping protocol

The guide does not mention or refer to any details about the protocol used for the gossip.ping messages and the behavior expected by ssb servers.

My current understanding is the following:

  • Client sends gossip.ping RPC message and the current timestamp (milliseconds) in a separate message.

  • Then, the server responds with the same.

  • The client can then calculate the turnaround time for messages.

Is this true?
What should really be going on?
Should this be repeated periodically?

Only one blob is allowed per response?

In a side-note in the want and have section, the Protocol Guide says that only one blob is allowed per response to a blobs.createWants. I see no reason for this limitation, and the implementations I've been interacting with request multiple blobs in a single response, e.g.

{
  "&GKLgK+u/aZP57pNX5/8Bu3M67qJxuHpYxS/b+pwShYw=.sha256":-2,
  "&bzju2bHV5nbuXdPDPF5Sro2acjpJhKlj6+Fwha7JD9s=.sha256":-3
}

Can the guide be changed to describe such wants for multiple blobs or am I missing something?

Metafeed docs feedback

@soapdog https://soapdog.github.io/scuttlebutt-protocol-guide/#metafeeds

Overall structure (sections, paragraphs, order of sentences, structuring of content, diagrams) is perfect! I only found corrections to grammar, tiny technical details etc. Here they are:

A metafeed is a special kind of feed that holds metadata information about multiple feeds. There can be only one active metafeed per identity, and it is only record metadata information about other feeds. Using metafeeds do not break compatibility with clients that only support the legacy feed format. Clients, for compatibility reasons, will always have a main feed in the format described above. They may support multiple additional feeds beyond that main feed, those additional feeds which will be organised and using a metafeed.

I changed this one the most because it's the introduction and we have to emphasize the tree-structure of metafeeds and shouldn't give the impression that there is only one metafeed.

A metafeed is a special kind of feed that only publishes metadata information 
concerning other feeds, called the <i>subfeeds</i>. A metafeed can also contain 
other metafeeds as subfeeds, so they form a tree structure. There can only be 
one <i>root metafeed</i> at the top of this tree. Using metafeeds do not 
break compatibility with clients that only support the classic feed format. 
Clients, for compatibility reasons, have a <i>main classic feed</i> alongside 
the root metafeed. They may support multiple additional feeds beyond that 
<i>main feed</i>, those additional feeds which will be organised and using a 
metafeed.

-Metafeeds use a different format than the legacy feed, this new format is 
+Metafeeds use a different format than the classic feed, this new format is 
-called bendy butt. It still is an append-only log of cryptographically signed 
+called Bendy Butt. It still is an append-only log of cryptographically signed 
 messages, but it is not tied to JSON or V8 engine internals like the old format 
 messages, but it is not tied to JSON or V8 engine internals like the old format 
-thus making it easier to implement it using other programming languages.
+thus making it easier to implement in other programming languages.

Would be good to use the term "classic" to keep it consistent with specs and implementations elsewhere, and we're not yet deprecating that format, so let's not call it legacy yet.


-A Bendy Butt Message is a Bencoded payload containing a collection with the 
+A Bendy Butt Message is a Bencoded list containing a collection with the 
 following five elements, in this specific order.

"payload" is not a Bencode term, it's just what we call it internally in the code, the Bencode term is a "list".


-id
+ID

Everywhere applicable.


SSB-BFE

The first time this is mentioned it should be a link inline. See e.g. when "Secret Handshake key exchange" is mentioned for the first time, it's a link to the PDF.

I also think you can rename SSB-BFE to BFE everywhere. And the first time it's mentioned, write Binary Field Encodings (BFE) and subsequent mentions are just BFE.


contentSection:

-If the message is not encrypted, This is a Bencoded dictionary containing 
+If the message is not encrypted, This is a Bencoded list containing 
 SSB-BFE encoded payload and signature. If it is SSB-BFE encrypted data, 
-then upon decryption it becomes a Bencoded dictionary of SSB-BFE encoded 
+then upon decryption it becomes a Bencoded list of SSB-BFE encoded 
 data and the signature for the payload.

There are other places where it suggests it's a Bencode dictionary, but it really is a list:

contentSection = [content, contentSignature]

content is a dictionary


type

-Is a SSB-BFE string. It can only be one of metafeed/add/existing, 
+Is a BFE string. It can only be one of metafeed/add/existing, 
-metafeed/add/derived, metafeed/update, metafeed/seed, or metafeed/tombstone.
+metafeed/add/derived, or metafeed/tombstone.

Very important to note that metafeed/seed is publish on the classic main feed, not on the bendy butt feeds.


The supported types are:

 * metafeed/add/existing: Used to add an existing feed to the metafeed.
 * metafeed/add/derived: Used to add a derived feed to the metafeed.
-* metafeed/update: Used to update a feed information.
-* metafeed/seed: Used to record the cryptographic seed used to generate the metafeed identity. More about this later.
 * metafeed/tombstone: Used to flag that a feed is no longer in use.

-A feed let other peers know that it supports metafeeds by announcing its own 
+A classic main feed can let other peers know that it supports metafeeds by announcing its own 
-metafeed as a message of type metafeed/announce on its main feed.
+metafeed as a message of type metafeed/announce.

-This message is ignored by clients that do not support metafeeds thus 
+This message is ignored by existing apps that do not support metafeeds,
-guaranteeing that metafeeds do not break legacy clients. Clients that 
+such that adding a metafeed is harmless to existing apps. Clients that 
 support metafeeds will discover new metafeeds and subfeeds by looking for 
 such announcements.

-The sample diagram below shows a root feed, a metafeed and the 
+The sample diagram below shows a classic main feed, a root metafeed and the 
 associated subfeeds.

-Traditionally, SSB replicates feeds by fetching the whole feed...
+Traditionally, Scuttlebutt replicates feeds by fetching the whole feed...

To keep the document consistent, so far it has been referring to the protocol as "Scuttlebutt" and there is actually no mention of "SSB" in upstream master.

Mention Tunnels, Rooms, Tor Onion-Services, ...

HI!
I didn't find a place to ask questions so I ask here.

I try to understand the Scuttlebutt-Protocol and first of all congratulation for this awesome project!!

What I don't understand:
Is there something like a RPC-Call: "Give me the Socket for User @...ed25519"? Or a other way to do NAT Traversal? Or is all non-local traffic relayed through pubs?

Thanks in advance!
Thomas

Possible mistake in createHistoryStream documentation

I noticed an interesting behaviour exhibited by Patchwork. When receiving createHistoryStream requests for sequence n patchwork returns messages with sequence >= n. The protocol guide seems to imply that this is incorrect and instead messages with sequence > n should be returned.

Only return messages later than this sequence number. If not specified then start from the very beginning of the feed.

https://ssbc.github.io/scuttlebutt-protocol-guide/#createHistoryStream

Example: when Patchwork receives a createHistoryStream request with seq set to 10 it will start returning messages from sequence 10. Therefore if we have an up to date feed and ask patchwork for newer messages it will always return at least one message - the least message that we have.

I wonder if this is some kind of a legacy behaviour?

Originally posted here %ptQutWwkNIIteEn791Ru27DHtOsdnbcEJRgjuxW90Y4=.sha256.

Order of JSON keys

According to RFC8259 specifying the JavaScript Object Notation (JSON) Data Interchange Format:

An object is an unordered collection of zero or more name/value
pairs, where a name is a string and a value is a string, number,
boolean, null, object, or array.

The protocol guide seems however to assume that the keys have an order:

  • The canonical formatting instructions for signing/hashing do not specify an ordering
  • Regarding the signature it says:

It must be the last entry in the dictionary

Possible resolution:

  • Do nothing: "we know what we mean by JSON, it's not what RFC says but we don't care"
  • Explicitly say that this is "ordered JSON" and some additional constraints apply to the ones specified in the RFC
  • Specify how the keys are to be ordered for the canonical serialization and remove the constraint regarding the position of the signature. This could be done with a transition phase or exception for old content in which the order provided in the message is used.

Invalid secret key length for crypto_sign_detached

I found a problem while implementing the "4. Server accepts" section on the documentation when computing detached_signature_B .

I think there is something wrong in the documentation.

To my understanding, server_longterm_sk is 32 bytes long and crypto_sign_detached expects a 64 bytes long key.

When I try to implement the secure handshake myself using the sodium-native package, I get the following error:

                    sodium.crypto_sign_detached(detached_signature_B, msg, server_longterm_sk);
                           ^
Error: "sk" must be crypto_sign_SECRETKEYBYTES bytes long
    at Socket.<anonymous> (~/index.ts:146:28)
    at Socket.emit (node:events:369:20)
    at Socket.emit (node:domain:470:12)
    at addChunk (node:internal/streams/readable:313:12)
    at readableAddChunk (node:internal/streams/readable:288:9)
    at Socket.Readable.push (node:internal/streams/readable:227:10)
    at TCP.onStreamRead (node:internal/stream_base_commons:190:23)

Where server_longterm_sk is 32 bytes long and sodium.crypto_sign_SECRETKEYBYTES's value is 64.

Link to documentation: https://ssbc.github.io/scuttlebutt-protocol-guide/#:~:text=key%3A%20server_longterm_sk
Links to libsodium definition of crypto_sign_SECRETKEYBYTES https://github.com/jedisct1/libsodium/blob/6d566070b48efd2fa099bbe9822914455150aba9/src/libsodium/include/sodium/crypto_sign.h#L40 and https://github.com/jedisct1/libsodium/blob/6d566070b48efd2fa099bbe9822914455150aba9/src/libsodium/include/sodium/crypto_sign_ed25519.h#L34

Is there something wrong with the documentation? Or have I misunderstood something?

Follow publicity

I want to send a question on the ssbc protocol, but it's not clear to me where to do that, so i'm trying this repo
I'm happy to move the conversation elsewhere if this is not the right place

From https://ssbc.github.io/scuttlebutt-protocol-guide/ at this time:

Feeds can follow other feeds. Following is a way of saying “I am interested in the messages posted by this feed”.

When a user wants to follow another feed their client will post a message to their own feed that looks like:
(json)
Later, if the user decides to unfollow this feed, their client can post another message with following set to false.

A Scuttlebutt feed is a list of all the messages posted by a particular identity. When a user writes a message in a Scuttlebutt client and posts it, that message is put onto the end of their feed.

The messages in a feed form an append-only log, meaning that once a message is posted it cannot be modified

One implementation, Patchwork, shows messages up to 2 hops out by default. Messages from feeds 3 hops out are replicated to help keep them available for others

From all of these quotes combined, my understanding of the protocol as it currently is is that who you follow is part of your feed and it is part of it pretty much forever. This means that someone following you knows everyone else you follow

I see a potential privacy concern here. In some cases, the social graph of a person can be considered a private information that indirectly reveals interests or relationships

It's happened on Facebook where the platform outed a person to their parent by disclosing a group membership
In this case, the social graph of a person was a privacy-sensitive information she had been careful to not reveal her parents (in this example, it's a group, but it might as well have been the leader of this group or several queer people)

There is also the "finsta" phenomenon where some people create a secundary more private instagram account for many purposes, but a common one being to be more discreet in who they (actually) follow

My question is: does the ssbc team acknowledges keeping the social graph of a person private information as a problem in ssbc scope?
If so, what are the current and future solutions?

All this relies on my current understanding of the ssbc protocol and i apologize for the noise if i misunderstood something in the protocol

The function nacl_auth is a blackbox

The function nacl_auth in the code snipped is not documented, it doesn't seem clear from the document what exactly it does yet it seems crucial to understand what it does in order to correctly implement the protocol.

Maximum message size

Mention the maximum message size (which effectively restricts the set of valid content objects. Spec entry:

Ssb places a limit on the size of legacy messages. To compute the length of a legacy value, compute the signing encoding (which is always valid unicode), reencode that unicode as utf16, then count the number of code units.

A message can consist of at most 16385 utf-16 code units.

move into ssbc?

we should probably move this to the ssbc org, to which I have also sent you an invitation!

Order of content object entries

The guide states that "Fields within content can appear in any order but the order must be remembered for later.". This is not correct, see the spec on object entry ordering:

  • The order in which to serialize the entries s_i: v_i is not fully specified, but there are some constraints:
  • intuitively: Natural numbers less than 2^32 - 1 are sorted ascendingly
  • formally:
    • a key is called a more-or-less integer key if it is either the string "0", or a nonzero decimal digit (1 - 9 (0x31 - 0x39)) followed by zero or more arbitrary decimal digits (0 - 9 (0x30 - 0x39)) and consists solely of such digits
    • a key is called an int key if it:
      • is a more-or-less integer key
      • its numeric value is strictly less than 2^32 - 1
        • no, there's no off-by one error here: 0xfffffffe is an int key, whereas 0xffffffff is not
        • Why? We have no idea either
    • all entries with int keys must be serialized before all other entries
    • among themselves, the int keys are sorted:
      • by length first (ascending), using
      • numeric value as a tie breaker (the key whose raw bytes interpreted as a natural number are larger is serialized later)
        • note that this coincides with sorting the decimally encoded numbers by numeric value
  • all other entries may be serialized in an arbitrary order

misleading scalar mult figure

Since the order of the arguments in a scalar multiplication in fact matters it would make sense if the right side of the figure under "Shared secret derivation" would have the private key on the left as is already the case on the left side of the figure.

Specify the meaning of "character"

The type field allows applications to filter out message types they don’t understand and must be a string between 3 and 52 characters long (inclusive).

The length of the string is counted in utf-16 code units, which is not what you'd expect without additional clarification. Perhaps the guide could state at some central location that due to js-weirdness, all string handling deals with utf-16.

Hash computation utf-16 weirdness

The hash for determining a message id is not created over the utf-8 encoding of the json, but rather does some weird things:

To compute the hash of a legacy value, you can not use the signing encoding directly, but the hash computation is based on it. The signing encoding always results in valid unicode. Represent this unicode in utf-16. This encoding is a sequence of code units, each consisting of two bytes. The data to hash is obtained from these code units by only keeping the less significant byte.

Example: Suppose you want to compute the hash for "ß", the corresponding utf8 is [0x22, 0xC3, 0x9F, 0x22]. In big-endian utf16, this is [(0x22, 0x00), (0xDF, 0x00), (0x22, 0x00)], in little-endian utf16, this is [(0x00, 0x22), (0x00, 0xDF), (0x00, 0x22)]. In both cases, the sequence of less signifiant bytes per code unit is [0x22, 0xDF, 0x22]. That is the byte array over which to compute the hash.

Information on what steps are necessary for replication

From the guide it is not clear, which steps are necessary to have a server replicate a client's feed.

From my experience:

  • client does a proper SHS handshake with the server

  • client sends RPC calls:

    • gossip.ping and a timestamp
    • blobs.createWants
    • createHistoryStream with client's ID
  • server replies with:

    • gossip.ping and a timestamp
    • an answer to the 'blobs.createWants', possibly just '{}'
    • an answer to the 'createHistoryStream', possibly just 'true'
  • then the client sends:

    • createHistoryStream requesting any feed the server is interested in

This logic is not properly described so far in the guide, but is crucial knowledge, when implementing a new ssb server.

Edit:
Should the server initiate its own connection in addition to the existing connection initiated by the client?
Are important parts missing?

Explicit binary values to make implementing easy

Hi I am going through the guide and am currently at the "shared secret derivation" step.
The biggest problem I am finding is that the code examples are all tied to the libsodium terms.

I am currently implementing in erlang using the crypto library and it has been a challenge so far.

It would be really helpful if the guide had an example with values to step through. i.e.

Given client_public_ephemeral_key "abc121..." and network identifier "a1a..." then the client hello would be. "999..."


As an additional note. It took me a while to realize the network identifier on the page was in hex and not base encoded. https://github.com/ssbc/scuttlebutt-protocol-guide/blob/master/index.html#L238

Broken Links

I just realized that the links to the go implementation still point to my username, before we moved to the cryptoscope org.

I wrote this 3 liner to get a list of links and their status - might be cool to have it as a commit hook or something but since there were no build targets or makefiles I didn't make a PR yet:

cat index.html | pup 'a attr{href}' | grep https | while read url
do
    code=$(curl -s -o /dev/null -I -w "%{http_code}" $url)
    echo $url: $code
done

(the pup tool to use css selectors in HTML can be found here)

Here is the output it gave me:

Feed signature JSON stringify

When creating the signature of a feed message, the guide mentions to apply JSON.stringify with a series of guidelines on how to pretty print JSON.

Those guidelines do not indicate which line ending characters to use: \n or\r\n?

Timestamp clarification

Time the message was created. Number of milliseconds since 1 January 1970 00:00 UTC.

Perhaps there should be a clarification of the datatype (64 bit float) here. When I first implemented this in rust, I fully expected timestamps to be integers, but that's not the case. There are a bunch of non-integer timestamps out there, even for numbers where 64 bit floats have enough precision to represent all integers exactly. That's because places past the decimal point have been used to ensure monotonically increasing timestamps I think.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.