There are a lot of open questions re how best to handle encrypted group chat. The basic issue is a tradeoff in simplicity vs efficiency. I'm going to outline a couple proposals below:
Client-side fanout
There are a couple variants on this, and I'm not happy about any of them.
The basic idea is that group chats are a client-side detail, the server is only there for pki + offline delivery.
This will reliably be terrible for large groups, and makes even pairwise conversations awkward because our multi-device plans don't involve shared keys between devices, so actually all chats are group chats.
There are partial solutions here, such as generating a new key to encrypt your plaintext, then allowing the server to fanout the plaintext to everyone you want to send to while you reencrypt the key individually for each of them, but this is awkward and terrible. Let's not do it.
Sender keys
Sender keys are a decent solution to this - for every group chat, each device is associated with its own ratchet, known to all other members of the group. This means you can have full server-side fanout, saving the client CPU time and bandwidth. But, to maintain forward security, this requires additional key-generation by every device every time any conversation is updated, which is far from ideal. Also, doing this in a naive way would be vulnerable to spoofing attacks - if Alice, Bob, and Carol are all in a group together, they all know Bob's symmetric key for that group, so Alice could pretend to send messages as Bob.
Blockchain
Each channel is associated with its own blockchain. Hear me out, this is less stupid than it sounds. Each blockchain starts with some root shared-secret, and proceeds by proof-of-knowledge (I'll explain more below). Each block in this chain consists of:
a) A random N
-byte nonce (where N
is probably 32, but could be as low as 8)
b) The H
-byte hash of a previous block, B
(where H
is probably 32 and shouldn't be lower than 16)
c) The generation of the current block, equal to B.c
d) The current length of the chain, equal to B.d + 1
e) A signature by a member of the group of (a, b, c, d)
f) Ciphertext of the intended message, encrypted in aead-fashion, where the associated data is (a,b,c,d)
and the key is generated by kdf(a, B.b)
.
Every so often the chain restarts from a new random key that is sent via client-side fanout to preserve the self-healing properties of double ratchet. This also increments the generation of the new block, and contains the hash of a block in the previous generation.
Devices prefer to add new messages to the longest chain they currently know of, where any chain in a newer generation is longer than a chain in a previous generation.
Eventually blocks should be deleted, though exactly when to do this is not clear. It's probably safe to do so once every device has ack'd a downstream block, but that's not super clear to me.
It should definitely be safe to delete all blocks in generation g
after everyone has ack'd a block in generation g+1
.