grumpyoldtroll / draft-jholland-quic-multicast Goto Github PK

View Code? Open in Web Editor NEW

6.0 5.0 6.0 214 KB

Work in progress to propose a multicast extension to quic.

License: Other

Makefile 100.00%

draft-jholland-quic-multicast's Introduction

Multicast Extension for QUIC

This is the working area for the individual Internet-Draft, "Multicast Extension for QUIC".

Contributing

See the guidelines for contributions.

Contributions can be made by creating pull requests. The GitHub interface supports creating pull requests using the Edit (✏) button.

Command Line Usage

Formatted text and HTML versions of the draft can be built using make.

$ make

Command line usage requires that you have the necessary software installed. See the instructions.

draft-jholland-quic-multicast's People

Contributors

Stargazers

Watchers

Forkers

maxf12 samhurst lpardue momoka0122y louisna squarooticus

draft-jholland-quic-multicast's Issues

Session already has a meaning in QUIC

"Session" already has a meaning in QUIC. It appears to be implicitly imported via TLS, c.f.:

I'm thinking maybe we can search & replace "session" with "channel"?

"Channel" also has a meaning in SSM (c.f. https://www.rfc-editor.org/rfc/rfc4607.html#page-4), but using it here would be complementary instead of divergent. ("Channel" in SSM refers to a network path, and likewise if we used it instead of "Session" in the QUIC extensions it would refer to a network path and associated objects...)

remove MC_PATH_RESPONSE

We don't need the PATH_CHALLENGE/MC_PATH_RESPONSE logic, I think. We're anchored by the unicast connection, and we get acks for received packets.

Michaels review - Add text for graceful degradation

Regarding graceful degradation to unicast:

I’d expect this spec to tell me how... but maybe in a future version.

Clarify that joins are optional in the Channel Management section

From Sam Hurst:
Be more explicit in the "Channel Management" section that the client is not obligated to join the multicast channel once it receives an MC_SESSION_JOIN frame. If it chooses not to, it has the option to send an MC_SESSION_STATE_CHANGE frame with the "declined join" reason, but that isn't actually mentioned in the session management section.

Client behavior when unicast connection is disrupted

How would a client behave if the unicast connection is interrupted but the Multicast channels still receive data? It wouldn’t be able to check the integrity of any packets, but it isn’t exactly idling either so the idle connection timeout would not trigger I guess? Should this be clarified so that all multicast session are left and the entire connection is shutdown if there is no message over the unicast connection for longer than the idle timeout?

Do bitfields really need to be up to 62 bits long?

In a few places there is Capabilities Flags (i), which is a varint that can provide between 6 and 62 bits. Given that there are only two things currently defined, this is a slight waste of space if implementations don't minimally encode. It also introduces potential interop issues if you don't define MSB or LSB. If we stick with that design, some more text would probably be required.

A different approach here is to just list out the bits e.g. assuming 8-bit alignment is desirable, the smallest thing we could do is

IPv4 capable (1),
IPv6 capable (1),
Reserved(6),

this would give back 2 bits otherwise consumed by the varint encoding, and allow up to 6 more future capabilities.

There's a tradoff either way but given that these fields won't be sent often, I don't think we need to over-optimize.

AMT

This might be too much for the initial document and could be an expansion, but it might be useful to add a frame that contains information about available AMT relays for the channel in question. That way you could still distribute load better in cases where e2e native multicast isn't supported by utilizing as much multicast as possible (up until the relays) and reduce the load on the server by having many unicast connections.

I guess this idea similar to having a fan out, but it might be an interesting alternative. You might even be able to do something clever by determining which relay is best suited by the clients IP or something along those lines and then just including that.

I guess in theory this could be done by simply adding an AMT relay field to the MC_CHANNEL_PROPERTIES.

MC_STATE: consider using MACRO_CASE

Style comment. The state and reason fields are variable-length integers, which makes them quite like RFC 9000 Error Codes. For consistency, my suggestion is to use MACRO_CASE for the values defined in this document. E.g instead of Property Violation use PROPERTY_VIOLATION. This can make it easier to slot into existing enum-based handling or logging systems.

Michaels review - Give more details about the unicast connection

I have a feeling that more should be said about this unicast connection. Some constraints perhaps… can it migrate? Can it use multiple streams? Can it be used for traffic that’s not bound to the signaling for the multicast stuff here?

Also: here, and in most places, you call it “the unicast connection”, which makes quite clear that there is one only. That’s good! But in some places, it says “a unicast connection” which then confuses me. There IS only one which is associated with this multicast stuff, right?

Be consistent about saying if frames are sent on channels or connections

Most frames just state that they are sent "From server to client" while e.g. MC_SESSION_LEAVE says "from server to client
in either the unicast connection or a channel"

Either remove the specifier from everywhere or, and probably better, add it to everywhere.

Retiring a channel

Two things that might need clarification:

Retiring a channel that is currently joined should probably force an automatic leave
Should there be a Retired option for the MC_CHANNEL_STATE frame so the server knows if a channel was retired successfully?

ack-eliciting packets

Should there be an inclusion that packets including any of the Frames sent over the multicast stream are not ack-eliciting? I assume the intention is to not send ACK frames on the multicast streams but rather use MC_SESSION_ACK. Should there then also be a not ack-eliciting MC_DATA frame that carries the multicast data? Otherwise if STREAM frames are used they would also be ACK eliciting.

clarify how the secret in the MC_KEY frame is used

Add flow diagrams

Flow diagrams like these should show the important parts for channel use examples and who they're sent by:

move aead algorithm to CHANNEL_ANNOUNCE

Suggested change from @squarooticus :

Announce the payload AEAD algorithm in CHANNEL_ANNOUNCE and just assume it isn't going to change for the lifetime of the channel. This probably works better with existing QUIC stacks that assume only the key, rather than the alg, will change during a connection.

Max streams

RFC 9000, section 4.6: `Only streams with a stream ID less than "(max_streams * 4

first_stream_id_of_type)" can be opened`

Since each session has its own stream ID space a different restriction might need to be specified to make sure the combination of all streams across all sessions does not exceed the limit set by max_streams.

Remove STREAM_BOUNDARY

Good question on what this is and how it's meant to be used from Sam. Rough answer I sent:

It's there because we have shared channels and they might carry streams, and it’s useful to allow the streams to be long-lived. It’s possible (and I think useful) for a client to start processing an in-progress stream starting at the boundary of an HTTP push for instance, or at a message boundary for another higher-layer protocol, but without this stream boundary the client doesn’t know at what byte offset inside the stream it’s safe to start processing data (if a client started trying to parse the stream in the middle of the HTTP push, it wouldn’t know how to interpret it).

In addition to h3 server push (and maybe webtransport?), I think this is useful for the latest moq-related proposals like RUSH (https://datatracker.ietf.org/doc/draft-kpugin-rush/) and WARP (https://datatracker.ietf.org/doc/draft-lcurley-warp/).

Add a section on recovery

In general, retransmits for the frames with reliability in dropped packets on the multicast channels can happen on any channel that reaches the client, either over the unicast connection or the same or a different multicast channel. Server has responsibility to give the client all the frames with reliability (any of the control frames, as well as STREAM frames).

Also possible to do a STREAM_RESET if a server gives up on some block of data. When this happens, it's possible for the client to pick the stream up again later after a STREAM_BOUNDARY.

A section covering this will probably also need to cover #16 (about how to handle limits on how much data server will buffer before abandoning a client that's not keeping up), so maybe these issues should be merged, or have different subsections. But something reasonably complete is going to be needed on this topic.

Add an Application use cases sub-section?

I thought I'd capture here some iffy text that I wasn't happy enough with to include yet, but might be worth finishing to fill in the "Pool/ Channel management on the server side" part from #61 that I skipped when closing that issue.

The goal really is to make it clear what kinds of channel assignment strategies you'd want for a few different use cases, which I don't think this text quite achieves.

I'm not sure it's worth it, it seems like a lot of background in order to cover some non-normative example stuff that might be better done by an example implementation outside the spec, but I thought I'd leave this here for a bit more discussion instead of just dropping the "pool/Channel management" suggestion.

Maybe it would be better with a slightly different direction I just haven't thought of properly, I'm not sure. But this is what my interpretation started out typing, until it kinda felt like maybe more of a can of worms than it's worth, so I thought I'd dump it in here for possibly a later rev.

Please do suggest direction or text here if you think there's something worth filling out and incorporating, but I'm kind of inclined to drop it at the moment.

Application Use Cases {#application-use-cases}

This section covers some potentially useful applications and some of their deployment considerations if using multicast QUIC for the data delivery.

Live Adaptive Bitrate Media

Adaptive bitrate media (for example using HLS {{RFC8216}} or DASH {{MPEG-DASH}}) that's either live or has high concurrency among end users (for example right after release of a new episode of a popular show) can benefit from leveraging multicast delivery because many end users are trying to consume the same data at roughly the same time.

However, adaptive bitrate systems deliver different media segments to different end users according to the bitrate their network connections can support under the changing network conditions detected by the client.

Useful techniques:

send each segment in a separate stream, so that there's no dependency on prior content in a stream that might not have been delivered to a client that switched bitrates
deliver the most popular or highest bitrate content over multicast, and deliver lower bitrates or bitrates with fewer consumers over unicast.

Pre-loading Popular Video Clips

Video clips that will predictably be delivered to many users can usefully be delivered and cached with multicast QUIC, then served from cache when appropriate.

For example, advertisements for a cohort of end users or "trending" video clips for social media applications that have a wide audience might be delivered on a multicast channel that the server asks clients to join according to group memberships they have within the social media application. Then as users scroll their media feed, the advertisements or the popular video clips that are already pre-positioned can be inserted in the feed where appropriate without new network traffic (playing the content from a local cache), even though the users' consumption of these clips may not be synchronized.

File Transfer

Existing unidirectional file transfer protocols based on ALC {{RFC5775}} and FEC {{RFC6363}} can leverage multicast to send the same file to many end users even though they might start at different times and consume the data at different rates.

Add section on what frames are required to be retransmitted

Jakes comment:

Other kinds of frames besides streams also need to be retransmitted (like RESET_STREAM, as well as MC_INTEGRITY, MC_RETIRE, MC_LEAVE). Likewise other kinds of frames besides DATAGRAM frames don't get retransmitted (like PING and PADDING). I tried to leave it generic, but maybe we need a reference to https://www.rfc-editor.org/rfc/rfc9000.html#section-13.3 and https://www.rfc-editor.org/rfc/rfc9221.html#section-5.2 plus a similar section in this doc to list what gets retransmitted?

Move everything other than key to Announce frame

MC_CHANNEL_LEAVE text incomplete

Write something more coherent for the MC_CHANNEL_LEAVE text.

MC_STATE: don't overload connection and application Reason code spaces

The MC_STATE Reason field provides a shared code space for connection and application errors. This is quite different from how RFC 9000 does things, whereby CONNECTION_CLOSE indicates connection or application errors via the Frame type. This separation allows all transport concerns to be managed by one set of experts, while delegating application concerns to a different set of experts. That scales well.

I don't see much strong reason to deviate from the design QUIC already uses. The example in https://www.ietf.org/archive/id/draft-jholland-quic-multicast-01.html#section-12.1.1 highlights that an H3 implementation would need to model another error code 0x1000108 somehow, and it would be awkward to figure out where that actually gets registered.

So in summary, I suggest just defining two types of MC_STATE frame

Initial timeout

The initial time to join a channel can be significantly longer than the value in max idle time (as that is intended as the time to detect a disruption of an already established channel) as the join has to be propagated and the multicast tree constructed.

There could probably be an additional field in MC_CHANNEL_PROPERTIES (something like max establishment time) or just general guidance and a recommendation on the time it might take.

Make at least one server-side state diagram

possibly we need 2, there's a global state of channel existence, plus a client-specific state of joined/left. But maybe it can fit in one diagram, not sure yet.

Delivery of MC_SESSION_PROPERTIES over multicast

Section 9.1:

An MC_SESSION_PROPERTIES frame (type=TBD-01) is sent from server to client, either with the unicast connection or in an existing joined multicast session.

Is there any reason that this couldn't always be delivered just over the unicast stream? As the AEAD key and algorithm are mutable, I think this might expose an unnecessary attack vector. By only having it on the unicast stream its safer for both the server (as it knows all recipients that will get it) as well as the client (as it knows it comes from the legitimate source).

Mandate the shortest possible encoding of some varint fields?

There's a few fields that are variable-length integers, which are used to communicate only a handful of defined types. I'm guessing this is done to hedge some bets against future growth; that seems ok. But that leaves the door open to implementations doing annoything things like encoding 0x1 in 62 bits.

RFC 9000 includes a requirement that frame types are encoded using the smallest integer encoding https://www.rfc-editor.org/rfc/rfc9000.html#section-12.4-18. We might want to levy a similar requirement in this document

Clients should be able to tell if data came from multicast or unicast

Last sentence of section 2 reads:

An application using a multicast-capable QUIC implementation that receives a datagram or stream data has no knowledge at the application layer whether multicast was used or not used for that data, it will only know it has received unidirectional server-to-client application data.

As the security and privacy guarantees for data that has been delivered over multicast are (inherently) lower than over unicast, I feel like this could cause serious issues for applications. I think that any implementation that supports multicast delivery has to at least make it clear to the application that the data has been delivered over multicast, or even better yet only allow multicast delivery if the application opted in for it.

Is it always necessary to acknowledge all STREAM frames?

Following QUIC mechanics, every single STREAM frame has to be acknowledged. This is done so that missing frames can be retransmitted. However, in Multicast there are several use cases where a retransmission of missing frames might not be desired (such as live video streaming) and having to acknowledge every packet creates unnecessary overhead (which at Multicast scales could add up to be quite significant). I think it might be useful to have a mechanic that replaced the acknowledgement of every frame immediately with either a bundled acknowledgment of several frames that only occurs (relatively) rarely or even just an acknowledgment that (any) data is still being received over the Multicast Channel. I guess a high max_ack_delay could be used to bundle acknowledgments, but that is set by the client. There is also not (yet) the inclusion of a mechanic that would allow for different transport properties between the unicast connection and the Multicast channels, so you would be stuck with a high max_ack_delay for unicast frames as well.

Encrypt-then-mac is recommended

From @squarooticus:
§ 7.1 is going to trigger some folks who have ETM (encrypt-then-mac) on the brain to avoid leaking information, so some reassurance that the packet hashes are in the encrypted stream would probably suffice to prevent this reaction.

Response from Jake:

I guess the flow here if the hash is on the encrypted packet is:

hash the packet with the channel's hash algorithm
decrypt the packet (or at least the header) so you have the packet number
check the hash, reject if it doesn't match
parse the packet and accept it (provided it doesn't trigger protocol errors, etc.)

Would that work better?

Michaels review - Rephrase part of section 11.2

Regarding this part:

From Packet Number and Until Packet Number are used to indicate the packet number (Section 17.1 of [RFC9000]) of the 1-RTT packets received over which these values are applicable.

His comment is:

This is hard to understand, and I think it’s to do with “these values”. Replace with “the fields of the MC_CHANNEL_PROPERTIES frame”, perhaps??

MC_SESSION_INTEGRITY

For type TBD-05, Length is present and is a count of packet hashes. For TBD-04, Length is not present and the packet hashes extend to the end of the packet.

Should this read "[...] extend to the end of the session."?

Channel IDs should be carried in frames like Connection IDs

Connection ID with a required length, e.g. https://www.rfc-editor.org/rfc/rfc9000#name-new_connection_id-frames:

NEW_CONNECTION_ID Frame {
...
  Connection ID (8..160),
...
}

Instead of (i):

MC_CHANNEL_INTEGRITY Frame {
...
  Channel ID (i),
...
}

How is a Stateless Reset handled?

https://www.rfc-editor.org/rfc/rfc9000#section-10.3

Replace Max Idle Time with Max Ack Delay

I think we talked about this but forgot the issue I think:

Since the server knows if/ when reception of multicast is interrupted (by no longer getting ACKs) there is no need for clients to leave channels unilaterally if they think the channel timed out.

Add something like a MC_RESERVE_SESSIONIDS frame.

From Max in issue #4 comment:
#8 (comment)

There should probably be a mechanic to force clients to not use some connection IDs for unicast connections. I think only doing it in the MC_SESSION_PROPERTIES frame is too late, it should probably be done immediately after the handshake of the unicast connection. Its basically a frame telling the client "Do not use any of these connection IDs for your unicast streams as we might have a multicast session that uses this ID. If you (by sheer bad luck) already use one of them for a unicast connection (i.e. the initially created one), issue a new connection ID and retire the old one immediately."

Ordering of Announce, properties and Join

Client's have no way to communicate to the server if a properties frame arrives after an announce frame. In this case, it might take them until they receive a join for the server to notice that the properties are missing.

Define the server's responsibility for buffering old data

Session data needs reliability on streams, but also servers need to not be required to buffer unbounded amounts of data. Define what the failure modes look like.

Frame spec cleanup

There's a number of problems with the frames:

Channel IDs need a length preceding the 8..160.
Some missing reasons in CLIENT_CHANNEL_STATE
Better extensibility if we make the bit fields into varints and define their selector bits, I think. Is this useful?

(Note: working thru these in the course of frame implementation (GrumpyOldTroll/quiche#5), proposal forthcoming...)

For point 3 I'm tentatively thinking something like this:

---
MC_CLIENT_LIMITS Frame {
  Type (i) = TBD-09 (experiments use 0xff3e809),
  Client Limits Sequence Number (i),
  Capabilities Flags(i),
  Max Aggregate Rate (i),
  Max Channel IDs (i),
  Max Joined Count (i),
}
---
{: #fig-mc-client-limits-format title="MC_CLIENT_LIMITS Frame Format"}

The sequence number is implicitly 0 before the first MC_CLIENT_LIMITS frame from the client, and increases by 1 each new frame that's sent.
Newer frames override older ones.

Limit Support Flags is a bit field computed as follows:

 - 0x1 is set if IPv4 channels are permitted
 - 0x2 is set if IPv6 channels are permitted
 - 0x4 is set if SSM channels are permitted
 - 0x8 is set if ASM channels are permitted

Plus something similar in CHANNEL_PROPERTIES. Nice part is this doesn't change encoding if we add flags that go beyond the reserved space, it just ends up taking more varint space when they're set.

Add operational considerations section

Depending on size might also be a separate ID.

Include reference to CBACC
Include fan out/ deployment strategies
Graceful degradation (#44)
Pool/ Channel management on the server side
Reusing the same S,G for multiple channels might have implications for the rate on the network
...

Loss of a MC_CHANNEL_PROPERTIES frame (and potentially MC_CHANNEL_LEAVE)

If sent over multicast, a MC_CHANNEL_PROPERTIES frame might get lost. Add text that states that servers MUST retransmit frames lost this way over unicast.

If the split into multiple frames happens, this would probably be only a MUST for the key update frames.

Can the same stream ID be used in multiple channels?

Since the stream ID space is shared between all channels and the connection, would it be possible for two channels to send data on the same stream?

4.4 says:
a server can always avoid stream ID collisions with the stream IDs carried in sessions
Does this mean it MUST avoid collisions?

I guess it depends on where the stream data is processed, if there is a sub process for each channel it might be ok, but if they all end up in the same place you would presumably see different packets with the same ACK number (since each channel uses its own ACK number space, so they have to overlap), which would probably lead to issues.

In either case I think it is something that should be clarified.

Path migration

What happens if the unikalster path is being migrated due to a change in NAT/ mobility? It would probably mean that all sessions should issue new IGMP/MLD reports since it’s quite possible that the device is connected to a new router does not yet have forwarding state and waiting for a query might cause unnecessary delays.

It might also mean that the idle timeout for the session is (falsely) triggered as it could take some time for the join to propagate far enough upstream to receive Multicast packets once again.

Michaels review - "unspecified termination"

Regarding this part:

A From Packet Number without an Until Packet Number has an unspecified termination.

His comment is:

That also reads strange to me. Why not just say “is not specified”

Add channel state diagrams

We at least need a channel state diagram for client, and probably also for server.
(like https://www.rfc-editor.org/rfc/rfc9000#fig-stream-send-states and https://www.rfc-editor.org/rfc/rfc9000#fig-stream-recv-states and https://www.ietf.org/archive/id/draft-ietf-quic-multipath-01.html#fig-path-states)

MC_CHANNEL_INTEGRITY frames MUST be protected against injection, modification, and repla

"Integrity" as defined in draft-krose-multicast-security is specific to multicast (and necessarily weaker than for unicast). So I think the use of "integrity" in normative language here is probably ambiguous. The key properties are protection against:

Injection: an attacker, including other receivers, must not be able to create new channel integrity frames that will be accepted by a properly-functioning receiver.
Modification: an attacker, including other receivers, must not be able to modify channel integrity frames such that they will be accepted by a properly-functioning receiver.
Replay: an attacker, including other receivers, must not be able to alter the content delivered to a receiver through the replay of previously-seen channel integrity frames.

Right now, all three properties are achieved by sending (and presumably accepting) such frames only over unicast QUIC. In the future, the first two properties will be provided by something like AMBI-over-unicast-or-signed-manifest or ALTA, while the third is (now and in the future) provided by the fact that channel integrity frames are stateless and declarative (i.e., a given frame has an immutable meaning).

There are references to (S,G)s even though IP layer is allowed to be ASM

Max pointed out that there's some places talking about (S,G)s when talking about the IP layer multicast Group or Channel, rather than "(*,G)s or (S,G)s" or something, but the quic channels are not restricted to only SSM (they can include ASM). Some possible solutions:

restrict to SSM
generalize the references
- possibly by adding a term to capture "(*,G) or (S,G)" more succinctly

What is a multicast session?

I am confused about what a multicast session actually is. Section 2:

A multicast session (or just session) is a one-way communication stream

So is it just a (special) QUIC stream? If so, does Session ID in 9.2 just mean stream ID?

But if it is just a stream, what is the point of the Max Streams field in MC_SESSION_PROPERTIES (and why is it mutable)?

Frame names too long, diagrams too wide

Warnings during build about idnits violations from diagram width (both). In large part driven by frame name length.