quicwg / ops-drafts Goto Github PK
View Code? Open in Web Editor NEWApplicability and Manageability Statements
Home Page: https://quicwg.org/
Applicability and Manageability Statements
Home Page: https://quicwg.org/
I think the applicability draft may benefit from exploring connection migration strategies as described in:
The transport draft says:
"The PING frame can be used to keep a connection alive when an application or application protocol wishes to prevent the connection from timing out. An application protocol SHOULD provide guidance about the conditions under which generating a PING is recommended. This guidance SHOULD indicate whether it is the client or the server that is expected to send the PING. Having both endpoints send PING frames without coordination can produce an excessive number of packets and poor performance."
Probably we should state this in the applicability statement as well...
Any endpoint that identifies connections using a Connection ID requires that Connection ID in packets, including ICMP Error packets. Hence, that endpoint will not only request the peer to send Connection IDs but will also include Connection IDs on outgoing packets (in case they result in an ICMP Error packet). Such ICMP Error packets are required for PMTU discovery, among other things.
Such Connection IDs will be sent even if the peer gave its permission to omit Connection IDs using Transport Parameter omit_connection_id. See quicwg/base-drafts#953 for more.
Moving quicwg/base-drafts#514 to the ops repo.
Of course update text on connection ID first when PR on symmetric IDs is landed.
The manageability document should illustrate all the various packet sequences in a handshake, preferably with pretty pictures.
This is related to quicwg/base-drafts#2602 and depends on the outcome there.
Current practices in detection and mitigation of Distributed Denial of Service (DDoS) attacks generally involve passive measurement using network flow data {{?RFC7011}}, classification of traffic into "good" (productive) and "bad" (DoS) flows, and filtering of these bad flows in a "scrubbing" environment
This is not how it works. A few examples:
The applicability statement should provide / point to guidelines for application protocol port selection for QUIC. See the related issue on the base drafts.
As noted in discussion on #94.
... in the applicability statement? I mean one example mechanism maybe...
Write section to provide guidance; maybe in the same section than guidance in port selection in general
Someone with ownership rights needs to hit the button on circle (not travis).
From @martinduke in quicwg/base-drafts#1006:
We're pretty close to settling on a wire image, I think. It would be useful for the transport draft (and eventually the RFC) to have an appendix covering issues with simultaneous support of GQUIC versions and QUIC v1.
I believe all the entities actually trying to support GQUIC are heavily involved in the working group. However, I imagine there are quite a lot of middleboxes out there doing some basic ID/classification (if not more) on GQUIC today, and will need some guide on how to simultaneously handle two packet header formats, etc.
If people are opposed to an appendix, I suppose a short-lived internet draft would also get the job done. In any case, I'd like to see a placeholder sooner rather than later.
Related to quicwg/base-drafts#2308 as a comment came up in the discussion that the transport draft does not explain why anybody would use coalesced packets. Do we want to elaborate some use cases and provide more guidance?
Linkability across CID changes is in the common case so trivial that protocol features to defeat linkability through other means risk being useless.
"Find CID y where delay < d between last packet for CID x and first packet for CID y on 2-tuple {a,p}, given {x,a,p}" is an operation which requires zero additional state and a simple O(kn log n) search for any large on-path (passive surveillance) device that's halfway smart about keeping per-flow state -- i.e., it's basically a free operation, and its utility is baked into the physics of CIDs -- after all, this is what CIDs are for.
The ease of this analysis can only be mitigated by increasing the size of the anonymity set: ensuring that for any given delay window d, a minimum number of CIDs x transition on any given {a,p}. This seems like good operational advice for servers with enough traffic to build such anonymity sets (should they have interest in preventing client linkability, of course) -- small servers are probably out of luck though.
Applicability should call this out as a problem, manageability should suggest a solution space.
Review and update text in -manageability to reflect BKK spin bit consensus:
The manageability draft should talk a little bit about NATS for QUIC.
The laziest thing to do is just continue to have NAT mapping be by 4-tuple, as it is today. This is actually fine.
However, someone might attempt to be clever and use the CID as well. For example, client A and client B could share the same serverside address and port and the NAT can initially distinguish between the two because they have different CIDs. This appears to save address space, but doesn't actually work because the server could switch to a new client connection ID. In this case, the NAT won't be able to determine which client to route to.
This is editorial, but the whole introduction is written as if the quic drafts are some early concept kicking around in the working group. This will not age well, and the intro should be worded as if it were already a published RFC.
The intro still talks about "HTTP/2 over QUIC" when it's now HTTP/3.
Moving this from quicwg/base-drafts#504 to here.
Add text to section on tradeoffs for using 0-RTT session resumption rather then sending keep-alives. This was discussed at the interim in Paris, and it turns out that this might not be a great idea; tracking the open editor's note in the doc.
It's useful to talk about the use of streams as messages, since they're designed to be lightweight. This is a feature of QUIC, and it's worth articulating.
Both documents provide limited guidance on use of ports, however, there was an open question if the applicability statement should maybe say more and e.g. also talk about ALPN.
Middlebox vendors (and others as well) used to TCP want to assume that the smallest packet might be ack packets. This is not true in QUIC, and manageabiity draft may want to say something about it.
See Toma's email.
The section on "Passive network performance measurement and troubleshooting" is probably a bit too pessimistic.
It's probably worth pointing out that one part of network troubleshooting -- "find out which box to blame" -- is substantially easier. Thanks to integrity protection, most middleboxes, can drop, delay/rate limit, and corrupt, and that's about it. These are fairly easy behaviors to diagnose.
I've had some conversations with customers afraid of not having their TCP headers anymore, and I believe this fear is largely misguided.
If people agree this belongs in the draft, I can file a PR.
Probably needs some API for the app to indicate if 0-RTT should be retransmitted or withdrawn after a rejection.
See also section on Streams in transport draft:
"Stream offsets allow for the octets on a stream to be placed in order. An endpoint MUST be capable of delivering data received on a stream in order. Implementations MAY choose to offer the ability to deliver data out of order. There is no means of ensuring ordering between octets on different streams."
While TFO has some 0RTT-like properties, it does not fully replace it, not only because the SYN or option gets scrubbed, but also because you're limited to one packet payload.
Also, this ought to mention TLS stacks that offer replay protection.
I would think a modern QUIC interface would offer a signal to separate 0RTT from 1RTT data, so that both senders and receivers can manage their application data appropriately.
It might be the case that this is true for the whole draft and we should just state this at the beginning but better to double-check!
Note in section about CID that as the server chooses the CIDs, it could use some own scheme to embed the length of the CID in the CID itself, instead remembering it, e.g. use the first few bits to indicate which of the sizes that are support was chosen.
Section 2.5 needs content.
For discussion: should the manageability draft advise that idle timeouts should be longer for flows inferred/detected to be QUIC flows?
I think the answer here is "it depends": for NATs, yes, state at the edge is not all that scarce anymore. For firewalls? Maybe not so much. Are there other applications to consider here, and is there any generalized advice this document could give?
If so, is there any generalized advice for how long the timeout should be?
Review the text of the -manageability draft to ensure that changes to headers in the latest version of -transport are correctly tracked.
The ops drafts often reference the core drafts by section number; make sure these section number references are correct for the WGLC drafts.
Feel free to close this if it's out of scope, but creating a proxy that attempts to passively observe QUIC is much more complex and potentially problematic than TLS over TCP.
For example, all major OS's have mature TCP implementations, whereas a MITM would have to pull in its own QUIC implementation. If it does so, it would be important that's it's both conformant to IETF specifications and have good performance. This is the full L7 termination case, which is possible, but a lot of work, and we should heavily encourage anyone doing this to use an existing("good") implementation.
A potential MITM option is to attempt to terminate the handshake, but then only observe ApplicationData packets. However, this could go wrong in many ways, so I believe it should be strongly discouraged. QUIC is designed as an end-to-end protocol and efforts to change that are likely to end very poorly.
For example, if the length of the CID changes, then the offset of the header changes and a middlebox who doesn't track all CIDs both peers may be unable to decrypt packets. I'm sure there are more gotchas, particularly with the introduction of extensions. It may also be worth pointing out that QUIC frames are not TLV, so any frames(ie: extensions) that are not understood cause all other frames to be unparseable.
Thanks to @DavidSchinazi for the specific examples of ways this could fail.
From Spencer on the list:
Before I read this ^ paragraph, I was planning to ask you whether "blocking QUIC" merited a mention in https://tools.ietf.org/html/draft-ietf-quic-manageability-06, because we are in the awkward position with HTTP/3 that operators will face less blowback blocking HTTP/3-QUIC and forcing a fallback to HTTP/2-TCP than they will face blocking almost any non-HTTP/3 QUIC protocol (with less clearly defined fallbacks) (see: MASQUE).
ISTM that at a minimum, the "accidental invariants" might be mentioned in https://tools.ietf.org/html/draft-ietf-quic-manageability-06, which targets operators.
Copying from quicwg/base-drafts#495
There's no requirement that servers use a particular UDP port for HTTP/QUIC or any other QUIC traffic. Using Alt-Svc, the server is able to pick and advertise any port number and a compliant client will handle it just fine. That's already the case, and isn't part of this issue. #424 updates the HTTP draft to highlight this, increasing the odds that implementations will test and support that case.
This issue is to track that it might actually be desirable from a privacy standpoint for servers to pick arbitrary port numbers and perhaps even rotate them periodically (though that requires coordination with their Alt-Svc advertisements and lifetimes, which could be challenging) in order to make it more difficult for a network observer to classify traffic (and therefore more difficult to ossify).
On the other hand, as we're wrestling with in each of these privacy/manageability debates, removing easy network visibility into the likely protocol by using arbitrary port numbers means that middleboxes will probably resort to other means of attempting to identify protocols and potentially doing it badly, which could result in even worse ossification. (E.g. indexing into the TLS ClientHello to find the ALPN list, then panicking on a different handshake protocol.)
There's further discussion on the original issue, but this belongs in the ops drafts, not the protocol drafts.
quic could offer an appliaction interface to specify the use of fixed length packets as input for an algorithm that handles padding to enable concealment of application characteristics based on packet length
SNI is currently mentioned in the Manageability document, but you'd have to read a substantial chunk of the transport and tls drafts to understand how to extract it.
I'd suggest a short section that describes how it's presented on the wire with references to the appropriate sections on transport and tls.
Currently, the section on client Initial says: "The Client Hello datagram exposes version number, source and destination connection IDs, and information in the TLS Client Hello message, including any TLS Server Name Indication (SNI) present, in the clear."
I think saying the SNI is in the clear is a bit misleading to a casual observer.
See also issue quicwg/base-drafts#2496
Ensure that the description of redirection via server stateless reject accurately reflects the present state of the transport draft.
Not clear if these documents should use normative language at all. The applicability statement also sometimes uses normative language to recommend the implementation of certain interfaces; I don't think this is the right document for that but as we also don't have another document, we should decide if we want to do that or not!
The guidance recommend that server generated connection IDs should include a crytographic MAC when being used for load balancing. This sounds like a good suggestion in principal, but the Connection ID is 64bits in size, I'm not a cryptographer but I don't see how this is possible - if it is possible then it perhaps would be good to include guidance within the documents?
My logic is as follows (hopefully I've made a mistake/misunderstood the concept somewhere)
As a CDN operator I'd probably want to segment the connection ID into 4 spaces:
POP id|machine id|unique connection|MAC
You could combine your POP id and machine ID into a single space but it is likely to be beneficial to have two tiers, the POP routes the request to the right general location and the machine id then identifies a machine within that space. Other ways to split this are also possible.
If 12 bits are used for POP id that would give 4096 distinct locations. Trends to deploy deeper into networks (particularly within mobile/cell base stations) doesn't make this number totally infeasible .
If 8 bits are then used for a machine id within a POP that gives 256 possible machines.
This means 20 bits might be used for load balancing purposes. This gives a bit (lot?) of room for flexibility and it is unlikely all million combinations are used but it is probably the right order of magnitude for a large widely distributed CDN (even if we drop down to 16 bits it doesn't significantly impact the analysis below).
The actual unique connection pool needs to be large enough to avoid repeating too quickly - even if you have a lot of clients taking advantage of 0RTT to aggressively close/re-open connections. There also presumably needs to be some randomization of the ids so that they aren't created sequentially. I've assumed 24 bits (which is only 16 million unique ids per server).
This only leaves 20 bits left for the MAC, that is about 1 million possible values. As I've said I'm not a cryptographer but to me this looks small enough that a brute force attack guided by some valid inputs might be feasible even on relatively low power machines?
For that matter isn't a simple replay attack feasible? The MAC can't be based upon the full 5-tuple as we need to support connection mobility. You could embed a timestamp into the id somewhere to protect against this but that would use up even more bits, otherwise you could slowly gather up valid (but presumably closed) connection ids and then aggressively replay them all in a short time period.
Jana noted at the interim in Paris that we should point out
that applications need to be re-thought slightly to get the benefits of zero
RTT. Add a little text to discuss this and why it's worth the effort,
before we go straight into the dragons.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.