Giter VIP home page Giter VIP logo

ops-drafts's People

Contributors

britram avatar chris-box avatar eltociear avatar gorryfair avatar ianswett avatar igorlord avatar janaiyengar avatar lpardue avatar martinduke avatar martinthomson avatar mikebishop avatar mirjak avatar mjoras avatar mnot avatar nibanks avatar tfpauly avatar thomas-fossati avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ops-drafts's Issues

Provide guidance on use of PING

The transport draft says:

"The PING frame can be used to keep a connection alive when an application or application protocol wishes to prevent the connection from timing out. An application protocol SHOULD provide guidance about the conditions under which generating a PING is recommended. This guidance SHOULD indicate whether it is the client or the server that is expected to send the PING. Having both endpoints send PING frames without coordination can produce an excessive number of packets and poor performance."

Probably we should state this in the applicability statement as well...

Connection IDs and ICMP Error messages (for PMTU discovery)

Any endpoint that identifies connections using a Connection ID requires that Connection ID in packets, including ICMP Error packets. Hence, that endpoint will not only request the peer to send Connection IDs but will also include Connection IDs on outgoing packets (in case they result in an ICMP Error packet). Such ICMP Error packets are required for PMTU discovery, among other things.

Such Connection IDs will be sent even if the peer gave its permission to omit Connection IDs using Transport Parameter omit_connection_id. See quicwg/base-drafts#953 for more.

Handshake Illustration

The manageability document should illustrate all the various packet sequences in a handshake, preferably with pretty pictures.

Improve and generalize text on DoS detection and mitigation

Current practices in detection and mitigation of Distributed Denial of Service (DDoS) attacks generally involve passive measurement using network flow data {{?RFC7011}}, classification of traffic into "good" (productive) and "bad" (DoS) flows, and filtering of these bad flows in a "scrubbing" environment

This is not how it works. A few examples:

Setup CI

Someone with ownership rights needs to hit the button on circle (not travis).

GQUIC -> QUIC migration

From @martinduke in quicwg/base-drafts#1006:

We're pretty close to settling on a wire image, I think. It would be useful for the transport draft (and eventually the RFC) to have an appendix covering issues with simultaneous support of GQUIC versions and QUIC v1.

I believe all the entities actually trying to support GQUIC are heavily involved in the working group. However, I imagine there are quite a lot of middleboxes out there doing some basic ID/classification (if not more) on GQUIC today, and will need some guide on how to simultaneously handle two packet header formats, etc.

If people are opposed to an appendix, I suppose a short-lived internet draft would also get the job done. In any case, I'd like to see a placeholder sooner rather than later.

Operational guidelines for reducing timing-linkability across CID migration

Linkability across CID changes is in the common case so trivial that protocol features to defeat linkability through other means risk being useless.

"Find CID y where delay < d between last packet for CID x and first packet for CID y on 2-tuple {a,p}, given {x,a,p}" is an operation which requires zero additional state and a simple O(kn log n) search for any large on-path (passive surveillance) device that's halfway smart about keeping per-flow state -- i.e., it's basically a free operation, and its utility is baked into the physics of CIDs -- after all, this is what CIDs are for.

The ease of this analysis can only be mitigated by increasing the size of the anonymity set: ensuring that for any given delay window d, a minimum number of CIDs x transition on any given {a,p}. This seems like good operational advice for servers with enough traffic to build such anonymity sets (should they have interest in preventing client linkability, of course) -- small servers are probably out of luck though.

Applicability should call this out as a problem, manageability should suggest a solution space.

update -manageability for BKK spin bit consensus

Review and update text in -manageability to reflect BKK spin bit consensus:

  • single spin bit is in the protocol spec now
  • endpoints may opt out (discuss why; this is for any reason, not just proxies as suggested in #55),
  • endpoints should probabilistically out out to provide cover for endpoints opting out

Considerations for NATs

The manageability draft should talk a little bit about NATS for QUIC.

The laziest thing to do is just continue to have NAT mapping be by 4-tuple, as it is today. This is actually fine.

However, someone might attempt to be clever and use the CID as well. For example, client A and client B could share the same serverside address and port and the NAT can initially distinguish between the two because they have different CIDs. This appears to save address space, but doesn't actually work because the server could switch to a new client connection ID. In this case, the NAT won't be able to determine which client to route to.

Applicability introduction is outdated

This is editorial, but the whole introduction is written as if the quic drafts are some early concept kicking around in the working group. This will not age well, and the intro should be worded as if it were already a published RFC.

The intro still talks about "HTTP/2 over QUIC" when it's now HTTP/3.

Write subsection "Session resumption versus keep-alive"

Add text to section on tradeoffs for using 0-RTT session resumption rather then sending keep-alives. This was discussed at the interim in Paris, and it turns out that this might not be a great idea; tracking the open editor's note in the doc.

Streams as message-framing

It's useful to talk about the use of streams as messages, since they're designed to be lightweight. This is a feature of QUIC, and it's worth articulating.

Note about small packets

Middlebox vendors (and others as well) used to TCP want to assume that the smallest packet might be ack packets. This is not true in QUIC, and manageabiity draft may want to say something about it.

Point out how troubleshooting gets easier

The section on "Passive network performance measurement and troubleshooting" is probably a bit too pessimistic.

It's probably worth pointing out that one part of network troubleshooting -- "find out which box to blame" -- is substantially easier. Thanks to integrity protection, most middleboxes, can drop, delay/rate limit, and corrupt, and that's about it. These are fairly easy behaviors to diagnose.

I've had some conversations with customers afraid of not having their TCP headers anymore, and I believe this fear is largely misguided.

If people agree this belongs in the draft, I can file a PR.

Maybe say something about in-order delivery (in streams)

See also section on Streams in transport draft:
"Stream offsets allow for the octets on a stream to be placed in order. An endpoint MUST be capable of delivering data received on a stream in order. Implementations MAY choose to offer the ability to deliver data out of order. There is no means of ensuring ordering between octets on different streams."

Applicability 0RTT section

While TFO has some 0RTT-like properties, it does not fully replace it, not only because the SYN or option gets scrubbed, but also because you're limited to one packet payload.

Also, this ought to mention TLS stacks that offer replay protection.

I would think a modern QUIC interface would offer a signal to separate 0RTT from 1RTT data, so that both senders and receivers can manage their application data appropriately.

CID and length

Note in section about CID that as the server chooses the CIDs, it could use some own scheme to embed the length of the CID in the CID itself, instead remembering it, e.g. use the first few bits to indicate which of the sizes that are support was chosen.

Advice on network state idle timeouts

For discussion: should the manageability draft advise that idle timeouts should be longer for flows inferred/detected to be QUIC flows?

I think the answer here is "it depends": for NATs, yes, state at the edge is not all that scarce anymore. For firewalls? Maybe not so much. Are there other applications to consider here, and is there any generalized advice this document could give?

If so, is there any generalized advice for how long the timeout should be?

Document the challenges of creating a 'transparent' QUIC MITM

Feel free to close this if it's out of scope, but creating a proxy that attempts to passively observe QUIC is much more complex and potentially problematic than TLS over TCP.

For example, all major OS's have mature TCP implementations, whereas a MITM would have to pull in its own QUIC implementation. If it does so, it would be important that's it's both conformant to IETF specifications and have good performance. This is the full L7 termination case, which is possible, but a lot of work, and we should heavily encourage anyone doing this to use an existing("good") implementation.

A potential MITM option is to attempt to terminate the handshake, but then only observe ApplicationData packets. However, this could go wrong in many ways, so I believe it should be strongly discouraged. QUIC is designed as an end-to-end protocol and efforts to change that are likely to end very poorly.

For example, if the length of the CID changes, then the offset of the header changes and a middlebox who doesn't track all CIDs both peers may be unable to decrypt packets. I'm sure there are more gotchas, particularly with the introduction of extensions. It may also be worth pointing out that QUIC frames are not TLV, so any frames(ie: extensions) that are not understood cause all other frames to be unparseable.

Thanks to @DavidSchinazi for the specific examples of ways this could fail.

Mention accidental invariants and the calculus of blocking

From Spencer on the list:

Before I read this ^ paragraph, I was planning to ask you whether "blocking QUIC" merited a mention in https://tools.ietf.org/html/draft-ietf-quic-manageability-06, because we are in the awkward position with HTTP/3 that operators will face less blowback blocking HTTP/3-QUIC and forcing a fallback to HTTP/2-TCP than they will face blocking almost any non-HTTP/3 QUIC protocol (with less clearly defined fallbacks) (see: MASQUE).
ISTM that at a minimum, the "accidental invariants" might be mentioned in https://tools.ietf.org/html/draft-ietf-quic-manageability-06, which targets operators.

Guidance for port number use

Copying from quicwg/base-drafts#495

There's no requirement that servers use a particular UDP port for HTTP/QUIC or any other QUIC traffic. Using Alt-Svc, the server is able to pick and advertise any port number and a compliant client will handle it just fine. That's already the case, and isn't part of this issue. #424 updates the HTTP draft to highlight this, increasing the odds that implementations will test and support that case.

This issue is to track that it might actually be desirable from a privacy standpoint for servers to pick arbitrary port numbers and perhaps even rotate them periodically (though that requires coordination with their Alt-Svc advertisements and lifetimes, which could be challenging) in order to make it more difficult for a network observer to classify traffic (and therefore more difficult to ossify).

On the other hand, as we're wrestling with in each of these privacy/manageability debates, removing easy network visibility into the likely protocol by using arbitrary port numbers means that middleboxes will probably resort to other means of attempting to identify protocols and potentially doing it badly, which could result in even worse ossification. (E.g. indexing into the TLS ClientHello to find the ALPN list, then panicking on a different handshake protocol.)

There's further discussion on the original issue, but this belongs in the ops drafts, not the protocol drafts.

Document how SNI from QUIC handshakes is presented

SNI is currently mentioned in the Manageability document, but you'd have to read a substantial chunk of the transport and tls drafts to understand how to extract it.

I'd suggest a short section that describes how it's presented on the wire with references to the appropriate sections on transport and tls.

Currently, the section on client Initial says: "The Client Hello datagram exposes version number, source and destination connection IDs, and information in the TLS Client Hello message, including any TLS Server Name Indication (SNI) present, in the clear."

I think saying the SNI is in the clear is a bit misleading to a casual observer.

Double check use of normative language

Not clear if these documents should use normative language at all. The applicability statement also sometimes uses normative language to recommend the implementation of certain interfaces; I don't think this is the right document for that but as we also don't have another document, we should decide if we want to do that or not!

Server-Generated Connection ID should include a cryptographic MAC

The guidance recommend that server generated connection IDs should include a crytographic MAC when being used for load balancing. This sounds like a good suggestion in principal, but the Connection ID is 64bits in size, I'm not a cryptographer but I don't see how this is possible - if it is possible then it perhaps would be good to include guidance within the documents?

My logic is as follows (hopefully I've made a mistake/misunderstood the concept somewhere)

As a CDN operator I'd probably want to segment the connection ID into 4 spaces:
POP id|machine id|unique connection|MAC

You could combine your POP id and machine ID into a single space but it is likely to be beneficial to have two tiers, the POP routes the request to the right general location and the machine id then identifies a machine within that space. Other ways to split this are also possible.

If 12 bits are used for POP id that would give 4096 distinct locations. Trends to deploy deeper into networks (particularly within mobile/cell base stations) doesn't make this number totally infeasible .
If 8 bits are then used for a machine id within a POP that gives 256 possible machines.

This means 20 bits might be used for load balancing purposes. This gives a bit (lot?) of room for flexibility and it is unlikely all million combinations are used but it is probably the right order of magnitude for a large widely distributed CDN (even if we drop down to 16 bits it doesn't significantly impact the analysis below).

The actual unique connection pool needs to be large enough to avoid repeating too quickly - even if you have a lot of clients taking advantage of 0RTT to aggressively close/re-open connections. There also presumably needs to be some randomization of the ids so that they aren't created sequentially. I've assumed 24 bits (which is only 16 million unique ids per server).

This only leaves 20 bits left for the MAC, that is about 1 million possible values. As I've said I'm not a cryptographer but to me this looks small enough that a brute force attack guided by some valid inputs might be feasible even on relatively low power machines?

For that matter isn't a simple replay attack feasible? The MAC can't be based upon the full 5-tuple as we need to support connection mobility. You could embed a timestamp into the id somewhere to protect against this but that would use up even more bits, otherwise you could slowly gather up valid (but presumably closed) connection ids and then aggressively replay them all in a short time period.

Write subsection "Thinking in zero RTT"

Jana noted at the interim in Paris that we should point out
that applications need to be re-thought slightly to get the benefits of zero
RTT. Add a little text to discuss this and why it's worth the effort,
before we go straight into the dragons.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.