aws / s2n-quic Goto Github PK

View Code? Open in Web Editor NEW

1.1K 24.0 114.0 11.79 MB

An implementation of the IETF QUIC protocol

Home Page: https://crates.io/crates/s2n-quic

License: Apache License 2.0

Rust 99.44% Dockerfile 0.08% Shell 0.41% HTML 0.04% Procfile 0.03% C 0.01%

quic rust cryptography s2n

s2n-quic's Introduction

s2n-quic

s2n-quic is a Rust implementation of the IETF QUIC protocol, featuring:

a simple, easy-to-use API. See an example of an s2n-quic echo server built with just a few API calls
high configurability using providers for granular control of functionality
extensive automated testing, including fuzz testing, integration testing, unit testing, snapshot testing, efficiency testing, performance benchmarking, interoperability testing and more
integration with s2n-tls, AWS's simple, small, fast and secure TLS implementation, as well as rustls
thorough compliance coverage tracking of normative language in relevant standards
and much more, including CUBIC congestion controller support, packet pacing, Generic Segmentation Offload support, Path MTU discovery, and unique connection identifiers detached from the address

See the API documentation and examples to get started with s2n-quic.

Installation

s2n-quic is available on crates.io and can be added to a project like so:

[dependencies]
s2n-quic = "1"

NOTE: On unix-like systems, s2n-tls will be used as the default TLS provider. On linux systems, aws-lc-rs will be used for cryptographic operations. A C compiler and CMake may be required on these systems for installation.

Example

The following implements a basic echo server and client. The client connects to the server and pipes its stdin on a stream. The server listens for new streams and pipes any data it receives back to the client. The client will then pipe all stream data to stdout.

Server

// src/bin/server.rs
use s2n_quic::Server;
use std::{error::Error, path::Path};

#[tokio::main]
async fn main() -> Result<(), Box<dyn Error>> {
    let mut server = Server::builder()
        .with_tls((Path::new("cert.pem"), Path::new("key.pem")))?
        .with_io("127.0.0.1:4433")?
        .start()?;

    while let Some(mut connection) = server.accept().await {
        // spawn a new task for the connection
        tokio::spawn(async move {
            while let Ok(Some(mut stream)) = connection.accept_bidirectional_stream().await {
                // spawn a new task for the stream
                tokio::spawn(async move {
                    // echo any data back to the stream
                    while let Ok(Some(data)) = stream.receive().await {
                        stream.send(data).await.expect("stream should be open");
                    }
                });
            }
        });
    }

    Ok(())
}

Client

// src/bin/client.rs
use s2n_quic::{client::Connect, Client};
use std::{error::Error, path::Path, net::SocketAddr};

#[tokio::main]
async fn main() -> Result<(), Box<dyn Error>> {
    let client = Client::builder()
        .with_tls(Path::new("cert.pem"))?
        .with_io("0.0.0.0:0")?
        .start()?;

    let addr: SocketAddr = "127.0.0.1:4433".parse()?;
    let connect = Connect::new(addr).with_server_name("localhost");
    let mut connection = client.connect(connect).await?;

    // ensure the connection doesn't time out with inactivity
    connection.keep_alive(true)?;

    // open a new stream and split the receiving and sending sides
    let stream = connection.open_bidirectional_stream().await?;
    let (mut receive_stream, mut send_stream) = stream.split();

    // spawn a task that copies responses from the server to stdout
    tokio::spawn(async move {
        let mut stdout = tokio::io::stdout();
        let _ = tokio::io::copy(&mut receive_stream, &mut stdout).await;
    });

    // copy data from stdin and send it to the server
    let mut stdin = tokio::io::stdin();
    tokio::io::copy(&mut stdin, &mut send_stream).await?;

    Ok(())
}

Minimum Supported Rust Version (MSRV)

s2n-quic will maintain a rolling MSRV (minimum supported rust version) policy of at least 6 months. The current s2n-quic version is not guaranteed to build on Rust versions earlier than the MSRV.

The current MSRV is 1.71.0.

Security issue notifications

If you discover a potential security issue in s2n-quic we ask that you notify AWS Security via our vulnerability reporting page. Please do not create a public github issue.

If you package or distribute s2n-quic, or use s2n-quic as part of a large multi-user service, you may be eligible for pre-notification of future s2n-quic releases. Please contact [email protected].

License

This project is licensed under the Apache-2.0 License.

s2n-quic's People

Contributors

Stargazers

Watchers

Forkers

ghishadow renaissanxe isgasho everesio cyberflamego jqk6 neko-suki meerd csosto-pk ai-zixun jz5121 czakian stjordanis flyarong rhernandez35 nsdyoshi xiaxiangjun exceloo gdm agnos-ai wesleyrosenblum lundinc2 goatgoose yang282441848 nottirb zhassan-aws swift-quic 51coder mor9x charleschege test-mass-forker-org-1 orrinni bdonlan dougch justsmth wuhx fanweixiao zz85 kenelite re2zero thekuwayama chayleaf hyungjic lokenetwork danielsn songwei163 hansonchar morristai mkxzy skmcgrail dmgolembiowski lmjw mark-simulacrum celinval rklaehn jmayclin x77a1 xaclincoln gg-big-org daiyy rrichardson flub forestofrain binadamu-isiyoonekana tosunkaya jon-chuang cenobites peteaudinate qinheping harrisonkaiser shraddha-1508 genesysgo kckeiks aetheriaxai aditishri18 iq-scm karkhaz kinddevil cg-mitun-akil xucsh dongcarl junkurihara nanofi fxlb clownsw seanpm2001 yodatech1988 feliperodri mambisi mackey0225 brandonsimpson21 ma-etl ma-9999 yixinin getong bitcapybara ifd3f v0idmatr1x mohe2015 zerolh0

s2n-quic's Issues

Ignoring loss of undecryptable packets

https://tools.ietf.org/id/draft-ietf-quic-recovery-31.html#section-7.4

During the handshake, some packet protection keys might not be available when a packet arrives and the receiver can choose to drop the packet. In particular, Handshake and 0-RTT packets cannot be processed until the Initial packets arrive and 1-RTT packets cannot be processed until the handshake completes. Endpoints MAY ignore the loss of Handshake, 0-RTT, and 1-RTT packets that might have arrived before the peer had packet protection keys to process those packets. Endpoints MUST NOT ignore the loss of packets that were sent after the earliest acknowledged packet in a given packet number space.

Implement CUBIC congestion controller with Hybrid Slow Start

The signals QUIC provides for congestion control are generic and are designed to support different algorithms. Endpoints can unilaterally choose a different algorithm to use, such as Cubic ([RFC8312]).

https://tools.ietf.org/id/draft-ietf-quic-recovery-31.html#section-7

This issue encompasses sections 7.0 - 7.3

Send a single STREAM frame with contiguous data

Currently the DataSender component sends a STREAM frame per chunk that was submitted by the application. If the application decides to submit several small chunks, this can result in quite a bit of framing overhead over the wire.

Instead we should send a single STREAM frame for any contiguous data.

Determine Stream drop behavior

From comment: #129 (comment)

Matthias247

Btw: I learned that Quinn implements dropping a stream as closing it instead of resetting it. That seems like an option to get around the UNKNOWN issue. However I am not sure if I prefer it, since it seems like it could close the stream even if not all expected data is sent.

camshaft

Yeah I could go either way on it. I think if we want to be close to what TCP streams do in Rust it would probably be better to finish it rather than reset.

From https://doc.rust-lang.org/std/net/struct.TcpStream.html

The connection will be closed when the value is dropped. The reading and writing portions of the connection can also be shut down individually with the shutdown method.

From tokio-rs/tokio-io#73 (comment)

the intention of shutdown was for things like "shut down the TLS connection" or "flush remaining buffers and then shut down the underlying socket" or things like that, the name shutdown wasn't intended to convey a literal TCP shutdown and was a mistake on our part.

use the std lib's pattern for naming

As I've been figuring out what will be included in the public API, I noticed a few of the naming conventions are inconsistent. I propose we follow what rust's std does:

Simple struct or enum names, namespaced by modules

The entry for a hashmap is called std::collections::hash_map::Entry, not std::collections::hash_map::HashMapEntry. We don't follow this rule in several places. For example: s2n_quic_transport::connection::ConnectionInterests should really just be s2n_quic_transport::connection::Interests, since we already know it's inside the connection module.

Another example is the IO error isn't called IOError, it's just std::io::Error. We've got a few of these, with s2n_quic_core::connection::ConnectionError being one. It should really just be s2n_quic_core::connection::Error.

There's a clippy lint to enforce this as well: https://rust-lang.github.io/rust-clippy/master/#module_name_repetitions

Simple associated type naming

The associated types don't have a Type suffix, as that's implied by the casing. Examples of this include Iterator::Item and Future::Output.

We've got several traits that include Type in type name, which doesn't follow the std convention. See s2n_quic_transport::endpoint::EndpointConfig::ConnectionConfigType for an example. It should just be ConnectionConfig.

Cubic Behavior for Application-Limited Flows

CUBIC does not raise its congestion window size if the flow is currently limited by the application instead of the congestion window. In case of long periods when cwnd has not been updated due to the application rate limit, such as idle periods, t in Eq. 1 MUST NOT include these periods; otherwise, W_cubic(t) might be very high after restarting from these periods.

https://tools.ietf.org/html/rfc8312#section-5.8

Congestion controller must not block probe packets

Probe packets MUST NOT be blocked by the congestion controller.

https://tools.ietf.org/id/draft-ietf-quic-recovery-31.html#section-7.5

ci: cargo bolero

We need to run cargo bolero on PRs and push to the default branch. It should also run nightly to ensure the corpus stays healthy and up-to-date.

Differentiate between local and peer connection close in APIs

Currently when a connection error occurs Error::ConnectionClosed is emitted. This is independent of which side actually closed the connection.

It seems desirable to be able to differentiate between a connection which was closed locally and one which was closed by the peer. E.g. in a HTTP/3 implementation we will want log errors in several places if the peer closed the connection - but don't need an error if we locally closed the connection from another task (because we already logged this in a different place).

Integration with Quinn at draft 29

Problem

Interop tests are failing against Quinn as the draft version has increased. We should update the existing codebase to accept a basic connection with Quinn

Solution

Update Initial salt value, along with several other values in crypto::initial (to pass unit tests).
Add new TransportParameters, and other spec differences, in order to complete basic connection.

Out of scope

It would be great to have automation to read certain values out of the spec, but that is not part of this issue.
We should fully comply with draft-29, but this issue is only about getting the interop test working.

Cubic panics while computing the window_increase_rate

            let window_increase_rate = (target_congestion_window - self.congestion_window) as f32
                / self.congestion_window as f32;

ESC[33mserver          |ESC[0m thread 'tokio-runtime-worker' panicked at 'attempt to subtract with overflow', quic/s2n-quic-
transport/src/recovery/cubic.rs:404:40
ESC[33mserver          |ESC[0m stack backtrace:
ESC[33mserver          |ESC[0m    0: backtrace::backtrace::libunwind::trace
ESC[33mserver          |ESC[0m              at ./cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.46/src/backtra
ce/libunwind.rs:86
ESC[33mserver          |ESC[0m    1: backtrace::backtrace::trace_unsynchronized
ESC[33mserver          |ESC[0m              at ./cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.46/src/backtra
ce/mod.rs:66
ESC[33mserver          |ESC[0m    2: std::sys_common::backtrace::_print_fmt
ESC[33mserver          |ESC[0m              at src/libstd/sys_common/backtrace.rs:78
ESC[33mserver          |ESC[0m    3: <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt
ESC[33mserver          |ESC[0m              at src/libstd/sys_common/backtrace.rs:59
ESC[33mserver          |ESC[0m    4: core::fmt::write
ESC[33mserver          |ESC[0m              at src/libcore/fmt/mod.rs:1076
ESC[33mserver          |ESC[0m    5: std::io::Write::write_fmt
ESC[33mserver          |ESC[0m              at src/libstd/io/mod.rs:1537
ESC[33mserver          |ESC[0m    6: std::sys_common::backtrace::_print
ESC[33mserver          |ESC[0m              at src/libstd/sys_common/backtrace.rs:62
ESC[33mserver          |ESC[0m    7: std::sys_common::backtrace::print
ESC[33mserver          |ESC[0m              at src/libstd/sys_common/backtrace.rs:49
ESC[33mserver          |ESC[0m    8: std::panicking::default_hook::{{closure}}
ESC[33mserver          |ESC[0m              at src/libstd/panicking.rs:198
ESC[33mserver          |ESC[0m    9: std::panicking::default_hook
ESC[33mserver          |ESC[0m              at src/libstd/panicking.rs:218
ESC[33mserver          |ESC[0m   10: std::panicking::rust_panic_with_hook
ESC[33mserver          |ESC[0m              at src/libstd/panicking.rs:486
ESC[33mserver          |ESC[0m   11: rust_begin_unwind
ESC[33mserver          |ESC[0m              at src/libstd/panicking.rs:388
ESC[33mserver          |ESC[0m   12: core::panicking::panic_fmt
ESC[33mserver          |ESC[0m              at src/libcore/panicking.rs:101
ESC[33mserver          |ESC[0m   13: core::panicking::panic
ESC[33mserver          |ESC[0m              at src/libcore/panicking.rs:56
ESC[33mserver          |ESC[0m   14: s2n_quic_transport::recovery::cubic::CubicCongestionController::congestion_avoidance
ESC[33mserver          |ESC[0m              at quic/s2n-quic-transport/src/recovery/cubic.rs:404
ESC[33mserver          |ESC[0m   15: <s2n_quic_transport::recovery::cubic::CubicCongestionController as s2n_quic_core::recovery::congestion_controller::CongestionController>::on_packet_ack
ESC[33mserver          |ESC[0m              at quic/s2n-quic-transport/src/recovery/cubic.rs:233
ESC[33mserver          |ESC[0m   16: s2n_quic_transport::recovery::manager::Manager::on_ack_frame
ESC[33mserver          |ESC[0m              at ./s2n-quic/quic/s2n-quic-transport/src/recovery/manager.rs:340
ESC[33mserver          |ESC[0m   17: <s2n_quic_transport::space::application::ApplicationSpace<Config> as s2n_quic_transport::space::PacketSpace<Config>>::handle_ack_frame
ESC[33mserver          |ESC[0m              at ./s2n-quic/quic/s2n-quic-transport/src/space/application.rs:300
ESC[33mserver          |ESC[0m   18: s2n_quic_transport::space::PacketSpace::handle_cleartext_payload
ESC[33mserver          |ESC[0m              at ./s2n-quic/quic/s2n-quic-transport/src/space/mod.rs:400

Add padding to datagrams containing Initial packets

https://tools.ietf.org/html/draft-ietf-quic-transport-32#section-14.1

A client MUST expand the payload of all UDP datagrams carrying
Initial packets to at least the smallest allowed maximum datagram
size of 1200 bytes by adding PADDING frames to the Initial packet or
by coalescing the Initial packet; see Section 12.2. Similarly, a
server MUST expand the payload of all UDP datagrams carrying ack-
eliciting Initial packets to at least the smallest allowed maximum
datagram size of 1200 bytes.

Respect congestion window when sending a packet

https://tools.ietf.org/id/draft-ietf-quic-recovery-31.html#section-7

An endpoint MUST NOT send a packet if it would cause bytes_in_flight (see Appendix B.2) to be larger than the congestion window, unless the packet is sent on a PTO timer expiration (see Section 6.2) or when entering recovery (see Section 7.3.2).

Reusing RTT on resumed connections

https://tools.ietf.org/id/draft-ietf-quic-recovery-31.html#section-6.2.2

"Resumed connections over the same network MAY use the previous connection's final smoothed RTT value as the resumed connection's initial RTT. "

Read watermark

With the current design, a client is not able to indicate how many bytes it wants to read on a stream. It would simply do

let chunk = stream.pop().await?;

Now a certain concern here could occur if the peer sends a lot of small packets (which are "normal" in Quic with <= 1200 byte payloads):

For every packet, the payload gets stored in the streams receive buffer
The stream would notify the application code
The application code would read it, but since it requires more data it gets blocked again

This could repeat a couple of times, and in each step we are switching between 2 tasks and are locking/unlocking a Mutex. E.g. to read 16kB - which is a reasonable size for a HTTP DATA frame - we might have 13 transitions between application task and QUIC task.

One way to optimize this would be to make sure we process as many packets for a connection as possible for unlocking the mutex and signaling the application task - which might be worthwhile to do anyway.

However we can also improve here on the API side: The client could be able to set a "Low Watermark" - the minimum amount of data that needs to be stored in the receive buffer in order for the wait handle to be awoken.
For HTTP/3 DATA frames, the HTTP/3 library could set this one to the length of the remaining frame. This should minimize the amount of necessary wakeups.

If we implement this, I think the watermark needs to be a recommendation and not being strictly enforced. E.g. the QUIC library would need to be able to notify the reader if the receive buffer is (full / half full /etc) - even if the watermark hasn't been hit yet. API wise there are probably a couple of ways to do this.

We could add the API to just the watermark, and keep the pop APIs as is (and make them respect the watermark). E.g.

stream.set_receive_low_watermark(16*1024)?;
// Unblocks only if 16kB 
let chunk = stream.pop().await?;

or we could add an even new API, which just allows to "wait until watermark is hit" - and then make the pop API non-awaitable (similar to how the write API works):

stream.set_receive_low_watermark(16*1024)?;
// Unblocks only if 16kB 
stream.wait_ready().await?;
while !reached_16kb {
    let chunk = stream.pop()?; // No .await here
}

I think in order to determine whether this is useful we need to perform some benchmarking on how many of those context-switches we actually observe when reading HTTP/3. But I still determined it would be worthwhile to document it here.

Also @zz85 - since he's working on reading HTTP/3 frames.

ci: quic-interop-runner

We need to execute the quic-interop-runner tests on PR and push to the default branch.

transport: IsAppOrFlowControlLimited

https://www.ietf.org/id/draft-ietf-quic-recovery-31.html#name-under-utilizing-the-congest
When bytes in flight is smaller than the congestion window and sending is not pacing limited, the congestion window is under-utilized. When this occurs, the congestion window SHOULD NOT be increased in either slow start or congestion avoidance. This can happen due to insufficient application data or flow control limits.

We need a way to determine if we are currently app limited or flow control limited. One call out is that it is possible the limited status could change in the middle of processing acks in a single ack frame.

Packet pacing

Packet pacing is a method for smoothing the transmission of outgoing packets over time based on an estimate of available bandwidth. Currently, s2n-quic transmits outgoing packets as soon as it is able, as long as there is sufficient room in the congestion window. A burst of outgoing packets can lead to loss, as shallow buffers in network routers may be overwhelmed with the sudden increase in traffic. This issue is track the implementation of packet pacing.

RFC Requirements:

QUIC Loss Detection and Congestion Control RFC 9002 [3] §7.7
- A sender SHOULD pace sending of all in-flight packets based on input from the congestion controller.
- Senders MUST either use pacing or limit such bursts.
- Senders SHOULD limit bursts to the initial congestion window
- A sender with knowledge that the network path to the receiver can absorb larger bursts MAY use a higher limit.
- To avoid delaying their delivery to the peer, packets containing only ACK frames SHOULD therefore not be paced.
QUIC Loss Detection and Congestion Control RFC 9002 [3] §7.8
- A sender SHOULD NOT consider itself application limited if it would have fully utilized the congestion window without pacing delay.
BBR Congestion Control [4] §4.2.1
- Pacing is the primary mechanism that BBR uses to control its sending behavior; BBR implementations MUST implement pacing.

Tasks:

Implement default pacer for use in CubicCongestionController #1000
Incorporate default pacer into CubicCongestionController#1026
Add method to CongestionController trait to get next earliest departure time for a packet #1026

ci: cargo publish

We should automatically publish to crates.io on tagging a release.

Queue packets after Initial processing

Problem:
Customers need to tell the library when to delay connections: 8.1.2. This can be done through retry packets, and is how the address validation code will be utilized. The library can determine conditions where new connections should be delayed by communicating with a limit provider.

Possible solutions:
While processing an initial packet on an endpoint, we ask a limit provider whether we want to handle additional connections. It is possible that we want to delay new handshakes or close new connection attempts. Reasons for this behavior are defined in #143. Reference section .

There is no way to send packets from this part of the code. We need to implement the ability to send Retry or Close packets with a backoff.

        # s2n-quic-transport/src/endpoint/initial.rs
        let info = endpoint::ConnectionInfo::new(0, &datagram.remote_address);
        match endpoint_context.governor.on_connection_attempt(&info) {
            endpoint::Outcome::Allow => {
                // No action
            }
            endpoint::Outcome::Retry { delay: _ } => {
                //= https://tools.ietf.org/html/draft-ietf-quic-transport-31#section-8.1.3
                //# A server can also use a Retry packet to defer the state and
                //# processing costs of connection establishment.
            }
            endpoint::Outcome::Drop => {
                // Stop processing
                return Ok(());
            }
            endpoint::Outcome::Close { delay: _ } => {
                // Queue close packet
            }
        }

transport: Fast Retransmit in Congestion Controller

https://www.ietf.org/id/draft-ietf-quic-recovery-31.html#name-recovery

Implementations MAY reduce the congestion window immediately upon entering a recovery period or use other mechanisms, such as Proportional Rate Reduction ([PRR]), to reduce the congestion window more gradually. If the congestion window is reduced immediately, a single packet can be sent prior to reduction. This speeds up loss recovery if the data in the lost packet is retransmitted and is similar to TCP as described in Section 5 of [RFC6675].

quicwg/base-drafts#3335

To reduce loss recovery time when first entering a recovery period, we need a way to immediately retransmit a lost packet without being limited by the congestion window. One solution could involve setting a flag on the path to allow transmission of one packet that ignores the congestion window.

Deliver a project plan for the implementation of CUBIC congestion control algorithm

Setting client-dependent configurations in multi-tenant deployments

Conventional transport and QUIC libraries all only allow to configure settings per endpoint. However for multi-tenant systems like CDNs we need to be able to configure certain settings dependent on the Connection that is established.

E.g. the endpoint might serve connections and requests for customers which have different requirements regarding TLS configuration and/or other connection-level settings.

There are typically 2 ways to identify which configuration to apply in a multi-tenant system:

By source/destination IP.
By checking the SNI information during the handshake.

I think for configuration which we think is necessary to be loaded dynamically (based on the client), we need to make sure that the necessary hook for obtaining it is run at a point where we have that information available. This could require API changes.

Now the next questions is which configurations we need to set on a per-connection configuration base. For TLS we loaded the security policy dynamically based on the SNI. With QUIC this requirement will initially not apply - since every connection must use TLS1.3. But there could be requirements in the future.

We definitely need to load certificates dynamically during the handshake. This should also cover the requirement to disable/fail the handshake for endpoints which have no associated configuration.

Are we covered on those so far?
And besides those, is there anything else we are missing?

I can think of settings like

connection-level timeouts
being able to increase/decrease buffer sizes depending on the identified client, to accomodate users requiring high-perf as well as to have some tools to throttle clients which are not very well behaved
enabling Qlog for certain customers, in order to be able to debug issues

I guess those could all be alternatively be implemented by having setters on the Connection that allow to change those things. However injecting and setting the config while creating the Connection might be easier?

The TLS handshake allows to identify the customer

ci: quic-network-simulator

We should run network simulations with quic-network-simulator and track performance over time.

ci: Setup restyled.io

This will automatically run rustfmt on the code and, if there are errors, create a PR for the PR with the fixes.

Create s2n-tls binding crate

Retry Packets

Problem:
Tokens must be sent to clients in Retry Packets. These packets are defined in section 17.2.5. The current code was written for draft 22, and there have been changes since. Retry Packets must be updated to reflect these changes.

Work required:
Update the retry packet module based on Draft 31.

Update comments to reflect current wording
Update packet layout for all correct fields
Implement retry packet integrity algorithm
Tests for integrity and encoding / decoding

Support CID for Address Validation

Problem
Draft 30 introduced the option for servers to use the connection ID to validate clients if the CID contains enough entropy.

This change PR is here.

Possible Solution

Add a field to the token provider and format that allows validation by CID, something like allow_cid_validation: bool which defaults to false.
Verify our default Connection ID provider has at least 64 bits of entropy.
If allow_cid_validation is set, mark the path as verified conforming to the QUIC draft update.
Document the option to let customers know about the 64 bit requirement.

Cons
Customers have the ability to write their own Connection ID provider. If customers write their own provider, they must make sure their CIDs have enough entropy if allow_cid_validation is enabled.

Cubic panics while computing `t`

https://github.com/awslabs/s2n-quic/blob/main/quic/s2n-quic-transport/src/recovery/cubic.rs#L222

let t = ack_receive_time - avoidance_start_time;

ESC[33mserver          |ESC[0m thread 'tokio-runtime-worker' panicked at 'received an ACK at Timestamp(1526192) before the a
voidance start time at Timestamp(1523470)', quic/s2n-quic-transport/src/recovery/cubic.rs:225:21
ESC[33mserver          |ESC[0m stack backtrace:
ESC[33mserver          |ESC[0m    0: backtrace::backtrace::libunwind::trace
ESC[33mserver          |ESC[0m              at ./cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.46/src/backtra
ce/libunwind.rs:86
ESC[33mserver          |ESC[0m    1: backtrace::backtrace::trace_unsynchronized
ESC[33mserver          |ESC[0m              at ./cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.46/src/backtra
ce/mod.rs:66
ESC[33mserver          |ESC[0m    2: std::sys_common::backtrace::_print_fmt
ESC[33mserver          |ESC[0m              at src/libstd/sys_common/backtrace.rs:78
ESC[33mserver          |ESC[0m    3: <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt
ESC[33mserver          |ESC[0m              at src/libstd/sys_common/backtrace.rs:59
ESC[33mserver          |ESC[0m    4: core::fmt::write
ESC[33mserver          |ESC[0m              at src/libcore/fmt/mod.rs:1076
ESC[33mserver          |ESC[0m    5: std::io::Write::write_fmt
ESC[33mserver          |ESC[0m              at src/libstd/io/mod.rs:1537
ESC[33mserver          |ESC[0m    6: std::sys_common::backtrace::_print
ESC[33mserver          |ESC[0m              at src/libstd/sys_common/backtrace.rs:62
ESC[33mserver          |ESC[0m    7: std::sys_common::backtrace::print
ESC[33mserver          |ESC[0m              at src/libstd/sys_common/backtrace.rs:49
ESC[33mserver          |ESC[0m    8: std::panicking::default_hook::{{closure}}
ESC[33mserver          |ESC[0m              at src/libstd/panicking.rs:198
ESC[33mserver          |ESC[0m    9: std::panicking::default_hook
ESC[33mserver          |ESC[0m              at src/libstd/panicking.rs:218
ESC[33mserver          |ESC[0m   10: std::panicking::rust_panic_with_hook
ESC[33mserver          |ESC[0m              at src/libstd/panicking.rs:486
ESC[33mserver          |ESC[0m   11: rust_begin_unwind
ESC[33mserver          |ESC[0m              at src/libstd/panicking.rs:388
ESC[33mserver          |ESC[0m   12: std::panicking::begin_panic_fmt
ESC[33mserver          |ESC[0m              at src/libstd/panicking.rs:342
ESC[33mserver          |ESC[0m   13: <s2n_quic_transport::recovery::cubic::CubicCongestionController as s2n_quic_core::recovery::congestion_controller::CongestionController>::on_packet_ack
ESC[33mserver          |ESC[0m              at quic/s2n-quic-transport/src/recovery/cubic.rs:225
ESC[33mserver          |ESC[0m   14: s2n_quic_transport::recovery::manager::Manager::on_ack_frame
ESC[33mserver          |ESC[0m              at ./s2n-quic/quic/s2n-quic-transport/src/recovery/manager.rs:340
ESC[33mserver          |ESC[0m   15: <s2n_quic_transport::space::application::ApplicationSpace<Config> as s2n_quic_transport::space::PacketSpace<Config>>::handle_ack_frame
ESC[33mserver          |ESC[0m              at ./s2n-quic/quic/s2n-quic-transport/src/space/application.rs:300
ESC[33mserver          |ESC[0m   16: s2n_quic_transport::space::PacketSpace::handle_cleartext_payload
ESC[33mserver          |ESC[0m              at ./s2n-quic/quic/s2n-quic-transport/src/space/mod.rs:398
ESC[33mserver          |ESC[0m   17: <s2n_quic_transport::connection::connection_impl::ConnectionImpl<Config> as s2n_quic_transport::connection::connection_trait::ConnectionTrait>::handle_short_packet
ESC[33mserver          |ESC[0m              at ./s2n-quic/quic/s2n-quic-transport/src/connection/connection_impl.rs:627
ESC[33mserver          |ESC[0m   18: s2n_quic_transport::connection::connection_trait::ConnectionTrait::handle_packet
ESC[33mserver          |ESC[0m              at ./s2n-quic/quic/s2n-quic-transport/src/connection/connection_trait.rs:190
ESC[33mserver          |ESC[0m   19: s2n_quic_transport::endpoint::Endpoint<Cfg>::receive_datagram::{{closure}}
ESC[33mserver          |ESC[0m              at ./s2n-quic/quic/s2n-quic-transport/src/endpoint/mod.rs:163

ci: semverver

Execute ~~rust-semverver~~ to ensure we maintain API compatibility.

rust-semverver is deprecated so consider the listed alternatives instead: https://github.com/rust-lang/rust-semverver?tab=readme-ov-file#deprecation-notice

Retry Token tests

These tests are expected to fail at first because the logic required has not been implemented. This task\ includes writing the code required to make the tests pass.

Server closes connection on invalid retry tokens
https://tools.ietf.org/html/draft-ietf-quic-transport-30#section-8.1.2

Instead, the server SHOULD immediately close (Section 10.3) the connection with an INVALID_TOKEN error.

Server detects duplicate tokens and treats them as invalid
https://tools.ietf.org/html/draft-ietf-quic-transport-30#section-8.1.4

To protect against such attacks, servers MUST ensure that replay of tokens is prevented or limited.
Note: This may be more applicable to token from NEW_TOKEN frames. However, we need to make sure our trait and detection is correct to avoid changing the API later.

Server does not send Retry Token unless a limit is met
Our server should not always send a retry token (it adds a round trip). It should only happen when a limit is met (#143)
https://tools.ietf.org/html/draft-ietf-quic-transport-30#section-8.1.2

A server can also use a Retry packet to defer the state and processing costs of connection establishment.

ci: cargo bench

We need to run cargo bench on PRs and push to the default branch.

Note: It's probably better to run benchmarks in a predictable environment, like codebuild, instead of GitHub Actions where we can't control the machine resources as well. Without consistent resources, the report will be inaccurate between each run.

Tasks

Compare results to previous executions (see https://bheisler.github.io/criterion.rs/book/user_guide/command_line_output.html#change)
Publish reports

Cubic panics while tracking BytesInFlight

https://github.com/awslabs/s2n-quic/blob/main/quic/s2n-quic-transport/src/recovery/cubic.rs#L83

self.0 -= rhs;

server          | BYTES IN FLIGHT 33442 - 1200
server          | BYTES IN FLIGHT 32242 - 1200
server          | BYTES IN FLIGHT 31042 - 1200
server          | BYTES IN FLIGHT 29842 - 1200
server          | BYTES IN FLIGHT 28642 - 1200
server          | BYTES IN FLIGHT 27442 - 1200
server          | BYTES IN FLIGHT 26242 - 1200
server          | BYTES IN FLIGHT 25042 - 1200
server          | BYTES IN FLIGHT 23842 - 1200
server          | BYTES IN FLIGHT 22642 - 1200
server          | BYTES IN FLIGHT 21442 - 1200
server          | BYTES IN FLIGHT 20242 - 1200
server          | BYTES IN FLIGHT 19042 - 1200
server          | BYTES IN FLIGHT 17842 - 1200
server          | BYTES IN FLIGHT 16642 - 1200
server          | BYTES IN FLIGHT 15442 - 1200
server          | BYTES IN FLIGHT 14242 - 1200
server          | BYTES IN FLIGHT 13042 - 1200
server          | BYTES IN FLIGHT 11842 - 1200
server          | BYTES IN FLIGHT 10642 - 1200
server          | BYTES IN FLIGHT 9442 - 1190
server          | BYTES IN FLIGHT 8252 - 1200
server          | BYTES IN FLIGHT 7052 - 1200
server          | BYTES IN FLIGHT 5852 - 1200
server          | BYTES IN FLIGHT 4652 + 1200
server          | BYTES IN FLIGHT 5852 + 39
server          | BYTES IN FLIGHT 5891 - 1200
server          | BYTES IN FLIGHT 4691 - 1200
server          | BYTES IN FLIGHT 3491 + 39
server          | BYTES IN FLIGHT 3530 - 1200
server          | BYTES IN FLIGHT 2330 - 1200
server          | BYTES IN FLIGHT 1130 + 625
server          | BYTES IN FLIGHT 1755 + 605
server          | BYTES IN FLIGHT 2360 - 1200
server          | BYTES IN FLIGHT 1160 - 1200
server          | thread 'tokio-runtime-worker' panicked at 'attempt to subtract with overflow', quic/s2n-quic-transport/src/recovery/cubic.rs:84:13
server          | stack backtrace:
server          |    0: backtrace::backtrace::libunwind::trace
server          |              at ./cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.46/src/backtrace/libunwind.rs:86
server          |    1: backtrace::backtrace::trace_unsynchronized
server          |              at ./cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.46/src/backtrace/mod.rs:66
server          |    2: std::sys_common::backtrace::_print_fmt
server          |              at src/libstd/sys_common/backtrace.rs:78
server          |    3: <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt
server          |              at src/libstd/sys_common/backtrace.rs:59
server          |    4: core::fmt::write
server          |              at src/libcore/fmt/mod.rs:1076
server          |    5: std::io::Write::write_fmt
server          |              at src/libstd/io/mod.rs:1537
server          |    6: std::sys_common::backtrace::_print
server          |              at src/libstd/sys_common/backtrace.rs:62
server          |    7: std::sys_common::backtrace::print
server          |              at src/libstd/sys_common/backtrace.rs:49
server          |    8: std::panicking::default_hook::{{closure}}
server          |              at src/libstd/panicking.rs:198
server          |    9: std::panicking::default_hook
server          |              at src/libstd/panicking.rs:218
server          |   10: std::panicking::rust_panic_with_hook
server          |              at src/libstd/panicking.rs:486
server          |   11: rust_begin_unwind
server          |              at src/libstd/panicking.rs:388
server          |   12: core::panicking::panic_fmt
server          |              at src/libcore/panicking.rs:101
server          |   13: core::panicking::panic
server          |              at src/libcore/panicking.rs:56
server          |   14: <s2n_quic_transport::recovery::cubic::BytesInFlight as core::ops::arith::SubAssign<u32>>::sub_assign
server          |              at quic/s2n-quic-transport/src/recovery/cubic.rs:84
server          |   15: <s2n_quic_transport::recovery::cubic::BytesInFlight as core::ops::arith::SubAssign<usize>>::sub_assign
server          |              at quic/s2n-quic-transport/src/recovery/cubic.rs:93
server          |   16: <s2n_quic_transport::recovery::cubic::CubicCongestionController as s2n_quic_core::recovery::congestion_controller::CongestionController>::on_packet_ack
server          |              at quic/s2n-quic-transport/src/recovery/cubic.rs:174
server          |   17: s2n_quic_transport::recovery::manager::Manager::on_ack_frame
server          |              at ./s2n-quic/quic/s2n-quic-transport/src/recovery/manager.rs:340
server          |   18: <s2n_quic_transport::space::application::ApplicationSpace<Config> as s2n_quic_transport::space::PacketSpace<Config>>::handle_ack_frame

Token deduplication

Problem:
Duplicate token detection will be done by our library because the RFC states servers MUST protect against token replay attacks. This lets us help customer protect themselves against replay attacks.

We provide a trait to get the hash of a token:

fn token_hash<'a>(&self, token: &'a [u8]) -> &'a [u8];

We need a module to verify a token is not a duplicate based on this hash.

Possible solutions:
Key rotation allows us to invalidate a token after some period of time. Duplicating invalid tokens won't assist an attack, so we can limit our duplicate detection to a time window. A stable bloom filter could be used to manage an upper bound on false positives while keeping memory requirements low. A downside is possible false negatives, causing valid tokens to be discarded.

We could instead use a cuckoo filter. This filter doesn't have false negatives and we could just re-create it on key rotation to clear out stale information.

Notes:
https://tools.ietf.org/html/draft-ietf-quic-transport-30#section-8.1.4

Attackers could replay tokens to use servers as amplifiers in DDoS
attacks. To protect against such attacks, servers MUST ensure that
replay of tokens is prevented or limited.

RUSTSEC-2020-0016: `net2` crate has been deprecated; use `socket2` instead

net2 crate has been deprecated; use socket2 instead

Details
Status	unmaintained
Package	`net2`
Version	`0.2.34`
URL	deprecrated/net2-rs@`3350e38`
Date	2020-05-01

The net2 crate has been deprecated
and users are encouraged to considered socket2 instead.

See advisory page for additional details.

Deliver the first draft of QUIC public APIs

Deliver a design document for address and path validation and connection migration functionalities

Allow customer configuration of duplicate token filter in default provider

Problem:
The cuckoo filter used to detect duplicate tokens has a default capacity of (1 << 20) - 1. This is certainly too large for tracking tokens between key rotations. We should allow customers to configure their own filter size similar to how they configure the key_rotation_period.

Possible Solutions:

Add a field to the Provider struct
Provide some data on expected connections/sec and key_rotation_period to pick a reasonable value.

Handshake resource exhaustion limit

Problem:
There is a limit provider which customers can use to set limits in the system. We should have a way to set a limit that indicates handshakes should be deferred. This would be used as an indicator that QUIC should send Retry Tokens to clients.

Possible Solutions:
Update the limit provider to allow customers to set a max_concurrent_handshake limit.

transport: Find a more efficient data structure for SentPackets

SentPackets, which tracks packets that are pending acknowledgement, it currently implemented as a BTreeMap. BTreeMap does not have an efficient way of draining items from the map (drain_filter is only implemented in Nightly). Therefore, we currently are allocating a Vec with the packets that need to be removed from the BTreeMap, and then iterating over that. This is not efficient, so we should find a better data structure.

Some ideas:

https://crates.io/crates/cranelift-bforest
https://crates.io/crates/hashbrown (No Std version of HashMap which does have drain, but would not be efficient because the keys are not sorted)
Build our own data structure

RUSTSEC-2019-0031: spin is no longer actively maintained

spin is no longer actively maintained

Details
Status	unmaintained
Package	`spin`
Version	`0.5.2`
URL	mvdnes/spin-rs@`7516c80`
Date	2019-11-21

The author of the spin crate does not have time or interest to maintain it.

Consider the following alternatives (both of which support no_std):

conquer-once
lock_api (a subproject of parking_lot)
- spinning_top spinlock crate built on lock_api

See advisory page for additional details.

ci: cargo bloat

It would be useful to track binary sizes over time to ensure the compiled output stays reasonable and doesn't accumulate bloat without us knowing.

Add tests to verify Recovery works when current time is 0 or very small

Most of the recovery manager assumes that the current time is a value > 0. Since this may not be true, we need to fix areas of the code that make that assumption (#174 contains one fix) and add tests to verify the behavior.

Future connection tokens

Problem:
To prevent a round trip on future connections from known clients the server can provide tokens in a NEW_TOKEN frame. Our default provider should implement the generation and validation of NEW_TOKEN frame tokens.

Possible solutions:
Note a recent update to the spec

It is unlikely that the client port number is the same on two
different connections; validating the port is therefore unlikely to
be successful.

Our default provider could avoid including the port in the generation / validation of these tokens.
Our default provider should consider the time value of these tokens. Retry Tokens have a very short lifetime, but these tokens could live much longer.

Notes
https://tools.ietf.org/html/draft-ietf-quic-transport-30#section-8.1.3

ci: Store/restore fuzz corpus snapshots

We need to store fuzz corpus snapshots in something like S3 so we're not rebuilding it each time.

ci: publish coverage reports

We need to publish coverage reports to an easily-accessible place. https://codecov.io/ seems like a good option.

Connection::poll_accept should return a ConnectionError instead of a StreamError

https://github.com/awslabs/s2n-quic/blob/b1a2c6a069bd26d1112e320cb471eac29cc7f85d/quic/s2n-quic-transport/src/connection/api.rs#L53 seems to use a StreamError when accepting fails. I don't recall the reason for why it is that way. But it seems like this should rather be a connection error => Accepting will mainly fail if something is wrong with the underlying connection. A stream doesn't exist yet at this point.

Endpoint Limits Default Impl

Problem:
The default endpoint::limits implementation (#143) always returns Outcome::Allow. We should allow customers to configure the max handshakes value.

Possible solutions:
Update trait and provider to allow customers to configure their own values. Verify the number of handshakes against self.inflight_handshakes. Also write unit tests to verify behavior.

Track current handshakes

We need to track handshakes in-flight so we can deliver that number to customers. This will allow customers to make decisions about when to send retry packets.

This number will be used by the infrastructure setup in #143.

Build LimitsManager to track endpoint limits
Inc/dec current_peers, inflight requests in manager where appropriate
Handle timeouts

Version negotiation

Problem
Client and server must negotiate a version.
https://tools.ietf.org/html/draft-ietf-quic-transport-31#section-5.2.2
https://tools.ietf.org/html/draft-ietf-quic-transport-31#section-6
https://tools.ietf.org/html/draft-ietf-quic-transport-31#section-17.2.1

Possible solutions:
Our implementation MVP only supports a single Draft of QUIC. If the client tries another version, a packet should be sent to request the version we support.

We should also create follow up issues for items like version downgrading which has not been defined, but should be implemented in the future.

Tests:
To be determined by assignee.

aws / s2n-quic Goto Github PK

s2n-quic's Introduction

s2n-quic

Installation

Example

Server

Client

Minimum Supported Rust Version (MSRV)

Security issue notifications

License

s2n-quic's People

Contributors

Stargazers

Watchers

Forkers

s2n-quic's Issues

Simple struct or enum names, namespaced by modules

Simple associated type naming

Problem

Solution

Out of scope

Links

Links

Tasks

Links

Recommend Projects

Recommend Topics

Recommend Org