hyperium / http Goto Github PK

Rust HTTP types

License: Apache License 2.0

Rust 100.00%

http's Issues

HeaderMap docs show using a string value

The documentation of HeaderMap shows using strings for values, which works in the doc tests because HeaderMap<T> is generic over the value. However, if people try to use the same examples in real code, it won't work, because Request and Response both use HeaderMap<HeaderValue>. The docs should probably be updated to show using HeaderValues.

I also kind of wonder if the generic T will cause problems elsewhere. If someone just needs to view some headers, and writes &HeaderMap, and sees the complaint about a missing generic, will they easily be able to determine it should be &HeaderMap<HeaderValue?

Should we export a type alias instead?

export http::HeaderValue?

this crate exports http::HeaderMap and http::header::HeaderValue. The typical signature of HeaderMap is HeaderMap<HeaderValue>, so it seems natural for both of these to exported together from the root of the module hierarchy.

Please document migration path for "url" crate

Is http::Uri intended as a complete replacement for the url crate? If so, it might make sense to document that, and perhaps work with its author to deprecate it and point users towards the http crate. And, for that matter, I'd suggest adding "uri" and "url" as tags, and mentioning URIs and URLs in the crate description (along with requests and responses), so that it's really obvious this is the crate to use. (And hopefully it'll start showing up prominently in the search results for "uri" and "url".)

Rename uri::FromStrError -> InvalidUri?

Request "always" has a Uri

Request has the following method:

fn uri(&self) -> &Uri
Returns a reference to the associated URI.

But an actual HTTP request doesn't always contain a URI. HTTP's RFC notes that a request has a request-target, which is,

     request-target = origin-form
                    / absolute-form
                    / authority-form
                    / asterisk-form

Only absolute-form is a URI; the remaining three are not. origin-form, I would say, is the form most folks implementing basic CRUD APIs are going to interact with.

My concern is that the API encourages writing origin servers that will respond to non-origin requests poorly, and defeats the point of strong static typing by essentially munging what is an enumeration into a singly string. (i.e., it "stringly types" the data)

E.g., if you receive the following request:

GET http://example.com/ HTTP/1.1
Host: example.com
Connection: close

(assuming the receiving server is not example.com) If your routing does something like request.uri() == "/something", you won't erroneously match a proxy request; however, you're still likely to return 404. If you do request.uri().path(), you're likely going to incorrectly respond with the wrong content.

Ultimately, I feel that the actual type implied by the spec is,

enum RequestTarget {
    Origin(Path),
    Absolute(Uri),
    Authority(Authority),
    Asterisk,
}

This would render some of the method that presume they can always return things (e.g., Request::uri) into Options. (e.g., Request::uri() -> Option<Uri>); but this would also mean Uri would actually represent a URI and wouldn't need to be capable of handling weird things like "*". (As others noted: Uri seems like something that should be more common than it is, and removing HTTP-specific meanings from it makes it more easily extractable.)

Request might benefit from some helper methods to deal with "I just want to write an origin server" — but again, the need for such is already present, hidden behind the fact that all four types are munged into one.

Method type

An initial design uses an opaque struct and constructor functions to create the different variants: https://github.com/seanmonstar/http-rs/blob/dd7deb3b9396ba858b7268ad49cc4409236ff5e7/src/method.rs

pub struct Method(Inner);

// Inner can be anything, current draft has an enum

impl Method {
    // ... constructors for each variant
    pub fn get() -> Method {
        Method(Inner::Get)
    }

    // .. instance methods
    pub fn is_safe(&self) -> bool;
    pub fn is_idempotent(&self) -> bool;
}

Is uri::Parts -> Uri fallible?

There currently is an Into implementation, but I believe that there may be be error cases. If so, the impl should be removed.

Make HTTP version optional

In many cases it is not necessary to specify the HTTP version to be used instead it can be chosen by the underlying HTTP implementation.

A hypothetical requests-like client may allow sending requests like this:

let request = Request::builder().uri("https://www.google.com/about/").body(())?;
let response = requests::fetch(request)?;

Currently the request will be for HTTP_11 but a crate user probably means "use whatever HTTP version you think works best" and in this case would probably prefer if the http client used e.g. HTTP/2, QUIC. But in some other cases users will explicitly set the version to HTTP_11 because they want to test the page or just don't like the overhead of HTTP/2 for a single connection so ignoring the field for sending requests is not an option either.

I think the version field should be optional: either Option<Version> or it could be an extension using the extension mechanism (still part of this crate of course).

Bring back the ContentMD5 standard header

It doesn't have to be exposed as a constant but it is still under heavy enough use that I believe it merits a standard header slot.

Removed as part of #67

Make all "try from" functions consistently named

For instance, there is:

Method::from_bytes
HeaderName::from_bytes
StatusCode::from_u16

HeaderValue::try_from_bytes
Uri::try_from_shared

Both `from_slice` and `from_bytes` are used

For example Method::from_bytes and StatusCode::from_slice. Only one should be used.

@alexcrichton Which is more idiomatic?

Provide zero-copy API for getting URI components needed for H2

HTTP 2.0 needs the following URI components:

Scheme
Authority
Path

These are transferred as pseudo header fields. It would be nice to be able to extract these components from a Uri w/o having to copy / allocate.

Rename OriginForm -> PathAndQuery

It's really not an origin form representation. The type shouldn't be explicitly named often so a slightly less ergonomic name that is more correct seems better to me.

Consider providing as_str / as_ref implementations for `StatusCode`

It could be convenient, though the implementation would be annoying...

Properly handle userinfo in Uri

userinfo has been deprecated, but should still be handled. The various host fns may not properly handle user info as well.

HTTP Version type

Similar to the reasoning in #1, I've an initial design that replaces hyper's enum with an opaque struct, allowing adding new HTTP versions in the future (HTTP2 spec says 3 could happen).

https://github.com/seanmonstar/http-rs/blob/50b7cb555ae3445d37ae20edfddb756d70fdf22d/src/version.rs

pub struct Version(Http);

impl Version {
    // .. constructors for each version
    pub fn http_11() -> Version {
        Version(Http::Http11)
    }

    // .. instance methods
    pub fn major(&self) -> u8 {
        match self.0 {
            Http09 => 0
            Http10 |
            Http11 => 1,
            H2     |
            H2c    => 2,    
        }
    }
}

enum Http {
    Http11,
    // ...version variants
}

Ability to set Content-Encoding: gzip for files like .svgz

How would an API to allow this look?

Use modules vs. not?

I have doubts re: #68, which was merged before I could look at it.

Personally, I enjoy the short type names using modules to namespace them. It makes for less imports.

Instead of having to do:

use http::{RequestBuilder, RequestParts};

fn foo(b: RequestBuilder, p: RequestParts) {
}

use http::request;

fn foo(b: request::Builder, p: request::Parts) {
}

Most of the lib is optimized for the module style, so either way it needs to be uniform.

Make sure HeaderMap adaptive hashing works.

In theory it should, but it hasn't been tested. There should be unit tests to verify it works correctly.

TryFrom Parts for Uri has a Error type that can never occur

The current implementation has type Error = InvalidUri, but this can never occur.

If we decide that #96 should be changed or removed, then this could be updated to correctly return an InvalidUri error.

If not, we might want to consider that a From<T> will likely end up in a TryFrom<T> where Error = ! blanket impl. Can we have some uninhabited type in convert.rs as the Error, and then alias it to bang or something later?

Consider adding Method::from_bytes_unchecked

This function would bypass all checks on the input in favor of the caller guaranteeing that:

The input only contains valid method characters
The input is not a standard HTTP method.

This is a continuation of #8

Fix case sensitivity issues w/ Uri

Scheme characters must be lower case, but implementations should tolerate and normalize upper case characters to lower case.

The host component is case insensitive.

The various Uri types should provide PartialEq / Hash implementations that handle the necessary case sensitivity logic.

HTTP/1.1 Request/Response parsing

This'd be a nifty feature to have! Basically something to relatively easily attempt to parse into a Request and/or Response. @withoutboats I think you were working on this, right? Do you have thoughts to write down?

Types should be able to be statically initialized

This would enable them to be assigned to statics.

URI: Consider removing all mention of `username:password` from authority

This is deprecated and generally a bad idea.

Provide conversions HeaderName/HeaderValue <-> Bytes

This will enable zero-copy for headers.

Depend on the bytes crate

I vote for all byte collections to be stored using the bytes crate. While this adds a dependency, it is lightweight and will help avoid quite a number of copies. I think that Bytes should be hidden from the API except maybe in some conversion functions.

Track HPACK encoding options

http://httpwg.org/specs/rfc7541.html#detailed.format

The most critical aspect is not losing "never indexed" for proxies. This will probably require an additional field on HeaderValue to track the information.

Shouldn't Builder::body take self instead of &self?

The only way to "use" a request/response builder is to call body(some_body_data) on it and create a message. The documentation on body is very explicit that this must be called once. But still the body signature is &mut self and not self as I'd expect. The docs already state that it "consumes" the builder, so why not turn the runtime panic to compile time errors if the method is called multiple times?

First impressions tracking issue

Using my day off to build a little HTTP server on top of the stdlib and http; Here's some issues @ashleygwilliams and I are finding. I'm not suggesting it has to be this way, mind you, just things to consider.

Request body can't be easily restricted to a trait

I'm aware that the title of this issue reads really weird, so I will do my best to explain and start a discussion. My explanation is a little long, so if you're in a hurry here's the TL;DR:

As a server application writer, there's no way I can handle a Request<T> without knowing what T is. As a gateway/framework, there's no way I can enforce what T is without making the application non-interoperable with other gateways/frameworks.

I've been working on a web application interface crate (think Ruby's Rack) for a couple of months, but I heard that this crate was in the works and decided to wait and see what this crate would look like. (I haven't publicly announced said crate, but I guess the cat is out of the bag now...)

I am inspecting it now to see how well this crate will work as an interface for HTTP applications, and I am encountering a problem with Request. The problem is that the request is generic over T, which is the body. There is no way for an application to realistically read a body of unknown type.

Consider the following web application (greatly simplified, but works as an example):

use http::{Request, Response};

pub fn my_application<Body>(request: Request<Body>) -> Response<&'static str> {
    Response::new("Hello World!".into())
}

That works great, but what if we want to do something more interesting, like respond with the number of bytes in the request body? That should be easy:

use http::{Request, Response};
use std::io::Read;

pub fn my_application<Body: Read>(mut request: Request<Body>) -> Response<String> {
    let mut buf = Vec::new();
    let len = request.body_mut().read_to_end(&mut buf).unwrap();

    Response::new(format!("{}", len))
}

That should work, but now we have a problem: we can no longer handle any type of request; we can only handle requests of a specific type. More specifically, we accept a request of type http::Request<T> where T: std::io::Read. If we wanted to support Request<String>, we'd have to re-implement our application function again for this type.

It gets worse. The above examples are stateless, which very few real-world applications are. We'd probably want to instead talk about these applications as traits instead of functions. Something like this comes to mind (more similar to my crate, actually similar to Iron too):

use http::{Request, Response};

pub trait HttpApp: Send + Sync + 'static {
    fn handle<Req, Res>(&self, _: Request<Req>) -> Response<Res>;
}

There's an immediate problem with this; no HTTP server implementation can actually support the above trait. If I write a CGI gateway crate, I might provide a Request<Stdin>, but now the application writers have to explicitly code against Request<Stdin>, and interoperability is lost.

To put it another way: Request<T> doesn't provide application writers a unified request type to code against, because Request is generic over something without meaningful bounds. If application writers do code to a specific body type, then they are no longer coding against a crate-agnostic type. T here is basically leaking the implementation into the interface.

I don't pretend to have an answer to this problem; maybe this crate is too high-level (or is it low-level?) for what I need. I am very interested in comments, disagreements, or suggestions.

Provide request / response builders.

Discussion: #22

Should the `Head` type be public?

There is currently request::Head and response::Head exposed publicly. Should this be part of the public API, or should be the be made private? This was discussed a little in #22, and postponed to get things working. I'd like to at least be sure that the decision we make here has been reasoned about.

Tracking issue: audit of all conversion errors.

The style used for conversion errors has not been consistent across the library. A full audit should be performed and error types should be normalized.

Known issues:

name error after conversion fn (FromStrError) or after error semantics (InvalidUri)
Should error types include the conversion input
Should multiple types share the same error type. (uri::Authority, uri::Path both use FromStrError, yet the potential error cases don't fully overlap).
Other issues not mentioned...

Relates: #48, #23

Should HeaderName impl Display?

StatusCode enum

There's a couple of questions, maybe they should be broken out into separate issues, but I'll start by listing them here:

Should it even be an enum? There is the StatusCode::Unregistered(u16) variant, since really any number between 100 and 599 is a valid status code. The reason for it being an enum originally in hyper is to take advantage of match statements. if let StatusCode::Created = status reads better than if status == 201 in my opinion. The problem with is being an enum is that it's a breaking change any time a new status code is defined, and we want to name it.

I had considered for a bit having a newtype around u16, but the associated constants feature is required to make it complete. It could look like:
```
pub struct StatusCode(u16);

impl StatusCode {
    pub fn new(code: u16) -> StatusCode;

    pub const Ok: StatusCode = StatusCode(200);
}
```
Does the StatusClass pull its own weight? The various specs declare that the first digit of the code determines its category or class. hyper has a StatusClass enum that allows any StatusCode to be check for its class. As mentioned in RFC7321, if a client doesn't recognize a code, it should still be able to recognize its class, such as 471 being treated as a client error, like 400.
Should converting from a primitive depend on the unstable TryFrom feature? Since the specs say that a client should always be able to determine a status class from a code, and there are only classes defined in the specs for 1xx, 2xx, 3xx, 4xx, and 5xx, it can be implied that any other status code is invalid or illegal. Does it make sense then that converting from a u16 to a StatusCode is a fallible action? If so, the best case to do that is probably TryFrom, which is currently unstable, but looks super close to stabilization.

Give the opportunity to comment

I saw #67 merged before there was an opportunity given to comment. I'm opening this issue as a placeholder to review.

Consider implementing `Stream` for Request / Response

Consider implementing Stream for Request<T> / Response<T> where T: Stream.

This could be an optional dependency on the futures crate.

URI: Consider returning the scheme's default port if none is present

For HTTP: 80, HTTPS: 443. Currently, None is returned.

Should `HeaderMap` at the root be a type alias instead of a re-export?

Since the most common usage of HeaderMap<T> is w/ HeaderValue, should the root re-export be a type def instead?

pub type HeaderMap = header::HeaderMap<header::HeaderValue>;

Should all conversion error types include the input?

Conversions From<T> consume the input type. In the case of an error, the input is lost unless the error type returns the input source.

Should conversion errors include the input type? The current pattern is to not, but that is mostly because it was easier.

Consider moving to only one error type

Currently there's a slew of error types exposed throughout this library to provide precise error information, but this may make for unergonomic usage for consumers of this crate. If we assume that most users will typically use ? to "ignore" errors and propagate them upwards then this means that a user's own error type needs to be able to consume all of these errors.

For example users currently would maybe need to do:

error_chain! {
    foreign_links {
        HttpUri(http::uri::FromStrError),
        HttpHeader(http::header::InvalidValueError),
        // ... about 5 more times
    }
}

We may want to consider instead perhaps having one http::Error type which all functions return which would enable:

error_chain! {
    foreign_links {
        Http(http::Error),
    }
}

The downside of this approach is that the error itself is less precise, but I think we can recover any lost information regardless. We should be able to have something like an HttpErrorKind eventually which exposes precise information, and you just won't get all kinds of errors from all sorts of sites.

Builders don't have a fn to set the header map when the user already has a HeaderMap value.

Currently, the only builder fns are for inserting header entries, but if you just want to set the header map, there is no API for that.

HeaderMapKey implementation for String/str panics

The docs show using strings as keys, and with it being the most "ergonomic" way, people will probably default to using it everywhere. Unfortunately, a string is not always a valid header name, and because of this, the HeaderMapKey implementation panics if it is invalid.

This could lead to people uniformly just using strings, and then being surprised and upset when their server panics because of an illegal string. Of course, it's better to have panicked than to allow the damaging header name through, but users certainly would not have expected it. Should we remove the impl for String/str, so that users realize that HeaderName::try_from(str).unwrap() can panic?

XHR status codes

The StatusCode claims to allow all codes in the range [0, 65535] but provides no way to set codes outside the recommended range.

HeaderMap::get_all should not return Option

Currently, HeaderMap::get_all returns an Option<GetAll<T>>, with None indicating the absence of any header with the specified name. That type signature allows encoding "impossible" situations, which code will then have to handle. Specifically, it allows Some of an empty iterator. That's going to force code using this type to have extra logic dismissing that "impossible" case.

Please switch the type to just GetAll<T>, with an empty iterator indicating the absence of any header with the specified name. It's still trivial to detect presence/absence of a header (such as via contains_key), and this will simplify callers, reducing uses of .unwrap() and similar.

Should HeaderValue conversion fns be prefixed w/ try_

Probably not. Also, try_from_slice probably should be from_bytes (#25).

Should HTTP_2 and HTTP_2C be distinguished in this crate?

I'm not exactly sure if it makes sense to distinguish between the two as they are details of protocol negotiation. The actual protocol version is 2.0 and TLS vs. not should not be checked by looking at the HTTP version.

I would suggest to just get rid of HTTP_2C.

InvalidHeaderValue doesn't allow returning the value being converted.

Right now, you can only convert from refs, so it isn't a huge deal. However, I assume we will be adding fns to convert from String and other values passed by ownership. In this case, if the conversion fails, it seems like it would be worth for the error to be able to return the value.

The other option would be add a second error:

InvalidHeaderValueOwned<T>

but that doesn't seem ideal either.

hyperium / http Goto Github PK

http's Issues

Recommend Projects

Recommend Topics

Recommend Org