hyperium / http Goto Github PK
View Code? Open in Web Editor NEWRust HTTP types
License: Apache License 2.0
Rust HTTP types
License: Apache License 2.0
The documentation of HeaderMap
shows using strings for values, which works in the doc tests because HeaderMap<T>
is generic over the value. However, if people try to use the same examples in real code, it won't work, because Request
and Response
both use HeaderMap<HeaderValue>
. The docs should probably be updated to show using HeaderValue
s.
I also kind of wonder if the generic T
will cause problems elsewhere. If someone just needs to view some headers, and writes &HeaderMap
, and sees the complaint about a missing generic, will they easily be able to determine it should be &HeaderMap<HeaderValue
?
Should we export a type alias instead?
this crate exports http::HeaderMap
and http::header::HeaderValue
. The typical signature of HeaderMap is HeaderMap<HeaderValue>
, so it seems natural for both of these to exported together from the root of the module hierarchy.
Is http::Uri
intended as a complete replacement for the url crate? If so, it might make sense to document that, and perhaps work with its author to deprecate it and point users towards the http crate. And, for that matter, I'd suggest adding "uri" and "url" as tags, and mentioning URIs and URLs in the crate description (along with requests and responses), so that it's really obvious this is the crate to use. (And hopefully it'll start showing up prominently in the search results for "uri" and "url".)
Request
has the following method:
fn uri(&self) -> &Uri
Returns a reference to the associated URI.
But an actual HTTP request doesn't always contain a URI. HTTP's RFC notes that a request has a request-target
, which is,
request-target = origin-form
/ absolute-form
/ authority-form
/ asterisk-form
Only absolute-form
is a URI; the remaining three are not. origin-form
, I would say, is the form most folks implementing basic CRUD APIs are going to interact with.
My concern is that the API encourages writing origin servers that will respond to non-origin requests poorly, and defeats the point of strong static typing by essentially munging what is an enumeration into a singly string. (i.e., it "stringly types" the data)
E.g., if you receive the following request:
GET http://example.com/ HTTP/1.1
Host: example.com
Connection: close
(assuming the receiving server is not example.com
) If your routing does something like request.uri() == "/something"
, you won't erroneously match a proxy request; however, you're still likely to return 404. If you do request.uri().path()
, you're likely going to incorrectly respond with the wrong content.
Ultimately, I feel that the actual type implied by the spec is,
enum RequestTarget {
Origin(Path),
Absolute(Uri),
Authority(Authority),
Asterisk,
}
This would render some of the method that presume they can always return things (e.g., Request::uri
) into Option
s. (e.g., Request::uri() -> Option<Uri>
); but this would also mean Uri
would actually represent a URI and wouldn't need to be capable of handling weird things like "*"
. (As others noted: Uri
seems like something that should be more common than it is, and removing HTTP-specific meanings from it makes it more easily extractable.)
Request
might benefit from some helper methods to deal with "I just want to write an origin server" โ but again, the need for such is already present, hidden behind the fact that all four types are munged into one.
An initial design uses an opaque struct and constructor functions to create the different variants: https://github.com/seanmonstar/http-rs/blob/dd7deb3b9396ba858b7268ad49cc4409236ff5e7/src/method.rs
pub struct Method(Inner);
// Inner can be anything, current draft has an enum
impl Method {
// ... constructors for each variant
pub fn get() -> Method {
Method(Inner::Get)
}
// .. instance methods
pub fn is_safe(&self) -> bool;
pub fn is_idempotent(&self) -> bool;
}
There currently is an Into
implementation, but I believe that there may be be error cases. If so, the impl should be removed.
In many cases it is not necessary to specify the HTTP version to be used instead it can be chosen by the underlying HTTP implementation.
A hypothetical requests-like client may allow sending requests like this:
let request = Request::builder().uri("https://www.google.com/about/").body(())?;
let response = requests::fetch(request)?;
Currently the request will be for HTTP_11
but a crate user probably means "use whatever HTTP version you think works best" and in this case would probably prefer if the http client used e.g. HTTP/2, QUIC. But in some other cases users will explicitly set the version to HTTP_11
because they want to test the page or just don't like the overhead of HTTP/2 for a single connection so ignoring the field for sending requests is not an option either.
I think the version field should be optional: either Option<Version>
or it could be an extension using the extension mechanism (still part of this crate of course).
It doesn't have to be exposed as a constant but it is still under heavy enough use that I believe it merits a standard header slot.
Removed as part of #67
For instance, there is:
Method::from_bytes
HeaderName::from_bytes
StatusCode::from_u16
vs
HeaderValue::try_from_bytes
Uri::try_from_shared
For example Method::from_bytes
and StatusCode::from_slice
. Only one should be used.
@alexcrichton Which is more idiomatic?
HTTP 2.0 needs the following URI components:
These are transferred as pseudo header fields. It would be nice to be able to extract these components from a Uri
w/o having to copy / allocate.
It's really not an origin form representation. The type shouldn't be explicitly named often so a slightly less ergonomic name that is more correct seems better to me.
It could be convenient, though the implementation would be annoying...
userinfo has been deprecated, but should still be handled. The various host
fns may not properly handle user info as well.
Similar to the reasoning in #1, I've an initial design that replaces hyper's enum with an opaque struct, allowing adding new HTTP versions in the future (HTTP2 spec says 3 could happen).
https://github.com/seanmonstar/http-rs/blob/50b7cb555ae3445d37ae20edfddb756d70fdf22d/src/version.rs
pub struct Version(Http);
impl Version {
// .. constructors for each version
pub fn http_11() -> Version {
Version(Http::Http11)
}
// .. instance methods
pub fn major(&self) -> u8 {
match self.0 {
Http09 => 0
Http10 |
Http11 => 1,
H2 |
H2c => 2,
}
}
}
enum Http {
Http11,
// ...version variants
}
How would an API to allow this look?
I have doubts re: #68, which was merged before I could look at it.
Personally, I enjoy the short type names using modules to namespace them. It makes for less imports.
Instead of having to do:
use http::{RequestBuilder, RequestParts};
fn foo(b: RequestBuilder, p: RequestParts) {
}
use http::request;
fn foo(b: request::Builder, p: request::Parts) {
}
Most of the lib is optimized for the module style, so either way it needs to be uniform.
In theory it should, but it hasn't been tested. There should be unit tests to verify it works correctly.
The current implementation has type Error = InvalidUri
, but this can never occur.
If we decide that #96 should be changed or removed, then this could be updated to correctly return an InvalidUri
error.
If not, we might want to consider that a From<T>
will likely end up in a TryFrom<T> where Error = !
blanket impl. Can we have some uninhabited type in convert.rs as the Error
, and then alias it to bang or something later?
This function would bypass all checks on the input in favor of the caller guaranteeing that:
This is a continuation of #8
Scheme characters must be lower case, but implementations should tolerate and normalize upper case characters to lower case.
The host component is case insensitive.
The various Uri types should provide PartialEq
/ Hash
implementations that handle the necessary case sensitivity logic.
This'd be a nifty feature to have! Basically something to relatively easily attempt to parse into a Request
and/or Response
. @withoutboats I think you were working on this, right? Do you have thoughts to write down?
This would enable them to be assigned to statics.
This is deprecated and generally a bad idea.
This will enable zero-copy for headers.
I vote for all byte collections to be stored using the bytes crate. While this adds a dependency, it is lightweight and will help avoid quite a number of copies. I think that Bytes
should be hidden from the API except maybe in some conversion functions.
http://httpwg.org/specs/rfc7541.html#detailed.format
The most critical aspect is not losing "never indexed" for proxies. This will probably require an additional field on HeaderValue
to track the information.
The only way to "use" a request/response builder is to call body(some_body_data)
on it and create a message. The documentation on body
is very explicit that this must be called once. But still the body
signature is &mut self
and not self
as I'd expect. The docs already state that it "consumes" the builder, so why not turn the runtime panic to compile time errors if the method is called multiple times?
Using my day off to build a little HTTP server on top of the stdlib and http; Here's some issues @ashleygwilliams and I are finding. I'm not suggesting it has to be this way, mind you, just things to consider.
I'm aware that the title of this issue reads really weird, so I will do my best to explain and start a discussion. My explanation is a little long, so if you're in a hurry here's the TL;DR:
As a server application writer, there's no way I can handle a Request<T>
without knowing what T
is. As a gateway/framework, there's no way I can enforce what T
is without making the application non-interoperable with other gateways/frameworks.
I've been working on a web application interface crate (think Ruby's Rack) for a couple of months, but I heard that this crate was in the works and decided to wait and see what this crate would look like. (I haven't publicly announced said crate, but I guess the cat is out of the bag now...)
I am inspecting it now to see how well this crate will work as an interface for HTTP applications, and I am encountering a problem with Request
. The problem is that the request is generic over T
, which is the body. There is no way for an application to realistically read a body of unknown type.
Consider the following web application (greatly simplified, but works as an example):
use http::{Request, Response};
pub fn my_application<Body>(request: Request<Body>) -> Response<&'static str> {
Response::new("Hello World!".into())
}
That works great, but what if we want to do something more interesting, like respond with the number of bytes in the request body? That should be easy:
use http::{Request, Response};
use std::io::Read;
pub fn my_application<Body: Read>(mut request: Request<Body>) -> Response<String> {
let mut buf = Vec::new();
let len = request.body_mut().read_to_end(&mut buf).unwrap();
Response::new(format!("{}", len))
}
That should work, but now we have a problem: we can no longer handle any type of request; we can only handle requests of a specific type. More specifically, we accept a request of type http::Request<T> where T: std::io::Read
. If we wanted to support Request<String>
, we'd have to re-implement our application function again for this type.
It gets worse. The above examples are stateless, which very few real-world applications are. We'd probably want to instead talk about these applications as traits instead of functions. Something like this comes to mind (more similar to my crate, actually similar to Iron too):
use http::{Request, Response};
pub trait HttpApp: Send + Sync + 'static {
fn handle<Req, Res>(&self, _: Request<Req>) -> Response<Res>;
}
There's an immediate problem with this; no HTTP server implementation can actually support the above trait. If I write a CGI gateway crate, I might provide a Request<Stdin>
, but now the application writers have to explicitly code against Request<Stdin>
, and interoperability is lost.
To put it another way: Request<T>
doesn't provide application writers a unified request type to code against, because Request
is generic over something without meaningful bounds. If application writers do code to a specific body type, then they are no longer coding against a crate-agnostic type. T
here is basically leaking the implementation into the interface.
I don't pretend to have an answer to this problem; maybe this crate is too high-level (or is it low-level?) for what I need. I am very interested in comments, disagreements, or suggestions.
Discussion: #22
There is currently request::Head
and response::Head
exposed publicly. Should this be part of the public API, or should be the be made private? This was discussed a little in #22, and postponed to get things working. I'd like to at least be sure that the decision we make here has been reasoned about.
The style used for conversion errors has not been consistent across the library. A full audit should be performed and error types should be normalized.
Known issues:
FromStrError
) or after error semantics (InvalidUri
)uri::Authority
, uri::Path
both use FromStrError
, yet the potential error cases don't fully overlap).There's a couple of questions, maybe they should be broken out into separate issues, but I'll start by listing them here:
Should it even be an enum
? There is the StatusCode::Unregistered(u16)
variant, since really any number between 100 and 599 is a valid status code. The reason for it being an enum originally in hyper is to take advantage of match
statements. if let StatusCode::Created = status
reads better than if status == 201
in my opinion. The problem with is being an enum is that it's a breaking change any time a new status code is defined, and we want to name it.
I had considered for a bit having a newtype around u16
, but the associated constants feature is required to make it complete. It could look like:
pub struct StatusCode(u16);
impl StatusCode {
pub fn new(code: u16) -> StatusCode;
pub const Ok: StatusCode = StatusCode(200);
}
Does the StatusClass
pull its own weight? The various specs declare that the first digit of the code determines its category or class. hyper has a StatusClass enum that allows any StatusCode
to be check for its class. As mentioned in RFC7321, if a client doesn't recognize a code, it should still be able to recognize its class, such as 471
being treated as a client error, like 400.
Should converting from a primitive depend on the unstable TryFrom
feature? Since the specs say that a client should always be able to determine a status class from a code, and there are only classes defined in the specs for 1xx, 2xx, 3xx, 4xx, and 5xx, it can be implied that any other status code is invalid or illegal. Does it make sense then that converting from a u16
to a StatusCode
is a fallible action? If so, the best case to do that is probably TryFrom
, which is currently unstable, but looks super close to stabilization.
I saw #67 merged before there was an opportunity given to comment. I'm opening this issue as a placeholder to review.
Consider implementing Stream
for Request<T>
/ Response<T>
where T: Stream
.
This could be an optional dependency on the futures
crate.
For HTTP: 80, HTTPS: 443. Currently, None
is returned.
Since the most common usage of HeaderMap<T>
is w/ HeaderValue
, should the root re-export be a type def instead?
pub type HeaderMap = header::HeaderMap<header::HeaderValue>;
Conversions From<T>
consume the input type. In the case of an error, the input is lost unless the error type returns the input source.
Should conversion errors include the input type? The current pattern is to not, but that is mostly because it was easier.
Currently there's a slew of error types exposed throughout this library to provide precise error information, but this may make for unergonomic usage for consumers of this crate. If we assume that most users will typically use ?
to "ignore" errors and propagate them upwards then this means that a user's own error type needs to be able to consume all of these errors.
For example users currently would maybe need to do:
error_chain! {
foreign_links {
HttpUri(http::uri::FromStrError),
HttpHeader(http::header::InvalidValueError),
// ... about 5 more times
}
}
We may want to consider instead perhaps having one http::Error
type which all functions return which would enable:
error_chain! {
foreign_links {
Http(http::Error),
}
}
The downside of this approach is that the error itself is less precise, but I think we can recover any lost information regardless. We should be able to have something like an HttpErrorKind
eventually which exposes precise information, and you just won't get all kinds of errors from all sorts of sites.
Currently, the only builder fns are for inserting header entries, but if you just want to set the header map, there is no API for that.
The docs show using strings as keys, and with it being the most "ergonomic" way, people will probably default to using it everywhere. Unfortunately, a string is not always a valid header name, and because of this, the HeaderMapKey
implementation panics if it is invalid.
This could lead to people uniformly just using strings, and then being surprised and upset when their server panics because of an illegal string. Of course, it's better to have panicked than to allow the damaging header name through, but users certainly would not have expected it. Should we remove the impl for String
/str
, so that users realize that HeaderName::try_from(str).unwrap()
can panic?
The StatusCode claims to allow all codes in the range [0, 65535] but provides no way to set codes outside the recommended range.
Currently, HeaderMap::get_all
returns an Option<GetAll<T>>
, with None
indicating the absence of any header with the specified name. That type signature allows encoding "impossible" situations, which code will then have to handle. Specifically, it allows Some
of an empty iterator. That's going to force code using this type to have extra logic dismissing that "impossible" case.
Please switch the type to just GetAll<T>
, with an empty iterator indicating the absence of any header with the specified name. It's still trivial to detect presence/absence of a header (such as via contains_key
), and this will simplify callers, reducing uses of .unwrap()
and similar.
Probably not. Also, try_from_slice
probably should be from_bytes
(#25).
I'm not exactly sure if it makes sense to distinguish between the two as they are details of protocol negotiation. The actual protocol version is 2.0 and TLS vs. not should not be checked by looking at the HTTP version.
I would suggest to just get rid of HTTP_2C
.
Right now, you can only convert from refs, so it isn't a huge deal. However, I assume we will be adding fns to convert from String
and other values passed by ownership. In this case, if the conversion fails, it seems like it would be worth for the error to be able to return the value.
The other option would be add a second error:
InvalidHeaderValueOwned<T>
but that doesn't seem ideal either.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.