Giter VIP home page Giter VIP logo

h3's Introduction

H3

The Fast HTTP header parser library.

(under construction)

H3 does not use finite state machine or parser generator to parse the http request header, it goes through the whole buffer and save the pointers of each meta fields and values by a hand-written scanner.

H3 uses a pre-built minimal perfect hash table for the defined names of the header fields, to provide a fast field name lookup.

For these custom/external header fields (field name starts with X- or other), H3 lookups these fields by a simple/quick hashing function.

Since a HTTP-based application should be tolerant to the entity-header. H3 is designed to tolerant the entity header, you can decide whether to validate the header field values by your will.

All HTTP header fields are lazily parsed, H3 only parses the details when needed.

SYNOPSIS

#include <h3.h>

int main() {
    RequestHeader *header;
    header = h3_request_header_new();
    h3_request_header_parse(header, headerBody, len);


    printf("HEADER\n");
    printf("===========================\n");
    printf("%s", headerBody);
    printf("\n---------------------------\n");
    printf("Method: %.*s\n", header->RequestMethodLen, header->RequestMethod);
    printf("Request-URI: %.*s\n", header->RequestURILen, header->RequestURI);
    printf("HTTP-Version: %.*s\n", header->HTTPVersionLen, header->HTTPVersion);
    printf("===========================\n");

    h3_request_header_free(header);
    return 0;
}

API

High Level API

h3_request_header_parse(RequestHeader *header, const char *header, int len);


/*
 * Request Header
 * http://tools.ietf.org/html/rfc2616#section-5.3
 */
Value * h3_get_accept(HeaderFields *headers); // Get "Accept"
Value * h3_get_accept_charset(HeaderFields *headers); // Get "Accept-Charset"
Value * h3_get_accept_language(HeaderFields *headers); // Get "Accept-Language"
Value * h3_get_accept_encoding(HeaderFields *headers); // Get "Accept-Encoding"
Value * h3_get_authorization(HeaderFields *headers); // Get "Authorization"
Value * h3_get_expect(HeaderFields *headers); // Get "Expect"
Value * h3_get_from(HeaderFields *headers); // Get "From"
Value * h3_get_host(HeaderFields *headers); // Get "Host"
Value * h3_get_if_match(HeaderFields *headers); // Get "If-Match"

Value * h3_get_if_none_match(HeaderFields *headers); // Get "If-None-Match"
Value * h3_get_if_range(HeaderFields *headers); // Get "If-Range"
Value * h3_get_if_unmodified_since(HeaderFields *headers); // Get "If-Unmodified-Since"

Value * h3_get_range(HeaderFields *headers); // Get "Range"
Value * h3_get_referer(HeaderFields *headers); // Get "Referer"
Value * h3_get_max_forwards(HeaderFields *headers); // Get "Max-Forwards"
Value * h3_get_proxy_authorization(HeaderFields *headers); // Get "Proxy-Authorization"
Value * h3_get_range(HeaderFields *headers); // Get "Range"
Value * h3_get_user_agent(HeaderFields *headers); // Get "User-Agent"
Value * h3_get_te(HeaderFields *headers); // Get "TE"


/* 
 * Response Header
 * http://tools.ietf.org/html/rfc2616#section-6
 */
Value * h3_get_accept_encoding(HeaderFields *headers); // Get "Accept-Encoding"
Value * h3_get_accept_language(HeaderFields *headers); // Get "Accept-Language"
Value * h3_get_accept_ranges(HeaderFields *headers); // Get "Accept-Ranges"
Value * h3_get_cache_control(HeaderFields *headers);  // Get "Cache-Control"
Value * h3_get_connection(HeaderFields *headers);  // Get "Connection"
Value * h3_get_date(HeaderFields *headers);        // Get "Date"
Value * h3_get_transfer_encoding(HeaderFields *headers); // Get "Transfer-Encoding"
Value * h3_get_upgrade(HeaderFields *headers);     // Get "Upgrade"
Value * h3_get_via(HeaderFields *headers);         // Get "Via"
Value * h3_get_warning(HeaderFields *headers);     // Get "Warning"

Low Level API

Date/Time parsing

H3DateTime * h3_parse_date_rfc1123(const char *dateStr, int len);

H3DateTime * h3_parse_date_rfc1036(const char *dateStr, int len);

H3DateTime * h3_parse_date_rfc850(const char *dateStr, int len); // rfc850 date format is replaced by rfc1036

H3DateTime * h3_parse_date_ansi(const char *dateStr, int len);

/*
 * Detect & Parse date string automatically
 */
H3DateTime * h3_parse_date(, const char *dateStr, int len);

h3's People

Contributors

c9s avatar cindylinz avatar krishnapg avatar verpeteren avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

h3's Issues

Logodesign proposal

Hi,
I have read your work so I decided to creat a logo for you. I hope you like it and you want to use it. I can give you all the formats of the design free. If you want to make changes please specify.

I used the colours of your website.

Greetings
h3

cmake results in error

-- Configuring done
CMake Error at tests/CMakeLists.txt:15 (add_executable):
  Cannot find source file:

    bench/bench.c

  Tried extensions .c .C .c++ .cc .cpp .cxx .m .M .mm .h .hh .h++ .hm .hpp
  .hxx .in .txx


CMake Error: CMake can not determine linker language for target: bench_h3
CMake Error: Cannot determine link language for target "bench_h3".
-- Generating done
-- Build files have been written to: /home/abaumann/h3

Should cmake be used or make -f Makefile.dist?

Avoiding memory allocation inside the h3_request_header_parse

Going through this good work. It is very nice to see this effort. Trying to create benchmark for this - but the memory allocations inside the h3_request_header_parse() are killer.

Just couple of suggestions to consider to improve performance:

  1. Use intrusive mechanism: that is, instead of creating memory inside, accept the header-fields as an input array and try to fill them. Personally I would not like it, because it would demand the caller to know the number of header fields to allocate the array before-hand.
    But, one technique could be: let the caller pass a header array (with whatever size he thinks as reasonable) and the h3_request_header_parse() method parses and fills only as many as the available fields (sent by caller). If there are more fields pending, but the input array cannot hold them, then the parser just returns the no. of bytes consumed, so that the caller can compare it with size of input and call the h3_request_header_parse() in a loop till all fields are parsed. This would require returning to caller in a reasonable state (so that the parser can resume later correctly when called with brand new field array).

  2. Use call-back mechanism: This way, the parser would never need to worry about memory at all. For example, the h3_request_header_parse() method will contain a local stack based header-field that gets filled inside the loop and just before the next iteration, make the call-back() with the filled the filed as argument to it, so that caller will do whatever he wants to do with that particular field. May be he will aggregate them into array or just scan and dump it.
    The call-back could also be used by the caller to tell if the parser should proceed further or end by returning true/false.
    For example, the prototype of call-back could be: bool (*lpfnCallback)(HeaderField&);

    Then, the prototype of h3_request_header_parse() would be:

    int h3_request_header_parse(lpfnCallback fn, const char *body, int bodyLength)
    {
    do
    {
    HeaderField field;
    // fill the filed here
    if(false == fn(field)) break;

    }while(notend(p));
    }

h3_request_header_parse() should return on end-of-string also

Encountering this while running the test application on Windows. (There is some garbage characters loading at the end of chrome test file on windows with fread()).

Right now, the h3_request_header_parse() method is doing a sanity check after every 'while' only for 'if (iscrlf(p)) return -1'

It should also include the test for end of string apart from 'iscrlf()'. For example, it should be:

if (end(p) || iscrlf(p)) return -1;

Also, in such case the newly allocated 'HeaderField *field = h3_header_field_new();' at the start of loop would go waste. So, it is suggested to use a local stack variable at the start of loop to hold the values parsed in that iteration and allocate the field at the end of loop (after everything completed successfully) and then memcpy the local stack variable onto the heap allocated one. That way, only successfully parsed fields would be allocated memory and any failures would just get away with stack-based field.

Warning(s) during compile: statement with no effect

During the compilation of the h3 parsing, some warnings occur.

make -f Makefile.dist

src/request_header.c: In function ‘h3_request_header_parse’:
src/request_header.c:86:5: warning: statement with no effect [-Wunused-value]
src/request_header.c:116:9: warning: statement with no effect [-Wunused-value]

This is caused by the usage of the macro to determine if it is a CR-LF iscrlf(p); p+=2;

This macro is defined in src/scanner.h:35 as #define iscrlf(p) (*p == '\r' && *(p + 1) == '\n')

I am not familiar enough with the code and the various RFC's to asses this situtation.
I can immagine that either:

  • it should be if ( iscrlf( p ) ) { p += 2; }
  • it should be p += 2;

Please advice what to do for both line 86 and 116.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.