elastic-rs / elastic Goto Github PK
View Code? Open in Web Editor NEWAn Elasticsearch REST API client for Rust
License: Apache License 2.0
An Elasticsearch REST API client for Rust
License: Apache License 2.0
Move the parsing for Format
into a compiler plugin crate so that we can provide a parse
plugin to get Item
s at compile time, instead of parsing at runtime
Rather than calling .unwrap()
, which will panic on errors, tank the build and give an unhelpful error message, return span _err
with a proper message, so users can see what's actually going on.
Use issues (like this one) instead of commiting todos in code. It's an unnecessary bit of commit history, but is just more convenient right now.
The date parser needs to provide the current implementation as a default for Format
implementations, but also them to override it where the situation is unique, such as with epoch_millis
where there's no equivalent chrono format.
Format
trait.
as milliseconds)See: https://www.elastic.co/guide/en/elasticsearch/reference/current/multi-index.html
On elastic_client
, use an api like:
enum MultiIndexSelection {
Include (&str),
Exclude (&str)
}
enum IndexSelection {
Single (&str),
Multi (Vec <MultiIndexSelection>)
}
impl ToString for IndexSelection {
fn to_string (&self) -> String {
//Follow index formats for multi selection to return single string
}
}
And take an Into <IndexSelection>
in the elastic_client
methods.
The &str
in the selection may also need to be expanded.
Currently, the codegen is mostly just rolled straight from the traits. It would be good to use aster
rather than putting together giant ungainly syntax trees by hand.
Take an AstBuilder
argument, so an ExtCtxt
could be passed in too. A Span
would also be a good thing to pass around, even though it will usually be a DUMMY_SP
.
From #1 Some notes on the YAML format.
For do
steps, if a body is supplied, expect it as JSON.
struct Do<'a> {
name: &'a str,
params: Vec<&'a str>
}
For match
, is_false
and is_true
steps, parse to assert_eq
, assuming there aren't any complex matches to deal with.
Use the free string on line 2 as either a comment on each test, or append to the test name.
There will need to be a coded response type, which should live in the types
crate.
Fix the changes to P
in a number of calls in elastic_types_codegen
See https://github.com/steveklabnik/automatically_update_github_pages_with_travis_example
Will probably need to specify different page roots for each crate doc, since they're all in the same repo.
Test difficult codegen functions of #1 by generating and emitting a test source file, and verifying that it compiles.
It'd be cool if this could be wrapped up in a macro so it's run as a pre-compilation step
Currently, date formats are taken as strings and are parsed into tokens for then parsing again on the actual date.
While it's friendlier on the user to use string formats instead of collections of chrono Item
s, the performance improvement on not having to parse can't be ignored.
Time taken to parse a standard format:
test date::parse_date_format ... bench: 275 ns/iter (+/- 19)
Time taken to parse a DateTime
struct from &str
format and Vec<Item>
respectively.
test date::parse_date_from_format ... bench: 642 ns/iter (+/- 17)
test date::parse_date_from_tokens ... bench: 337 ns/iter (+/- 9)
It would be good to be able to offer either some kind of caching of parsed formats, or build a macro that parses formats at compile time.
Need to write a test that uses the actual API spec source and gets mod paths, it doesn't seem to be correct for some reason on the actual data.
Changes in latest nightly
for PatIdent
are breaking the build
Use Travis to build, test and docs. Add the Travis badge of success/shame to the README
For #1
There need to be helpers in the codegen::api::gen::rust
mod for taking an AST Endpoint
and building functions from it.
This should also include meta for the module they should live in (eg cluster
) and doc comments etc.
Write some macros that take inline json and serialise it to JsonValue
at compile time. For most cases, there would be 2 steps:
The macro design could look something like this:
json!(a, b, c, {
'a': $a,
'b': {
'ba': $b
},
$c: 'value'
});
Where a
, b
and c
are Serialize
. The json!
macro could be backed by two macros; one which does the actual parsing and serialisation, and the other that splices in replacement values:
let json = json_ser!({
'a': $a,
'b': {
'ba': $b
},
$c: 'value'
});
This would result in:
JsonPartialResult {
json: JsonValue,
replacements: BTreeMap<String, ReplacementPath>
}
impl JsonPartialResult {
pub fn replace(&self, repls: BTreeMap<String, JsonValue>) -> JsonValue;
}
enum ReplacementPath {
Key(String),
Value(String)
}
The replacements
value would look like:
"a": Value("a"),
"b": Value("b.ba"),
"c": Key("")
We then use a macro to splice the replacements in. This would be done at runtime:
let mut repls = BTreeMap::with_capacity(3);
repls.insert("a", a);
repls.insert("b", b);
repls.insert("c", c);
let result = json.replace(repls);
If the json macro doesn't contain replacement chars, then we just return the result, otherwise, we return the expression calling json.replace
, so values can be spliced in at runtime.
There are lots of messy allocations in the Codegen which can be tidied up. This doesn't need to be a high priority because the codegen isn't expected to be run by end-users, but should still be looked at at some point.
Use the types from elastic_types
as implementations for function parameters where necessary.
There shouldn't be too many of these, otherwise they could be taken as strings. If we can avoid introducing another dependency then that would be good.
Currently, none of this compiles on the latest nightly channel, or with the latest versions of packages. This will need to change, but for now, setting the versions correctly in the cargo.toml
files and listing the latest supported nightly build as supported would be helpful.
Use the Elasticsearch API spec to generate Rust protocols and tests https://github.com/elastic/elasticsearch/tree/master/rest-api-spec
Add a ParseError
type to date parsing that can be used by the elastic_types_parsers
crate and the elastic_types
crate.
The Chrono library has separate structures for Date and DateTime. This doesn't work with Elasticsearch's default strict_date_optional_time
format.
Can be implemented as a second-chance parse if pull request in chrono is merged chronotope/chrono#54
See: https://www.elastic.co/guide/en/elasticsearch/reference/current/date-math-index-names.html
For examples of date math. These are resolved on the Elasticsearch server, so shouldn't need to be handled on our end, but date math is supported in other places too.
The generated code needs to be emitted somewhere, to a Writer
.
There should be a base emit
trait that can be implemented to output the results as a string to the writer.
See: https://users.rust-lang.org/t/fast-string-concatenation/4425
The url_push
method is not only the fastest, but also the easiest to codegen. This can be built up using the url_parse_parts
and url_parse_params
methods.
This should be in the base codegen
crate
The current way chrono
is used is clunky.
The easiest solution will probably be to re-export the crate, so users don't specifically need to extern crate chrono
unless they're already using it elsewhere.
Also implement From <chrono::DateTime <_>>
for elastic_types::DateTime
.
When no replacements are present, json!
returns &str
, but when there are it returns a String
.
Change both to use String
and see how the performance is impacted.
From #7 Add CI to online docs (which don't exist yet) using Travis.
Add a proper ParseError
struct to handle parsing the API or Test specs. Needs to cover such issues as parsing and reading without being too opinionated about the source of the Read
(file, url etc)
Need to add Rust docs to the types, with links back to the original Elasticsearch docs too where possible.
Test that emitting multiple fns to a single emitter works as expected, and that we can emit use
statements easily.
For #1
Use ExtCtxt
to generate ast nodes so the API more in line with compiler plug ins.
Long term, it looks like the new rotor_http
might be a good base for building the HTTP off.
https://github.com/tailhook/rotor-http/blob/master/src/client/request.rs
/_search
/_search
with bodyNotifier
/_search
with various bodiesThe current implementation of epoch_millis expects that the last 3 digits are millis and the rest are the epoch seconds.
This is correct for 13 digit timestamps, but may not necessarily be for smaller ones.
It's difficult to find examples on how this should work, different formatters seem to use different behaviour for millis.
Another issue is whether or not the millis on dates before the epoch should go down from 1000, instead of up to 1000. So should a timestamp of -1
be 31/12/1969 23:59:59.999
or 31/12/1969 23:59:59.001
?
Any tests that don't have an assert
in them should be turned into documentation samples instead.
It'll make them more accessible to users, and not clutter up the tests with blocks of code that aren't really 'tests' anyway.
See: https://github.com/rust-lang-nursery/rustfmt
Files that are codegenned should be formatted properly. This might just be a case of having to set up a build script to generate code, then format it.
The tests for the codegen need to be tidied up.
This test should verify that there is no extra path info on the type. And there should be a corresponding test to show that there can be full path info.
Make sure the alternative body adding fns are tested properly.
Changes in Rust seem to have broken the RustEmitter
.
Add an explicit lifetime requirement to the Emitter
trait, which will probably mean PhantomData
on the ContextFreeEmitter
.
From #8
With the crate design sorted, they need to be uploaded to crates.io. This will include:
elastic_codegen: 0.2
elastic_types: 0.2
elastic_hyper: 0.2
The version numbers don't necessarily need to be in sync, but while they're covered by a single milestone it makes sense.
Need to get some thoughts down about the feature requirements for this project and sort into milestones.
Some of this work is already done, but more is needed
At the moment, primitive types are manually implemented for the Emit
trait, but they could all be covered with a single impl of ToString
. This could be a bit of a sledgehammer though, because it may get in the way of people implementing Emit
themselves.
Keep an eye on how the Emit
API is used, and rework the default impls if necessary
Move the contents of the api::gen
mod (apart from url
) to a base module so it can be shared with the test
module.
Need to get the crate design sorted for the key functionality:
Some could be combined into a single crate, but these should be on crates.io once functionality is at a useable state.
Once Hyper has support for async io, use it as the HTTP driver for the client https://github.com/hyperium/hyper/tree/mio
Rework the rust codegen to be properly combatible with compiler plugins and macros.
This means we need to pass an ExtCtxt
around and use it for identing etc. That'll also make it possible to replace some of the homegrown AST building with quotes.
Need to add documentation to the elsatic_types_codegen
crate. This should cover the parsing and the plugin marcos.
See: https://github.com/KodrAus/elasticsearch-rs/blob/master/codegen/src/emit/mod.rs#L139
Try prevent the double into()
call on errors. Shouldn't be necessary. The names of the generics could be improved.
This could be turned into a macro.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.