Giter VIP home page Giter VIP logo

druid4net's Introduction

druid4net

A .NET Apache Druid client written in C#

Supports .NET 4.5 and above, .NET Standard 1.6 and 2.0

Getting started

  1. Add a reference to druid4net from Nuget or download and reference the dll from releases
  2. Add your favorite JSON parser (if you don't already have one referenced)
  3. Implement the IJsonSerializer interface
  4. Create a DruidClient and start querying

Querying

To query druid, create an instance of the DruidClient using code similar to the following:

var options = new ConfigurationOptions()
{
  JsonSerializer = new JilSerializer(),
  QueryApiBaseAddress = new Uri("http://localhost:8082")
};
new DruidClient(options);

Note the JilSerializer implementation can be found in the Integration tests project along with sample queries of all supported query types.

Timeseries

See Apache Druid Timeseries query documentation for more details on this type of query.

The following example query is performing a timeseries query against the sample wikiticker datasource. It filters the data where the country code is 'US' and the data timestamp is within the specified date interval. It then returns the total pages added by hour in a descending order.

var response = _druidClient.Timeseries<T>(q => q
  .Descending(true)
  .Aggregations(new LongSumAggregator("totalAdded", "added"))
  .Filter(new SelectorFilter("countryIsoCode", "US"))
  .DataSource("wikiticker")
  .Interval(FromDate, ToDate)
  .Granularity(Granularities.Hour)
);

TopN

See Apache Druid TopN query documentation for more details on this type of query.

The following example query is performing a topN query against the sample wikiticker datasource. It filters the data where the country code is 'US' and the user was anonymous and the data timestamp is within the specified date interval. It then returns the top 5 pages by count.

var response = _druidClient.TopN<T>(q => q
  .Metric("totalCount")
  .Dimension("page")
  .Threshold(5)
  .Aggregations(new LongSumAggregator("totalCount", "count"))
  .Filter(new AndFilter(
    new SelectorFilter("isAnonymous", "true"),
    new SelectorFilter("countryIsoCode", "US")
  ))
  .DataSource("wikiticker")
  .Interval(FromDate, ToDate)
  .Granularity(Granularities.All)
);

GroupBy

See Apache Druid GroupBy query documentation for more details on this type of query.

The following example query is performing a groupBy query against the sample wikiticker datasource. It returns the sum of page count grouped by Country name, then by city name and finally by page name.

var response = _druidClient.GroupBy<T>(q => q
  .Dimensions("countryName", "cityName", "page")
  .Aggregations(new LongSumAggregator("totalCount", "count"))
  .DataSource("wikiticker")
  .Interval(FromDate, ToDate)
  .Granularity(Granularities.All)
);

Select

See Apache Druid Select query documentation for more details on this type of query.

The following example query is performing a select query against the sample wikiticker datasource. It selects the country name, city name, page, added and deleted values, filtered to anonymous users and limited to 10 records.

var response = _druidClient.Select<T>(q => q
  .Dimensions("countryName", "cityName", "page")
  .Metrics("added", "deleted")
  .Paging(new PagingSpec(10))
  .Filter(new SelectorFilter("isAnonymous", "true"))
  .DataSource("wikiticker")
  .Interval(FromDate, ToDate)
);

Search

See Apache Druid Search query documentation for more details on this type of query.

The following example query is performing a search query against the sample wikiticker datasource. It searches for pages that contain the term "Dragon" and returns the page dimension value limited to the top 10 records.

var response = _druidClient.Search(q => q
  .DataSource("wikiticker")
  .Granularity(Granularities.All)
  .SearchDimensions("page")
  .Query(new ContainsSearchQuery("Dragon"))
  .Limit(10)
  .Interval(FromDate, ToDate)
);

TimeBoundary

See Apache Druid TimeBoundary query documentation for more details on this type of query.

The following example query is performing a timeBoundary query against the sample wikiticker datasource. It finds the minimum and maximum data points filtered to anonymous users.

var response = _druidClient.TimeBoundary(q => q
  .DataSource("wikiticker")
  .Filter(new SelectorFilter("isAnonymous", "true"))
);

Scan

See Apache Druid TimeBoundary query documentation for more details on this type of query.

The following example query is performing a scan query against the sample wikiticker datasource. It returns druid records in streaming mode, filtered to anonymous users and limited to the first 10 results.

var response = _druidClient.Scan<T>(q => q
  .DataSource("wikiticker")
  .Interval(FromDate, ToDate)
  .Filter(new SelectorFilter("isAnonymous", "true"))
  .Limit(10)
);

Async queries

All query types have both synchronous and asynchronous methods available.

For example:

var response = _druidClient.Timeseries<T>(q => q...);

var response = await _druidClient.TimeseriesAsync<T>(q => q...);

Notes

Why do I need to implement IJsonSerializer?

The short answer is we wanted no dependencies. We also didn't want to implement our own JSON serialization as there are already so many good libraries out there that do this. Most projects already have a library included in their solution that can be used by implementing the interface in a simple pass-through class.

Not supported yet

  • Union data source
  • Extraction filter

druid4net's People

Contributors

andrewbridge avatar pano-skylakis avatar pzawisza avatar redj4y avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

druid4net's Issues

DruidClient methods have `class` requirement on TResponse

Is there a reason that all of the methods have a class requirement for TResponse? As far as I can tell, my IJsonSerializer does not have the same requirement, and I can theoretically have my serializer return different concrete classes depending on what the data is, as long as it follows some interface. I don't actually have any use cases for when I would need to do this, but I'm trying to plug this into a library where I need to be able to pass an unconstrainted generic into TResponse, and I'm just not seeing why that constraint exists.

        public async Task<IEnumerable<TResult>> GetManyAsync<TResult>(IQuery query)
        {
            // Following line fails because TResult does not have class constraint
            var response = await _getClient().ScanAsync<TResult>(
                // some code to map query to something ScanAsync can understand
            );
            return response.Data;
        }

I'm not sure if there's a simple workaround to this, other than checking that TResult IsByRef at runtime and using reflection to call ScanAsync.

Any thoughts/ideas?

SegmentMetadata request without Interval includes empty Interval by default

Hihi

When making a SegmentMetadata request without a Interval call (which is optional to Druid) it includes "intervals":[] in the resulting call to Druid which invalidates the result.

It's an easy work-around to pass DateTime.Min and DateTime.Max, but thought I'd highlight this here. I'll try put a PR together to fix it.

Basic Authentication

I'm looking at the DruidClient and Requestor classes and trying to figure out how to support basic authentication on a request. Do you have any recommendations?

Optional Limit and Offset in DefaultLimitSpec

According to Druid documentation (Limit Spec) in DefaultLimitSpec Limit and Offset fields are optional integers and in current implementation both values are of type int. Is there any reason for this and is it possible to change them to long? or at least int?? Or is there some walkaround for sorting in GroupBy query without setting offset and limit?

BoundFilter with strings

I'm trying to create a new BoundFilter to compare strings. Because there's a struct constraint, it's not letting me pass the string type:

new BoundFilter<string>("srcip", lower: "127", lowerStrict: false, ordering: SortingOrder.alphanumeric)
// Error: CS0453 The type 'string' must be a non-nullable value type in order to use it as parameter 'T' in the generic type or method 'BoundFilter<T>'

Am I doing something wrong? According to the docs, it should be able to work: https://druid.apache.org/docs/latest/querying/filters.html#bound-filter

{
    "type": "bound",
    "dimension": "name",
    "lower": "foo",
    "upper": "hoo"
}

Not setting `Context:useCache` causes an UnderflowError

This might be the wrong repo almost so I'll try to flesh this out later, but the gist is that a seeming fresh instance of Druid if called with an empty context object will throw an error called JavaUnderflowException.

Given we set useCache to null this occurs on some calls. Oddly the error seems to have vanished now I made a single request with the useCache value set.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.