Giter VIP home page Giter VIP logo

calcite-tutorial's People

Contributors

julianhyde avatar zabetak avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

calcite-tutorial's Issues

Implement double convention end to end example for data in Lucene

Implement a class similar to EndToEndExampleEnumerable fetching data from Lucene. This class goes one step further than #2 focusing on the presence of multiple conventions (BindableConvention, LuceneConvention) and the introduction of simple implementation rules which resembles more real use cases of Calcite.

The following classes are added:

  • LuceneTableScan
  • LuceneTableScanRule
  • LuceneToBindableConverter
  • LuceneToBindalbeConverterRule
  • `LuceneRel

and LuceneTable no longer implements ScannableTable (the logic is moved to LuceneToBindableConverter).

Add filter push down optimization in Lucene example

In #3, #4 the whole table is fetched from Lucene and loaded into memory before further operations are applied to the data. The goal of this issue is to push to Lucene simple filter conditions exploiting the efficient data structures provided by the library to speed up query execution and avoid touching all data pages on disk.

Implement simple Lucene indexer for the tutorial dataset

The tutorial will be about Lucene so we need a small program to create the respective indexes and populate them with data. The simplest would be to load data from CSV files although depending on the dataset that we choose it the source could also be a JDBC connection.

Generate TPC-H dataset programmaticaly

We could avoid checking in TPC-H data files in the git repo by generating the dataset programaticaly using airlift-tpch or another similar.

Before making the change we should ensue that it is possible to generate tiny datasets (scale factors < 1) and that is acceptably fast. Currently we use a dataset with scale factor 0.001.

Implement single convention end to end example for data in Lucene

Create a class similar to EndToEndExampleEnumerable but for data residing in Lucene indexes. The idea is to do introduce a LuceneTable class and implement ScannableTable interface using exclusively the Bindable convention.

The goal is to get the attendees familiar with the main Calcite APIs without talking a lot about multiple conventions, converters, etc.

To what logical plan are you referring in this section?

I'm following along the tutorial after the presentation and it's not clear to me to which logical plan this piece is referring to.

Is this Exercise 3 from https://www.slideshare.net/julianhyde/apache-calcite-a-tutorial-given-at-boss-21 ?

Q1

SELECT o._o_custkey, COUNT(*)
FROM orders AS o
GROUP BY o.o_custkey

Q2

SELECT o.o_custkey, COUNT(*)
FROM orders AS o
WHERE o.o_totalprice > 220388.06
GROUP BY o.o_custkey

// TODO 2. Use the RelBuilder to create directly a logical plan for execution

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.