zabetak / calcite-tutorial Goto Github PK
View Code? Open in Web Editor NEWLicense: Apache License 2.0
License: Apache License 2.0
Implement a class similar to EndToEndExampleEnumerable
fetching data from Lucene. This class goes one step further than #2 focusing on the presence of multiple conventions (BindableConvention
, LuceneConvention
) and the introduction of simple implementation rules which resembles more real use cases of Calcite.
The following classes are added:
LuceneTableScan
LuceneTableScanRule
LuceneToBindableConverter
LuceneToBindalbeConverterRule
and LuceneTable
no longer implements ScannableTable
(the logic is moved to LuceneToBindableConverter
).
In #3, #4 the whole table is fetched from Lucene and loaded into memory before further operations are applied to the data. The goal of this issue is to push to Lucene simple filter conditions exploiting the efficient data structures provided by the library to speed up query execution and avoid touching all data pages on disk.
It would be nice to show how a rule can exploit metadata in order to perform some transformations.
Currently all tables return the default row count, which is 100. It would be nice to plug in the real row count by probing Lucene.
The tutorial will be about Lucene so we need a small program to create the respective indexes and populate them with data. The simplest would be to load data from CSV
files although depending on the dataset that we choose it the source could also be a JDBC connection.
We could avoid checking in TPC-H data files in the git repo by generating the dataset programaticaly using airlift-tpch or another similar.
Before making the change we should ensue that it is possible to generate tiny datasets (scale factors < 1) and that is acceptably fast. Currently we use a dataset with scale factor 0.001.
Create a class similar to EndToEndExampleEnumerable
but for data residing in Lucene indexes. The idea is to do introduce a LuceneTable
class and implement ScannableTable
interface using exclusively the Bindable
convention.
The goal is to get the attendees familiar with the main Calcite APIs without talking a lot about multiple conventions, converters, etc.
I'm following along the tutorial after the presentation and it's not clear to me to which logical plan this piece is referring to.
Is this Exercise 3 from https://www.slideshare.net/julianhyde/apache-calcite-a-tutorial-given-at-boss-21 ?
Q1
SELECT o._o_custkey, COUNT(*)
FROM orders AS o
GROUP BY o.o_custkey
Q2
SELECT o.o_custkey, COUNT(*)
FROM orders AS o
WHERE o.o_totalprice > 220388.06
GROUP BY o.o_custkey
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.