verdict-project / verdict Goto Github PK
View Code? Open in Web Editor NEWInteractive-Speed Analytics: 200x Faster, 200x Fewer Cluster Resources, Approximate Query Processing
Home Page: http://verdictdb.org
License: Apache License 2.0
Interactive-Speed Analytics: 200x Faster, 200x Fewer Cluster Resources, Approximate Query Processing
Home Page: http://verdictdb.org
License: Apache License 2.0
Super interesting work. I'm curious to see how it could improve query times in the Druid timeseries database.
One possible reference might be the datasketches extension - though I'm not at all sure whether that's how it would be implemented, knowing little about the technical details of either Druid or Datasketches. ๐ I'd be interested to see how Verdict would compare to Datasketches in that context.
User should be able to choose any combination of:
The correct confidence interval is [a,b]
, where:
a = 2t'-t''_i
, b = 2t'-t''_j
, t'
is the estimated value and t''
s are the bootstrap values.
example:
create sample tss from t with size 10%
columns
stratified by name;
Don't use UDF and calculate in the middleware.
example:
when result of count
on the sample is a
and the confidence interval is [s,t]
where s < a
, more accurate answer c.i. is [a,t]
. Because the true answer cannot be smaller than the result on the sample.
good for testing and experiments
maybe cache results?
takes a little bit more space but we can avoid the join and maybe extra groupby
Maybe separate their schemas?
Currently, one can see the DBMS password by:
GET hive.password;
Don't show "impala is not supported by Verdict"
Add seed argument to UDAs + add scale factor argument to conf-int
change output type of AVG + add seed argument for all UDAs + add scale factor argument to conf-int
Consider extending the SQL syntax itself to support options such as sample size, etc.
reason: poisson() is not recalculated for each row
Maybe select the biggest one
Add user manual and tutorial.
Impala and SparkSQL support caching
Exmple:
2*sum(x)
sum(x)+count(y)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.