Giter VIP home page Giter VIP logo

mimir's People

Contributors

anandsan1 avatar bizentass avatar codingsage avatar hube5462 avatar kyunghoj avatar legacy25 avatar michaelkulbacki avatar mikebrachmann avatar mrb24 avatar nickcellino avatar okennedy avatar shivang94 avatar snehakrishnamurthy avatar sophieyoung717 avatar willspoth avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mimir's Issues

Bug in join over union

select * from (SELECT * FROM Matched UNION SELECT * FROM typedratings1) ratings, product where product.pid = ratings.pid

Regression: The percolator should handle full-nondeterministic join conflicts

[info] x handle full-nondeterministic join conflicts
[error]    'PROJECT[A1 <= R_A, B1 <= R_B, N <= {{ test_0[] }}, A2 <= R_A, B2 <= R_B, M <= {{ test_1[] }}](
[error]      SELECT[ (R_A=R_A) ](
[error]        JOIN(
[error]          PROJECT[__LHS_ROWID <= ROWID](
[error]            R(ROWID:int)
[error]          ),
[error]          PROJECT[__RHS_ROWID <= ROWID](
[error]            R(ROWID:int)
[error]          )
[error]        )
[error]      )
[error]    )'
[error]     is not equal to 
[error]    'PROJECT[A1 <= __LHS_R_A, B1 <= __LHS_R_B, N <= {{ test_0[] }}, A2 <= __RHS_R_A, B2 <= __RHS_R_B, M <= {{ test_1[] }}](
[error]      SELECT[ (__LHS_R_A=__RHS_R_A) ](
[error]        JOIN(
[error]          PROJECT[__LHS_R_A <= R_A, __LHS_R_B <= R_B, __LHS_R_C <= R_C, __LHS_ROWID <= ROWID](
[error]            R(ROWID:int)
[error]          ),
[error]          PROJECT[__RHS_R_A <= R_A, __RHS_R_B <= R_B, __RHS_R_C <= R_C, __RHS_ROWID <= ROWID](
[error]            R(ROWID:int)
[error]          )
[error]        )
[error]      )
[error]    )' (CompilerSpec.scala:211)
[error] Expected: ...OJECTA1...= [__LHS_]R_...= [__LHS_]R_..._0[] }...= [__]R[HS]_[R_]A,...= [__]R[HS]_[R_]B,..._1[] }}
[error] ...ELECT ([__LHS_]R_A=[__]R[HS]_[R_]A) 
[error] ...JOIN(
[error] ...OJECT__..._R[_A <= R_A, __LHS_R_B <= R_B, __LHS_R_C <= R_C, __LHS_R]OWID ...
[error] ...:int)
[error] ...   ),
[error] ...OJECT__..._R[_A <= R_A, __RHS_R_B <= R_B, __RHS_R_C <= R_C, __RHS_R]OWID ...
[error] ...:int)
[error] ...    )
[error]     )
[error]   )
[error] )
[error] Actual:   ...OJECTA1...= []R_...= []R_..._0[] }...= []R[]_[]A,...= []R[]_[]B,..._1[] }}
[error] ...ELECT ([]R_A=[]R[]_[]A) 
[error] ...JOIN(
[error] ...OJECT__..._R[]OWID ...
[error] ...:int)
[error] ...   ),
[error] ...OJECT__..._R[]OWID ...
[error] ...:int)
[error] ...    )
[error]     )
[error]   )
[error] )
[info] 

Query flow diagram

A GITFlow-style diagram of the query currently being displayed in the web view.

Database.getVGTerms should retrieve row-specific VG Terms

Consider the following expression:

CASE WHEN X IS NULL THEN {{foo}} ELSE X END

ResultIterator.isDeterministic(...) returns false for this expression only when X is in fact null. getVGTerms should follow suit. In fact, this may be better implemented as a method on resultIterator rather than on Database.

The simple way to implement this would to use Eval.inline() to assign all of the Column() values and then emit the VGTerms remaining in the reduced expression.

Case sensitivity in lens names

Lens type definitions should not be case sensitive. Right now, these behave differently

create lens x as select * from ratings2 with SCHEMA_MATCHING (PID string, RATING float, REVIEW_COUNT float);
create lens x as select * from ratings2 with schema_matching (PID string, RATING float, REVIEW_COUNT float);

Lens builder menu

Add a menu to simplify building lenses

  • build lens 'AS' the current query
  • build lens 'WITH' based on some user-inputs to a dialogue box

Regression: Could not create an instance of SqlLoaderSpec

[error] Could not create an instance of mimir.ctables.SqlLoaderSpec
[error]   caused by java.lang.Exception: Can't find a constructor for class mimir.ctables.SqlLoaderSpec
[error]   org.specs2.reflect.Classes$class.tryToCreateObjectEither(Classes.scala:96)
[error]   org.specs2.reflect.Classes$.tryToCreateObjectEither(Classes.scala:207)
[error]   org.specs2.specification.SpecificationStructure$$anonfun$createSpecificationEither$2.apply(BaseSpecification.scala:119)
[error]   org.specs2.specification.SpecificationStructure$$anonfun$createSpecificationEither$2.apply(BaseSpecification.scala:119)
[error]   scala.Option.getOrElse(Option.scala:120)
[error]   org.specs2.specification.SpecificationStructure$.createSpecificationEither(BaseSpecification.scala:119)
[error]   org.specs2.runner.SbtRunner.org$specs2$runner$SbtRunner$$specificationRun(SbtRunner.scala:73)
[error]   org.specs2.runner.SbtRunner$$anonfun$newTask$1$$anon$5.execute(SbtRunner.scala:59)
[error]   sbt.ForkMain$Run$2.call(ForkMain.java:294)
[error]   sbt.ForkMain$Run$2.call(ForkMain.java:284)
[error]   java.util.concurrent.FutureTask.run(FutureTask.java:266)
[error]   java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
[error]   java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[error]   java.lang.Thread.run(Thread.java:744)

Create database directory

The root dir has a bunch of .db files piling up. These should get organized into a Databases directory.

Fuzzing Lens

A simplified form of the archival lens that simply adds a user-specified gaussian to any or all of its input columns.

OperatorParser breaks with nested VGTerms

The TYPE_INFERENCE lens takes the form -

PROJECT[PID <= {{ TR1CAST_0[ROWID, {{ TR1INFER_0[] }}] }}, RATING <= {{ TR1CAST_1[ROWID, {{ TR1INFER_1[] }}] }}, REVIEW_CT <= {{ TR1CAST_2[ROWID, {{ TR1INFER_2[] }}] }}]( RATINGS1(...) )

It seems passing a VGTerm as an argument to another VGTerm is confusing the operator parser. The lens works on its own, but when you try to compose it with another lens, the lens.load() step fails. The error can be reproduced by creating a type_inference lens, then creating a missing_value lens on top of it and trying to see any tooltip, or creating another lens on it or even just trying to do a SELECT * FROM MIMIR_LENSES

File import through Web interface

A CSV file import feature would be helpful, and could form the basis for some later features for log parsing.

The semantics I'd be looking to see are something along the lines of:

SELECT * INTO new_table FROM uploaded_csv_file

CREATE LENS fails after SELECT

Running any SELECT query and following it with a CREATE LENS:

SELECT * FROM sane_r;
CREATE LENS insane_r AS SELECT * FROM r WITH missing_value('C')

results in the following exception

java.sql.SQLException: [SQLITE_BUSY]  The database file is locked (database is locked)
    at org.sqlite.core.DB.newSQLException(DB.java:890)
    at org.sqlite.core.DB.newSQLException(DB.java:901)
    at org.sqlite.core.DB.execute(DB.java:807)
    at org.sqlite.jdbc3.JDBC3PreparedStatement.execute(JDBC3PreparedStatement.java:50)
    at mimir.sql.JDBCBackend.update(JDBCBackend.scala:56)
    at mimir.Database.update(Database.scala:96)
    at mimir.lenses.LensManager.save(LensManager.scala:71)
    at mimir.lenses.LensManager.create(LensManager.scala:67)

Possible bug in SqlToRA and RAToSql conversions

I was playing around with some tables for CSV import + Type Inference when I noticed that with more than a few columns, the order of the columns of the tables are getting messed up. For example -

screenshot from 2015-07-22 20 45 11

Name is getting displayed in Married, Married in Joining and Joining in Name

This is because in line 219 of SqlToRA, the toMap is returning a HashMap, which is not preserving the order of columns. Consequently, in RAToSql, the mappings of the SelectItems are wrong.

screenshot from 2015-07-22 20 50 30

screenshot from 2015-07-22 20 56 04

ret has incorrectly ordered mappings above.

Should we correct this?

Regression: The percolator should hndle row-ids correctly

[info] x handle row-ids correctly
[error]    'PROJECT[A <= R_A, C <= R_C, N <= {{ test_0[__LHS_ROWID, R_A] }}, S_C <= S_C, S_D <= S_D](
[error]      SELECT[ (R_C=S_C) ](
[error]        JOIN(
[error]          PROJECT[__LHS_ROWID <= ROWID, __LHS_ROWID <= ROWID, __LHS_ROWID <= ROWID](
[error]            R(ROWID:int // ROWID:rowid, ROWID:rowid)
[error]          ),
[error]          PROJECT[S_C <= S_C, S_D <= S_D](
[error]            S(S_C:int, S_D:decimal)
[error]          )
[error]        )
[error]      )
[error]    )'
[error]     is not equal to 
[error]    'PROJECT[A <= R_A, C <= R_C, N <= {{ test_0[__LHS_ROWID, R_A] }}, S_C <= S_C, S_D <= S_D](
[error]      SELECT[ (R_C=S_C) ](
[error]        JOIN(
[error]          PROJECT[R_A <= R_A, R_B <= R_B, R_C <= R_C, __LHS_ROWID <= ROWID](
[error]            R(ROWID:int)
[error]          ),
[error]          PROJECT[S_C <= S_C, S_D <= S_D](
[error]            S(S_C:int, S_D:decimal)
[error]          )
[error]        )
[error]      )
[error]    )' (CompilerSpec.scala:240)
[error] Expected: ...OJECTA ..._0[__LHS_ROWID, R_A] }}, ...
[error] ...ELECT[ (R_C=S_C) ](
[error] ...JOIN(
[error] ...OJECT[]R[_A] <= R[_A], [R]_[B <= R]_[B, R_C] <= R[_C], __L...
[error] ...D:int[])
[error] ...   ),
[error] ...OJECT[S_C <= S_C, S_D <= S_D](
[error] ...imal)
[error] ...    )
[error]     )
[error]   )
[error] )
[error] Actual:   ...OJECTA ..._0[__LHS_ROWID, R_A] }}, ...
[error] ...ELECT[ (R_C=S_C) ](
[error] ...JOIN(
[error] ...OJECT[__LHS_]R[OWID] <= R[OWID], []_[_LHS]_[ROWID] <= R[OWID], __L...
[error] ...D:int[ // ROWID:rowid, ROWID:rowid])
[error] ...   ),
[error] ...OJECT[S_C <= S_C, S_D <= S_D](
[error] ...imal)
[error] ...    )
[error]     )
[error]   )
[error] )

Add row-level explanation box

The explain box should have a Confidence (probability of the row's presence) and a list of var terms in the __MIMIR_CONDITION column.

Trouble uploading CSV files

[error] - play.core.server.netty.PlayDefaultUpstreamHandler - Cannot invoke the action
java.sql.SQLException: near ".": syntax error
    at org.sqlite.core.NativeDB.throwex(NativeDB.java:397) ~[sqlite-jdbc-3.8.7.jar:na]
    at org.sqlite.core.NativeDB._exec(Native Method) ~[sqlite-jdbc-3.8.7.jar:na]
    at org.sqlite.jdbc3.JDBC3Statement.executeUpdate(JDBC3Statement.java:116) ~[sqlite-jdbc-3.8.7.jar:na]
    at mimir.sql.JDBCBackend.update(JDBCBackend.scala:48) ~[classes/:na]
    at mimir.Database.update(Database.scala:94) ~[classes/:na]
    at mimir.Database.handleLoadTable(Database.scala:291) ~[classes/:na]
    at mimir.WebAPI.configure(WebAPI.scala:50) ~[classes/:na]
    at controllers.Application$$anonfun$loadTable$1$$anonfun$apply$1.apply(Application.scala:123) ~[classes/:na]
    at controllers.Application$$anonfun$loadTable$1$$anonfun$apply$1.apply(Application.scala:117) ~[classes/:na]

This issue occurs when uploading the file https://github.com/UBOdin/mimir/blob/master/test/data/CPUSpeed.csv

screen shot 2015-07-27 at 6 04 02 pm

Sampling

Sample(Expr) that produces a sample from one possible world of evaluating the expression.

Type inference lens

Selects the type of each attribute based on the majority of values in the record. Allows for the possibility of errors in the type selection.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.