uwdb / cosette Goto Github PK
View Code? Open in Web Editor NEWCosette is an automated SQL solver.
License: BSD 2-Clause "Simplified" License
Cosette is an automated SQL solver.
License: BSD 2-Clause "Simplified" License
This generated an "Invalid Coq generated" message and told me to create an issue.
schema sch_flights(fid:int,
year:int,
month_id:string,
day_of_month:int,
day_of_week_id:int,
carrier_id:int,
flight_num:int,
origin_city:string,
origin_state:int,
dest_city:string,
dest_state:int,
departure_delay:int,
taxi_out:int,
arrival_delay:int,
canceled:int,
actual_time:int,
distance:int
);
schema sch_carriers
(
cid:int,
name:int
);
SCHEMA sch_months
(
mid:int,
month:int
);
SCHEMA sch_days
(
did:int,
day_of_week:int
);
table Flights(sch_flights);
table Carriers(sch_carriers);
table Weekdays(sch_days);
table Months(sch_months);
query q1 `SELECT C.name AS name,
F1.flight_num AS f1_flight_num,
F1.origin_city AS f1_origin_city,
F1.dest_city AS f1_dest_city,
F1.actual_time AS f1_actual_time,
F2.flight_num AS f2_flight_num,
F2.origin_city AS f2_origin_city,
F2.dest_city AS f2_dest_city,
F2.actual_time AS f2_actual_time,
(F1.actual_time + F2.actual_time) AS actual_time
FROM FLIGHTS F1, FLIGHTS F2
JOIN MONTHS M
ON F1.month_id = M.mid AND F2.month_id = M.mid
JOIN CARRIERS C
ON F1.carrier_id = C.cid AND F2.carrier_id = C.cid
WHERE M.month = 'July'
AND F1.day_of_month = 15
AND F2.day_of_month = 15
AND F1.year = 2015
AND F2.year = 2015
AND F1.origin_city = 'Seattle WA'
AND F2.dest_city = 'Boston MA'
AND F1.dest_city = F2.origin_city
AND (F1.actual_time + F2.actual_time) < 7 * 60
AND F1.carrier_id = F2.carrier_id`;
query q2 -- define query q2 likewise
`SELECT C.name AS name,
F1.flight_num AS f1_flight_num,
F1.origin_city AS f1_origin_city,
F1.dest_city AS f1_dest_city,
F1.actual_time AS f1_actual_time,
F2.flight_num AS f2_flight_num,
F2.origin_city AS f2_origin_city,
F2.dest_city AS f2_dest_city,
F2.actual_time AS f2_actual_time,
(F1.actual_time + F2.actual_time) AS actual_time
FROM FLIGHTS F1, FLIGHTS F2
JOIN MONTHS M
ON F1.month_id = M.mid AND F2.month_id = M.mid
JOIN CARRIERS C
ON F1.carrier_id = C.cid AND F2.carrier_id = C.cid
WHERE M.month = 'July'
AND F1.day_of_month = 15
AND F2.day_of_month = 15
AND F1.year = 2015
AND F2.year = 2015
AND F1.origin_city = 'Seattle WA'
AND F2.dest_city = 'Boston MA'
AND F1.dest_city = F2.origin_city
AND (F1.actual_time + F2.actual_time) < 7 * 60
AND F1.carrier_id = F2.carrier_id`;
verify q1 q2; -- does q1 equal to q2?
Hey! Could we get these features added? This is a query found when grading CSEP 544, HW3.
select distinct F2.dest_city as city
from Flights F1, Flights F2
where F1.origin_city = 'Seattle WA' and
F1.dest_city = F2.origin_city and F2.dest_city != F1.origin_city and
F2.dest_city not in (select F.dest_city as dest_city
from Flights F
where F.origin_city = 'Seattle WA');
The following queries are not known as equivalent, even though they are exactly the same. Maybe you could try sorting the x = y and y = x, such that they are all x = y or something. Either way, I feel like it should know that these are equivalent statements.
schema sch_flights(fid:int,
year:int,
month_id:int,
day_of_month:int,
day_of_week_id:int,
carrier_id:string,
flight_num:int,
origin_city:string,
origin_state:string,
dest_city:string,
dest_state:string,
departure_delay:int,
taxi_out:int,
arrival_delay:int,
canceled:int,
actual_time:int,
distance:int,
capacity:int,
price:int
);
schema sch_carriers
(
cid:int,
name:string
);
SCHEMA sch_months
(
mid:int,
month:string
);
SCHEMA sch_days
(
did:int,
day_of_week:string
);
table Flights(sch_flights);
table Carriers(sch_carriers);
table Weekdays(sch_days);
table Months(sch_months);
query q1 -- define query q1 on tables a and b
` select C.name as name, sum(F.departure_delay) as delay
from FLIGHTS as F, CARRIERS as C
where C.cid=F.carrier_id
group by F.carrier_id`;
query q2 -- define query q2 likewise
`select c.name as name, sum(f.departure_delay) as delay
from FLIGHTS as f, CARRIERS as c
where f.carrier_id = c.cid
group by f.carrier_id`;
verify q1 q2; -- does q1 equal to q2?
We should be able to find the Coq proof for the following query. Essentially A UNION ALL EMPTY = A. There are various ways of implementing that, though.
schema s1(x:int);
table a(s1); -- define table a using schema s1
query q1 -- define query q1 on tables a and b
`select * from a x UNION ALL
select * from a x where false`;
query q2 -- define query q2 likewise
`select * from a x`;
verify q1 q2; -- does q1 equal to q2?
Change the parser so that SELECT *
will be converted to SELECT (VAL "t1.a" ... ) ...
in rosette AST.
-- define schema r with attributes a and b
schema s(a:int, b:int);
table r(s); -- define table r using schema s
query q1 -- define query q1
`select sum(1+x.b)
from r x`;
query q2 -- define query q2 likewise
`select sum(1+y.b)
from r y`;
verify q1 q2; -- does q1 equal to q2?
generate incorrect coq code.
Can be supported as an uninterpreted function in Coq and interpreted in Rosette.
Other Features and Limitations. The COSETTE constraints generator currently supports most standard SQL features. Besides the features mentioned above, EXISTS, IN, and LEFT OUTER JOIN are also supported. Currently COSETTE does not support string operations (e.g., LIKE) since COSETTE currently does not model strings as character arrays, along with ORDER BY.
The paper says Cosette supports EXISTS and IN, but it doesn't.
For example, Cosette complains an example code in the github repositiory has a syntax error.
example code: https://raw.githubusercontent.com/uwdb/Cosette/master/examples/calcite/testDecorrelateTwoIn.cos
tested under Cosette online demo and Cosette source code.
Scenario:
Expected:
hott module for Coq should complete building successfully
What happens:
make[1]: Entering directory `/pstore/home/adaszews/Cosette/hott/.build'
COQC library/UnivalentSemantics.v
COQC .build_solve/library/UnivalentSemantics.v
File "./library/UnivalentSemantics.v", line 10, characters 2-57:
Error:
Unable to satisfy the following constraints:
In environment:
T : type
?Denotation : "Denotation type Type"
The following query will cause an internal error.
schema s(??);
table r(s);
table s(s);
predicate b(s);
query q1
`(select * from r x where b(x)) union all
(select * from s y where b(y))`;
query q2
`select *
from ( r union all s) x
where b(x)`;
verify q1 q2;
According to SQL standard, table names and names of attributes should name insensitive.
I notice that all test cases in rosette are only checking equivalence of two queries. But the purpose of using rosette is to find counterexamples that can proof the inequality of two queries, am I right?
Could you specify that how to use the given Rosette codes to generate counterexample?
Hey guys, are there any plans to work on CTE (common table expressions) support?
They are a part of SQL standard, supported by most popular RDBM and their usage is widely considered to be best practice when there's a need for multiple (potentially nested) subquieries.
Queries:
Cosette Program:
schema s1(x:int);
table a(s1); -- define table a using schema s1
query q1
`select count(*) as ax from a x`;
query q2
`select * from a x`;
verify q1 q2; -- does q1 equal to q2?
the solver returns
Coq Error
Internal Error (to rosette).
CoqCodeGen: ToHoTTSQL.lhs:(450,5)-(456,66): Non-exhaustive patterns in function toCoq
Cosette Program:
schema s1(x:int);
table a(s1); -- define table a using schema s1
query q1
`select count(*) as ax from
(select x.x as ax from a x) x1`;
query q2
`select * from a x`;
verify q1 q2; -- does q1 equal to q2?
The solver returns
Coq Error
Syntax Error.
ERROR: Error in elimSatr (ToHoTTSQL.lhs).
e.g., it still says 'We will release the Rosette code later.'
We want to enable Cosette to handle the following new forms of aggregation query forms:
An demonic example is shown below.
Select a, max(c), b + 1, count(*) + 2, min(a + b)
From t
Group by a, b
Having min(d - a) > 2
schema s1(id:ty0, ??);
table Products(s1);
table Inventory(s1);
query q1
`
SELECT id as p_id
FROM Products
EXCEPT
SELECT id as i_id
FROM Inventory
`;
verify q1 q1;
The above query will cause a syntax error on the id
in the second SELECT
line.
The test case rosette/cidr-benchmarks/subquery-test.rkt seems running forever on my computer. What are your expected run time and memory limitation for this test case?
to avoid duplicates
Such as:
SELECT t.a, t.b+t.c FROM t WHERE t.b > 10
Does Cosette support such kinds of queries?
With this query,
schema s1(user_id:ty0, parent_id:ty0, ??);
table users(s1);
query q1
`SELECT user.user_id as id
FROM users u
JOIN users AS p
ON p.user_id = u.parent_id
WHERE u.user_id = u.user_id`;
query q2
`SELECT user.user_id as id
FROM users u
JOIN users AS p
ON p.user_id = u.parent_id
WHERE u.user_id = u.user_id`;
verify q1 q2;
Cosette returns a syntax error pointing to users
on the line of the the JOIN
.
Currently the rosette backhand has support for Left Outer Join but the front end does not. We need to add Left-Outer-Join support in the parser.
Automated test Integration is needed.
I would expect the following queries to be comparable, but alas I think NOT EXISTS does not parse.
schema sch_flights(fid:int,
year:int,
month_id:int,
day_of_month:int,
day_of_week_id:int,
carrier_id:int,
flight_num:int,
origin_city:int,
origin_state:int,
dest_city:int,
dest_state:int,
departure_delay:int,
taxi_out:int,
arrival_delay:int,
canceled:int,
actual_time:int,
distance:int
);
schema sch_carriers
(
cid:int,
name:int
);
SCHEMA sch_months
(
mid:int,
month:int
);
SCHEMA sch_days
(
did:int,
day_of_week:int
);
table Flights(sch_flights);
table Carriers(sch_carriers);
table Weekdays(sch_days);
table Months(sch_months);
query q1 `select distinct F1.origin_city as orig from Flights F1
where not exists( select * from Flights F2
where F2.actual_time >= 180
and F2.origin_city = F1.origin_city )`;
query q2 -- define query q2 likewise
`select distinct F1.origin_city as orig
from Flights F1
where not exists( select * from Flights F2
where F2.actual_time >= 180
and F2.origin_city = F1.origin_city )`;
verify q1 q2; -- does q1 equal to q2?
Since Rosette updated to Racket 6.9 with a few bug fixes.
We should be able to support strings as literals. This wouldn't require supporting functions on strings.
Expect faster SMT solving :)
Can you add one? :)
I run the command:
emacs /docker:cosette:/hott/UnivalentSemantics.v
as exactly as being written in your README file, and got the following error:
Host cosette' looks like a remote host,
docker' can only use the local host
3 front end bugs found, all related to handling * and aggregate:
schema s1(a:int, b:int, ??);
schema s2(a:int, b:int, ??); -- define schema s2
table r1(s1); -- define table a using schema s1
table r2(s2); -- define table b using schema s1
query q1 -- define query q1 on tables a and b
`select x.b as xb, y.b as yb
from r1 x, r2 y`;
query q2 -- define query q2 likewise
`select x1.xb as xb, y1.yb as yb
from (select x.b as xb from r1 x) x1,
(select y.b as yb from r2 y) y1`;
verify q1 q2; -- does q1 equal to q2?
The solver returns
Coq Error
ERROR: Error in elimSatr (ToHoTTSQL.lhs).
"prod_heuristic1" and "deductive_proof" aren't very descriptive names. Maybe put the tactic definition file somewhere and put a link to it from the demo results?
To support some rewrites, we want to support symbolic predicate as an uninterpreted function in Rosette explicitly.
Add new syntax in Rosette code
parser support
We need compile constants in user submitted queries to Coq. Currently we can assume all the constants are integers.
This currently returns a racket error:
schema s1(x:int, ya:int);
schema s2(yb:int, yc: int); -- define schema s2
table a(s1); -- define table a using schema s1
table b(s2); -- define table b using schema s1
query q1 -- define query q1 on tables a and b
`select distinct A.x as ax from a A
where NOT EXISTS (select * from b B
where (B.yb < A.ya) or (B.yb>A.ya) )`;
query q2 -- define query q2 likewise
`select distinct A.x as ax from a A
where
((select count(B.yb) from b B where B.yb=A.ya) =
(select count(B.yb) from b B))`;
verify q1 q2; -- does q1 equal to q2?
{"size":[1],"status":"UNSAT"}
$=: expected real? arguments
arguments: '(#<void>)
context...:
/root/.racket/6.8/pkgs/rosette/rosette/base/core/real.rkt:160:10
/root/.racket/6.8/pkgs/rosette/rosette/base/form/control.rkt:31:25
/root/.racket/6.8/pkgs/rosette/rosette/base/adt/list.rkt:119:9: @for-each
/Cosette-Web/backend/Cosette/rosette/cosette.rkt:185:0: solve-queries-symbreak
I thought this should work? Modified from https://www.reddit.com/r/programming/comments/6oektc/introducing_cosette_a_sql_solver_for/dkhc92b/
/* define schema s1,
here s1 can contain any number of attributes,
but it has to at least contain integer attributes
x and y */
schema s1(x:int, ya:int, ??);
schema s2(yb:int, ??); -- define schema s2
table a(s1); -- define table a using schema s1
table b(s2); -- define table b using schema s1
query q1 -- define query q1 on tables a and b
`select distinct x.x as ax from a x, b y
where x.ya = y.yb UNION ALL
select distinct x.x as ax from a x where 1=2`;
query q2 -- define query q2 likewise
`select distinct x.x as ax from a x, a y, b z
where x.x = y.x and x.ya = z.yb`;
verify q1 q2; -- does q1 equal to q2?
Increasing size of symbolic relations in Rosette.
The rosette program that the parser produced only need to provide query and input schemas.
/* define schema s1,
here s1 can contain any number of attributes,
but it has to at least contain integer attributes
x and y */
schema s1(x:int, ya:int, yb:int, ??);
schema s2(yb:int, ??); -- define schema s2
table a(s1); -- define table a using schema s1
table b(s2); -- define table b using schema s1
query q1 -- define query q1 on tables a and b
`select distinct x.x as ax from a x, b y
where x.ya = y.yb`;
query q2 -- define query q2 likewise
`select distinct x.x as ax from a x, a y, b z
where x.x = y.x`;
verify q1 q2; -- does q1 equal to q2?
The above cosette program leads to invalid coq code generation. The reason is that s1
and s2
both have a column named yb
. A better handling of generated column names should be done.
When I ran cosette to check query that uses EXISTS in its WHERE caluse, cosette freezes. A similar query using IN in its WHERE clause works perfectly. Query 1 below works fine but query 2 freezes
schema sch_flights(fid:int,
year:int,
month_id:int,
day_of_month:int,
day_of_week_id:int,
carrier_id:string,
flight_num:int,
origin_city:string,
origin_state:string,
dest_city:string,
dest_state:string,
departure_delay:int,
taxi_out:int,
arrival_delay:int,
canceled:int,
actual_time:int,
distance:int,
capacity:int,
price:int
);
schema sch_carriers
(
cid:int,
name:string
);
SCHEMA sch_months
(
mid:int,
month:string
);
SCHEMA sch_weekdays
(
did:int,
day_of_week:string
);
table Flights(sch_flights);
table Carriers(sch_carriers);
table Weekdays(sch_weekdays);
table Months(sch_months);
query q1 -- define query q1 on tables a and b
select distinct C.name as carrier
from Carriers C
where C.cid in (select F.carrier_id as carrier_id
from Flights F
where F.origin_city = 'Seattle WA' and
F.dest_city = 'San Francisco CA' and
F.carrier_id = C.cid);
query q2 -- define query q2 likewise
SELECT DISTINCT c.name AS carrier
FROM Carriers AS c
WHERE EXISTS ( SELECT *
FROM flights AS f
WHERE f.carrier_id = c.cid
AND f.origin_city = 'Seattle WA'
AND f.dest_city = 'San Francisco CA');
/* define schema s1,
here s1 can contain any number of attributes,
but it has to at least contain integer attributes
x and y */
schema s1(x:int, y:int, ??);
table a(s1); -- table a of schema s1
query q1 -- query 1
select x.x as ax from a x where 1=1
;
query q2 -- query 2
select x.x as ax from a x where 0=0
;
verify q1 q2; -- verify the equivalence
cosette-solve
can only return counter example on specific tables but not symbolic function.
From https://news.ycombinator.com/item?id=14810085#14811708
/* define schema s1, here s1 can contain any number of attributes, but it has to at least contain integer attributes x and y */ schema s1(x:int, ya:int, ??);
schema s2(yb:int, ??); -- define schema s2
table a(s1); -- define table a using schema s1
table b(s2); -- define table b using schema s1
query q1 -- define query q1 on tables a and b
`select distinct x.x as ax
from a x, b y
where x.ya = y.yb`;
query q2 -- define query q2 likewise
`(select x.x as ax
from a x, a y, b z
where x.x = y.x and x.ya = z.yb)
union all (select 1 as ax from a x where 1 = 0)`;
verify q1 q2; -- does q1 equal to q2?
Whenever we generate a counter example for two queries q1 and q2, we should generate a sqlite3 scripts and execute it on a sqlite engine -- to test whether the generated counter example is indeed a counter example.
The purpose is to test our sql interpreter (as it may contain bugs).
Some AST transformation is common to both rosette and coq path, e.g. converting group by to subquery.
They should put into a single place.
Currently, two transformation could be done:
TableRef
in From to a binary tree of XProdA declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.