Comments (6)
Hi @fisher-liquid,
Thank you for filing the report. I took a look at this and I think this is expected behavior, it's just that the result size is extremely large. Here are the logical plans for the two queries you mentioned
original query
┌─────────────────────────────┐
│┌───────────────────────────┐│
││ Physical Plan ││
│└───────────────────────────┘│
└─────────────────────────────┘
┌───────────────────────────┐
│ PROJECTION │
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │
│ item_id │
└─────────────┬─────────────┘
┌─────────────┴─────────────┐
│ RIGHT_DELIM_JOIN │
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │
│ INNER │
│delim_index IS NOT DISTINCT├───────────────────────────────────────────┐
│ FROM delim_index │ │
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ │
│ EC: 10364 │ │
└─────────────┬─────────────┘ │
┌─────────────┴─────────────┐ ┌─────────────┴─────────────┐
│ PROJECTION │ │ HASH_JOIN │
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ │ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │
│ #0 │ │ INNER │
│ #1 │ │delim_index IS NOT DISTINCT├───────────────────────────────────────────┐
│ #2 │ │ FROM delim_index │ │
│ #3 │ │ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ │
│ │ │ EC: 10364 │ │
└─────────────┬─────────────┘ └─────────────┬─────────────┘ │
┌─────────────┴─────────────┐ ┌─────────────┴─────────────┐ ┌─────────────┴─────────────┐
│ STREAMING_WINDOW │ │ HASH_JOIN │ │ DUMMY_SCAN │
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ │ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ │ │
│ delim_index │ │ INNER │ │ │
│ │ │ item_id = item_id ├──────────────┐ │ │
│ │ │ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ │ │ │
│ │ │ EC: 1 │ │ │ │
└─────────────┬─────────────┘ └─────────────┬─────────────┘ │ └───────────────────────────┘
┌─────────────┴─────────────┐ ┌─────────────┴─────────────┐┌─────────────┴─────────────┐
│ HASH_JOIN │ │ SEQ_SCAN ││ INOUT_FUNCTION │
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ │ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ││ │
│ INNER │ │ t3 ││ │
│ id = id │ │ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ││ │
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ├──────────────┐ │ item_id ││ │
│ Build Min: 0 │ │ │ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ││ │
│ Build Max: 9999 │ │ │ EC: 750000 ││ │
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ │ │ ││ │
│ EC: 10364 │ │ │ ││ │
└─────────────┬─────────────┘ │ └───────────────────────────┘└─────────────┬─────────────┘
┌─────────────┴─────────────┐┌─────────────┴─────────────┐ ┌─────────────┴─────────────┐
│ SEQ_SCAN ││ SEQ_SCAN │ │ PROJECTION │
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ││ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ │ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │
│ t2 ││ t1 │ │ COALESCE(item_ids, []) │
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ││ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ │ delim_index │
│ id ││ id │ │ item_ids │
│ item_ids ││ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ │ │
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ││ EC: 10000 │ │ │
│ EC: 10000 ││ │ │ │
└───────────────────────────┘└───────────────────────────┘ └─────────────┬─────────────┘
┌─────────────┴─────────────┐
│ DELIM_SCAN │
└───────────────────────────┘
CTE
┌─────────────────────────────┐
│┌───────────────────────────┐│
││ Physical Plan ││
│└───────────────────────────┘│
└─────────────────────────────┘
┌───────────────────────────┐
│ PROJECTION │
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │
│ item_id │
└─────────────┬─────────────┘
┌─────────────┴─────────────┐
│ HASH_JOIN │
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │
│ INNER │
│ item_id = item_id ├──────────────┐
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ │
│ EC: 11229 │ │
└─────────────┬─────────────┘ │
┌─────────────┴─────────────┐┌─────────────┴─────────────┐
│ SEQ_SCAN ││ HASH_JOIN │
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ││ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │
│ t3 ││ INNER │
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ││ id = id │
│ item_id ││ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ├──────────────┐
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ││ Build Min: 0 │ │
│ EC: 750000 ││ Build Max: 9999 │ │
│ ││ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ │
│ ││ EC: 10364 │ │
└───────────────────────────┘└─────────────┬─────────────┘ │
┌─────────────┴─────────────┐┌─────────────┴─────────────┐
│ PROJECTION ││ SEQ_SCAN │
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ││ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │
│ id ││ t1 │
│ item_id ││ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │
│ ││ id │
│ ││ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │
│ ││ EC: 10000 │
└─────────────┬─────────────┘└───────────────────────────┘
┌─────────────┴─────────────┐
│ UNNEST │
└─────────────┬─────────────┘
┌─────────────┴─────────────┐
│ SEQ_SCAN │
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │
│ t2 │
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │
│ id │
│ item_ids │
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │
│ EC: 10000 │
└───────────────────────────┘
Note that the CTE is actually not the same query since there is an extra join condition. This is the main reason the CTE finished so fast (more on that later). An equivalent query with a CTE would looks like the following
explain with id_item as (
select id, unnest(item_ids) as item_id
from t2
)
SELECT *
FROM t1
INNER JOIN t2 on t2.id = t1.id,
id_item
INNER JOIN t3 on t3.item_id = id_item.item_id;
The query plan for this query is
┌─────────────────────────────┐
│┌───────────────────────────┐│
││ Physical Plan ││
│└───────────────────────────┘│
└─────────────────────────────┘
┌───────────────────────────┐
│ PROJECTION │
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │
│ id │
│ id │
│ item_ids │
│ id │
│ item_id │
│ item_id │
└─────────────┬─────────────┘
┌─────────────┴─────────────┐
│ CROSS_PRODUCT ├───────────────────────────────────────────┐
└─────────────┬─────────────┘ │
┌─────────────┴─────────────┐ ┌─────────────┴─────────────┐
│ HASH_JOIN │ │ HASH_JOIN │
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ │ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │
│ INNER │ │ INNER │
│ item_id = item_id │ │ id = id │
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ├──────────────┐ │ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ├──────────────┐
│ EC: 10834 │ │ │ Build Min: 0 │ │
│ │ │ │ Build Max: 9999 │ │
│ │ │ │ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ │
│ │ │ │ EC: 10364 │ │
└─────────────┬─────────────┘ │ └─────────────┬─────────────┘ │
┌─────────────┴─────────────┐┌─────────────┴─────────────┐┌─────────────┴─────────────┐┌─────────────┴─────────────┐
│ SEQ_SCAN ││ PROJECTION ││ SEQ_SCAN ││ SEQ_SCAN │
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ││ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ││ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ││ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │
│ t3 ││ id ││ t2 ││ t1 │
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ││ item_id ││ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ││ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │
│ item_id ││ ││ id ││ id │
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ││ ││ item_ids ││ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │
│ EC: 750000 ││ ││ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ││ EC: 10000 │
│ ││ ││ EC: 10000 ││ │
└───────────────────────────┘└─────────────┬─────────────┘└───────────────────────────┘└───────────────────────────┘
┌─────────────┴─────────────┐
│ UNNEST │
└─────────────┬─────────────┘
┌─────────────┴─────────────┐
│ SEQ_SCAN │
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │
│ t2 │
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │
│ id │
│ item_ids │
│ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │
│ EC: 10000 │
└───────────────────────────┘
This query also hangs and does not finish (or will get killed by the OS).
I decided to make intermediate tables from the left and right subtrees before the cross produce with the following queries. This way I could estimate the size of the cross product myself.
create table cp_right as (SELECT * FROM t1 INNER JOIN t2 on t2.id = t1.id);
create table cp_left as with id_item as (select id, unnest(item_ids) as item_id from t2) SELECT * from id_item INNER JOIN t3 on t3.item_id = id_item.item_id;
select count(*) from cp_right;
-- 10000
select count(*) from cp_left;
-- 20000000
This means the last cross product will end up producing 200,000,000,000 results. If item_id is just a 32 bit integer, the result set has a size of 800 gigabytes. Using the CTE with the extra join condition, only the size of the left table is returned as cp_right
only has distinct item_id values, so only 20,000,000 results, which comes out to 0.08 Gigabytes.
Let me know if this helps. If you still have issues with a use case surrounding this, let me know 👍
from duckdb.
@Tmonster The query plans should be equivalent though. They might not be equal, but they should produce equal results; they do in PostgreSQL at least.
See
https://www.db-fiddle.com/f/4xiYbrbGgbCsbGT4dJzHmB/0
where the two queries from the OP produce equal results, and only your CTE produces the "big" result.
In summary, it seems to me the issue reported by the OP is indeed not a performance issue but, worse, a parsing/compilation issue.
from duckdb.
Ok yea, so this is an issue with our Parser most likely. For some reason DuckDB detects a correlated subquery when in reality this is not a correlated subquery.
It will be fixed, but not before v1.0.0 I'm afraid. If you decide to stick to CTE's instead of the subquery solution mentioned in (#11820 (comment)), you can actually run the CTE as a materialized CTE. This way it is a run once, use multiple times
I would leave this issue open though, that way I can remember to come back to it.
from duckdb.
Thanks for the detailed write up! I'll stick to unnesting arrays in CTEs in the future instead of cross joining them.
from duckdb.
TL;DR Ah yes, technically this isn't a cross product, sorry for the confusion. My explanation for why isn't great, but it has to do with correlated subqueries and the unnest call. I can spend some more time refining it, but my advice is to stick with CTE's in this case, or the following query
SELECT t3.item_id
FROM t1
INNER JOIN t2 ON t2.id = t1.id
INNER JOIN
(select id, UNNEST(COALESCE(item_ids, [])) item_id from t2) u ON u.id = t1.id
INNER JOIN t3 on t3.item_id = u.item_id;
Ah ok yea, technically the two should be equivalent, sorry for the confusion. In the query
SELECT t3.item_id
FROM t1
INNER JOIN t2 ON t2.id = t1.id,
UNNEST(COALESCE(t2.item_ids, [])) u(item_id)
INNER JOIN t3 on t3.item_id = u.item_id;
the unnest(coalesce(t2.item_ids,[])) u(item_id)
refers to the same read of table t2
from the first inner join, so it isn't really a cross product. For every value in the join between t1 and t2, another column needs to get added that unnests t2.item_ids
. While this is easy to explain, our planner/optimizer has no statistics about this unnest, so it duplicates the intermediate table, unnests t2.item_ids
in the duplicate and joins it with t3. Then the planner joins the result back to the original join of t1&t2 using an implicit join on the rowid-values of the unnested values of t2.item_ids. This plan is very confusion/bad and is why the query takes so long to execute. The CTE is much easier to optimize for since t2 is read twice, so there is no confusion about when/where to unnest the item_ids in the query plan.
from duckdb.
The problem with "recommend sticking to CTEs" is that (1) it's hard to know in what situations that recommendation applies (2) CTEs come with their own performance problems (e.g. they can't be "run once and use results many times" afaik).
Without having understood the details in your answer, do you agree there is a bug in the current parser/optimiser and do you think that'll be fixed?
Or do you think duckdb will define its interpretation of the OP's query as an accepted deviation from the PostgreSQL dialect?
from duckdb.
Related Issues (20)
- cannot define a macro inside a cursor - Cross catalog dependencies are not supported. HOT 5
- Why is the db.file size still large after deleting data or dropping tables? HOT 19
- Inconsistent/ unexpected error when using strptime() in combination with format-list option HOT 1
- `get_table_names` erroring on query using `generate_series` in CTE, but executes it without issue
- Aggregate Regression function SLOPE and INTERCEPT seem to produce wrong results HOT 8
- read_csv invalidates DuckDB instance HOT 8
- Polars example raises InvalidInputException error HOT 2
- Parquet hive partition write error with OVERWRITE_OR_IGNORE - Deleting other partition data HOT 3
- read_parquet defaults hive_partitioning to auto, not false HOT 2
- ATTACH does not show new table in attached DB on DBeaver HOT 2
- Segmentation fault during PIVOTING table (CLI) using struct field HOT 3
- Entry type not supported in PhysicalCopyDatabase HOT 3
- BM25 matching scores seems to be invalid HOT 1
- read_csv with with store_rejects=true does not work as intented with constraints HOT 4
- DuckDB goes into fatal mode when it fails to write parquet data in a mounted file system HOT 2
- Issue with Python duckdb.read_csv not working with 0.10.2+ HOT 10
- Treatment of interval type in Python HOT 1
- cli: Stuck at the intialization HOT 14
- No implicit cast to `TIMESTAMP_MS` from `DATE`
- wrong value reported in last_modified (read_text, read_blob), probably TZ issue
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from duckdb.