Checks <input type="c

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

So I just discovered that <div class="snippet-clipboard-content notranslate positi

Got a complete query <a class="user-mention notranslate" data-hovercard-type="use

ShapeError: filter's length: 155 differs from that of the series: 0 about polars HOT 9 CLOSED

david-waterworth commented on July 1, 2024

ShapeError: filter's length: 155 differs from that of the series: 0

from polars.

Comments (9)

david-waterworth commented on July 1, 2024 1

@ritchie46 See attached - the dataframe from df = pl.read_database_uri(...) contains chunks of size 155 which matches the error message, I wrote out 2 columns using write_parquet("tasks.parquet", row_group_size=155) which I assume preserves the chunks.

To reproduce

pl.read_parquet("tasks.parquet").filter(~pl.col("task_id").is_in([1]))

ShapeError: filter's length: 155 differs from that of the series: 0

tasks.parquet.zip

(I think the json encoding was a furphy but I included that column)

from polars.

ritchie46 commented on July 1, 2024

Thanks for the report. Would love to get a repro on this.

from polars.

david-waterworth commented on July 1, 2024

Sorry, saving the dataframe (to parquet or json) and then reloading it "fixes" the issue otherwise I'd be happy to privately share the data. I can reproduce at will on both my Ubuntu 22.04 workstation and M3 Macbook Pro

If I have time I'll try spinning up a postgres docker image and see if I can create a simpler reproduction.

I also couldn't work our which specific version introduced the issue but it's not present in 0.20.26

from polars.

david-waterworth commented on July 1, 2024

So I just discovered that

df = pl.read_database_uri(...)
df.with_columns(pl.col("status").str.json_decode().filter(pl.col("id")==1)

fails with the error above and

df = pl.read_database_uri(...)
df.rechunk().with_columns(pl.col("status").str.json_decode().filter(pl.col("id")==1)

works - I'm assuming that's why saving/loading also "fixes" it.

Although I found I had to add rechunk in multiple points in the query to make it work for every query (in general before any filter operation, so when I filtered twice I had to rechunk twice?)

from polars.

ritchie46 commented on July 1, 2024

What is the schema? I think if you create a chunked dataframe with the same schema you should be able to create a repro.

from polars.

Elvynzs commented on July 1, 2024

I think I have an issue that may share similarities with this one : After concatenating dataframes, some operations raise a ShapeError. My issue is also "fixed" by writing the dataframe to parquet and loading back, or by using rechunk.

I am not sure if this is same underlying issue, however it provides an easy repro : #16516

from polars.

cmdlineluser commented on July 1, 2024

I can reproduce the error with the attached file.

I'm not sure if it is the same issue, but I also get a panic trying to rewrite it to disk:

>>> pl.read_parquet("Downloads/tasks.parquet").write_parquet("1.parquet")
thread 'polars-0' panicked at crates/polars-arrow/src/array/struct_/mod.rs:117:52:
called `Result::unwrap()` on an `Err` value: ComputeError(ErrString("The children must have an equal number of values.\n                         However, the values at index 7 have a length of 161, which is different from values at index 0, 0."))

from polars.

david-waterworth commented on July 1, 2024

Also note join on chunked dataframes is also raising errors, i.e.

thread 'polars-9' panicked at crates/polars-ops/src/chunked_array/gather/chunked.rs:84:5:
assertion `left == right` failed: implementation error
  left: 1
 right: 12
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

Edit: I'm seeing this on v0.30.26 as well

from polars.

ritchie46 commented on July 1, 2024

~~Got a complete query @david-waterworth ?~~

from polars.

ShapeError: filter's length: 155 differs from that of the series: 0 about polars HOT 9 CLOSED

Comments (9)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent