Comments (4)
it is a problem, delta table don't like this, anyway I report it earlier
#11931
from duckdb.
Seems that fastparquet doesn't spot the issue.
df = pd.read_parquet("sample.parquet", engine='pyarrow') # Exception
df = pd.read_parquet("sample.parquet", engine='fastparquet') # Runs fine
from duckdb.
Thanks for reporting, can confirm this is an issue
from duckdb.
Just to add another use case where I believe the same issue manifests:
AWS Glue Crawlers create invalid tables with duplicate fields (regular and partition) when crawling parquet files produced and partitioned with DuckDB. Such tables cause error when we tried to read them with AWS Athena.
from duckdb.
Related Issues (20)
- Very slow performance for query targeting partitioned parquet files with WHERE clause for values inside another duckdb table
- insert with on conflict fails if column has index
- Unexpected error raised by PRINTF HOT 1
- ERROR: duckdb.duckdb.Error: Failed to execute query "UPDATE HOT 1
- DuckDB.jl `execute` doesn't like non-null terminated strings HOT 1
- NaN behaves erratically HOT 7
- build error for fuction "strerror_r" in extention jemalloc HOT 2
- Can't stream record batches from the same cursor that I'm using to insert rows HOT 5
- struct: no support for UPDATE? HOT 2
- DuckDB CLI gives Segmention fault while querying data from db file placed on aws s3 bucket, when setting more than one thread HOT 1
- `fetchnumpy()` returns non-NumPy objects (i.e., pandas.core.arrays.categorical.Categorical)
- Misleading error message in correlated subquery
- INTERNAL Error caused by INNER JOIN
- Result returned for incorrect sub-query HOT 5
- Binder does not apply hive partition filtering
- Documentation for read_csv function claims header is false by default, but isn't HOT 6
- Conversion Error: Overflow exception in date/time -> timestamp conversion HOT 9
- read_csv cannot skip lines on the basis of comment characters
- Significant performance degradation when sorting strings with common prefix HOT 1
- Conversion Error: Malformed JSON HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from duckdb.