Comments (4)
The file is not corrupted, it's simply not in parquet format because how could duckdb possibly know that you wanted it in parquet format? You need to add (FORMAT PARQUET)
to your COPY
statement if you want to write a parquet file but don't want to use a parquet file ending.
As to duckdb getting this right automatically: personally I'd be more surprised if it didn't, but feel free to make a PR at the documentation repository (note that each page of the documentation website has a link at the bottom to the corresponding source in GitHub, e.g. https://duckdb.org/docs/sql/statements/copy.html links to https://github.com/duckdb/duckdb-web/edit/main/docs/sql/statements/copy.md)
from duckdb.
The file is not corrupted, it's simply not in parquet format because how could duckdb possibly know that you wanted it in parquet format? You need to add
(FORMAT PARQUET)
to yourCOPY
statement if you want to write a parquet file but don't want to use a parquet file ending.
Thanks, I had forgot about this FORMAT statement, feeling stupid now...
As to duckdb getting this right automatically: personally I'd be more surprised if it didn't, but feel free to make a PR at the documentation repository (note that each page of the documentation website has a link at the bottom to the corresponding source in GitHub, e.g. https://duckdb.org/docs/sql/statements/copy.html links to https://github.com/duckdb/duckdb-web/edit/main/docs/sql/statements/copy.md)
Thanks, I like this feature but it feels like this kind of guard code should always be under the responsibility of the user. But now I know it's not required when working with duckdb :)
from duckdb.
It's not required in this instance. It might fail in more complex problems, and I agree with you that it's bad practice to overwrite an input file, especially when using a declarative system like duckdb.
from duckdb.
Oooof, so I guess I should remove my pull request...
Really thought it was an actual feature
from duckdb.
Related Issues (20)
- DuckDB stack overflows with crafted TIMESTAMP casting
- DuckDB crashes with the crafted LIST_RESIZE function expression
- DuckDB crashes with the crafted `RESERVOIR_QUANTILE` expression (internal error in nightly build)
- Excel extension crashes with a crafted `TEXT` expression
- DuckDB crashes with a crafted ARRAY expression HOT 1
- DuckDB crashes via a crafted `LIST_WHERE` expression
- Serveral internal errors found in nightly build HOT 2
- DuckDB needs more than 11GB memory to process INSERT OR UPDATE statements
- [Fuzzer] DuckDB crashes via the `REPEAT` function with crafted arguments
- Failed to build benchmark "ModuleNotFoundError: No module named 'package_build'" HOT 8
- DuckDB erroneously pushes incomplete results to S3 following query failure
- INTERNAL Error: Time With Time Zone logicalType is set but unit is not defined HOT 2
- Query returns inconsistent results HOT 9
- describe in CTE
- sniff_csv: the Columns value cannot be parsed using DuckDB if header names contain single quote
- Spatial & Iceberg extensions do not autoload HOT 6
- Corrupt Parquet Metadata (migrating from 0.10.0 to 0.10.1) HOT 1
- Slow performance of duckdb_stream_fetch_chunk HOT 6
- Creating DuckDB in-memory connection throws exception [Java API] HOT 2
- Disabling enable_external_access does not disable replacement scans HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from duckdb.