Comments (1)
Hmm, good question!
So in geopandas, we discussed this on the PR and in the end decided (for now) to error when you specify a column selection without the geometry column (so eg geopandas.read_parquet(.., columns=['pop_est'])
, which would otherwise result in a normal pandas dataframe and not a geopandas GeoDataFrame).
So under the hood, that is happening here. The
ddf = dask_geopandas.read_parquet("test/test_*.pq")
ddf['pop_est']
gets translated under the hood to
ddf = dask_geopandas.read_parquet("test/test_*.pq", columns=['pop_est'])
to make use of the fact that we can read columns selectively with the parquet file format.
Of course, the user should not directly be aware of this. So indeed, with dask_geopandas, we should somehow be able to fall back to the plain dask reader when the "pushed down" column selection doesn't include a geometry column.
from dask-geopandas.
Related Issues (20)
- Unpin sphinx-book-theme HOT 1
- Add support for Pandas 2.0.0 `dtype_backend` argument in `read_feather`
- read parquet from s3 failing with 'GeoArrowEngine' has no attribute 'extract_filesystem' HOT 3
- 0.3.1 release HOT 2
- dtype('O') not supported since geopandas 0.13.0
- Spatial_shuffle() can result in ArrowTypeError when using pyarrow 12 HOT 5
- FeatureError from filegdbtable.cpp when reading file HOT 2
- Drop distributed as a required dependency? HOT 1
- Question regarding parallelism over many seperate GeoSeries HOT 2
- dask geopandas to parquet does not seem to persist spatial paritions HOT 1
- Can someone answer why the number and x columns of '201105. shp' in the output of this code also become 0? HOT 1
- msgpack - ValueError: 2369781118 exceeds max_bin_len(2147483647 HOT 1
- Remove dask anti-pattern example on README and docs HOT 1
- DeprecationWarning: underlying geometries through the `.data` attribute is deprecated HOT 1
- Error when reading geoparquet file HOT 3
- Support latest dask.dataframe with query planning (dask-expr) HOT 2
- ddf._meta_nonempty doesnt instantiate correctly when calling `from_dask_dataframe` HOT 1
- BUG: `to_parquet()` failing with `dask=2024.4.1` HOT 2
- Uninformative AttributeError for aggregation methods
- AttributeError: 'DataFrame' object has no attribute 'within' HOT 9
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dask-geopandas.