Comments (4)
Similarly to underflows/overflows we could set the values to null
if the strict parameter in cast
is set to false.
from polars.
To clarify, you mean that if you cast a series s
to an enum, and s
contains values that are not in the enum's set of categories, then those values should be converted to null? I like this idea. I think a default
parameter or something of the sort would be nice, which would also allow outside values to take on a default value.
from polars.
yes that is what I meant, sorry if it wasn't clear. Something like this
pl.Series(
['a', 'b', 'd'],
dtype=pl.Enum(['a', 'b', 'c'])
)
>>> ComputeError
pl.Series(
['a', 'b', 'd'],
dtype=pl.Enum(['a', 'b', 'c'], missing_policy='set_to_null') # or default or whatever
)
>>> shape: (3,)
>>>
>>> --
>>> enum
>>> "a"
>>> "b"
>>> null
from polars.
Maybe what I wrote above isn't ideal because maybe this 'missing policy' thing shouldn't be a property of the enum, but of the computation (the cast or otherwise).
from polars.
Related Issues (20)
- from_dicts without strict=false can result in silent data loss with ragged data HOT 1
- Add `polars` to SQL query translation for databases (like `dbplyr`).
- `DataFrame.to_dicts` change str values HOT 3
- assert_frame_equal's error is too verbose
- Crash with parallel cumulative_eval HOT 2
- Adding with_row_index() to a CSV LazyFrame does not always add the column HOT 2
- Categorical cast causes out-of-bounds gather access with string cache turned on
- As of polars 1.8.x (but not 1.7.x), partition_by with as_dict=True strips leading zeros from strings on the partition column HOT 4
- Broadcasting issue HOT 13
- Support datetime arithemetic within lists
- Support arithmetic operations between numeric List Series and a scalar HOT 1
- with_columns(pl.lit("02").alias('TEST')) is truncating leading zeroes of numeric-only strings HOT 4
- Make `ColumnNotFoundError` in `with_columns` as nice as it is in `columns` from `read_csv` HOT 1
- Cannot create literal for pd.Timedelta with allow_object=False
- Issue reading S3 files HOT 1
- `write_ndjson` produces incorrect data that does not match dataframe HOT 1
- Exporting StringView Array to Arrow C Data Interface sets wrong number of buffers HOT 6
- Casting an all null column created with pl.lit(None) with length greater than 1 to another type in the same with_columns or select context fails HOT 1
- Diagonal concat when lazyframe other than first item contains list dtype for a column not in first item causes incompatible types error HOT 1
- Polars scan_iceberg inefficiently loads Iceberg table data into memory leading to out-of-memory errors HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from polars.