Comments (3)
Came across this after looking for info on type hints in Polars (via the issue you referenced). Off the top of my head, I don't know that there wouldn't be cases where the type resulting from certain operations would be unknown if the argument types are known (apart from maybe some uses of map_elements
or operations on Object
types, but haven't thought very far on this).
I think your enhancement idea is a worthwhile endeavor (though I admit I am not very familiar with the Rust code base myself). It did however get me thinking more about typing in Polars and has made me wonder: why are all expressions the same "type" anyway (i.e., Expr
)?
It seems it would actually be useful to have multiple subclasses of Expr
that are specific to each of the native types (e.g., something like StringExpr
, Float64Expr
, and so forth). This way, a chain of methods like:
pl.col('some_floats').cast(pl.String)
would have a type that explicitly describes the types of its elements (e.g., StringExpr
) which would allow direct introspection without the need for a meta
accessor. What's more, you wouldn't be permitted to do something like pl.col('some_floats').cast(pl.String).round(2)
because no such method would exist, thereby minimizing run-time errors maybe.
And of course, this would be a major improvement for type hinting with functions such as this:
def foo(a: pl.Expr, b: pl.Expr) -> pl.Expr:
return pl.when(a.round(0) == 4, b.str.strip_chars()).then(200).otherwise(900)
Where it would be helpful to have a signature more like:
def foo(a: pl.Float64Expr, b: pl.StringExpr) -> pl.Int32Expr:
It may be that I am missing some critical piece that would show why this cannot be done, but from what I can tell, it seems like something that could help to clarify types (which seems to fit well with Polars' already excellent treatment of types and type consistency) and I would be interested in seeing something that at least takes things in this direction. Curious if you have any thoughts.
from polars.
Came across this after looking for info on type hints in Polars (via the issue you referenced). Off the top of my head, I don't know that there wouldn't be cases where the type resulting from certain operations would be unknown if the argument types are known (apart from maybe some uses of
map_elements
or operations onObject
types, but haven't thought very far on this).I think your enhancement idea is a worthwhile endeavor (though I admit I am not very familiar with the Rust code base myself). It did however get me thinking more about typing in Polars and has made me wonder: why are all expressions the same "type" anyway (i.e.,
Expr
)?It seems it would actually be useful to have multiple subclasses of
Expr
that are specific to each of the native types (e.g., something likeStringExpr
,Float64Expr
, and so forth). This way, a chain of methods like:
pl.col('some_floats').cast(pl.String)
would have a type that explicitly describes the types of its elements (e.g.,StringExpr
) which would allow direct introspection without the need for ameta
accessor. What's more, you wouldn't be permitted to do something likepl.col('some_floats').cast(pl.String).round(2)
because no such method would exist, thereby minimizing run-time errors maybe.And of course, this would be a major improvement for type hinting with functions such as this:
def foo(a: pl.Expr, b: pl.Expr) -> pl.Expr: return pl.when(a.round(0) == 4, b.str.strip_chars()).then(200).otherwise(900)
Where it would be helpful to have a signature more like:
def foo(a: pl.Float64Expr, b: pl.StringExpr) -> pl.StringExpr:
It may be that I am missing some critical piece that would show why this cannot be done, but from what I can tell, it seems like something that could help to clarify types (which seems to fit well with Polars' already excellent treatment of types and type consistency) and I would be interested in seeing something that at least takes things in this direction. Curious if you have any thoughts.
Perhaps this is best raised in a separate issue? Though type hinting is a very important tool and this kind of feature would be great, having runtime access to column dtypes unlocks new capabilities that are otherwise impossible/impractical with the current Polars API.
from polars.
Thanks @butterlyn. I decided to write this out in more detail in #18082 and looking to get some feedback.
from polars.
Related Issues (20)
- Pandas.Timestamp timezone and unit gets modified when creating new Polars Dataframe HOT 2
- read_csv trimming white space on first line when file has no headers and is single column HOT 3
- Pandas DataFrames with a multiindex fails to be converted to a Polars DataFrame HOT 4
- polars.when(<condition>) treats False as True HOT 8
- Incorrect statistics are loaded from parquet containing structs
- Calling `to_deserializer` with `filter = None` results in a panic HOT 1
- Series is not an instance of Sequence HOT 5
- DataFrame rendering broken on Android HOT 2
- Predicates with `&` where one side is `is_in(Series)` are no longer pushed down to pyarrow scan HOT 1
- Do not cast to supertypes for `is_between` HOT 1
- Unable to read empty dataframe from Parquet without dictionary
- Throw Exception Reading > UInt32 Row Datasets
- Add DRY equivalent for Series HOT 8
- Unexpected results occur when using the head(1) expression in the select method.
- Add cumulative n_unique HOT 1
- Series constructor with NumPy datetime64 scalar returns incorrect results
- Request for new method argmax_horizontal HOT 4
- `pl.struct` inside `list.eval` produces different dtype if fields are named
- Add `Expr.scatter` HOT 3
- Example code for `group_by` iteration is wrong HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from polars.