Comments (6)
Such as:
.mean()
.quantile()
.median()
.first()
.last()
.min()
.max()
.sum()
.nunique()
from polars.
Status:
- mean
- median
- quantile
- first
- last
- min
- max
- sum
- nunique
- agg to series
- return group indexes
from polars.
I just took this for a spin. I wondered. The there a reason for the .select().sum()
syntax as opposed to the .agg()
in pandas?
from polars.
I just took this for a spin. I wondered. The there a reason for the
.select().sum()
syntax as opposed to the.agg()
in pandas?
No specific reason. The python api closely resembles the Rust api and isn't per se meant to be a copy of Pandas (however familiarity is beneficial).
We could default to all selected columns by default and only aggregate the columns that are summable. That would be a more ergonomic default. groupby("foo").sum()
from polars.
In pandas right now I can do;
df.groupby('user_id').agg(n_sess=('session', 'nunique'), avg_session_len=('session_time', 'mean'))
After grouping I can calculate multiple aggregations and I can be specific. I can specfiy here that I want the number of unique sessions and the average session time. How might that work here?
from polars.
Right! We cannot yet. 😄 This was already requested in #75 . It's coming. I think that we can also support the same syntax on the Python side
Edit:
agg
syntax and multiple aggregations added in cb85513
python
gb_df = df.groupby("a").agg({"b": ["sum", "min"], "c": "count"})
rust
df.groupby("a")?.agg(&[("b", &["n_unique", "sum", "min"]), ("c", &["count"])])
from polars.
Related Issues (20)
- Creating a `DataFrame()` from list of nested dicts replaces values with `null`
- rolling with aggregation involving shift is very slow HOT 8
- Add a `df.to_string(max_number_of_visible_rows)` function HOT 3
- `pl.read_parquet` does not identify `io.IOBase` as a byte stream HOT 2
- Wrong example of map_batches() increases the confusion HOT 1
- Misleading error message in read_excel method HOT 2
- Is_between push down implementation in scan_pyarrow_dataset is wrong ('<' and '>' signs are exchanged)
- `.is_in` method not compliant with `sink_parquet` in polars > 0.20.19
- Streaming pipeline runs out of memory HOT 4
- `pivot()` doesn't distinguish between `count` and `len` HOT 1
- `separator` parameter in `pivot()` has no effect HOT 1
- BUG: LazyFrame neglects head(0) if a sort follows HOT 3
- PanicException using `qcut` with `include_breaks=True` and `.over` HOT 5
- panic on forward/backward fill on struct HOT 1
- More updates to PyO3 0.21 Bound<> APIs
- Add ability to specify decimal separator in Polars SQL `read_csv` table function.
- `read_csv` change `decimal_float` parameter to `decimal_separator` HOT 2
- Add `background` option to `collect_all`
- Support referencing columns in SQL `ORDER BY` clauses by ordinal value
- Add `dt.replace` functions to specify specific day, month, year, etc. for `date` and `datetime` HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from polars.