Comments (2)
A join_asof
should accomplish this for you as a LEFT ANY JOIN
. You may have to play with some of the parameters depending on your use case.
import polars as pl
df1 = pl.DataFrame({
"key": [1, 2, 3],
"a": [1, 2, 3],
}).set_sorted("key")
df2 = pl.DataFrame({
"key": [1, 2, 2, 3, 3],
"a": [1, 2, 3, 4, 5],
}).set_sorted("key")
print(df1.join_asof(df2, on="key"))
shape: (3, 3)
┌─────┬─────┬─────────┐
│ key ┆ a ┆ a_right │
│ --- ┆ --- ┆ --- │
│ i64 ┆ i64 ┆ i64 │
╞═════╪═════╪═════════╡
│ 1 ┆ 1 ┆ 1 │
│ 2 ┆ 2 ┆ 3 │
│ 3 ┆ 3 ┆ 5 │
└─────┴─────┴─────────┘
For a RIGHT ANY JOIN
you can swap the dataframes, as polars does not support a right-join right now, although there is an open issue for this feature.
In the general case, you can always call df1.join(df2.unique(subet=join_key), ...)
to accomplish your task.
from polars.
I don't think asof_join
is a valid workaround for my question, namely because it has tolerance, and requires sorted dataframes.
For example, the following would return a non-empty dataframe, while in proposal:
ANY INNER
should return empty dataframeANY LEFT
should return emptya_right
.
import polars as pl
df1 = pl.DataFrame({
"key": [1, 10, 20],
"a": [1, 2, 3],
}).set_sorted("key")
df2 = pl.DataFrame({
"key": [2, 8, 9, 13, 18],
"a": [1, 2, 3, 4, 5],
}).set_sorted("key")
print(df1.join_asof(df2, on="key"))
>>> print(df1.join_asof(df2, on="key"))
shape: (3, 3)
┌─────┬─────┬─────────┐
│ key ┆ a ┆ a_right │
│ --- ┆ --- ┆ --- │
│ i64 ┆ i64 ┆ i64 │
╞═════╪═════╪═════════╡
│ 1 ┆ 1 ┆ null │
│ 10 ┆ 2 ┆ 3 │
│ 20 ┆ 3 ┆ 5 │
└─────┴─────┴─────────┘
from polars.
Related Issues (20)
- Lazy frame casted from numpy fails on streaming, `cannot reshape empty array into shape ( -1, 1000)`
- RecordBatch requires all its arrays to have an equal number of rows when running pipeline in streaming mode HOT 2
- Singular value optimization HOT 2
- `.join_where()` should error when `_right` does not exist.
- `test_delta_encoding_roundtrip` hypothesis test failure in CI HOT 1
- read_excel schema issues HOT 3
- `pl.date_range` doesn't support `date` subclasses with 1.6+ HOT 2
- Inappropriate result of expr skew and kurtosis for a magic constant inputs HOT 1
- get crc32 of a column
- Filter is not pushed down through joins HOT 2
- Remove stale `memory_map` option in `scan_ipc` HOT 1
- Change methods' type hinting to return "Self" instead of "DataFrame" HOT 3
- `pl.concat` PanicException instead of ShapeError with LazyFrame
- Consistency between default interpolation for `median` and `quantile`
- Spurious CI failure `test_lit_datetime_subclass_w_allow_object`
- Optimize kurtosis, skew, mean by making moments, mean CSE
- Change `join_where` to not allow ambiguous naming but do allow interchangeable order HOT 3
- Out of bounds error for certain string values while writing/reading IPC file
- Support lazy schema retrieval in IO Plugins HOT 5
- Altair incompatibility HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from polars.