Giter VIP home page Giter VIP logo

Comments (9)

stinodego avatar stinodego commented on June 20, 2024 2

We will rename the parameter to reverse in top_k and bottom_k.

from polars.

mcrumiller avatar mcrumiller commented on June 20, 2024 1

Confusing in general. I would expect the descending parameter to change the sorting of the output, not what the "top" values are. We have a bottom_k() function for that already.

from polars.

MarcoGorelli avatar MarcoGorelli commented on June 20, 2024

not what the "top" values are. We have a bottom_k() function for that already.

you can pass multiple values to descending though, so bottom_k isn't a full replacement:

In [44]: df = pl.DataFrame({'a': [1, 2, 3], 'b': [6, 5, 4], 'c': [7,8,9]})

In [45]: df.top_k(k=2, by=['a', 'b'], descending=[True, False])
Out[45]:
shape: (2, 3)
┌─────┬─────┬─────┐
│ a   ┆ b   ┆ c   │
│ --- ┆ --- ┆ --- │
│ i64 ┆ i64 ┆ i64 │
╞═════╪═════╪═════╡
│ 1   ┆ 6   ┆ 7   │
│ 2   ┆ 5   ┆ 8   │
└─────┴─────┴─────┘

from polars.

mcrumiller avatar mcrumiller commented on June 20, 2024

I see. Seems like a single function called extrema would work better, but people will undoubtedly be searching for top_k and bottom_k since those are pretty prevalent in other libraries.

from polars.

MarcoGorelli avatar MarcoGorelli commented on June 20, 2024

Yup...which leaves open the question of "what to do with descending?". I don't know 😄

from polars.

mcrumiller avatar mcrumiller commented on June 20, 2024

A somewhat more explicit but verbose would be a sort_method parameter that's a list of "ascending" and "descending":

import polars as pl

df = pl.DataFrame({
    "a": [1, 1, 2, 2, 3, 3],
    "b": [2, 1, 2, 1, 2, 1],
})

# (a=3, b=2)
df.top_k(k=1, by=pl.all())

# (a=3, b=1)
df.top_k(k=1, by=pl.all(), sort_method=["ascending", "descending"])

# (a=1, b=1)
df.bottom_k(k=1, by=pl.all()) 

# (a=1, b=2)
df.bottom_k(k=1, by=pl.all(), sort_method="descending", "ascending")

from polars.

mcrumiller avatar mcrumiller commented on June 20, 2024

And actually, just renaming descending to sort_descending would be more obvious. And then the bottom_k parameter could be sort_ascending.

from polars.

MarcoGorelli avatar MarcoGorelli commented on June 20, 2024

I don't think it can just be renamed, because then the output of sort_descending=False actually would be sorted descending

I feel like suggesting find_smallest, so then code can just be updated without other changes?

Current behaviour:

In [46]: import polars as pl
    ...: df = pl.DataFrame({'a': [1,3,2]})
    ...: df.top_k(k=3, by='a', descending=False)
Out[46]:
shape: (3, 1)
┌─────┐
│ a   │
│ --- │
│ i64 │
╞═════╡
│ 3   │
│ 2   │
│ 1   │
└─────┘

Proposed new api:

In [46]: import polars as pl
    ...: df = pl.DataFrame({'a': [1,3,2]})
    ...: df.top_k(k=3, by='a', find_smallest=False)
Out[46]:
shape: (3, 1)
┌─────┐
│ a   │
│ --- │
│ i64 │
╞═════╡
│ 3   │
│ 2   │
│ 1   │
└─────┘

from polars.

mtomko avatar mtomko commented on June 20, 2024

I ran across this yesterday in a different context - it looks to me like the underlying Rust code is doing it backwards. Here's where the Python code calls into the Rust, if I understand it correctly:

self._ldf.top_k(k, by, descending, nulls_last, maintain_order)

To me it seems like the Python code reflects the Rust API and is faithfully passing descending through.

I don't like this behavior but it feels like the Python and Rust code should be kept consistent, so my temptation would be to look in to addressing the problem upstream of the Python bindings.

I'm personally not actually using Python at all except as an example (we're looking at the Scala JNI bindings to the Rust backend) but my coworkers would want to use the Python bindings. It would be excellent for us to be able to be on the same page with respect to the API shape.

from polars.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.