Giter VIP home page Giter VIP logo

Comments (4)

Jolanrensen avatar Jolanrensen commented on August 15, 2024

This is a very interesting bug!

@Test
fun `describe twice 1`() {
    val df = dataFrameOf("a", "b")(1, 2, 3, 4)
    val desc1 = df.describe()
    val desc2 = desc1.describe()
    desc2::class shouldBe DataFrameImpl::class
}

works fine, but

@Test
fun `describe twice 2`() {
    val df = dataFrameOf("a", "b")(1, "foo", 3, "bar")
    val desc1 = df.describe()
    val desc2 = desc1.describe()
    desc2::class shouldBe DataFrameImpl::class
}

breaks.

I suspect this is due to columns being created with type Comparable<*>/Comparable<Nothing> after running describe():

   name:String type:Any count:Int unique:Int nulls:Int top:Comparable<*> freq:Int mean:Double? std:Double? min:Comparable<*> median:Comparable<*> max:Comparable<*>
 0           a      Int         2          2         0                 1        1          2.0    1.414214                 1                    2                 3
 1           b   String         2          2         0               foo        1         null        null               bar                  bar               foo

If you now run another describe() on this table, it will try to find the min of columns like top and compare Int and String. However, these two are incomparable, as we can see by the exception.

Our current implementation only checks if a column AnyCol.isComparable() = isSubtypeOf<Comparable<*>?>(), not whether the type T != Nothing. I'm not sure we can actually.

from dataframe.

koperagen avatar koperagen commented on August 15, 2024

Our current implementation only checks if a column AnyCol.isComparable() = isSubtypeOf<Comparable<*>?>(), not whether the type T != Nothing. I'm not sure we can actually.

Maybe typeOf<Comparable<Any?>?>() will work as expected here. I suspect this code was written with an assumption that * means Any?, but Comparable has in variance and Comparable<*> == Comparable<Nothing> and from type system perspective you can't compare two Comparable<Nothing>
image
image

from dataframe.

Jolanrensen avatar Jolanrensen commented on August 15, 2024

@koperagen thanks for the tip! But unfortunately:

typeOf<Int>().isSubtypeOf(typeOf<Comparable<Any?>>()) == false
typeOf<Int>().isSubtypeOf(typeOf<Comparable<Any>>()) == false

variance is fun :)

from dataframe.

Jolanrensen avatar Jolanrensen commented on August 15, 2024

It can be fixed like:

/**
 * Returns `true` if [this] column is comparable, i.e. its type is a subtype of [Comparable] and its
 * type argument is not [Nothing].
 */
public fun AnyCol.isComparable(): Boolean = isSubtypeOf<Comparable<*>?>()
    && type().projectTo(Comparable::class).arguments[0].let {
        it != KTypeProjection.STAR &&
            it.type?.isNothing != true
    }

I'll probably make a PR later :)

from dataframe.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.