Giter VIP home page Giter VIP logo

Comments (6)

thuiop avatar thuiop commented on August 16, 2024 2

Hm, indeed I have no obvious solution here (besides doing the statistical work outside of seaborn). I could probably whip up something for the objects interface though (which is probably the way to go; I believe that the changes to the old interface would need to be somewhat heavy for this to work).

from seaborn.

mwaskom avatar mwaskom commented on August 16, 2024 1

I've made plots like these before, IIRC the upstream data transform can be done in a pandas one-liner; I'm not sure why the linked stackoverflow question makes it seem so complicated.

But I am 👎 on adding this in seaborn; even though the data transform is straightforward, exposing a specification API that is sufficiently general for all experimental designs where you would want to use it is not.

Agree it could probably be supported through a plugin stat object.

Thanks for the suggestion.

from seaborn.

thuiop avatar thuiop commented on August 16, 2024 1

@henrymj Ok, so I built this for the object interface (it is a bit ugly, sorry) :

import numpy as np
import pandas as pd
import seaborn.objects as so

from seaborn._stats.aggregation import Est

class CMEst(Est):
    def __call__(
        self, data, groupby, orient, scales,
    ):
        var = {"x": "y", "y": "x"}[orient]
        means_indiv = data.groupby([orient,"id_var"]).agg(np.mean)[var]
        means_all = data.groupby(orient).agg(np.mean)[var]
        num_cond = None
        def normalize_df(df):
            nonlocal num_cond
            if num_cond is None:
                num_cond = len(df)
            new_df = df.copy()
            i = df.reset_index().loc[0,[orient,"id_var"]]
            partial_mean = means_indiv[i[orient],i["id_var"]]
            full_mean = means_all[i[orient]]
            new_df[var] = new_df[var] - partial_mean  + full_mean
            return new_df
        data = data.groupby([orient,"id_var"],group_keys=False).apply(normalize_df)
        def adjust_variance(df):
            df[var] = df[var].mean() + np.sqrt(num_cond/(num_cond-1)) * (df[var] - df[var].mean())
            return df
        data = groupby.apply(data,adjust_variance)
        return super().__call__(data, groupby, orient, scales)

This is close to a drop-in replacement of so.Est(), except that you need to specify an id_var column which represents the individual you are considering. The example from your stackexchange post would look like :

df = pd.read_csv("DemoWS-30x2.csv")
# Busy work for converting to long-form
df["index"] = df.index
df = pd.wide_to_long(df,[f"activation.{i}" for i in range(1,31)],i="index",j="condition",sep=".").reset_index()
df["index"] = df.index
df = pd.wide_to_long(df,"activation",i="index",j="timepoints",sep=".").reset_index()
p = (
    so.Plot(data=df, x="timepoints", y="activation", color="condition")
    .add(so.Band(), CMEst(),id_var="id")
    #.add(so.Band(),so.Est())
    .add(so.Line(marker="o"), so.Agg())
    .scale(color=so.Nominal(["red","blue"]))
)
p.show()

giving (there are some differences with the provided example though, I believe it might be due to the way the CI is computed, but don't hesitate to check the code as I may have made a mistake)
Figure_1

from seaborn.

thuiop avatar thuiop commented on August 16, 2024

This seems a bit specific to your domain. Is there anything preventing you from passing a callable to the errorbar parameter, which allows you to define how the error is computed ?

from seaborn.

henrymj avatar henrymj commented on August 16, 2024

My impression is that this would be a beneficial feature to almost also social science domains (and any clinical work that compares post-treatments to baselines), but I might be wrong.

I looked into customizable error bar callables using the old API (I haven't migrated yet), but at least in the documentation it appears that the callables should only expect a 1D vector of data, and thus wouldn't be able to make adjustments that require richer vectors with information like a individual identifier. However, it's possible I've just been looking in the wrong place and/or haven't understood how flexible the callables can be.

Here's an example that illustrates the stroop example above - if someone were able to provide a flexible solution to it, I'd be very happy to withdraw the issue!

import numpy as np
import pandas as pd
import seaborn as sns

np.random.seed(0)

# getting group average RTs for the 2 conditions
congruent_rts = np.random.normal(250, 50, 30)
incongruent_shift = np.random.normal(25, 10, 30)
incongruent_rts = congruent_rts + incongruent_shift

df = pd.DataFrame({
    'Condition': ['Congruent'] * 30 + ['Incongruent'] * 30,
    'subject': np.repeat(range(30), 2),
    'RT': np.concatenate([congruent_rts, incongruent_rts])
})

_ = sns.barplot(x='Condition', y='RT', data=df)  # barplots will mask the difference between conditions
Screenshot 2024-07-10 at 1 47 58 PM

from seaborn.

henrymj avatar henrymj commented on August 16, 2024

Thank you so much @thuiop! I'll test out this approach. I really appreciate you taking the time to generate this - hopefully other social scientists will also benefit from this example.

from seaborn.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.