Comments (6)
Hm, indeed I have no obvious solution here (besides doing the statistical work outside of seaborn). I could probably whip up something for the objects interface though (which is probably the way to go; I believe that the changes to the old interface would need to be somewhat heavy for this to work).
from seaborn.
I've made plots like these before, IIRC the upstream data transform can be done in a pandas one-liner; I'm not sure why the linked stackoverflow question makes it seem so complicated.
But I am 👎 on adding this in seaborn; even though the data transform is straightforward, exposing a specification API that is sufficiently general for all experimental designs where you would want to use it is not.
Agree it could probably be supported through a plugin stat object.
Thanks for the suggestion.
from seaborn.
@henrymj Ok, so I built this for the object interface (it is a bit ugly, sorry) :
import numpy as np
import pandas as pd
import seaborn.objects as so
from seaborn._stats.aggregation import Est
class CMEst(Est):
def __call__(
self, data, groupby, orient, scales,
):
var = {"x": "y", "y": "x"}[orient]
means_indiv = data.groupby([orient,"id_var"]).agg(np.mean)[var]
means_all = data.groupby(orient).agg(np.mean)[var]
num_cond = None
def normalize_df(df):
nonlocal num_cond
if num_cond is None:
num_cond = len(df)
new_df = df.copy()
i = df.reset_index().loc[0,[orient,"id_var"]]
partial_mean = means_indiv[i[orient],i["id_var"]]
full_mean = means_all[i[orient]]
new_df[var] = new_df[var] - partial_mean + full_mean
return new_df
data = data.groupby([orient,"id_var"],group_keys=False).apply(normalize_df)
def adjust_variance(df):
df[var] = df[var].mean() + np.sqrt(num_cond/(num_cond-1)) * (df[var] - df[var].mean())
return df
data = groupby.apply(data,adjust_variance)
return super().__call__(data, groupby, orient, scales)
This is close to a drop-in replacement of so.Est(), except that you need to specify an id_var column which represents the individual you are considering. The example from your stackexchange post would look like :
df = pd.read_csv("DemoWS-30x2.csv")
# Busy work for converting to long-form
df["index"] = df.index
df = pd.wide_to_long(df,[f"activation.{i}" for i in range(1,31)],i="index",j="condition",sep=".").reset_index()
df["index"] = df.index
df = pd.wide_to_long(df,"activation",i="index",j="timepoints",sep=".").reset_index()
p = (
so.Plot(data=df, x="timepoints", y="activation", color="condition")
.add(so.Band(), CMEst(),id_var="id")
#.add(so.Band(),so.Est())
.add(so.Line(marker="o"), so.Agg())
.scale(color=so.Nominal(["red","blue"]))
)
p.show()
giving (there are some differences with the provided example though, I believe it might be due to the way the CI is computed, but don't hesitate to check the code as I may have made a mistake)
from seaborn.
This seems a bit specific to your domain. Is there anything preventing you from passing a callable to the errorbar parameter, which allows you to define how the error is computed ?
from seaborn.
My impression is that this would be a beneficial feature to almost also social science domains (and any clinical work that compares post-treatments to baselines), but I might be wrong.
I looked into customizable error bar callables using the old API (I haven't migrated yet), but at least in the documentation it appears that the callables should only expect a 1D vector of data, and thus wouldn't be able to make adjustments that require richer vectors with information like a individual identifier. However, it's possible I've just been looking in the wrong place and/or haven't understood how flexible the callables can be.
Here's an example that illustrates the stroop example above - if someone were able to provide a flexible solution to it, I'd be very happy to withdraw the issue!
import numpy as np
import pandas as pd
import seaborn as sns
np.random.seed(0)
# getting group average RTs for the 2 conditions
congruent_rts = np.random.normal(250, 50, 30)
incongruent_shift = np.random.normal(25, 10, 30)
incongruent_rts = congruent_rts + incongruent_shift
df = pd.DataFrame({
'Condition': ['Congruent'] * 30 + ['Incongruent'] * 30,
'subject': np.repeat(range(30), 2),
'RT': np.concatenate([congruent_rts, incongruent_rts])
})
_ = sns.barplot(x='Condition', y='RT', data=df) # barplots will mask the difference between conditions
from seaborn.
Thank you so much @thuiop! I'll test out this approach. I really appreciate you taking the time to generate this - hopefully other social scientists will also benefit from this example.
from seaborn.
Related Issues (20)
- plt.axvspan does not work with lmplot
- Legend Overlaps with X-Axis Labels After Using move_legend in Seaborn HOT 2
- Feature Request: Allow `multiple` parameter in `sns.countplot` HOT 2
- Feature Request: Move x-axis labels to the top in PairGrid HOT 3
- Kwargs can't be passed from seaborn to matplotlib HOT 1
- Incorrect plotting of exactly overlapping scatter with `hue` and `hue_order` HOT 3
- Adding plot type metadata to support textual figure descriptions HOT 2
- TypeError: Cannot interpret 'Float64Dtype()' as a data type with pandas extension types HOT 2
- Consider adding mpl.Figure as an optional argument to subplot functions / classes. HOT 1
- AttributeError: module 'numpy' has no attribute 'typeDict'
- categorical plots - unused categories mess up element spacing and width HOT 4
- Customizing context or style as in seaborn as in matplotlib? HOT 1
- Introducing `seaborn_objects_recipes` Library HOT 2
- secondary y axis is messing up the y-axis labels HOT 1
- Adding precomputed errorbars to pointplot HOT 1
- Seaborn with scanpy and statannotations HOT 1
- Linear regression constrained to pass through origin HOT 2
- Dual axis plot broken in 0.13.2 (vs. 0.12.2) HOT 2
- Controlling color of both line and bar at same time using object interface HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from seaborn.