Comments (4)
Hi @paul-hawkins ,
It seems like DABEST only allows paired tests between two sets of data, while paired comparisons can be carried out on three or more sets of data.
This is half-correct: paired comparisons can only be done on pairs of data. For instance, using the data you posted in #98,
new_df = pd.read_csv('all_ringrmsd_data_only.txt', sep='\t')
# Need to have an ID column so DABEST knows which observations go together.
new_df.rename(columns={"Unnamed: 0": "id"}, inplace=True)
multi_paired = dabest.load(
new_df,
# Here, we assume OMEGA and MOE are a set of repeated measures,
# while Macromodel and Desmond are a second, unrelated set of repeated measures.
idx=(('OMEGA','MOE'),
('Macromodel', 'Desmond')),
id_col="id", paired=True)
multi_paired.cohens_d.plot();
DABEST v0.3.0
=============
Good afternoon!
The current time is Fri Apr 24 16:12:45 2020.
The paired Cohen's d between OMEGA and MOE is 0.182 [95%CI 0.0781, 0.285].
The p-value of the two-sided permutation t-test is 0.0014.
The paired Cohen's d between Macromodel and Desmond is 0.314 [95%CI 0.179, 0.448].
The p-value of the two-sided permutation t-test is 0.0.
5000 bootstrap samples were taken; the confidence interval is bias-corrected and accelerated.
The p-value(s) reported are the likelihood(s) of observing the effect size(s),
if the null hypothesis of zero difference is true.
For each p-value, 5000 reshuffles of the control and test labels were performed.
To get the results of all valid statistical tests, use `.cohens_d.statistical_tests`
DABEST paired analysis design insists that none of the groups are repeated more than once.
multi_paired_neg = dabest.load(new_df,
idx=(('OMEGA','MOE'),
('OMEGA', 'Desmond')),
id_col="id", paired=True)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-16-53d5efeaf2d7> in <module>()
5 ('OMEGA', 'Desmond')),
6
----> 7 id_col="id", paired=True)
~/anaconda3/envs/dabest-dev-py3.6/lib/python3.6/site-packages/dabest/_api.py in load(data, idx, x, y, paired, id_col, ci, resamples, random_seed)
63 from ._classes import Dabest
64
---> 65 return Dabest(data, idx, x, y, paired, id_col, ci, resamples, random_seed)
~/anaconda3/envs/dabest-dev-py3.6/lib/python3.6/site-packages/dabest/_classes.py in __init__(self, data, idx, x, y, paired, id_col, ci, resamples, random_seed)
60 err1 = ' or a tuple has repeated groups in it.'
61 err2 = ' Please remove any duplicates and try again.'
---> 62 raise ValueError(err0 + err1 + err2)
63
64 else: # mix of string and tuple?
ValueError: Groups are repeated across tuples, or a tuple has repeated groups in it. Please remove any duplicates and try again.
This is designed deliberately to reduce any confusion. Paired comparisons, by definition, should only have a before measure, and an after measure. Setting up the comparison as in multi_paired_neg
implies this is a not a strict paired comparison.
If you are doing a successive repeated measures experiment (ie.OMEGA
is t=0, MOE
is t=1, and then Macromodel
is t=2), the way to do this is:
first = dabest.load(new_df,
idx=('OMEGA','MOE'),
id_col="id", paired=True)
second = dabest.load(new_df,
idx=('OMEGA', 'Macromodel'),
id_col="id", paired=True)
first.cohens_d
DABEST v0.3.0
=============
Good afternoon!
The current time is Fri Apr 24 16:34:10 2020.
The paired Cohen's d between OMEGA and MOE is 0.182 [95%CI 0.0781, 0.285].
The p-value of the two-sided permutation t-test is 0.0014.
5000 bootstrap samples were taken; the confidence interval is bias-corrected and accelerated.
The p-value(s) reported are the likelihood(s) of observing the effect size(s),
if the null hypothesis of zero difference is true.
For each p-value, 5000 reshuffles of the control and test labels were performed.
To get the results of all valid statistical tests, use `.cohens_d.statistical_tests`
second.cohens_d
DABEST v0.3.0
=============
Good afternoon!
The current time is Fri Apr 24 16:34:12 2020.
The paired Cohen's d between OMEGA and Macromodel is 0.0295 [95%CI -0.0672, 0.122].
The p-value of the two-sided permutation t-test is 0.541.
5000 bootstrap samples were taken; the confidence interval is bias-corrected and accelerated.
The p-value(s) reported are the likelihood(s) of observing the effect size(s),
if the null hypothesis of zero difference is true.
For each p-value, 5000 reshuffles of the control and test labels were performed.
To get the results of all valid statistical tests, use `.cohens_d.statistical_tests`
To plot them alongside each other:
import matplotlib.pyplot as plt
import seaborn as sns
%config InlineBackend.figure_format = 'retina'
sns.set(context="talk")
f, axx = plt.subplots(ncols=2, figsize=(10, 7),
# Adjust the width-wise spacing
gridspec_kw={"wspace":0.5})
plot_kwargs = dict(float_contrast=False,
contrast_ylim=(0, 0.7))
first.cohens_d.plot(ax=axx[0], **plot_kwargs);
second.cohens_d.plot(ax=axx[1], **plot_kwargs);
You can read more here.
Hope this helps!
from dabest-python.
Hi @paul-hawkins,
Are you asking how to compute paired Cohen's d? Or are you saying that the paired Cohen's d returned by DABEST is not actually paired?
If you question is the first one, simply load an experiment as a paired experiment:
import pandas as pd
import dabest
# Load the iris dataset. Requires internet access.
iris = pd.read_csv("https://github.com/mwaskom/seaborn-data/raw/master/iris.csv")
iris.reset_index(inplace=True)
virginica = iris[iris.species=="virginica"].copy()
virginica_melted = pd.melt(virginica,
id_vars="index",
value_vars=["sepal_length", "petal_length"],
var_name="flower_part",
value_name="width")
virginica_paired = dabest.load(data=virginica_melted, x="flower_part", y="width",
paired=True, id_col="index",
idx=("sepal_length", "petal_length"))
then produce the Cohen's d:
virginica_paired.cohens_d
DABEST v0.3.0
=============
Good evening!
The current time is Thu Apr 23 18:44:08 2020.
The paired Cohen's d between sepal_length and petal_length is -1.74 [95%CI -2.1, -1.37].
The p-value of the two-sided permutation t-test is 0.0.
5000 bootstrap samples were taken; the confidence interval is bias-corrected and accelerated.
The p-value(s) reported are the likelihood(s) of observing the effect size(s),
if the null hypothesis of zero difference is true.
For each p-value, 5000 reshuffles of the control and test labels were performed.
To get the results of all valid statistical tests, use `.cohens_d.statistical_tests`
If you are saying the latter (ie the paired Cohen's d returned by DABEST is not actually paired), could you provide a dummy dataset with the expected accurate values, vis-a-vis what DABEST produces? Thanks!
from dabest-python.
from dabest-python.
Hi @paul-hawkins,
I hope Joses sufficiently answered your question.
Just to let you know, we have just released a new version of DABEST and you will have to use paired=baseline
or paired=sequential
for future paired comparisons. Please see the new documentation for details.
I will now be closing this issue. Thank you!
from dabest-python.
Related Issues (20)
- color_col formatting HOT 2
- pandas version conflicts HOT 2
- Plot ONLY mean diff HOT 1
- Error with dataframes containing non-string column names HOT 3
- Is it possible to get access to the underlying bootstrap samples generated to obtain the 95% CI for ES? HOT 1
- cannot plot the figures HOT 3
- Estimation plot only HOT 1
- Warning: Not all points displayed... HOT 2
- Are multi-group p-values corrected for multiple comparisons? HOT 2
- contrast_ylim does not work for matplotlib HOT 1
- DABEST calculation of median difference CIs often fails HOT 5
- Error in bca.ci(boot.out, conf, index[1L], L = L, t = t.o, t0 = t0.o, : estimated adjustment 'a' is NA HOT 1
- New Release: v2023.02.14
- Error in changing the the linewidth of the lines used to join each pair of observations HOT 1
- Possibility to do mixed model statistics ? HOT 2
- Little problems with the plots HOT 3
- Limitation of paired analysis: Statistics comparing to only one group instead of with each other
- delta_g does not plot together with hedges_g
- Options for plot appearance HOT 2
- cannot plot figure - 'numpy.ndarray' object has no attribute 'categories' HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dabest-python.