Giter VIP home page Giter VIP logo

Comments (4)

josesho avatar josesho commented on May 30, 2024 1

Hi @paul-hawkins ,

It seems like DABEST only allows paired tests between two sets of data, while paired comparisons can be carried out on three or more sets of data.

This is half-correct: paired comparisons can only be done on pairs of data. For instance, using the data you posted in #98,

new_df = pd.read_csv('all_ringrmsd_data_only.txt', sep='\t')

# Need to have an ID column so DABEST knows which observations go together.
new_df.rename(columns={"Unnamed: 0": "id"}, inplace=True)

multi_paired = dabest.load(
                   new_df,
                   # Here, we assume OMEGA and MOE are a set of repeated measures,
                   # while Macromodel and Desmond are a second, unrelated set of repeated measures.
                   idx=(('OMEGA','MOE'),
                        ('Macromodel', 'Desmond')), 

                   id_col="id", paired=True)

multi_paired.cohens_d.plot();
DABEST v0.3.0
=============
             
Good afternoon!
The current time is Fri Apr 24 16:12:45 2020.

The paired Cohen's d between OMEGA and MOE is 0.182 [95%CI 0.0781, 0.285].
The p-value of the two-sided permutation t-test is 0.0014. 

The paired Cohen's d between Macromodel and Desmond is 0.314 [95%CI 0.179, 0.448].
The p-value of the two-sided permutation t-test is 0.0. 

5000 bootstrap samples were taken; the confidence interval is bias-corrected and accelerated.
The p-value(s) reported are the likelihood(s) of observing the effect size(s),
if the null hypothesis of zero difference is true.
For each p-value, 5000 reshuffles of the control and test labels were performed.

To get the results of all valid statistical tests, use `.cohens_d.statistical_tests`

DABEST paired analysis design insists that none of the groups are repeated more than once.

multi_paired_neg = dabest.load(new_df,
                           idx=(('OMEGA','MOE'),
                                ('OMEGA', 'Desmond')), 
                           
                           id_col="id", paired=True)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-16-53d5efeaf2d7> in <module>()
      5                                 ('OMEGA', 'Desmond')), 
      6 
----> 7                            id_col="id", paired=True)

~/anaconda3/envs/dabest-dev-py3.6/lib/python3.6/site-packages/dabest/_api.py in load(data, idx, x, y, paired, id_col, ci, resamples, random_seed)
     63     from ._classes import Dabest
     64 
---> 65     return Dabest(data, idx, x, y, paired, id_col, ci, resamples, random_seed)

~/anaconda3/envs/dabest-dev-py3.6/lib/python3.6/site-packages/dabest/_classes.py in __init__(self, data, idx, x, y, paired, id_col, ci, resamples, random_seed)
     60                 err1 = ' or a tuple has repeated groups in it.'
     61                 err2 = ' Please remove any duplicates and try again.'
---> 62                 raise ValueError(err0 + err1 + err2)
     63 
     64         else: # mix of string and tuple?

ValueError: Groups are repeated across tuples, or a tuple has repeated groups in it. Please remove any duplicates and try again.

This is designed deliberately to reduce any confusion. Paired comparisons, by definition, should only have a before measure, and an after measure. Setting up the comparison as in multi_paired_neg implies this is a not a strict paired comparison.

If you are doing a successive repeated measures experiment (ie.OMEGA is t=0, MOE is t=1, and then Macromodel is t=2), the way to do this is:

first = dabest.load(new_df,
                    idx=('OMEGA','MOE'),
                    id_col="id", paired=True)

second = dabest.load(new_df,
                    idx=('OMEGA', 'Macromodel'),
                    id_col="id", paired=True)
first.cohens_d
DABEST v0.3.0
=============
             
Good afternoon!
The current time is Fri Apr 24 16:34:10 2020.

The paired Cohen's d between OMEGA and MOE is 0.182 [95%CI 0.0781, 0.285].
The p-value of the two-sided permutation t-test is 0.0014. 

5000 bootstrap samples were taken; the confidence interval is bias-corrected and accelerated.
The p-value(s) reported are the likelihood(s) of observing the effect size(s),
if the null hypothesis of zero difference is true.
For each p-value, 5000 reshuffles of the control and test labels were performed.

To get the results of all valid statistical tests, use `.cohens_d.statistical_tests`
second.cohens_d
DABEST v0.3.0
=============
             
Good afternoon!
The current time is Fri Apr 24 16:34:12 2020.

The paired Cohen's d between OMEGA and Macromodel is 0.0295 [95%CI -0.0672, 0.122].
The p-value of the two-sided permutation t-test is 0.541. 

5000 bootstrap samples were taken; the confidence interval is bias-corrected and accelerated.
The p-value(s) reported are the likelihood(s) of observing the effect size(s),
if the null hypothesis of zero difference is true.
For each p-value, 5000 reshuffles of the control and test labels were performed.

To get the results of all valid statistical tests, use `.cohens_d.statistical_tests`

To plot them alongside each other:

import matplotlib.pyplot as plt
import seaborn as sns
%config InlineBackend.figure_format = 'retina'

sns.set(context="talk")
f, axx = plt.subplots(ncols=2, figsize=(10, 7),
                      # Adjust the width-wise spacing
                      gridspec_kw={"wspace":0.5})

plot_kwargs = dict(float_contrast=False, 
                    contrast_ylim=(0, 0.7))

first.cohens_d.plot(ax=axx[0], **plot_kwargs);
second.cohens_d.plot(ax=axx[1], **plot_kwargs);

wErBdqR6Lc1igAAAABJRU5ErkJggg==

You can read more here.

Hope this helps!

from dabest-python.

josesho avatar josesho commented on May 30, 2024

Hi @paul-hawkins,

Are you asking how to compute paired Cohen's d? Or are you saying that the paired Cohen's d returned by DABEST is not actually paired?

If you question is the first one, simply load an experiment as a paired experiment:

import pandas as pd
import dabest

# Load the iris dataset. Requires internet access.
iris = pd.read_csv("https://github.com/mwaskom/seaborn-data/raw/master/iris.csv")
iris.reset_index(inplace=True)

virginica = iris[iris.species=="virginica"].copy()

virginica_melted = pd.melt(virginica, 
                           id_vars="index", 
                           value_vars=["sepal_length", "petal_length"],
                           var_name="flower_part",
                           value_name="width")

virginica_paired = dabest.load(data=virginica_melted,  x="flower_part", y="width",
                                 paired=True, id_col="index",
                                 idx=("sepal_length", "petal_length"))

then produce the Cohen's d:

virginica_paired.cohens_d
DABEST v0.3.0
=============
             
Good evening!
The current time is Thu Apr 23 18:44:08 2020.

The paired Cohen's d between sepal_length and petal_length is -1.74 [95%CI -2.1, -1.37].
The p-value of the two-sided permutation t-test is 0.0. 

5000 bootstrap samples were taken; the confidence interval is bias-corrected and accelerated.
The p-value(s) reported are the likelihood(s) of observing the effect size(s),
if the null hypothesis of zero difference is true.
For each p-value, 5000 reshuffles of the control and test labels were performed.

To get the results of all valid statistical tests, use `.cohens_d.statistical_tests`

If you are saying the latter (ie the paired Cohen's d returned by DABEST is not actually paired), could you provide a dummy dataset with the expected accurate values, vis-a-vis what DABEST produces? Thanks!

from dabest-python.

paul-hawkins avatar paul-hawkins commented on May 30, 2024

from dabest-python.

maiyishan avatar maiyishan commented on May 30, 2024

Hi @paul-hawkins,

I hope Joses sufficiently answered your question.

Just to let you know, we have just released a new version of DABEST and you will have to use paired=baseline or paired=sequential for future paired comparisons. Please see the new documentation for details.

I will now be closing this issue. Thank you!

from dabest-python.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.