Giter VIP home page Giter VIP logo

correlation's Introduction

easystats: An R Framework for Easy Statistical Modeling, Visualization, and Reporting

downloads total lifecycle

What is easystats?

easystats is a collection of R packages, which aims to provide a unifying and consistent framework to tame, discipline, and harness the scary R statistics and their pesky models.

However, there is not (yet) an unique “easystats” way of doing data analysis. Instead, start with one package and, when you’ll face a new challenge, do check if there is an easystats answer for it in other packages. You will slowly uncover how using them together facilitates your life. And, who knows, you might even end up using them all.

Installation

CRAN_Status_Badge insight status badge R-CMD-check

Type Source Command
Release CRAN install.packages("easystats")
Development r-universe install.packages("easystats", repos = "https://easystats.r-universe.dev")
Development GitHub remotes::install_github("easystats/easystats")

Finally, as easystats sometimes depends on some additional packages for specific functions that are not downloaded by default. If you want to benefit from the full easystats experience without any hiccups, simply run the following:

easystats::install_suggested()

Citation

To cite the package, run the following command:

citation("easystats")
To cite easystats in publications use:

  Lüdecke, Patil, Ben-Shachar, Wiernik, Bacher, Thériault, & Makowski
  (2022). easystats: Framework for Easy Statistical Modeling,
  Visualization, and Reporting. CRAN.
  doi:10.32614/CRAN.package.easystats
  <https://doi.org/10.32614/CRAN.package.easystats>

A BibTeX entry for LaTeX users is

  @Article{,
    title = {easystats: Framework for Easy Statistical Modeling, Visualization, and Reporting},
    author = {Daniel Lüdecke and Mattan S. Ben-Shachar and Indrajeet Patil and Brenton M. Wiernik and Etienne Bacher and Rémi Thériault and Dominique Makowski},
    journal = {CRAN},
    doi = {https://doi.org/10.32614/CRAN.package.easystats},
    year = {2022},
    note = {R package},
    url = {https://easystats.github.io/easystats/},
  }

If you want to do this only for certain packages in the ecosystem, have a look at this article on how you can do so! https://easystats.github.io/easystats/articles/citation.html

Getting started

Each easystats package has a different scope and purpose. This means your best way to start is to explore and pick the one(s) that you feel might be useful to you. However, as they are built with a “bigger picture” in mind, you will realize that using more of them creates a smooth workflow, as these packages are meant to work together. Ideally, these packages work in unison to cover all aspects of statistical analysis and data visualization.

  • report: 📜 🎉 Automated statistical reporting of objects in R
  • correlation: 🔗 Your all-in-one package to run correlations
  • modelbased: 📈 Estimate effects, group averages and contrasts between groups based on statistical models
  • bayestestR: 👻 Great for beginners or experts of Bayesian statistics
  • effectsize: 🐉 Compute, convert, interpret and work with indices of effect size and standardized parameters
  • see: 🎨 The plotting companion to create beautiful results visualizations
  • parameters: 📊 Obtain a table containing all information about the parameters of your models
  • performance: 💪 Models’ quality and performance metrics (R2, ICC, LOO, AIC, BF, …)
  • insight: 🔮 For developers, a package to help you work with different models and packages
  • datawizard: 🧙 Magic potions to clean and transform your data

Frequently Asked Questions

How is easystats different from the tidyverse?

You’ve probably already heard about the tidyverse, another very popular collection of packages (ggplot, dplyr, tidyr, …) that also makes using R easier. So, should you pick the tidyverse or easystats? Pick both!

Indeed, these two ecosystems have been designed with very different goals in mind. The tidyverse packages are primarily made to create a new R experience, where data manipulation and exploration is intuitive and consistent. On the other hand, easystats focuses more on the final stretch of the analysis: understanding and interpreting your results and reporting them in a manuscript or a report, while following best practices. You can definitely use the easystats functions in a tidyverse workflow!

easystats + tidyverse = ❤️

Can easystats be useful to advanced users and/or developers?

Yes, definitely! easystats is built in terms of modules that are general enough to be used inside other packages. For instance, the insight package is made to easily implement support for post-processing of pretty much all regression model packages under the sun. We use it in all the easystats packages, but it is also used in other non-easystats packages, such as ggstatsplot, modelsummary, ggeffects, and more.

So why not in yours?

Moreover, the easystats packages are very lightweight, with a minimal set of dependencies, which again makes it great if you want to rely on them.

Documentation

Websites

Each easystats package has a dedicated website.

For example, website for parameters is https://easystats.github.io/parameters/.

Blog

In addition to the websites containing documentation for these packages, you can also read posts from easystats blog: https://easystats.github.io/blog/posts/.

Other learning resources

In addition to these websites and blog posts, you can also check out the following presentations and talks to learn more about this ecosystem:

https://easystats.github.io/easystats/articles/resources.html

Dependencies

easystats packages are designed to be lightweight, i.e., they don’t have any third-party hard dependencies, other than base-R packages or other easystats packages! If you develop R packages, this means that you can safely use easystats packages as dependencies in your own packages, without the risk of entering the dependency hell.

library(deepdep)

plot_dependencies("easystats", depth = 2, show_stamp = FALSE)

As we can see, the only exception is the {see} package, which is responsible for plotting and creating figures and relies on {ggplot2}, which does have a substantial number of dependencies.

Usage

Total downloads

Total insight datawizard parameters performance bayestestR effectsize correlation see modelbased report easystats
22,663,701 6,645,177 4,004,503 2,765,087 2,682,787 2,629,672 2,087,821 690,405 572,721 338,671 186,895 59,962

Trend

Contributing

We are happy to receive bug reports, suggestions, questions, and (most of all) contributions to fix problems and add features. Pull Requests for contributions are encouraged.

Here are some simple ways in which you can contribute (in the increasing order of commitment):

  • Read and correct any inconsistencies in the documentation
  • Raise issues about bugs or wanted features
  • Review code
  • Add new functionality

Code of Conduct

Please note that the ‘easystats’ project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

correlation's People

Contributors

bwiernik avatar dominiquemakowski avatar etiennebacher avatar github-actions[bot] avatar hy4m avatar indrajeetpatil avatar mattansb avatar rempsyc avatar strengejacke avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

correlation's Issues

Error in attributes(x)$ci * 100 : non-numeric argument to binary operator

Just installed the package and trying to replicate the example got an error:

library(correlation)
correlation(iris)
Error in attributes(x)$ci * 100 : non-numeric argument to binary operator

With artificial data got the same error:

df = data.frame(a=rnorm(100,1,1), b=rnorm(100,2,3))
correlation(df)
Error in attributes(x)$ci * 100 : non-numeric argument to binary operator

Running on a clean R 3.6.1 session

allow users to change the bending constant for % bend correlation

  • WRS2
library(correlation)
library(WRS2)

df <- dplyr::select(ggplot2::msleep, c(sleep_rem, awake:bodywt))

set.seed(123)
pball(df, beta = 0.1)
#> Call:
#> pball(x = df, beta = 0.1)
#> 
#> Robust correlation matrix:
#>           sleep_rem   awake brainwt  bodywt
#> sleep_rem    1.0000 -0.7669 -0.3956 -0.4226
#> awake       -0.7669  1.0000  0.5697  0.5303
#> brainwt     -0.3956  0.5697  1.0000  0.8680
#> bodywt      -0.4226  0.5303  0.8680  1.0000
#> 
#> p-values:
#>           sleep_rem awake brainwt  bodywt
#> sleep_rem        NA     0 0.00538 0.00069
#> awake       0.00000    NA 0.00000 0.00000
#> brainwt     0.00538     0      NA 0.00000
#> bodywt      0.00069     0 0.00000      NA
#> 
#> 
#> Test statistic H: Inf, p-value = 0

set.seed(123)
pball(df, beta = 0.5)
#> Call:
#> pball(x = df, beta = 0.5)
#> 
#> Robust correlation matrix:
#>           sleep_rem   awake brainwt  bodywt
#> sleep_rem    1.0000 -0.6882 -0.4020 -0.4009
#> awake       -0.6882  1.0000  0.4959  0.4673
#> brainwt     -0.4020  0.4959  1.0000  0.9466
#> bodywt      -0.4009  0.4673  0.9466  1.0000
#> 
#> p-values:
#>           sleep_rem awake brainwt  bodywt
#> sleep_rem        NA 0e+00 0.00463 0.00136
#> awake       0.00000    NA 0.00010 0.00001
#> brainwt     0.00463 1e-04      NA 0.00000
#> bodywt      0.00136 1e-05 0.00000      NA
#> 
#> 
#> Test statistic H: Inf, p-value = 0
  • correlation

Correlation coefficients are identical for different betas.

set.seed(123)
correlation(df, method = "percentage", beta = 0.1)
#> Parameter1 | Parameter2 |     r |     t | df |      p |         95% CI |          Method | n_Obs
#> ------------------------------------------------------------------------------------------------
#> sleep_rem  |      awake | -0.75 | -8.79 | 59 | < .001 | [-0.84, -0.62] | Percentage Bend |    61
#> sleep_rem  |    brainwt | -0.41 | -3.04 | 46 | 0.004  | [-0.62, -0.14] | Percentage Bend |    48
#> sleep_rem  |     bodywt | -0.40 | -3.38 | 59 | 0.003  | [-0.59, -0.17] | Percentage Bend |    61
#> awake      |    brainwt |  0.59 |  5.36 | 54 | < .001 | [ 0.39,  0.74] | Percentage Bend |    56
#> awake      |     bodywt |  0.51 |  5.31 | 81 | < .001 | [ 0.33,  0.65] | Percentage Bend |    83
#> brainwt    |     bodywt |  0.92 | 16.71 | 54 | < .001 | [ 0.86,  0.95] | Percentage Bend |    56

set.seed(123)
correlation(df, method = "percentage", beta = 0.5)
#> Parameter1 | Parameter2 |     r |     t | df |      p |         95% CI |          Method | n_Obs
#> ------------------------------------------------------------------------------------------------
#> sleep_rem  |      awake | -0.75 | -8.79 | 59 | < .001 | [-0.84, -0.62] | Percentage Bend |    61
#> sleep_rem  |    brainwt | -0.41 | -3.04 | 46 | 0.004  | [-0.62, -0.14] | Percentage Bend |    48
#> sleep_rem  |     bodywt | -0.40 | -3.38 | 59 | 0.003  | [-0.59, -0.17] | Percentage Bend |    61
#> awake      |    brainwt |  0.59 |  5.36 | 54 | < .001 | [ 0.39,  0.74] | Percentage Bend |    56
#> awake      |     bodywt |  0.51 |  5.31 | 81 | < .001 | [ 0.33,  0.65] | Percentage Bend |    83
#> brainwt    |     bodywt |  0.92 | 16.71 | 54 | < .001 | [ 0.86,  0.95] | Percentage Bend |    56

Created on 2020-03-20 by the reprex package (v0.3.0)

Session info
devtools::session_info()
#> - Session info ---------------------------------------------------------------
#>  setting  value                                             
#>  version  R Under development (unstable) (2020-02-28 r77874)
#>  os       Windows 10 x64                                    
#>  system   x86_64, mingw32                                   
#>  ui       RTerm                                             
#>  language (EN)                                              
#>  collate  English_United States.1252                        
#>  ctype    English_United States.1252                        
#>  tz       Europe/Berlin                                     
#>  date     2020-03-20                                        
#> 
#> - Packages -------------------------------------------------------------------
#>  package     * version    date       lib source                                
#>  assertthat    0.2.1      2019-03-21 [1] CRAN (R 4.0.0)                        
#>  backports     1.1.5      2019-10-02 [1] CRAN (R 4.0.0)                        
#>  bayestestR    0.5.2.1    2020-03-16 [1] Github (easystats/bayestestR@6ee7e37) 
#>  callr         3.4.2      2020-02-12 [1] CRAN (R 4.0.0)                        
#>  cli           2.0.2      2020-02-28 [1] CRAN (R 4.0.0)                        
#>  colorspace    1.4-1      2019-03-18 [1] CRAN (R 4.0.0)                        
#>  correlation * 0.1.0      2020-03-17 [1] Github (easystats/correlation@c1c35b0)
#>  crayon        1.3.4      2017-09-16 [1] CRAN (R 4.0.0)                        
#>  desc          1.2.0      2018-05-01 [1] CRAN (R 4.0.0)                        
#>  devtools      2.2.2      2020-02-17 [1] CRAN (R 4.0.0)                        
#>  digest        0.6.25     2020-02-23 [1] CRAN (R 4.0.0)                        
#>  dplyr         0.8.5      2020-03-07 [1] CRAN (R 4.0.0)                        
#>  effectsize    0.2.0.1    2020-03-06 [1] Github (easystats/effectsize@64bfbc3) 
#>  ellipsis      0.3.0      2019-09-20 [1] CRAN (R 4.0.0)                        
#>  evaluate      0.14       2019-05-28 [1] CRAN (R 4.0.0)                        
#>  fansi         0.4.1      2020-01-08 [1] CRAN (R 4.0.0)                        
#>  fs            1.3.2      2020-03-05 [1] CRAN (R 4.0.0)                        
#>  ggplot2       3.3.0      2020-03-05 [1] CRAN (R 4.0.0)                        
#>  glue          1.3.2      2020-03-12 [1] CRAN (R 4.0.0)                        
#>  gtable        0.3.0      2019-03-25 [1] CRAN (R 4.0.0)                        
#>  highr         0.8        2019-03-20 [1] CRAN (R 4.0.0)                        
#>  htmltools     0.4.0      2019-10-04 [1] CRAN (R 4.0.0)                        
#>  insight       0.8.2.1    2020-03-16 [1] Github (easystats/insight@e0b229b)    
#>  knitr         1.28       2020-02-06 [1] CRAN (R 4.0.0)                        
#>  lifecycle     0.2.0.9000 2020-03-16 [1] Github (r-lib/lifecycle@355dcba)      
#>  magrittr      1.5        2014-11-22 [1] CRAN (R 4.0.0)                        
#>  MASS          7.3-51.5   2019-12-20 [2] CRAN (R 4.0.0)                        
#>  mc2d          0.1-18     2017-03-06 [1] CRAN (R 4.0.0)                        
#>  memoise       1.1.0      2017-04-21 [1] CRAN (R 4.0.0)                        
#>  munsell       0.5.0      2018-06-12 [1] CRAN (R 4.0.0)                        
#>  mvtnorm       1.1-0      2020-02-24 [1] CRAN (R 4.0.0)                        
#>  parameters    0.6.0      2020-03-12 [1] CRAN (R 4.0.0)                        
#>  pillar        1.4.3      2019-12-20 [1] CRAN (R 4.0.0)                        
#>  pkgbuild      1.0.6      2019-10-09 [1] CRAN (R 4.0.0)                        
#>  pkgconfig     2.0.3      2019-09-22 [1] CRAN (R 4.0.0)                        
#>  pkgload       1.0.2      2018-10-29 [1] CRAN (R 4.0.0)                        
#>  plyr          1.8.6      2020-03-03 [1] CRAN (R 4.0.0)                        
#>  prettyunits   1.1.1      2020-01-24 [1] CRAN (R 4.0.0)                        
#>  processx      3.4.2      2020-02-09 [1] CRAN (R 4.0.0)                        
#>  ps            1.3.2      2020-02-13 [1] CRAN (R 4.0.0)                        
#>  purrr         0.3.3      2019-10-18 [1] CRAN (R 4.0.0)                        
#>  R6            2.4.1      2019-11-12 [1] CRAN (R 4.0.0)                        
#>  Rcpp          1.0.4      2020-03-17 [1] CRAN (R 4.0.0)                        
#>  remotes       2.1.1      2020-02-15 [1] CRAN (R 4.0.0)                        
#>  reshape       0.8.8      2018-10-23 [1] CRAN (R 4.0.0)                        
#>  rlang         0.4.5      2020-03-01 [1] CRAN (R 4.0.0)                        
#>  rmarkdown     2.1        2020-01-20 [1] CRAN (R 4.0.0)                        
#>  rprojroot     1.3-2      2018-01-03 [1] CRAN (R 4.0.0)                        
#>  scales        1.1.0      2019-11-18 [1] CRAN (R 4.0.0)                        
#>  sessioninfo   1.1.1      2018-11-05 [1] CRAN (R 4.0.0)                        
#>  stringi       1.4.6      2020-02-17 [1] CRAN (R 4.0.0)                        
#>  stringr       1.4.0      2019-02-10 [1] CRAN (R 4.0.0)                        
#>  testthat      2.3.2      2020-03-02 [1] CRAN (R 4.0.0)                        
#>  tibble        2.1.3      2019-06-06 [1] CRAN (R 4.0.0)                        
#>  tidyselect    1.0.0      2020-01-27 [1] CRAN (R 4.0.0)                        
#>  usethis       1.5.1.9000 2020-03-18 [1] Github (r-lib/usethis@8c32c73)        
#>  vctrs         0.2.4      2020-03-10 [1] CRAN (R 4.0.0)                        
#>  withr         2.1.2      2018-03-15 [1] CRAN (R 4.0.0)                        
#>  WRS2        * 1.0-0      2019-06-06 [1] CRAN (R 4.0.0)                        
#>  xfun          0.12       2020-01-13 [1] CRAN (R 4.0.0)                        
#>  yaml          2.2.1      2020-02-01 [1] CRAN (R 4.0.0)                        
#> 
#> [1] C:/Users/inp099/Documents/R/win-library/4.0
#> [2] C:/Program Files/R/R-devel/library

Scope

"wuut, yet another useless brick?"

@strengejacke don't worry this is a very small package with a very narrow focus, pretty much feature-complete, that just implements correlations (further to be displayed in nice tables through report) 😄

How is biseral correlation supposed to work?

data(iris)
x <- iris[iris$Species != "versicolor", ]
correlation::cor_test(x, "Species", "Sepal.Length", method = "biseral")
#> Error in match.arg(tolower(method), c("pearson", "kendall", "spearman"), : 'arg' should be one of "pearson", "kendall", "spearman"
correlation:::.cor_test_biserial(x, "Species", "Sepal.Length", method = "biseral")
#> Warning in Ops.factor(x, 1): '%%' not meaningful for factors
#> Error in if (all(x%%1 == 0)) {: missing value where TRUE/FALSE needed

Created on 2020-03-24 by the reprex package (v0.3.0)

partial r hypothesis testing does not account for uncertainty in residualizing

Note the following examples:

res <- correlation::correlation(mtcars, partial = TRUE)
res[res$Parameter1=="mpg",]
#> Parameter1 | Parameter2 |     r |     t | df |     p |         95% CI |  Method | n_Obs
#> ---------------------------------------------------------------------------------------
#> mpg        |        cyl | -0.02 | -0.13 | 30 | 1.000 | [-0.37,  0.33] | Pearson |    32
#> mpg        |       disp |  0.16 |  0.89 | 30 | 1.000 | [-0.20,  0.48] | Pearson |    32
#> mpg        |         hp | -0.21 | -1.18 | 30 | 1.000 | [-0.52,  0.15] | Pearson |    32
#> mpg        |       drat |  0.10 |  0.58 | 30 | 1.000 | [-0.25,  0.44] | Pearson |    32
#> mpg        |         wt | -0.39 | -2.34 | 30 | 1.000 | [-0.65, -0.05] | Pearson |    32
#> mpg        |       qsec |  0.24 |  1.34 | 30 | 1.000 | [-0.12,  0.54] | Pearson |    32
#> mpg        |         vs |  0.03 |  0.18 | 30 | 1.000 | [-0.32,  0.38] | Pearson |    32
#> mpg        |         am |  0.26 |  1.46 | 30 | 1.000 | [-0.10,  0.56] | Pearson |    32
#> mpg        |       gear |  0.10 |  0.52 | 30 | 1.000 | [-0.26,  0.43] | Pearson |    32
#> mpg        |       carb | -0.05 | -0.29 | 30 | 1.000 | [-0.39,  0.30] | Pearson |    32

res <- ppcor::pcor(mtcars)
data.frame(r = res$estimate[-1,1],
           t = res$statistic[-1,1],
           p = res$p.value[-1,1])
#>                r          t          p
#> cyl  -0.02326429 -0.1066392 0.91608738
#> disp  0.16083460  0.7467585 0.46348865
#> hp   -0.21052027 -0.9868407 0.33495531
#> drat  0.10445452  0.4813036 0.63527790
#> wt   -0.39344938 -1.9611887 0.06325215
#> qsec  0.23809863  1.1234133 0.27394127
#> vs    0.03293117  0.1509915 0.88142347
#> am    0.25832849  1.2254035 0.23398971
#> gear  0.09534261  0.4389142 0.66520643
#> carb -0.05243662 -0.2406258 0.81217871

Created on 2020-04-06 by the reprex package (v0.3.0)

The resulting partial correlations are identical, but the t values are not (and by extension so are the CIs, and the unadjusted p values). Why?

Because correlation() computes partial correlations by residualizing variables, and then computing the correlations between them. But the df of the residualizing process - that is, the degree of uncertainty in estimating the residuals - is not accounted for. (Note that this should be true for Bayesian partial correlations as well - the priors and likelihood of the residualizing process are not accounted for).

Solutions:

  • Account for these. [HARD]
  • Update the docs to explicitly mention this - that inference and CIs are conditional on, and do not account for the uncertainty in estimating the residuals. [EASY]

Pearson's r different depending on the `bayesian` argument's truth value

Pearson's correlation coefficient estimate is different depending on whether bayesian is set to TRUE or FALSE:

library(correlation)

set.seed(123)
tibble::as_tibble(correlation(iris, method = "pearson"))
#> # A tibble: 6 x 10
#>   Parameter1  Parameter2      r     t    df        p CI_low CI_high Method n_Obs
#>   <chr>       <chr>       <dbl> <dbl> <int>    <dbl>  <dbl>   <dbl> <chr>  <int>
#> 1 Sepal.Leng~ Sepal.Wid~ -0.118 -1.44   148 1.52e- 1 -0.273  0.0435 Pears~   150
#> 2 Sepal.Leng~ Petal.Len~  0.872 21.6    148 5.19e-47  0.827  0.906  Pears~   150
#> 3 Sepal.Leng~ Petal.Wid~  0.818 17.3    148 9.30e-37  0.757  0.865  Pears~   150
#> 4 Sepal.Width Petal.Len~ -0.428 -5.77   148 1.35e- 7 -0.551 -0.288  Pears~   150
#> 5 Sepal.Width Petal.Wid~ -0.366 -4.79   148 8.15e- 6 -0.497 -0.219  Pears~   150
#> 6 Petal.Leng~ Petal.Wid~  0.963 43.4    148 2.81e-85  0.949  0.973  Pears~   150

set.seed(123)
tibble::as_tibble(correlation(iris, method = "pearson", bayesian = TRUE))
#> Loading required namespace: BayesFactor
#> # A tibble: 6 x 12
#>   Parameter1 Parameter2    rho CI_low CI_high    pd ROPE_Percentage       BF
#>   <chr>      <chr>       <dbl>  <dbl>   <dbl> <dbl>           <dbl>    <dbl>
#> 1 Sepal.Len~ Sepal.Wid~ -0.114 -0.236  0.0189 0.924           0.443 5.09e- 1
#> 2 Sepal.Len~ Petal.Len~  0.863  0.827  0.895  1               0     2.14e+43
#> 3 Sepal.Len~ Petal.Wid~  0.806  0.759  0.850  1               0     2.62e+33
#> 4 Sepal.Wid~ Petal.Len~ -0.415 -0.517 -0.306  1               0     3.49e+ 5
#> 5 Sepal.Wid~ Petal.Wid~ -0.349 -0.462 -0.240  1               0     5.29e+ 3
#> 6 Petal.Len~ Petal.Wid~  0.959  0.949  0.969  1               0     1.24e+80
#> # ... with 4 more variables: Prior_Distribution <chr>, Prior_Location <dbl>,
#> #   Prior_Scale <dbl>, n_Obs <int>

Created on 2020-03-19 by the reprex package (v0.3.0)

Session info
devtools::session_info()
#> - Session info ---------------------------------------------------------------
#>  setting  value                                             
#>  version  R Under development (unstable) (2020-02-28 r77874)
#>  os       Windows 10 x64                                    
#>  system   x86_64, mingw32                                   
#>  ui       RTerm                                             
#>  language (EN)                                              
#>  collate  English_United States.1252                        
#>  ctype    English_United States.1252                        
#>  tz       Europe/Berlin                                     
#>  date     2020-03-19                                        
#> 
#> - Packages -------------------------------------------------------------------
#>  package      * version    date       lib
#>  assertthat     0.2.1      2019-03-21 [1]
#>  backports      1.1.5      2019-10-02 [1]
#>  BayesFactor    0.9.12-4.2 2018-05-19 [1]
#>  bayestestR     0.5.2.1    2020-03-16 [1]
#>  callr          3.4.2      2020-02-12 [1]
#>  cli            2.0.2      2020-02-28 [1]
#>  coda           0.19-3     2019-07-05 [1]
#>  correlation  * 0.1.0      2020-03-17 [1]
#>  crayon         1.3.4      2017-09-16 [1]
#>  desc           1.2.0      2018-05-01 [1]
#>  devtools       2.2.2      2020-02-17 [1]
#>  digest         0.6.25     2020-02-23 [1]
#>  effectsize     0.2.0.1    2020-03-06 [1]
#>  ellipsis       0.3.0      2019-09-20 [1]
#>  evaluate       0.14       2019-05-28 [1]
#>  fansi          0.4.1      2020-01-08 [1]
#>  fs             1.3.2      2020-03-05 [1]
#>  glue           1.3.2      2020-03-12 [1]
#>  gtools         3.8.1      2018-06-26 [1]
#>  highr          0.8        2019-03-20 [1]
#>  htmltools      0.4.0      2019-10-04 [1]
#>  insight        0.8.2.1    2020-03-16 [1]
#>  knitr          1.28       2020-02-06 [1]
#>  lattice        0.20-40    2020-02-19 [2]
#>  magrittr       1.5        2014-11-22 [1]
#>  Matrix         1.2-18     2019-11-27 [2]
#>  MatrixModels   0.4-1      2015-08-22 [1]
#>  memoise        1.1.0      2017-04-21 [1]
#>  mvtnorm        1.1-0      2020-02-24 [1]
#>  parameters     0.6.0      2020-03-12 [1]
#>  pbapply        1.4-2      2019-08-31 [1]
#>  pillar         1.4.3      2019-12-20 [1]
#>  pkgbuild       1.0.6      2019-10-09 [1]
#>  pkgconfig      2.0.3      2019-09-22 [1]
#>  pkgload        1.0.2      2018-10-29 [1]
#>  prettyunits    1.1.1      2020-01-24 [1]
#>  processx       3.4.2      2020-02-09 [1]
#>  ps             1.3.2      2020-02-13 [1]
#>  R6             2.4.1      2019-11-12 [1]
#>  Rcpp           1.0.4      2020-03-17 [1]
#>  remotes        2.1.1      2020-02-15 [1]
#>  rlang          0.4.5      2020-03-01 [1]
#>  rmarkdown      2.1        2020-01-20 [1]
#>  rprojroot      1.3-2      2018-01-03 [1]
#>  sessioninfo    1.1.1      2018-11-05 [1]
#>  stringi        1.4.6      2020-02-17 [1]
#>  stringr        1.4.0      2019-02-10 [1]
#>  testthat       2.3.2      2020-03-02 [1]
#>  tibble         2.1.3      2019-06-06 [1]
#>  usethis        1.5.1.9000 2020-03-18 [1]
#>  utf8           1.1.4      2018-05-24 [1]
#>  vctrs          0.2.4      2020-03-10 [1]
#>  withr          2.1.2      2018-03-15 [1]
#>  xfun           0.12       2020-01-13 [1]
#>  yaml           2.2.1      2020-02-01 [1]
#>  source                                
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  Github (easystats/bayestestR@6ee7e37) 
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  Github (easystats/correlation@c1c35b0)
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  Github (easystats/effectsize@64bfbc3) 
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  Github (easystats/insight@e0b229b)    
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  Github (r-lib/usethis@8c32c73)        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#> 
#> [1] C:/Users/inp099/Documents/R/win-library/4.0
#> [2] C:/Program Files/R/R-devel/library

This is unexpected. The estimate for the association shouldn't change even if the hypothesis testing framework changes.

For example, here is what JASP produces for the same analyses (the estimates are identical):

image

Am I missing something?

replace `as.table`?

This seem problematic to me:

library(correlation)
cor <- correlation(iris)

class(as.table(cor))
#> [1] "easycormatrix" "data.frame"

(and not a table.)

I suggest changing as.matrix.easycorrelation in two ways:

  1. Add a redundant argument (default FALSE?).
  2. Add attributes that can be used by print.easycormatrix to give the same printing as is currently given?
    • Or have a argument of stars that when true returns a character matrix with the stars baked in?

remove dplyr dep

there is again a small legacy dplyr usage in correlation (correlation.R file) for grouped data frames that needs to be removed. Master @strengejacke is it as straightforward as in the other cases?

Bootstrapped mahalanobis distance

In the paper about Shepherd's pi correlation (#15), they say:

The Mahalanobis distance (in squared units) measures the distance in multivariate space taking into account the covariance structure of the data. Because a few extreme outliers can skew the covariance estimate, Dm is not robust. We therefore bootstrap the Mahalanobis distance by resampling n observations with replacement (i.e., allowing duplicates) and then calculating the Mahalanobis distance for each actual observation from the bivariate mean of the resampled data. The bootstrapped Mahalanobis distance, Ds, for each observation is the mean across the distances from all resamples.

I tried to add a bootstrapped mahalanobis, but something's wrong (the indices are roughly the same for all observations):

Does anyone have an idea? I added the basis of the function (currently on dev)

.distance_mahalanobis <- function(data, indices = 1:nrow(data), ...) {
  dat <- data[indices, ] # allows boot to select sample
  row.names(dat) <- NULL
  stats::mahalanobis(dat, center = colMeans(dat), cov = stats::cov(dat))
}

rez <- boot::boot(data = mtcars, statistic = .distance_mahalanobis, R = 1000, sim="permutation")
bayestestR::point_estimate(as.data.frame(rez$t), centrality="all")
#> # Point Estimates
#> 
#>  Parameter Median Mean MAP
#>         V1    9.5   11 9.0
#>         V2    9.9   11 9.0
#>         V3    9.5   11 9.0
#>         V4    9.5   11 9.0
#>         V5    9.9   11 9.0
#>         V6    9.5   11 9.0
#>         V7    9.9   11 9.0
#>         V8    9.9   11 9.0
#>         V9    9.5   11 9.0
#>        V10    9.5   10 9.0
#>        V11    9.5   11 9.0
#>        V12    9.7   11 9.0
#>        V13    9.9   11 9.0
#>        V14    9.9   11 9.0
#>        V15    9.9   10 9.0
#>        V16    9.5   10 9.0
#>        V17    9.9   11 9.0
#>        V18    9.5   11 9.0
#>        V19    9.9   11 9.0
#>        V20    9.9   11 9.0
#>        V21    9.5   11 9.0
#>        V22    9.5   11 8.9
#>        V23    9.9   11 9.0
#>        V24    9.9   11 8.9
#>        V25    9.9   11 9.0
#>        V26    9.5   11 9.0
#>        V27    9.5   11 9.0
#>        V28    9.5   10 9.0
#>        V29    9.5   11 9.0
#>        V30    9.5   11 9.0
#>        V31    9.5   10 9.0
#>        V32    9.9   11 9.0

Created on 2019-12-09 by the reprex package (v0.3.0)

Uncertainty/variability propagation: weighted correlations?

I remember mentioning this somewhere, but I'll rephrase it here for future reference:

Sometimes we want to correlate x and y, where x is, for instance, the mean score of some measure for an individual (average reaction time in the condition A). This value is sometimes accompanied by some measure of variability (or uncertainty) (for instance, the SD). We might want to take this information into account in the correlation, so that observations of x that are more precise (with lower associated variability) have more weight that observations with large uncertainty.

One alternative is to use weighted correlation:

`correlation` function struggles with `purrr`

I am trying to use correlation in my package and had to use it with purrr and discovered this weird behavior. I am not sure what's the source of this error though.

  • data without NAs

Works as expected.

# setup
set.seed(123)
library(tidyverse)

# creating a list of dataframes
df_ls1 <- iris %>%
  split(x = ., f = .$Species, drop = TRUE) 

# running function of interest  
purrr::pmap(list(df_ls1), correlation::correlation)
#> $setosa
#> Parameter1   |   Parameter2 |    r |    t | df |      p |        95% CI |  Method | n_Obs
#> -----------------------------------------------------------------------------------------
#> Sepal.Length |  Sepal.Width | 0.74 | 7.68 | 48 | < .001 | [ 0.59, 0.85] | Pearson |    50
#> Sepal.Length | Petal.Length | 0.27 | 1.92 | 48 | 0.202  | [-0.01, 0.51] | Pearson |    50
#> Sepal.Length |  Petal.Width | 0.28 | 2.01 | 48 | 0.202  | [ 0.00, 0.52] | Pearson |    50
#> Sepal.Width  | Petal.Length | 0.18 | 1.25 | 48 | 0.217  | [-0.11, 0.43] | Pearson |    50
#> Sepal.Width  |  Petal.Width | 0.23 | 1.66 | 48 | 0.208  | [-0.05, 0.48] | Pearson |    50
#> Petal.Length |  Petal.Width | 0.33 | 2.44 | 48 | 0.093  | [ 0.06, 0.56] | Pearson |    50
#> 
#> $versicolor
#> Parameter1   |   Parameter2 |    r |    t | df |      p |       95% CI |  Method | n_Obs
#> ----------------------------------------------------------------------------------------
#> Sepal.Length |  Sepal.Width | 0.53 | 4.28 | 48 | < .001 | [0.29, 0.70] | Pearson |    50
#> Sepal.Length | Petal.Length | 0.75 | 7.95 | 48 | < .001 | [0.60, 0.85] | Pearson |    50
#> Sepal.Length |  Petal.Width | 0.55 | 4.52 | 48 | < .001 | [0.32, 0.72] | Pearson |    50
#> Sepal.Width  | Petal.Length | 0.56 | 4.69 | 48 | < .001 | [0.33, 0.73] | Pearson |    50
#> Sepal.Width  |  Petal.Width | 0.66 | 6.15 | 48 | < .001 | [0.47, 0.80] | Pearson |    50
#> Petal.Length |  Petal.Width | 0.79 | 8.83 | 48 | < .001 | [0.65, 0.87] | Pearson |    50
#> 
#> $virginica
#> Parameter1   |   Parameter2 |    r |     t | df |      p |       95% CI |  Method | n_Obs
#> -----------------------------------------------------------------------------------------
#> Sepal.Length |  Sepal.Width | 0.46 |  3.56 | 48 | 0.003  | [0.20, 0.65] | Pearson |    50
#> Sepal.Length | Petal.Length | 0.86 | 11.90 | 48 | < .001 | [0.77, 0.92] | Pearson |    50
#> Sepal.Length |  Petal.Width | 0.28 |  2.03 | 48 | 0.048  | [0.00, 0.52] | Pearson |    50
#> Sepal.Width  | Petal.Length | 0.40 |  3.03 | 48 | 0.012  | [0.14, 0.61] | Pearson |    50
#> Sepal.Width  |  Petal.Width | 0.54 |  4.42 | 48 | < .001 | [0.31, 0.71] | Pearson |    50
#> Petal.Length |  Petal.Width | 0.32 |  2.36 | 48 | 0.045  | [0.05, 0.55] | Pearson |    50
  • data with NAs
# data with NAs

# creating a list of dataframes
df_ls2 <- ggplot2::msleep %>%
split(x = ., f = .$vore, drop = TRUE) 

# running function of interest  
purrr::pmap(list(df_ls2), correlation::correlation)
#> Error in rbind(deparse.level, ...): numbers of columns of arguments do not match

Created on 2020-03-20 by the reprex package (v0.3.0)

Session info
devtools::session_info()
#> - Session info ---------------------------------------------------------------
#>  setting  value                                             
#>  version  R Under development (unstable) (2020-02-28 r77874)
#>  os       Windows 10 x64                                    
#>  system   x86_64, mingw32                                   
#>  ui       RTerm                                             
#>  language (EN)                                              
#>  collate  English_United States.1252                        
#>  ctype    English_United States.1252                        
#>  tz       Europe/Berlin                                     
#>  date     2020-03-20                                        
#> 
#> - Packages -------------------------------------------------------------------
#>  package     * version    date       lib source                                
#>  assertthat    0.2.1      2019-03-21 [1] CRAN (R 4.0.0)                        
#>  backports     1.1.5      2019-10-02 [1] CRAN (R 4.0.0)                        
#>  bayestestR    0.5.2.1    2020-03-16 [1] Github (easystats/bayestestR@6ee7e37) 
#>  broom         0.5.3.9000 2020-03-01 [1] Github (tidymodels/broom@3c922d5)     
#>  callr         3.4.2      2020-02-12 [1] CRAN (R 4.0.0)                        
#>  cellranger    1.1.0      2016-07-27 [1] CRAN (R 4.0.0)                        
#>  cli           2.0.2      2020-02-28 [1] CRAN (R 4.0.0)                        
#>  colorspace    1.4-1      2019-03-18 [1] CRAN (R 4.0.0)                        
#>  correlation   0.1.0      2020-03-17 [1] Github (easystats/correlation@c1c35b0)
#>  crayon        1.3.4      2017-09-16 [1] CRAN (R 4.0.0)                        
#>  DBI           1.1.0      2019-12-15 [1] CRAN (R 4.0.0)                        
#>  dbplyr        1.4.2      2019-06-17 [1] CRAN (R 4.0.0)                        
#>  desc          1.2.0      2018-05-01 [1] CRAN (R 4.0.0)                        
#>  devtools      2.2.2      2020-02-17 [1] CRAN (R 4.0.0)                        
#>  digest        0.6.25     2020-02-23 [1] CRAN (R 4.0.0)                        
#>  dplyr       * 0.8.5      2020-03-07 [1] CRAN (R 4.0.0)                        
#>  effectsize    0.2.0.1    2020-03-06 [1] Github (easystats/effectsize@64bfbc3) 
#>  ellipsis      0.3.0      2019-09-20 [1] CRAN (R 4.0.0)                        
#>  evaluate      0.14       2019-05-28 [1] CRAN (R 4.0.0)                        
#>  fansi         0.4.1      2020-01-08 [1] CRAN (R 4.0.0)                        
#>  forcats     * 0.5.0      2020-03-01 [1] CRAN (R 4.0.0)                        
#>  fs            1.3.2      2020-03-05 [1] CRAN (R 4.0.0)                        
#>  generics      0.0.2      2018-11-29 [1] CRAN (R 4.0.0)                        
#>  ggplot2     * 3.3.0      2020-03-05 [1] CRAN (R 4.0.0)                        
#>  glue          1.3.2      2020-03-12 [1] CRAN (R 4.0.0)                        
#>  gtable        0.3.0      2019-03-25 [1] CRAN (R 4.0.0)                        
#>  haven         2.2.0      2019-11-08 [1] CRAN (R 4.0.0)                        
#>  highr         0.8        2019-03-20 [1] CRAN (R 4.0.0)                        
#>  hms           0.5.3      2020-01-08 [1] CRAN (R 4.0.0)                        
#>  htmltools     0.4.0      2019-10-04 [1] CRAN (R 4.0.0)                        
#>  httr          1.4.1      2019-08-05 [1] CRAN (R 4.0.0)                        
#>  insight       0.8.2.1    2020-03-16 [1] Github (easystats/insight@e0b229b)    
#>  jsonlite      1.6.1      2020-02-02 [1] CRAN (R 4.0.0)                        
#>  knitr         1.28       2020-02-06 [1] CRAN (R 4.0.0)                        
#>  lifecycle     0.2.0.9000 2020-03-16 [1] Github (r-lib/lifecycle@355dcba)      
#>  lubridate     1.7.4      2018-04-11 [1] CRAN (R 4.0.0)                        
#>  magrittr      1.5        2014-11-22 [1] CRAN (R 4.0.0)                        
#>  memoise       1.1.0      2017-04-21 [1] CRAN (R 4.0.0)                        
#>  modelr        0.1.6      2020-02-22 [1] CRAN (R 4.0.0)                        
#>  munsell       0.5.0      2018-06-12 [1] CRAN (R 4.0.0)                        
#>  parameters    0.6.0      2020-03-12 [1] CRAN (R 4.0.0)                        
#>  pillar        1.4.3      2019-12-20 [1] CRAN (R 4.0.0)                        
#>  pkgbuild      1.0.6      2019-10-09 [1] CRAN (R 4.0.0)                        
#>  pkgconfig     2.0.3      2019-09-22 [1] CRAN (R 4.0.0)                        
#>  pkgload       1.0.2      2018-10-29 [1] CRAN (R 4.0.0)                        
#>  prettyunits   1.1.1      2020-01-24 [1] CRAN (R 4.0.0)                        
#>  processx      3.4.2      2020-02-09 [1] CRAN (R 4.0.0)                        
#>  ps            1.3.2      2020-02-13 [1] CRAN (R 4.0.0)                        
#>  purrr       * 0.3.3      2019-10-18 [1] CRAN (R 4.0.0)                        
#>  R6            2.4.1      2019-11-12 [1] CRAN (R 4.0.0)                        
#>  Rcpp          1.0.4      2020-03-17 [1] CRAN (R 4.0.0)                        
#>  readr       * 1.3.1      2018-12-21 [1] CRAN (R 4.0.0)                        
#>  readxl        1.3.1      2019-03-13 [1] CRAN (R 4.0.0)                        
#>  remotes       2.1.1      2020-02-15 [1] CRAN (R 4.0.0)                        
#>  reprex        0.3.0      2019-05-16 [1] CRAN (R 4.0.0)                        
#>  rlang         0.4.5      2020-03-01 [1] CRAN (R 4.0.0)                        
#>  rmarkdown     2.1        2020-01-20 [1] CRAN (R 4.0.0)                        
#>  rprojroot     1.3-2      2018-01-03 [1] CRAN (R 4.0.0)                        
#>  rvest         0.3.5      2019-11-08 [1] CRAN (R 4.0.0)                        
#>  scales        1.1.0      2019-11-18 [1] CRAN (R 4.0.0)                        
#>  sessioninfo   1.1.1      2018-11-05 [1] CRAN (R 4.0.0)                        
#>  stringi       1.4.6      2020-02-17 [1] CRAN (R 4.0.0)                        
#>  stringr     * 1.4.0      2019-02-10 [1] CRAN (R 4.0.0)                        
#>  testthat      2.3.2      2020-03-02 [1] CRAN (R 4.0.0)                        
#>  tibble      * 2.1.3      2019-06-06 [1] CRAN (R 4.0.0)                        
#>  tidyr       * 1.0.2      2020-01-24 [1] CRAN (R 4.0.0)                        
#>  tidyselect    1.0.0      2020-01-27 [1] CRAN (R 4.0.0)                        
#>  tidyverse   * 1.3.0      2019-11-21 [1] CRAN (R 4.0.0)                        
#>  usethis       1.5.1.9000 2020-03-18 [1] Github (r-lib/usethis@8c32c73)        
#>  vctrs         0.2.4      2020-03-10 [1] CRAN (R 4.0.0)                        
#>  withr         2.1.2      2018-03-15 [1] CRAN (R 4.0.0)                        
#>  xfun          0.12       2020-01-13 [1] CRAN (R 4.0.0)                        
#>  xml2          1.2.5      2020-03-11 [1] CRAN (R 4.0.0)                        
#>  yaml          2.2.1      2020-02-01 [1] CRAN (R 4.0.0)                        
#> 
#> [1] C:/Users/inp099/Documents/R/win-library/4.0
#> [2] C:/Program Files/R/R-devel/library

robust correlations with 'ranktransform'

as shown by @lindeloev, Spearman correlations can be estimated using a rank transformation. Hence, it might be interesting to develop a more flexible framework for robust spearman-like correlations now that we have ranktransform() in effectsize, to make it work with all the correlation types, Bayesian, partial etc.

CRAN initial submission

@strengejacke I think I might try adding a bit more tests, and then go ahead with submission. Although there are potential developments for this package (#2), I think it should be ok for an initial release. What do you think?

using correlation in shiny apps in shinyapps.io

Hello,
Thank you for the package.

Your package is working as expected in local shiny apps. But when I try to upload it to a shinyapps.io server, I get the following error:

installed from sources; Packrat will assume this package is available from a CRAN-like repository during future restores Execution halted

I think there is a missing metadata in the package as described in this post:
https://community.rstudio.com/t/error-when-using-devtools-install-github-with-shiny-for-private-repository/39053/2?u=serdarbalci

Best wishes

upper and lower matrix with different correlation coefficients

From strengejacke/sjPlot#31:

Would be nice if sjt.corr could display different correlation coefficients in the upper and lower triangle of the matrix. I think it is quite common to display Pearson and Spearman correlation coefficients and to do this in one matrix.

lower.tri(x, diag = FALSE) and upper.tri(x, diag = FALSE) might come in handy as mentioned here:
http://www.sthda.com/english/wiki/elegant-correlation-table-using-xtable-r-package

Degrees of freedom for multilevel correlation

Hi - I noticed what looks like a potential bug in the multilevel correlation function. Specifically, when there are rows with a value for the factor, but neither of the variables to be correlated, these rows still seem to be counted in the degrees of freedom. This also means that dropping (or arbitrarily adding) empty rows changes the inferential statistics.

MWE:

library(tidyverse)
library(correlation)

# Full data frame
df1 <- data.frame("id" = factor(rep(letters[1:10], each = 10)), 
                 "V1" = rnorm(100, 0, 1), 
                 "V2" = rnorm(100, 0, 1))

correlation(df1, multilevel = TRUE)

# Introduce missingness
df2 <- df1
df2[sample(1:100, 10), c("V1","V2")] <- NA

correlation(df2, multilevel = TRUE)

# Drop rows with missingness
df3 <- df2 %>%
  drop_na()

correlation(df3, multilevel = TRUE)

Printing rounds incorrectly?

Following @profandyfield remarks, there seems to be a small discrepancy:

exam_tib <- readr::read_csv("http://discoveringstatistics.com/repository/dsr2/exam_anxiety.csv")
#> Parsed with column specification:
#> cols(
#>   id = col_double(),
#>   revise = col_double(),
#>   exam_grade = col_double(),
#>   anxiety = col_double(),
#>   sex = col_character()
#> )

data <- exam_tib[c("exam_grade", "revise", "anxiety")]

# With rounding
correlation::correlation(data, partial = TRUE)[c("r", "p", "t", "n_Obs", "Method")]
#> r     |      p |     t | n_Obs |  Method
#> ----------------------------------------
#> 0.13  | 0.182  |  1.35 |   103 | Pearson
#> -0.25 | 0.024  | -2.56 |   103 | Pearson
#> -0.65 | < .001 | -8.56 |   103 | Pearson

# Without
as.data.frame(correlation::correlation(data, partial = TRUE))[c("r", "p", "t", "n_Obs", "Method")]
#>            r            p         t n_Obs  Method
#> 1  0.1326783 1.815432e-01  1.345293   103 Pearson
#> 2 -0.2466658 2.402532e-02 -2.558002   103 Pearson
#> 3 -0.6485301 3.881594e-13 -8.562455   103 Pearson

ppcor::pcor.test(data$exam_grade, data$revise, data$anxiety)
#>    estimate   p.value statistic   n gp  Method
#> 1 0.1326783 0.1837308  1.338617 103  1 pearson
ppcor::pcor.test(data$exam_grade, data$anxiety, data$revise)
#>     estimate    p.value statistic   n gp  Method
#> 1 -0.2466658 0.01244581 -2.545307 103  1 pearson
ppcor::pcor.test(data$revise, data$anxiety, data$exam_grade)
#>     estimate      p.value statistic   n gp  Method
#> 1 -0.6485301 1.708019e-13 -8.519961 103  1 pearson

Created on 2020-04-15 by the reprex package (v0.3.0)

Partial / Rank correlation documentation

I think it should be made more explicit (especially for the Bayesian methods) how Partial and Rank correlation are actually obtained.

  • These are not "true" Bayesian rank correlations (which have some weird priors that I don't understand), but x & y a rank transformed and then these are tested for a Pearson correlation.
  • And these are not "true" Bayesian partial correlations (no prior for the correlation between X<->Z or Z<->Y), but Z is partial-ed out of x & y using OLS, and then these residual-scores are tested for a Pearson correlation.

difficulty in creating a matrix of p-values from `correlation`

This is in a way related to #12. If I wanted to create a correlation matrix visualization with ggcorrplot, I need both a matrix of correlations and p-values but as.matrix doesn't seem to work with the latter:

library(tidyverse)
library(correlation)

# formatting to respect current `ggcorrmat` defaults
df <- 
  correlation::correlation(
  data = ggplot2::msleep,
  ci = "default",
  method = "pearson"
) 

# create a matrix of correlation values
df %>%  
  select(Parameter1, Parameter2, r) %>%
  as.matrix() 

#>             sleep_total  sleep_rem sleep_cycle      awake    brainwt     bodywt
#> sleep_total   1.0000000  0.7517550  -0.4737127 -0.9999986 -0.3604874 -0.3120106
#> sleep_rem     0.7517550  1.0000000  -0.3381235 -0.7517713 -0.2213348 -0.3276507
#> sleep_cycle  -0.4737127 -0.3381235   1.0000000  0.4737127  0.8516203  0.4178029
#> awake        -0.9999986 -0.7517713   0.4737127  1.0000000  0.3604874  0.3119801
#> brainwt      -0.3604874 -0.2213348   0.8516203  0.3604874  1.0000000  0.9337822
#> bodywt       -0.3120106 -0.3276507   0.4178029  0.3119801  0.9337822  1.0000000

# create a matrix of p-values
df %>%  
  select(Parameter1, Parameter2, p) %>%
  as.matrix() 

#> Error in frame[row, col] <- object[(object$Parameter1 == row & object$Parameter2 == : number of items to replace is not a multiple of replacement length

Created on 2020-02-27 by the reprex package (v0.3.0)

Session info
devtools::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 3.6.2 (2019-12-12)
#>  os       macOS Mojave 10.14.6        
#>  system   x86_64, darwin15.6.0        
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_US.UTF-8                 
#>  ctype    en_US.UTF-8                 
#>  tz       Europe/Berlin               
#>  date     2020-02-27                  
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version    date       lib source                                
#>  assertthat    0.2.1      2019-03-21 [1] CRAN (R 3.6.0)                        
#>  backports     1.1.5      2019-10-02 [1] CRAN (R 3.6.0)                        
#>  bayestestR    0.5.2      2020-02-13 [1] Github (easystats/bayestestR@4350b4f) 
#>  broom         0.5.3.9000 2020-02-20 [1] Github (tidymodels/broom@3c922d5)     
#>  callr         3.4.2      2020-02-12 [1] CRAN (R 3.6.2)                        
#>  cellranger    1.1.0      2016-07-27 [1] CRAN (R 3.6.0)                        
#>  cli           2.0.1      2020-01-08 [1] CRAN (R 3.6.2)                        
#>  colorspace    1.4-1      2019-03-18 [1] CRAN (R 3.6.0)                        
#>  correlation * 0.1.0      2020-02-27 [1] Github (easystats/correlation@f0ec824)
#>  crayon        1.3.4      2017-09-16 [1] CRAN (R 3.6.0)                        
#>  DBI           1.1.0      2019-12-15 [1] CRAN (R 3.6.2)                        
#>  dbplyr        1.4.2      2019-06-17 [1] CRAN (R 3.6.0)                        
#>  desc          1.2.0      2018-05-01 [1] CRAN (R 3.6.0)                        
#>  devtools      2.2.2      2020-02-17 [1] CRAN (R 3.6.2)                        
#>  digest        0.6.25     2020-02-23 [1] CRAN (R 3.6.0)                        
#>  dplyr       * 0.8.4      2020-01-31 [1] CRAN (R 3.6.0)                        
#>  effectsize    0.2.0      2020-02-25 [1] CRAN (R 3.6.2)                        
#>  ellipsis      0.3.0      2019-09-20 [1] CRAN (R 3.6.0)                        
#>  evaluate      0.14       2019-05-28 [1] CRAN (R 3.6.0)                        
#>  fansi         0.4.1      2020-01-08 [1] CRAN (R 3.6.2)                        
#>  forcats     * 0.4.0      2019-02-17 [1] CRAN (R 3.6.0)                        
#>  fs            1.3.1      2019-05-06 [1] CRAN (R 3.6.0)                        
#>  generics      0.0.2      2018-11-29 [1] CRAN (R 3.6.0)                        
#>  ggplot2     * 3.2.1      2019-08-10 [1] CRAN (R 3.6.0)                        
#>  glue          1.3.1      2019-03-12 [1] CRAN (R 3.6.0)                        
#>  gtable        0.3.0      2019-03-25 [1] CRAN (R 3.6.0)                        
#>  haven         2.2.0      2019-11-08 [1] CRAN (R 3.6.1)                        
#>  highr         0.8        2019-03-20 [1] CRAN (R 3.6.0)                        
#>  hms           0.5.3      2020-01-08 [1] CRAN (R 3.6.2)                        
#>  htmltools     0.4.0      2019-10-04 [1] CRAN (R 3.6.0)                        
#>  httr          1.4.1      2019-08-05 [1] CRAN (R 3.6.0)                        
#>  insight       0.8.1.1    2020-02-20 [1] Github (easystats/insight@ff0c9a2)    
#>  jsonlite      1.6.1      2020-02-02 [1] CRAN (R 3.6.2)                        
#>  knitr         1.28       2020-02-06 [1] CRAN (R 3.6.2)                        
#>  lazyeval      0.2.2      2019-03-15 [1] CRAN (R 3.6.0)                        
#>  lifecycle     0.1.0      2019-08-01 [1] CRAN (R 3.6.0)                        
#>  lubridate     1.7.4      2018-04-11 [1] CRAN (R 3.6.0)                        
#>  magrittr      1.5        2014-11-22 [1] CRAN (R 3.6.0)                        
#>  memoise       1.1.0      2017-04-21 [1] CRAN (R 3.6.0)                        
#>  modelr        0.1.6      2020-02-22 [1] CRAN (R 3.6.0)                        
#>  munsell       0.5.0      2018-06-12 [1] CRAN (R 3.6.0)                        
#>  parameters    0.5.0.1    2020-02-20 [1] Github (easystats/parameters@f62f3ea) 
#>  pillar        1.4.3      2019-12-20 [1] CRAN (R 3.6.2)                        
#>  pkgbuild      1.0.6      2019-10-09 [1] CRAN (R 3.6.0)                        
#>  pkgconfig     2.0.3      2019-09-22 [1] CRAN (R 3.6.0)                        
#>  pkgload       1.0.2      2018-10-29 [1] CRAN (R 3.6.0)                        
#>  prettyunits   1.1.1      2020-01-24 [1] CRAN (R 3.6.2)                        
#>  processx      3.4.2      2020-02-09 [1] CRAN (R 3.6.2)                        
#>  ps            1.3.2      2020-02-13 [1] CRAN (R 3.6.2)                        
#>  purrr       * 0.3.3      2019-10-18 [1] CRAN (R 3.6.0)                        
#>  R6            2.4.1      2019-11-12 [1] CRAN (R 3.6.0)                        
#>  Rcpp          1.0.3      2019-11-08 [1] CRAN (R 3.6.1)                        
#>  readr       * 1.3.1      2018-12-21 [1] CRAN (R 3.6.0)                        
#>  readxl        1.3.1      2019-03-13 [1] CRAN (R 3.6.0)                        
#>  remotes       2.1.1      2020-02-15 [1] CRAN (R 3.6.0)                        
#>  reprex        0.3.0      2019-05-16 [1] CRAN (R 3.6.0)                        
#>  rlang         0.4.4      2020-01-28 [1] CRAN (R 3.6.2)                        
#>  rmarkdown     2.1        2020-01-20 [1] CRAN (R 3.6.2)                        
#>  rprojroot     1.3-2      2018-01-03 [1] CRAN (R 3.6.0)                        
#>  rvest         0.3.5      2019-11-08 [1] CRAN (R 3.6.0)                        
#>  scales        1.1.0      2019-11-18 [1] CRAN (R 3.6.0)                        
#>  sessioninfo   1.1.1      2018-11-05 [1] CRAN (R 3.6.0)                        
#>  stringi       1.4.6      2020-02-17 [1] CRAN (R 3.6.2)                        
#>  stringr     * 1.4.0      2019-02-10 [1] CRAN (R 3.6.0)                        
#>  testthat      2.3.1      2019-12-01 [1] CRAN (R 3.6.0)                        
#>  tibble      * 2.1.3      2019-06-06 [1] CRAN (R 3.6.0)                        
#>  tidyr       * 1.0.2      2020-01-24 [1] CRAN (R 3.6.2)                        
#>  tidyselect    1.0.0      2020-01-27 [1] CRAN (R 3.6.2)                        
#>  tidyverse   * 1.3.0      2019-11-21 [1] CRAN (R 3.6.0)                        
#>  usethis       1.5.1.9000 2020-02-18 [1] Github (r-lib/usethis@2a3d134)        
#>  vctrs         0.2.3      2020-02-20 [1] CRAN (R 3.6.2)                        
#>  withr         2.1.2      2018-03-15 [1] CRAN (R 3.6.0)                        
#>  xfun          0.12       2020-01-13 [1] CRAN (R 3.6.2)                        
#>  xml2          1.2.2      2019-08-09 [1] CRAN (R 3.6.0)                        
#>  yaml          2.2.1      2020-02-01 [1] CRAN (R 3.6.0)                        
#> 
#> [1] /Users/patil/Library/R/3.6/library
#> [2] /Library/Frameworks/R.framework/Versions/3.6/Resources/library

including two columns for frequentist statistics: `p` and `p_adjusted`

It will be nice to have two columns here: p and p.adjusted, which would be identical only in case the p_adjust = "none". Maybe another column called p_value_adjustment containing details of the adjustment method will also be helpful (e.g., I do this for pairwise comparisons: https://indrajeetpatil.github.io/pairwiseComparisons/reference/pairwise_comparisons.html#examples)

Here initially I couldn't tell if the p-values were adjusted or not because they are so small, but having these new columns might avoid such confusion.

library(correlation)

correlation(iris, p_adjust = "none")
#> Parameter1   |   Parameter2 |     r |     t |  df |      p |         95% CI |  Method | n_Obs
#> ---------------------------------------------------------------------------------------------
#> Sepal.Length |  Sepal.Width | -0.12 | -1.44 | 148 | 0.152  | [-0.27,  0.04] | Pearson |   150
#> Sepal.Length | Petal.Length |  0.87 | 21.65 | 148 | < .001 | [ 0.83,  0.91] | Pearson |   150
#> Sepal.Length |  Petal.Width |  0.82 | 17.30 | 148 | < .001 | [ 0.76,  0.86] | Pearson |   150
#> Sepal.Width  | Petal.Length | -0.43 | -5.77 | 148 | < .001 | [-0.55, -0.29] | Pearson |   150
#> Sepal.Width  |  Petal.Width | -0.37 | -4.79 | 148 | < .001 | [-0.50, -0.22] | Pearson |   150
#> Petal.Length |  Petal.Width |  0.96 | 43.39 | 148 | < .001 | [ 0.95,  0.97] | Pearson |   150

correlation(iris, p_adjust = "holm")
#> Parameter1   |   Parameter2 |     r |     t |  df |      p |         95% CI |  Method | n_Obs
#> ---------------------------------------------------------------------------------------------
#> Sepal.Length |  Sepal.Width | -0.12 | -1.44 | 148 | 0.152  | [-0.27,  0.04] | Pearson |   150
#> Sepal.Length | Petal.Length |  0.87 | 21.65 | 148 | < .001 | [ 0.83,  0.91] | Pearson |   150
#> Sepal.Length |  Petal.Width |  0.82 | 17.30 | 148 | < .001 | [ 0.76,  0.86] | Pearson |   150
#> Sepal.Width  | Petal.Length | -0.43 | -5.77 | 148 | < .001 | [-0.55, -0.29] | Pearson |   150
#> Sepal.Width  |  Petal.Width | -0.37 | -4.79 | 148 | < .001 | [-0.50, -0.22] | Pearson |   150
#> Petal.Length |  Petal.Width |  0.96 | 43.39 | 148 | < .001 | [ 0.95,  0.97] | Pearson |   150

correlation(iris, p_adjust = "BH")
#> Parameter1   |   Parameter2 |     r |     t |  df |      p |         95% CI |  Method | n_Obs
#> ---------------------------------------------------------------------------------------------
#> Sepal.Length |  Sepal.Width | -0.12 | -1.44 | 148 | 0.152  | [-0.27,  0.04] | Pearson |   150
#> Sepal.Length | Petal.Length |  0.87 | 21.65 | 148 | < .001 | [ 0.83,  0.91] | Pearson |   150
#> Sepal.Length |  Petal.Width |  0.82 | 17.30 | 148 | < .001 | [ 0.76,  0.86] | Pearson |   150
#> Sepal.Width  | Petal.Length | -0.43 | -5.77 | 148 | < .001 | [-0.55, -0.29] | Pearson |   150
#> Sepal.Width  |  Petal.Width | -0.37 | -4.79 | 148 | < .001 | [-0.50, -0.22] | Pearson |   150
#> Petal.Length |  Petal.Width |  0.96 | 43.39 | 148 | < .001 | [ 0.95,  0.97] | Pearson |   150

Created on 2020-03-19 by the reprex package (v0.3.0)

Session info
devtools::session_info()
#> - Session info ---------------------------------------------------------------
#>  setting  value                                             
#>  version  R Under development (unstable) (2020-02-28 r77874)
#>  os       Windows 10 x64                                    
#>  system   x86_64, mingw32                                   
#>  ui       RTerm                                             
#>  language (EN)                                              
#>  collate  English_United States.1252                        
#>  ctype    English_United States.1252                        
#>  tz       Europe/Berlin                                     
#>  date     2020-03-19                                        
#> 
#> - Packages -------------------------------------------------------------------
#>  package     * version    date       lib source                                
#>  assertthat    0.2.1      2019-03-21 [1] CRAN (R 4.0.0)                        
#>  backports     1.1.5      2019-10-02 [1] CRAN (R 4.0.0)                        
#>  bayestestR    0.5.2.1    2020-03-16 [1] Github (easystats/bayestestR@6ee7e37) 
#>  callr         3.4.2      2020-02-12 [1] CRAN (R 4.0.0)                        
#>  cli           2.0.2      2020-02-28 [1] CRAN (R 4.0.0)                        
#>  correlation * 0.1.0      2020-03-17 [1] Github (easystats/correlation@c1c35b0)
#>  crayon        1.3.4      2017-09-16 [1] CRAN (R 4.0.0)                        
#>  desc          1.2.0      2018-05-01 [1] CRAN (R 4.0.0)                        
#>  devtools      2.2.2      2020-02-17 [1] CRAN (R 4.0.0)                        
#>  digest        0.6.25     2020-02-23 [1] CRAN (R 4.0.0)                        
#>  effectsize    0.2.0.1    2020-03-06 [1] Github (easystats/effectsize@64bfbc3) 
#>  ellipsis      0.3.0      2019-09-20 [1] CRAN (R 4.0.0)                        
#>  evaluate      0.14       2019-05-28 [1] CRAN (R 4.0.0)                        
#>  fansi         0.4.1      2020-01-08 [1] CRAN (R 4.0.0)                        
#>  fs            1.3.2      2020-03-05 [1] CRAN (R 4.0.0)                        
#>  glue          1.3.2      2020-03-12 [1] CRAN (R 4.0.0)                        
#>  highr         0.8        2019-03-20 [1] CRAN (R 4.0.0)                        
#>  htmltools     0.4.0      2019-10-04 [1] CRAN (R 4.0.0)                        
#>  insight       0.8.2.1    2020-03-16 [1] Github (easystats/insight@e0b229b)    
#>  knitr         1.28       2020-02-06 [1] CRAN (R 4.0.0)                        
#>  magrittr      1.5        2014-11-22 [1] CRAN (R 4.0.0)                        
#>  memoise       1.1.0      2017-04-21 [1] CRAN (R 4.0.0)                        
#>  parameters    0.6.0      2020-03-12 [1] CRAN (R 4.0.0)                        
#>  pkgbuild      1.0.6      2019-10-09 [1] CRAN (R 4.0.0)                        
#>  pkgload       1.0.2      2018-10-29 [1] CRAN (R 4.0.0)                        
#>  prettyunits   1.1.1      2020-01-24 [1] CRAN (R 4.0.0)                        
#>  processx      3.4.2      2020-02-09 [1] CRAN (R 4.0.0)                        
#>  ps            1.3.2      2020-02-13 [1] CRAN (R 4.0.0)                        
#>  R6            2.4.1      2019-11-12 [1] CRAN (R 4.0.0)                        
#>  Rcpp          1.0.4      2020-03-17 [1] CRAN (R 4.0.0)                        
#>  remotes       2.1.1      2020-02-15 [1] CRAN (R 4.0.0)                        
#>  rlang         0.4.5      2020-03-01 [1] CRAN (R 4.0.0)                        
#>  rmarkdown     2.1        2020-01-20 [1] CRAN (R 4.0.0)                        
#>  rprojroot     1.3-2      2018-01-03 [1] CRAN (R 4.0.0)                        
#>  sessioninfo   1.1.1      2018-11-05 [1] CRAN (R 4.0.0)                        
#>  stringi       1.4.6      2020-02-17 [1] CRAN (R 4.0.0)                        
#>  stringr       1.4.0      2019-02-10 [1] CRAN (R 4.0.0)                        
#>  testthat      2.3.2      2020-03-02 [1] CRAN (R 4.0.0)                        
#>  usethis       1.5.1.9000 2020-03-18 [1] Github (r-lib/usethis@8c32c73)        
#>  withr         2.1.2      2018-03-15 [1] CRAN (R 4.0.0)                        
#>  xfun          0.12       2020-01-13 [1] CRAN (R 4.0.0)                        
#>  yaml          2.2.1      2020-02-01 [1] CRAN (R 4.0.0)                        
#> 
#> [1] C:/Users/inp099/Documents/R/win-library/4.0
#> [2] C:/Program Files/R/R-devel/library

feature request: consistently returning 95% CI across different correlation methods

I just randomly sampled a few methods here but the point I am trying to make is that it will be nice to be consistent across different methods in terms of what the output looks like.

library(tidyverse)
library(correlation)

# select only numeric varibles
df <- purrr::keep(ggplot2::msleep, is_bare_numeric)

# pearson (95% CI? : Yes)
correlation::correlation(df)
#> Parameter1  |  Parameter2 |     r |        t | df |      p |         95% CI |  Method
#> -------------------------------------------------------------------------------------
#> sleep_total |   sleep_rem |  0.75 |     8.76 | 59 | < .001 | [ 0.62,  0.84] | Pearson
#> sleep_total | sleep_cycle | -0.47 |    -2.95 | 30 | 0.049  | [-0.71, -0.15] | Pearson
#> sleep_total |       awake | -1.00 | -5328.71 | 81 | < .001 | [-1.00, -1.00] | Pearson
#> sleep_total |     brainwt | -0.36 |    -2.84 | 54 | 0.049  | [-0.57, -0.11] | Pearson
#> sleep_total |      bodywt | -0.31 |    -2.96 | 81 | 0.041  | [-0.49, -0.10] | Pearson
#> sleep_rem   | sleep_cycle | -0.34 |    -1.97 | 30 | 0.117  | [-0.61,  0.01] | Pearson
#> sleep_rem   |       awake | -0.75 |    -8.76 | 59 | < .001 | [-0.84, -0.62] | Pearson
#> sleep_rem   |     brainwt | -0.22 |    -1.54 | 46 | 0.131  | [-0.48,  0.07] | Pearson
#> sleep_rem   |      bodywt | -0.33 |    -2.66 | 59 | 0.049  | [-0.54, -0.08] | Pearson
#> sleep_cycle |       awake |  0.47 |     2.95 | 30 | 0.049  | [ 0.15,  0.71] | Pearson
#> sleep_cycle |     brainwt |  0.85 |     8.60 | 28 | < .001 | [ 0.71,  0.93] | Pearson
#> sleep_cycle |      bodywt |  0.42 |     2.52 | 30 | 0.052  | [ 0.08,  0.67] | Pearson
#> awake       |     brainwt |  0.36 |     2.84 | 54 | 0.049  | [ 0.11,  0.57] | Pearson
#> awake       |      bodywt |  0.31 |     2.96 | 81 | 0.041  | [ 0.10,  0.49] | Pearson
#> brainwt     |      bodywt |  0.93 |    19.18 | 54 | < .001 | [ 0.89,  0.96] | Pearson

# percentage bend (95% CI? : Yes)
correlation::correlation(df, method = "percentage")
#> Parameter1  |  Parameter2 |     r |        t | df |      p |         95% CI |          Method
#> ---------------------------------------------------------------------------------------------
#> sleep_total |   sleep_rem |  0.75 |     8.79 | 59 | < .001 | [ 0.62,  0.84] | Percentage_Bend
#> sleep_total | sleep_cycle | -0.51 |    -3.22 | 30 | 0.012  | [-0.73, -0.19] | Percentage_Bend
#> sleep_total |       awake | -1.00 | -6525.30 | 81 | < .001 | [-1.00, -1.00] | Percentage_Bend
#> sleep_total |     brainwt | -0.59 |    -5.36 | 54 | < .001 | [-0.74, -0.39] | Percentage_Bend
#> sleep_total |      bodywt | -0.51 |    -5.31 | 81 | < .001 | [-0.65, -0.33] | Percentage_Bend
#> sleep_rem   | sleep_cycle | -0.40 |    -2.37 | 30 | 0.025  | [-0.65, -0.06] | Percentage_Bend
#> sleep_rem   |       awake | -0.75 |    -8.79 | 59 | < .001 | [-0.84, -0.62] | Percentage_Bend
#> sleep_rem   |     brainwt | -0.41 |    -3.04 | 46 | 0.012  | [-0.62, -0.14] | Percentage_Bend
#> sleep_rem   |      bodywt | -0.40 |    -3.38 | 59 | 0.006  | [-0.59, -0.17] | Percentage_Bend
#> sleep_cycle |       awake |  0.51 |     3.22 | 30 | 0.012  | [ 0.19,  0.73] | Percentage_Bend
#> sleep_cycle |     brainwt |  0.89 |    10.45 | 28 | < .001 | [ 0.78,  0.95] | Percentage_Bend
#> sleep_cycle |      bodywt |  0.77 |     6.62 | 30 | < .001 | [ 0.58,  0.88] | Percentage_Bend
#> awake       |     brainwt |  0.59 |     5.36 | 54 | < .001 | [ 0.39,  0.74] | Percentage_Bend
#> awake       |      bodywt |  0.51 |     5.31 | 81 | < .001 | [ 0.33,  0.65] | Percentage_Bend
#> brainwt     |      bodywt |  0.92 |    16.71 | 54 | < .001 | [ 0.86,  0.95] | Percentage_Bend

# spearman (95% CI? : No)
correlation::correlation(df, method = "spearman")

#> Parameter1  |  Parameter2 |   rho |           S |      p |   Method
#> -------------------------------------------------------------------
#> sleep_total |   sleep_rem |  0.76 |     8920.08 | < .001 | Spearman
#> sleep_total | sleep_cycle | -0.49 |     8122.87 | 0.014  | Spearman
#> sleep_total |       awake | -1.00 | 1.90568e+05 | < .001 | Spearman
#> sleep_total |     brainwt | -0.59 |    46627.12 | < .001 | Spearman
#> sleep_total |      bodywt | -0.53 | 1.46223e+05 | < .001 | Spearman
#> sleep_rem   | sleep_cycle | -0.33 |     7280.52 | 0.061  | Spearman
#> sleep_rem   |       awake | -0.76 |    66719.92 | < .001 | Spearman
#> sleep_rem   |     brainwt | -0.41 |    26049.73 | 0.014  | Spearman
#> sleep_rem   |      bodywt | -0.45 |    54903.63 | 0.001  | Spearman
#> sleep_cycle |       awake |  0.49 |     2789.13 | 0.014  | Spearman
#> sleep_cycle |     brainwt |  0.87 |      572.26 | < .001 | Spearman
#> sleep_cycle |      bodywt |  0.85 |      837.92 | < .001 | Spearman
#> awake       |     brainwt |  0.59 |    11892.88 | < .001 | Spearman
#> awake       |      bodywt |  0.53 |    44345.02 | < .001 | Spearman
#> brainwt     |      bodywt |  0.96 |     1253.56 | < .001 | Spearman

# kendall (95% CI? : No)
correlation::correlation(df, method = "kendall")

#> Parameter1  |  Parameter2 |   tau |      z |      p |  Method
#> -------------------------------------------------------------
#> sleep_total |   sleep_rem |  0.59 |   6.64 | < .001 | Kendall
#> sleep_total | sleep_cycle | -0.35 |  -2.75 | 0.024  | Kendall
#> sleep_total |       awake | -1.00 | -13.30 | < .001 | Kendall
#> sleep_total |     brainwt | -0.43 |  -4.65 | < .001 | Kendall
#> sleep_total |      bodywt | -0.39 |  -5.14 | < .001 | Kendall
#> sleep_rem   | sleep_cycle | -0.21 |  -1.63 | 0.104  | Kendall
#> sleep_rem   |       awake | -0.59 |  -6.64 | < .001 | Kendall
#> sleep_rem   |     brainwt | -0.26 |  -2.61 | 0.024  | Kendall
#> sleep_rem   |      bodywt | -0.32 |  -3.56 | 0.002  | Kendall
#> sleep_cycle |       awake |  0.35 |   2.75 | 0.024  | Kendall
#> sleep_cycle |     brainwt |  0.71 |   5.47 | < .001 | Kendall
#> sleep_cycle |      bodywt |  0.65 |   5.21 | < .001 | Kendall
#> awake       |     brainwt |  0.43 |   4.65 | < .001 | Kendall
#> awake       |      bodywt |  0.39 |   5.14 | < .001 | Kendall
#> brainwt     |      bodywt |  0.84 |   9.11 | < .001 | Kendall

Created on 2020-02-10 by the reprex package (v0.3.0)

Session info
devtools::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 3.6.2 (2019-12-12)
#>  os       macOS Mojave 10.14.6        
#>  system   x86_64, darwin15.6.0        
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_US.UTF-8                 
#>  ctype    en_US.UTF-8                 
#>  tz       Europe/Berlin               
#>  date     2020-02-10                  
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version   date       lib source                                
#>  assertthat    0.2.1     2019-03-21 [1] CRAN (R 3.6.0)                        
#>  backports     1.1.5     2019-10-02 [1] CRAN (R 3.6.0)                        
#>  bayestestR    0.5.1     2020-01-27 [1] CRAN (R 3.6.2)                        
#>  broom         0.5.4     2020-01-27 [1] CRAN (R 3.6.2)                        
#>  callr         3.4.1     2020-01-24 [1] CRAN (R 3.6.2)                        
#>  cellranger    1.1.0     2016-07-27 [1] CRAN (R 3.6.0)                        
#>  cli           2.0.1     2020-01-08 [1] CRAN (R 3.6.2)                        
#>  colorspace    1.4-1     2019-03-18 [1] CRAN (R 3.6.0)                        
#>  correlation * 0.1.0     2020-02-10 [1] Github (easystats/correlation@b80559a)
#>  crayon        1.3.4     2017-09-16 [1] CRAN (R 3.6.0)                        
#>  DBI           1.1.0     2019-12-15 [1] CRAN (R 3.6.2)                        
#>  dbplyr        1.4.2     2019-06-17 [1] CRAN (R 3.6.0)                        
#>  desc          1.2.0     2018-05-01 [1] CRAN (R 3.6.0)                        
#>  devtools      2.2.1     2019-09-24 [1] CRAN (R 3.6.0)                        
#>  digest        0.6.23    2019-11-23 [1] CRAN (R 3.6.0)                        
#>  dplyr       * 0.8.4     2020-01-31 [1] CRAN (R 3.6.0)                        
#>  ellipsis      0.3.0     2019-09-20 [1] CRAN (R 3.6.0)                        
#>  evaluate      0.14      2019-05-28 [1] CRAN (R 3.6.0)                        
#>  fansi         0.4.1     2020-01-08 [1] CRAN (R 3.6.2)                        
#>  forcats     * 0.4.0     2019-02-17 [1] CRAN (R 3.6.0)                        
#>  fs            1.3.1     2019-05-06 [1] CRAN (R 3.6.0)                        
#>  generics      0.0.2     2018-11-29 [1] CRAN (R 3.6.0)                        
#>  ggplot2     * 3.2.1     2019-08-10 [1] CRAN (R 3.6.0)                        
#>  glue          1.3.1     2019-03-12 [1] CRAN (R 3.6.0)                        
#>  gtable        0.3.0     2019-03-25 [1] CRAN (R 3.6.0)                        
#>  haven         2.2.0     2019-11-08 [1] CRAN (R 3.6.1)                        
#>  highr         0.8       2019-03-20 [1] CRAN (R 3.6.0)                        
#>  hms           0.5.3     2020-01-08 [1] CRAN (R 3.6.2)                        
#>  htmltools     0.4.0     2019-10-04 [1] CRAN (R 3.6.0)                        
#>  httr          1.4.1     2019-08-05 [1] CRAN (R 3.6.0)                        
#>  insight       0.8.1     2020-02-02 [1] CRAN (R 3.6.2)                        
#>  jsonlite      1.6.1     2020-02-02 [1] CRAN (R 3.6.2)                        
#>  knitr         1.28      2020-02-06 [1] CRAN (R 3.6.2)                        
#>  lattice       0.20-38   2018-11-04 [2] CRAN (R 3.6.2)                        
#>  lazyeval      0.2.2     2019-03-15 [1] CRAN (R 3.6.0)                        
#>  lifecycle     0.1.0     2019-08-01 [1] CRAN (R 3.6.0)                        
#>  lubridate     1.7.4     2018-04-11 [1] CRAN (R 3.6.0)                        
#>  magrittr      1.5       2014-11-22 [1] CRAN (R 3.6.0)                        
#>  memoise       1.1.0     2017-04-21 [1] CRAN (R 3.6.0)                        
#>  mnormt        1.5-6     2020-02-03 [1] CRAN (R 3.6.0)                        
#>  modelr        0.1.5     2019-08-08 [1] CRAN (R 3.6.0)                        
#>  munsell       0.5.0     2018-06-12 [1] CRAN (R 3.6.0)                        
#>  nlme          3.1-142   2019-11-07 [2] CRAN (R 3.6.2)                        
#>  parameters    0.5.0     2020-02-09 [1] CRAN (R 3.6.2)                        
#>  pillar        1.4.3     2019-12-20 [1] CRAN (R 3.6.2)                        
#>  pkgbuild      1.0.6     2019-10-09 [1] CRAN (R 3.6.0)                        
#>  pkgconfig     2.0.3     2019-09-22 [1] CRAN (R 3.6.0)                        
#>  pkgload       1.0.2     2018-10-29 [1] CRAN (R 3.6.0)                        
#>  prettyunits   1.1.1     2020-01-24 [1] CRAN (R 3.6.2)                        
#>  processx      3.4.2     2020-02-09 [1] CRAN (R 3.6.2)                        
#>  ps            1.3.0     2018-12-21 [1] CRAN (R 3.6.0)                        
#>  psych       * 1.9.12.31 2020-01-08 [1] CRAN (R 3.6.2)                        
#>  purrr       * 0.3.3     2019-10-18 [1] CRAN (R 3.6.0)                        
#>  R6            2.4.1     2019-11-12 [1] CRAN (R 3.6.0)                        
#>  Rcpp          1.0.3     2019-11-08 [1] CRAN (R 3.6.1)                        
#>  readr       * 1.3.1     2018-12-21 [1] CRAN (R 3.6.0)                        
#>  readxl        1.3.1     2019-03-13 [1] CRAN (R 3.6.0)                        
#>  remotes       2.1.0     2019-06-24 [1] CRAN (R 3.6.0)                        
#>  reprex        0.3.0     2019-05-16 [1] CRAN (R 3.6.0)                        
#>  rlang         0.4.4     2020-01-28 [1] CRAN (R 3.6.2)                        
#>  rmarkdown     2.1       2020-01-20 [1] CRAN (R 3.6.2)                        
#>  rprojroot     1.3-2     2018-01-03 [1] CRAN (R 3.6.0)                        
#>  rvest         0.3.5     2019-11-08 [1] CRAN (R 3.6.0)                        
#>  scales        1.1.0     2019-11-18 [1] CRAN (R 3.6.0)                        
#>  sessioninfo   1.1.1     2018-11-05 [1] CRAN (R 3.6.0)                        
#>  stringi       1.4.5     2020-01-11 [1] CRAN (R 3.6.2)                        
#>  stringr     * 1.4.0     2019-02-10 [1] CRAN (R 3.6.0)                        
#>  testthat      2.3.1     2019-12-01 [1] CRAN (R 3.6.0)                        
#>  tibble      * 2.1.3     2019-06-06 [1] CRAN (R 3.6.0)                        
#>  tidyr       * 1.0.2     2020-01-24 [1] CRAN (R 3.6.2)                        
#>  tidyselect    1.0.0     2020-01-27 [1] CRAN (R 3.6.2)                        
#>  tidyverse   * 1.3.0     2019-11-21 [1] CRAN (R 3.6.0)                        
#>  usethis       1.5.1     2019-07-04 [1] CRAN (R 3.6.0)                        
#>  vctrs         0.2.2     2020-01-24 [1] CRAN (R 3.6.2)                        
#>  withr         2.1.2     2018-03-15 [1] CRAN (R 3.6.0)                        
#>  xfun          0.12      2020-01-13 [1] CRAN (R 3.6.2)                        
#>  xml2          1.2.2     2019-08-09 [1] CRAN (R 3.6.0)                        
#>  yaml          2.2.1     2020-02-01 [1] CRAN (R 3.6.0)                        
#> 
#> [1] /Users/patil/Library/R/3.6/library
#> [2] /Library/Frameworks/R.framework/Versions/3.6/Resources/library

Paper

"JOSS submissions are suspended until at least 4th May 2020" 😞

Another alternative could be JORS (never tried), which has publication fees (£400.00) but says If you do not have funds to pay such fees, you will have an opportunity to waive each fee. We do not want fees to prevent the publication of worthy work. (bottom of here). Which could apply here, as easystats (as the mother-project) has no funding. What do you say?

@strengejacke @mattansb @IndrajeetPatil

Biserial Correlation

Here you are:

#' @importFrom stats na.omit
.factor_to_numeric <- function(x, lowest = NULL) {
  if (is.numeric(x)) {
    return(x)
  }

  if (anyNA(suppressWarnings(as.numeric(as.character(stats::na.omit(x)))))) {
    if (is.character(x)) {
      x <- as.factor(x)
    }
    levels(x) <- 1:nlevels(x)
  }

  out <- as.numeric(as.character(x))

  if (!is.null(lowest)) {
    difference <- min(out) - lowest
    out <- out - difference
  }

  out
}

#' @importFrom stats dnorm qnorm complete.cases sd
own_biserial <- function(x, y) {
  cc <- stats::complete.cases(x, y)
  x <- x[cc]
  y <- y[cc]
  
  .factor_to_numeric(y, lowest = 0)
  
  m1 <- mean(x[y == 1])
  m0 <- mean(x[y == 0])
  sn <- stats::sd(x)
  q <- mean(y)
  p <- 1 - q
  
  zp <- stats::dnorm(stats::qnorm(q))
  
  (((m1 - m0) * (p * q / zp)) / sd(x))
}

set.seed(123)
y <- rbinom(100, 1, .3)
x <- rnorm(100)

own_biserial(x, y)
#> [1] 0.08155037
psych::biserial(x, y)
#>            [,1]
#> [1,] 0.08155037


set.seed(456)
y <- rbinom(100, 1, .3)
x <- rnorm(100)

own_biserial(x, y)
#> [1] 0.02964972
psych::biserial(x, y)
#>            [,1]
#> [1,] 0.02964972

Created on 2020-03-23 by the reprex package (v0.3.0)

Originally posted by @strengejacke in #55

Correlation "long" data to matrix

I would like report to create traditional correlation matrices from the data provided by the new correlation package, which is in a long format.

For square matrices (i.e., all variables correlated with all variables), something like this could be a first step:

model <- correlation::correlation(iris)
cells <- model$r
m <- matrix(cells, nrow = as.integer(sqrt(length(cells))), ncol=as.integer(sqrt(length(cells))), byrow = TRUE)

However, colnames and rownames still need to be named appropriately. Moreover, this wouldn't work in the case of uneven matrices, such as:

model <- correlation::correlation(
 select(iris, Sepal.Length),
 select(iris, starts_with("Petal"))
)

@strengejacke do you have by any chance any intuition?

(Regularized) (partial) correlations adjusted for random effects

I woke with this thing in mind, so I put it here for future reference.

I am currently working on some survey data, with a factor analysis part and exploring some psychometric networks.

Both of these are based more or less based on some kind of correlation matrices. However, Factor analysis requires, to my knowledge (?), a "regular" correlation matrix whereas the second (see all the work by @SachaEpskamp) is often based on partial correlations (or regularized partial correlations obtained for example via LASSO reg).

Here's the thing. My data contain some factors, or grouping structure, which I'd like somehow to adjust the correlations for. However, to my knowledge, there is no package or function that gives "random-effects (partial) correlation matrices".

Our recent discussion on effectsize made salient the fact that you can extract partial correlations in a quite straightforward way from linear regression models. HENCE, I wonder if it would be possible to apply the same to linear mixed models to extract partial correlations "adjusted" for random effects?

Also, I wonder if there's a way to recover the full correlation matrix from such mixed model, which could be in turn useful for EFA.

correlation for ordinal factors (polychoric) not working

library(correlation)
d <- data.frame(
  x = as.ordered(sample(1:5, 20, TRUE)),
  y = as.ordered(sample(letters[1:5], 20, TRUE))
)

correlation(d, method = "polychoric")
#> Error in .cor_test_polychoric(data, x, y, ci = ci, ...): Polychoric correlations can only be ran on ordinal factors.

Created on 2019-10-23 by the reprex package (v0.3.0)

problems with `correlation` with `dplyr 1.0.0`

This workflow used to work with the CRAN version of dplyr. But I updated to the development version of dplyr and it no longer seems to work. I can't seem to trace the source of this origin.

  • works outside of correlation
set.seed(123)
library(tidyverse)

iris %>%
  split(., .$Species) %>%
  map(., ~broom::tidy(stats::lm(formula = Sepal.Length ~ Sepal.Width, data = .x))) %>%
  purrr::map_dfr(., tibble::as_tibble)
#> # A tibble: 6 x 5
#>   term        estimate std.error statistic  p.value
#> * <chr>          <dbl>     <dbl>     <dbl>    <dbl>
#> 1 (Intercept)    2.64     0.310       8.51 3.74e-11
#> 2 Sepal.Width    0.690    0.0899      7.68 6.71e-10
#> 3 (Intercept)    3.54     0.563       6.29 9.07e- 8
#> 4 Sepal.Width    0.865    0.202       4.28 8.77e- 5
#> 5 (Intercept)    3.91     0.757       5.16 4.66e- 6
#> 6 Sepal.Width    0.902    0.253       3.56 8.43e- 4
  • doesn't work with correlation function
(ls <- 
  iris %>%
  split(., .$Species) %>%
  map(., correlation::correlation)) 
#> $setosa
#> Parameter1   |   Parameter2 |    r |        95% CI |    t | df |      p |  Method | n_Obs
#> -----------------------------------------------------------------------------------------
#> Sepal.Length |  Sepal.Width | 0.74 | [ 0.59, 0.85] | 7.68 | 48 | < .001 | Pearson |    50
#> Sepal.Length | Petal.Length | 0.27 | [-0.01, 0.51] | 1.92 | 48 | 0.202  | Pearson |    50
#> Sepal.Length |  Petal.Width | 0.28 | [ 0.00, 0.52] | 2.01 | 48 | 0.202  | Pearson |    50
#> Sepal.Width  | Petal.Length | 0.18 | [-0.11, 0.43] | 1.25 | 48 | 0.217  | Pearson |    50
#> Sepal.Width  |  Petal.Width | 0.23 | [-0.05, 0.48] | 1.66 | 48 | 0.208  | Pearson |    50
#> Petal.Length |  Petal.Width | 0.33 | [ 0.06, 0.56] | 2.44 | 48 | 0.093  | Pearson |    50
#> 
#> $versicolor
#> Parameter1   |   Parameter2 |    r |       95% CI |    t | df |      p |  Method | n_Obs
#> ----------------------------------------------------------------------------------------
#> Sepal.Length |  Sepal.Width | 0.53 | [0.29, 0.70] | 4.28 | 48 | < .001 | Pearson |    50
#> Sepal.Length | Petal.Length | 0.75 | [0.60, 0.85] | 7.95 | 48 | < .001 | Pearson |    50
#> Sepal.Length |  Petal.Width | 0.55 | [0.32, 0.72] | 4.52 | 48 | < .001 | Pearson |    50
#> Sepal.Width  | Petal.Length | 0.56 | [0.33, 0.73] | 4.69 | 48 | < .001 | Pearson |    50
#> Sepal.Width  |  Petal.Width | 0.66 | [0.47, 0.80] | 6.15 | 48 | < .001 | Pearson |    50
#> Petal.Length |  Petal.Width | 0.79 | [0.65, 0.87] | 8.83 | 48 | < .001 | Pearson |    50
#> 
#> $virginica
#> Parameter1   |   Parameter2 |    r |       95% CI |     t | df |      p |  Method | n_Obs
#> -----------------------------------------------------------------------------------------
#> Sepal.Length |  Sepal.Width | 0.46 | [0.20, 0.65] |  3.56 | 48 | 0.003  | Pearson |    50
#> Sepal.Length | Petal.Length | 0.86 | [0.77, 0.92] | 11.90 | 48 | < .001 | Pearson |    50
#> Sepal.Length |  Petal.Width | 0.28 | [0.00, 0.52] |  2.03 | 48 | 0.048  | Pearson |    50
#> Sepal.Width  | Petal.Length | 0.40 | [0.14, 0.61] |  3.03 | 48 | 0.012  | Pearson |    50
#> Sepal.Width  |  Petal.Width | 0.54 | [0.31, 0.71] |  4.42 | 48 | < .001 | Pearson |    50
#> Petal.Length |  Petal.Width | 0.32 | [0.05, 0.55] |  2.36 | 48 | 0.045  | Pearson |    50

purrr::map_dfr(ls, tibble::as_tibble)
#> Error in (function (x = list(), n = NULL, ..., class = NULL) : formal argument "n" matched by multiple actual arguments

Created on 2020-03-24 by the reprex package (v0.3.0)

Session info
devtools::session_info()
#> - Session info ---------------------------------------------------------------
#>  setting  value                                             
#>  version  R Under development (unstable) (2020-02-28 r77874)
#>  os       Windows 10 x64                                    
#>  system   x86_64, mingw32                                   
#>  ui       RTerm                                             
#>  language (EN)                                              
#>  collate  English_United States.1252                        
#>  ctype    English_United States.1252                        
#>  tz       Europe/Berlin                                     
#>  date     2020-03-24                                        
#> 
#> - Packages -------------------------------------------------------------------
#>  package     * version      date       lib
#>  assertthat    0.2.1        2019-03-21 [1]
#>  backports     1.1.5        2019-10-02 [1]
#>  bayestestR    0.5.2.1      2020-03-16 [1]
#>  broom         0.5.3.9000   2020-03-01 [1]
#>  callr         3.4.2        2020-02-12 [1]
#>  cellranger    1.1.0        2016-07-27 [1]
#>  cli           2.0.2        2020-02-28 [1]
#>  colorspace    1.4-1        2019-03-18 [1]
#>  correlation   0.1.1        2020-03-21 [1]
#>  crayon        1.3.4        2017-09-16 [1]
#>  DBI           1.1.0        2019-12-15 [1]
#>  dbplyr        1.4.2        2019-06-17 [1]
#>  desc          1.2.0        2018-05-01 [1]
#>  devtools      2.2.2        2020-02-17 [1]
#>  digest        0.6.25       2020-02-23 [1]
#>  dplyr       * 0.8.99.9002  2020-03-23 [1]
#>  effectsize    0.3.0        2020-03-22 [1]
#>  ellipsis      0.3.0        2019-09-20 [1]
#>  evaluate      0.14         2019-05-28 [1]
#>  fansi         0.4.1        2020-01-08 [1]
#>  forcats     * 0.5.0        2020-03-01 [1]
#>  fs            1.3.2        2020-03-05 [1]
#>  generics      0.0.2        2018-11-29 [1]
#>  ggplot2     * 3.3.0        2020-03-05 [1]
#>  glue          1.3.2        2020-03-12 [1]
#>  gtable        0.3.0        2019-03-25 [1]
#>  haven         2.2.0        2019-11-08 [1]
#>  highr         0.8          2019-03-20 [1]
#>  hms           0.5.3        2020-01-08 [1]
#>  htmltools     0.4.0        2019-10-04 [1]
#>  httr          1.4.1        2019-08-05 [1]
#>  insight       0.8.2.1      2020-03-22 [1]
#>  jsonlite      1.6.1        2020-02-02 [1]
#>  knitr         1.28         2020-02-06 [1]
#>  lifecycle     0.2.0.9000   2020-03-16 [1]
#>  lubridate     1.7.4        2018-04-11 [1]
#>  magrittr      1.5          2014-11-22 [1]
#>  memoise       1.1.0        2017-04-21 [1]
#>  modelr        0.1.6        2020-02-22 [1]
#>  munsell       0.5.0        2018-06-12 [1]
#>  parameters    0.6.0        2020-03-12 [1]
#>  pillar        1.4.3        2019-12-20 [1]
#>  pkgbuild      1.0.6        2019-10-09 [1]
#>  pkgconfig     2.0.3        2019-09-22 [1]
#>  pkgload       1.0.2        2018-10-29 [1]
#>  prettyunits   1.1.1        2020-01-24 [1]
#>  processx      3.4.2        2020-02-09 [1]
#>  ps            1.3.2        2020-02-13 [1]
#>  purrr       * 0.3.3        2019-10-18 [1]
#>  R6            2.4.1        2019-11-12 [1]
#>  Rcpp          1.0.4        2020-03-17 [1]
#>  readr       * 1.3.1        2018-12-21 [1]
#>  readxl        1.3.1        2019-03-13 [1]
#>  remotes       2.1.1        2020-02-15 [1]
#>  reprex        0.3.0        2019-05-16 [1]
#>  rlang         0.4.5.9000   2020-03-23 [1]
#>  rmarkdown     2.1          2020-01-20 [1]
#>  rprojroot     1.3-2        2018-01-03 [1]
#>  rvest         0.3.5        2019-11-08 [1]
#>  scales        1.1.0        2019-11-18 [1]
#>  sessioninfo   1.1.1        2018-11-05 [1]
#>  stringi       1.4.6        2020-02-17 [1]
#>  stringr     * 1.4.0        2019-02-10 [1]
#>  testthat      2.3.2        2020-03-02 [1]
#>  tibble      * 2.99.99.9014 2020-03-23 [1]
#>  tidyr       * 1.0.2        2020-01-24 [1]
#>  tidyselect    1.0.0        2020-01-27 [1]
#>  tidyverse   * 1.3.0        2019-11-21 [1]
#>  usethis       1.5.1.9000   2020-03-24 [1]
#>  utf8          1.1.4        2018-05-24 [1]
#>  vctrs         0.2.99.9010  2020-03-23 [1]
#>  withr         2.1.2        2018-03-15 [1]
#>  xfun          0.12         2020-01-13 [1]
#>  xml2          1.2.5        2020-03-11 [1]
#>  yaml          2.2.1        2020-02-01 [1]
#>  source                                
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  Github (easystats/bayestestR@6ee7e37) 
#>  Github (tidymodels/broom@3c922d5)     
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  Github (easystats/correlation@1fe04b9)
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  Github (tidyverse/dplyr@35d3ace)      
#>  Github (easystats/effectsize@6f4d5a3) 
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  Github (easystats/insight@b46a9eb)    
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  Github (r-lib/lifecycle@355dcba)      
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  Github (r-lib/rlang@a90b04b)          
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  Github (tidyverse/tibble@96af653)     
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  Github (r-lib/usethis@01dbd8f)        
#>  CRAN (R 4.0.0)                        
#>  Github (r-lib/vctrs@3675fdf)          
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#>  CRAN (R 4.0.0)                        
#> 
#> [1] C:/Users/inp099/Documents/R/win-library/4.0
#> [2] C:/Program Files/R/R-devel/library

Plot method for correlation matrix

Should we add a plot method in see for a correlation matrix obtained via summary() or as.table(), similar to the one made in the README using ggcorplot. Can we do it with raw ggplot to avoid having another (conditional) dependency?

biweight correlation - help with formula

In genomics, we usually use the biweight correlation from the WCGNA package
More details can be found in

https://en.wikipedia.org/wiki/Biweight_midcorrelation
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3465711/

I understand this is outside your field but adding this feature and making it tidy is something that
no one has done before in my knowledge... Perhaps it could be something to look into to push the package towards a larger group of users.

Originally posted by @JauntyJJS in #2 (comment)


@JauntyJJS I've started implementing but struggling a bit to get the formula right.

Assuming that:

set.seed(12345)
var_x <- rnorm(200)
var_y <- 0.5 * var_x  + sqrt(1 - 0.5^2) * rnorm(200)

I have:

  u <- (var_x - median(var_x)) / 9 * mad(var_x, constant = 1)
  v <- (var_y - median(var_y)) / 9 * mad(var_y, constant = 1)

for
image

Then:

  I_x <- ifelse((1 - abs(u)) > 0, 1, 0)
  I_y <- ifelse((1 - abs(v)) > 0, 1, 0)

  w_x <- (1 - u^2)^2 * (I_x * (1 - abs(u)))
  w_y <- (1 - v^2)^2 * (I_y * (1 - abs(v)))

for

image

Finally:

  denominator_x <- sqrt(sum((var_x - median(var_x)) * w_x)^2)
  x_curly <- ((var_x - median(var_x)) * w_x) / denominator_x

  denominator_y <- sqrt(sum((var_y - median(var_y)) * w_y)^2)
  y_curly <- ((var_y - median(var_y)) * w_y) / denominator_y

  r <- sum(x_curly * y_curly)

For

image

However, it seems that something is wrong 😕, because the biweight correlation should be 0.5584808 and I have 7.70 😬

@mattansb @IndrajeetPatil @pdwaggoner @lindeloev and people who like equations ^^

Feature Request: Option to remove stars from output

Would love to see an option to remove the stars from the correlation output (could even be the default) to correspond to the ASA's recommendations to remove stars and other references to statistical significance. It also makes sense pedagogically that there should be an option to just report descriptive measures of correlation without reference to inference.

feature request: including sample size column in correlation output

I think it will be nice if the output also contains n column that tracks the number of observations for each correlation test. This might seem redundant for datasets without any NAs, but will be a very handy feature to have when there is missing data. For example-

library(tidyverse)
library(psych)
library(correlation)

# select only numeric varibles
df <- purrr::keep(ggplot2::msleep, is_bare_numeric)

# using `psych`
corr_obj <- psych::corr.test(df, method = "spearman")

# looking at sample sizes
corr_obj$n
#>             sleep_total sleep_rem sleep_cycle awake brainwt bodywt
#> sleep_total          83        61          32    83      56     83
#> sleep_rem            61        61          32    61      48     61
#> sleep_cycle          32        32          32    32      30     32
#> awake                83        61          32    83      56     83
#> brainwt              56        48          30    56      56     56
#> bodywt               83        61          32    83      56     83

# correlation output (no info about sample sizes)
correlation::correlation(df, method = "spearman")

#> Parameter1  |  Parameter2 |   rho |           S |      p |   Method
#> -------------------------------------------------------------------
#> sleep_total |   sleep_rem |  0.76 |     8920.08 | < .001 | Spearman
#> sleep_total | sleep_cycle | -0.49 |     8122.87 | 0.014  | Spearman
#> sleep_total |       awake | -1.00 | 1.90568e+05 | < .001 | Spearman
#> sleep_total |     brainwt | -0.59 |    46627.12 | < .001 | Spearman
#> sleep_total |      bodywt | -0.53 | 1.46223e+05 | < .001 | Spearman
#> sleep_rem   | sleep_cycle | -0.33 |     7280.52 | 0.061  | Spearman
#> sleep_rem   |       awake | -0.76 |    66719.92 | < .001 | Spearman
#> sleep_rem   |     brainwt | -0.41 |    26049.73 | 0.014  | Spearman
#> sleep_rem   |      bodywt | -0.45 |    54903.63 | 0.001  | Spearman
#> sleep_cycle |       awake |  0.49 |     2789.13 | 0.014  | Spearman
#> sleep_cycle |     brainwt |  0.87 |      572.26 | < .001 | Spearman
#> sleep_cycle |      bodywt |  0.85 |      837.92 | < .001 | Spearman
#> awake       |     brainwt |  0.59 |    11892.88 | < .001 | Spearman
#> awake       |      bodywt |  0.53 |    44345.02 | < .001 | Spearman
#> brainwt     |      bodywt |  0.96 |     1253.56 | < .001 | Spearman

Created on 2020-02-10 by the reprex package (v0.3.0)

Session info
devtools::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 3.6.2 (2019-12-12)
#>  os       macOS Mojave 10.14.6        
#>  system   x86_64, darwin15.6.0        
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_US.UTF-8                 
#>  ctype    en_US.UTF-8                 
#>  tz       Europe/Berlin               
#>  date     2020-02-10                  
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version   date       lib source                                
#>  assertthat    0.2.1     2019-03-21 [1] CRAN (R 3.6.0)                        
#>  backports     1.1.5     2019-10-02 [1] CRAN (R 3.6.0)                        
#>  bayestestR    0.5.1     2020-01-27 [1] CRAN (R 3.6.2)                        
#>  broom         0.5.4     2020-01-27 [1] CRAN (R 3.6.2)                        
#>  callr         3.4.1     2020-01-24 [1] CRAN (R 3.6.2)                        
#>  cellranger    1.1.0     2016-07-27 [1] CRAN (R 3.6.0)                        
#>  cli           2.0.1     2020-01-08 [1] CRAN (R 3.6.2)                        
#>  colorspace    1.4-1     2019-03-18 [1] CRAN (R 3.6.0)                        
#>  correlation * 0.1.0     2020-02-10 [1] Github (easystats/correlation@b80559a)
#>  crayon        1.3.4     2017-09-16 [1] CRAN (R 3.6.0)                        
#>  DBI           1.1.0     2019-12-15 [1] CRAN (R 3.6.2)                        
#>  dbplyr        1.4.2     2019-06-17 [1] CRAN (R 3.6.0)                        
#>  desc          1.2.0     2018-05-01 [1] CRAN (R 3.6.0)                        
#>  devtools      2.2.1     2019-09-24 [1] CRAN (R 3.6.0)                        
#>  digest        0.6.23    2019-11-23 [1] CRAN (R 3.6.0)                        
#>  dplyr       * 0.8.4     2020-01-31 [1] CRAN (R 3.6.0)                        
#>  ellipsis      0.3.0     2019-09-20 [1] CRAN (R 3.6.0)                        
#>  evaluate      0.14      2019-05-28 [1] CRAN (R 3.6.0)                        
#>  fansi         0.4.1     2020-01-08 [1] CRAN (R 3.6.2)                        
#>  forcats     * 0.4.0     2019-02-17 [1] CRAN (R 3.6.0)                        
#>  fs            1.3.1     2019-05-06 [1] CRAN (R 3.6.0)                        
#>  generics      0.0.2     2018-11-29 [1] CRAN (R 3.6.0)                        
#>  ggplot2     * 3.2.1     2019-08-10 [1] CRAN (R 3.6.0)                        
#>  glue          1.3.1     2019-03-12 [1] CRAN (R 3.6.0)                        
#>  gtable        0.3.0     2019-03-25 [1] CRAN (R 3.6.0)                        
#>  haven         2.2.0     2019-11-08 [1] CRAN (R 3.6.1)                        
#>  highr         0.8       2019-03-20 [1] CRAN (R 3.6.0)                        
#>  hms           0.5.3     2020-01-08 [1] CRAN (R 3.6.2)                        
#>  htmltools     0.4.0     2019-10-04 [1] CRAN (R 3.6.0)                        
#>  httr          1.4.1     2019-08-05 [1] CRAN (R 3.6.0)                        
#>  insight       0.8.1     2020-02-02 [1] CRAN (R 3.6.2)                        
#>  jsonlite      1.6.1     2020-02-02 [1] CRAN (R 3.6.2)                        
#>  knitr         1.28      2020-02-06 [1] CRAN (R 3.6.2)                        
#>  lattice       0.20-38   2018-11-04 [2] CRAN (R 3.6.2)                        
#>  lazyeval      0.2.2     2019-03-15 [1] CRAN (R 3.6.0)                        
#>  lifecycle     0.1.0     2019-08-01 [1] CRAN (R 3.6.0)                        
#>  lubridate     1.7.4     2018-04-11 [1] CRAN (R 3.6.0)                        
#>  magrittr      1.5       2014-11-22 [1] CRAN (R 3.6.0)                        
#>  memoise       1.1.0     2017-04-21 [1] CRAN (R 3.6.0)                        
#>  mnormt        1.5-6     2020-02-03 [1] CRAN (R 3.6.0)                        
#>  modelr        0.1.5     2019-08-08 [1] CRAN (R 3.6.0)                        
#>  munsell       0.5.0     2018-06-12 [1] CRAN (R 3.6.0)                        
#>  nlme          3.1-142   2019-11-07 [2] CRAN (R 3.6.2)                        
#>  parameters    0.5.0     2020-02-09 [1] CRAN (R 3.6.2)                        
#>  pillar        1.4.3     2019-12-20 [1] CRAN (R 3.6.2)                        
#>  pkgbuild      1.0.6     2019-10-09 [1] CRAN (R 3.6.0)                        
#>  pkgconfig     2.0.3     2019-09-22 [1] CRAN (R 3.6.0)                        
#>  pkgload       1.0.2     2018-10-29 [1] CRAN (R 3.6.0)                        
#>  prettyunits   1.1.1     2020-01-24 [1] CRAN (R 3.6.2)                        
#>  processx      3.4.2     2020-02-09 [1] CRAN (R 3.6.2)                        
#>  ps            1.3.0     2018-12-21 [1] CRAN (R 3.6.0)                        
#>  psych       * 1.9.12.31 2020-01-08 [1] CRAN (R 3.6.2)                        
#>  purrr       * 0.3.3     2019-10-18 [1] CRAN (R 3.6.0)                        
#>  R6            2.4.1     2019-11-12 [1] CRAN (R 3.6.0)                        
#>  Rcpp          1.0.3     2019-11-08 [1] CRAN (R 3.6.1)                        
#>  readr       * 1.3.1     2018-12-21 [1] CRAN (R 3.6.0)                        
#>  readxl        1.3.1     2019-03-13 [1] CRAN (R 3.6.0)                        
#>  remotes       2.1.0     2019-06-24 [1] CRAN (R 3.6.0)                        
#>  reprex        0.3.0     2019-05-16 [1] CRAN (R 3.6.0)                        
#>  rlang         0.4.4     2020-01-28 [1] CRAN (R 3.6.2)                        
#>  rmarkdown     2.1       2020-01-20 [1] CRAN (R 3.6.2)                        
#>  rprojroot     1.3-2     2018-01-03 [1] CRAN (R 3.6.0)                        
#>  rvest         0.3.5     2019-11-08 [1] CRAN (R 3.6.0)                        
#>  scales        1.1.0     2019-11-18 [1] CRAN (R 3.6.0)                        
#>  sessioninfo   1.1.1     2018-11-05 [1] CRAN (R 3.6.0)                        
#>  stringi       1.4.5     2020-01-11 [1] CRAN (R 3.6.2)                        
#>  stringr     * 1.4.0     2019-02-10 [1] CRAN (R 3.6.0)                        
#>  testthat      2.3.1     2019-12-01 [1] CRAN (R 3.6.0)                        
#>  tibble      * 2.1.3     2019-06-06 [1] CRAN (R 3.6.0)                        
#>  tidyr       * 1.0.2     2020-01-24 [1] CRAN (R 3.6.2)                        
#>  tidyselect    1.0.0     2020-01-27 [1] CRAN (R 3.6.2)                        
#>  tidyverse   * 1.3.0     2019-11-21 [1] CRAN (R 3.6.0)                        
#>  usethis       1.5.1     2019-07-04 [1] CRAN (R 3.6.0)                        
#>  vctrs         0.2.2     2020-01-24 [1] CRAN (R 3.6.2)                        
#>  withr         2.1.2     2018-03-15 [1] CRAN (R 3.6.0)                        
#>  xfun          0.12      2020-01-13 [1] CRAN (R 3.6.2)                        
#>  xml2          1.2.2     2019-08-09 [1] CRAN (R 3.6.0)                        
#>  yaml          2.2.1     2020-02-01 [1] CRAN (R 3.6.0)                        
#> 
#> [1] /Users/patil/Library/R/3.6/library
#> [2] /Library/Frameworks/R.framework/Versions/3.6/Resources/library

bootstrapped confidence intervals

since we have several different types of correlations for which p values are not straightforward, would be nice to include a method for bootstrapped CI. Any advice?

`correlation()` error when `data2` is a `grouped_df` object

When data2 is a grouped_df object, correlation() doesn't seem right:

library(correlation)
df1 <- data.frame(x = rnorm(30),
                  y = rnorm(30),
                  g = rep_len(LETTERS[1:3], 30))
df2 <- data.frame(a = rnorm(30),
                  b = rnorm(30),
                  g = rep_len(LETTERS[1:3], 30))
correlation(dplyr::group_by(df1, g),
            dplyr::group_by(df2, g))
##  Error in `[.data.frame`(out, c("Group", names(out)[names(out) != "Group"])) : 
##  undefined columns selected

Cannot follow one of the examples from the website

library(correlation)
library(dplyr)
library(see)

cor <- correlation(iris)

cor %>%

  • as.table() %>%
  • plot()
    Error in plot.window(...) : need finite 'xlim' values
    In addition: Warning messages:
    1: In data.matrix(x) : NAs introduced by coercion
    2: In min(x) : no non-missing arguments to min; returning Inf
    3: In max(x) : no non-missing arguments to max; returning -Inf
    4: In min(x) : no non-missing arguments to min; returning Inf
    5: In max(x) : no non-missing arguments to max; returning -Inf

warning form parameters and insight when CI is NA

cor_test(iris, "Sepal.Length", "Sepal.Width", method = "spearman")
Error in max(unlist(lapply(stats::na.omit(round(CI_low, digits)), function(.i) nchar(as.character(.i))))) : 
  (converted from warning) no non-missing arguments to max; returning -Inf 
8.
doWithOneRestart(return(expr), restart) 
7.
withOneRestart(expr, restarts[[1L]]) 
6.
withRestarts({
    .Internal(.signalCondition(simpleWarning(msg, call), msg, 
        call))
    .Internal(.dfltWarn(msg, call)) ... 
5.
.signalSimpleWarning("no non-missing arguments to max; returning -Inf", 
    base::quote(max(unlist(lapply(stats::na.omit(round(CI_low, 
        digits)), function(.i) nchar(as.character(.i))))))) at format_ci.R#28
4.
insight::format_ci(x[[ci_low[i]]], x[[ci_high[i]]], ci = NULL, 
    digits = ci_digits, width = "auto", brackets = TRUE) at parameters_table.R#73
3.
parameters_table(x, pretty_names = pretty_names, ...) at print.parameters_model.R#72
2.
print.parameters_model(x) 
1.
(function (x, ...) 
UseMethod("print"))(x) 

@strengejacke do you think it's best to address it in insight or parameters?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.