Comments (3)
Hi,
From the phrase in the paper
"then calculating the Mahalanobis distance for each actual observation from the bivariate mean of the resampled data."
Shouldn't data (The actual observation) be used instead of dat (the resampled data) ?
As in
stats::mahalanobis(data, center = colMeans(dat), cov = stats::cov(dat))
instead of
stats::mahalanobis(dat, center = colMeans(dat), cov = stats::cov(dat))
Removing the sim="permutation"
helps since we need to sample with replacement
set.seed(123)
mean <- c(4,6)
cov <- matrix(c(1, 0.5, 0.5, 1),2,2)
d <- MASS::mvrnorm(100, mean, cov)
#Adding Outliers
d[3,1] <- 12
d[5,2] <- -8
.distance_mahalanobis <- function(data, indices = 1:nrow(data), ...) {
dat <- data[indices, ] # allows boot to select sample
row.names(dat) <- NULL
stats::mahalanobis(data, center = colMeans(dat), cov = stats::cov(dat))
}
rez <- boot::boot(data = d, statistic = .distance_mahalanobis, R = 1000)
bayestestR::point_estimate(as.data.frame(rez$t), centrality="all")
from correlation.
Not sure if you are aware. The author's MATLAB code can be found in https://sampendu.net/publications/
Clicking on the hyperlink "Matlab Code" gives this zip file
Shepherd.zip
Manage to come up with this...
set.seed(123)
mean <- c(4,6)
cov <- matrix(c(1, 0.5, 0.5, 1),2,2)
d <- MASS::mvrnorm(100, mean, cov)
#Correlation with no outliers
cor(d[,1],d[,2])
#> [1] 0.455973
#Adding Outliers
d[3,1] <- 12
d[5,2] <- -8
#Correlation with outliers
cor(d[,1],d[,2])
#> [1] 0.2037985
j <- 1000
n <- nrow(d)
Ms <- matrix(data=NA,nrow=j,ncol=n)
# Bootstrap the distances
for(i in 1:j)
{
#Draw random numbers from 1:n with replacement
x <- sample(1:n,n,replace = TRUE)
#Resampled data
dat <- d[x,]
#Calculating the Mahalanobis distance for each actual observation using resampled data
m <- stats::mahalanobis(d, center = colMeans(dat), cov = stats::cov(dat))
Ms[i,] <- m
}
# Average across all bootstraps to get the bootstrapped Mahalanobis distance
boot_m = colMeans(Ms)
#Determine the outliers
outlier_indicies <- which(boot_m >=6)
#Remove Outliers
new_d <- d[-outlier_indicies,]
# Shepherd's pi correlation
cor(new_d[,1],new_d[,2])
#> [1] 0.4930791
cor(new_d[,1],new_d[,2],method = "spearman")
#> [1] 0.4525431
from correlation.
@JauntyJJS awesome!
Want to give it a try a make a pull request 😏 ?
from correlation.
Related Issues (20)
- Error in Multilevel Correlation HOT 1
- Some tests are failing for R < 4.0
- CRAN check failures HOT 3
- Create GitHub release corresponding to CRAN release
- Changing aesthetics of the correlation plot HOT 2
- Costumizing priors HOT 1
- Multilevel correlation changes when more correlations are computed HOT 6
- Suggestion of new function: `cormatrix_to_excel()` HOT 4
- Factors don't always work as intended
- Multilevel rank correlation HOT 1
- refactoring the `correlation` package HOT 6
- Package version doesn't follow easystats' versioning conventions HOT 1
- Bayesian partial correlations
- bayestestR HOT 2
- X
- Update documentation to discuss handling of missing values HOT 1
- specify adjustment variable HOT 1
- Time series auto- and cross-correlations HOT 1
- A new correlation coefficient (Chatterjee) HOT 8
- Expanding Kendall's Tau to include Tau a and Stuart's c
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from correlation.