Giter VIP home page Giter VIP logo

roahd's Introduction

roahd

check-standard test-coverage codecov pkgdown CRAN status downloads

The roahd (Robust Analysis of High-dimensional Data) package allows to use a set of statistical tools for the exploration and robustification of univariate and multivariate functional datasets through the use of depth-based statistical methods.

In the implementation of functions, special attention was put to their efficiency, so that they can be profitably used also for the analysis of high-dimensional datasets.

For a full-featured description of the package, please take a look at the roahd vignette.

Installation

Install the released version of roahd from CRAN:

install.packages("roahd")

Or install the development version from GitHub with:

# install.packages("remotes")
remotes::install_github("astamm/roahd")

fData and mfData objects

A simple S3 representation of functional data object, fData, allows to encapsulate the important features of univariate functional datasets (like the grid of the dependent variable, the pointwise observations, etc.):

library(roahd)

# Grid representing the dependent variable
grid = seq( 0, 1, length.out = 100 )

# Pointwise measurements of the functional dataset
Data = matrix( c( sin( 2 * pi * grid ),
                  cos ( 2 * pi * grid ),
                  sin( 2 * pi * grid + pi / 4 ) ), ncol = 100, byrow = TRUE )

# S3 object encapsulating the univariate functional dataset            
fD = fData( grid, Data )

# S3 representation of a multivariate functional dataset
mfD = mfData( grid, list( 'comp1' = Data, 'comp2' = Data ) )

Also, this allows to exploit simple calls to customized functions which simplifies the exploratory analysis:

# Algebra of fData objects
fD + 1 : 100
fD * 4

fD + fD

# Subsetting fData objects (providing other fData objects)
fD[ 1, ]
fD[ 1, 2 : 4]

# Sample mean and (depth-based) median(s)
mean( fD )
mean( fD[ 1, 10 : 20 ] )
median_fData( fD, type = 'MBD' )
# Plotting functions
plot( fD )
plot( mean( fD ), lwd = 4, add = TRUE )

plot( fD[ 2:3, ] )

Robust methods for functional data analysis

A part of the package is specifically devoted to the computation of depths and other statistical indices for functional data:

  • Band depths and modified band depths,
  • Modified band depths for multivariate functional data,
  • Epigraph and hypograph indexes,
  • Spearman and Kendall’s correlation indexes for functional data,
  • Confidence intervals and tests on Spearman’s correlation coefficients for univariate and multivariate functional data.

These also are the core of the visualization / robustification tools like functional boxplot (fbplot) and outliergram (outliergram), allowing the visualization and identification of amplitude and shape outliers.

Thanks to the functions for the simulation of synthetic functional datasets, both fbplot and outliergram procedures can be auto-tuned to the dataset at hand, in order to control the true positive outliers rate.

Citation

If you use this package for your own research, please cite the corresponding R Journal article:

To cite roahd in publications use:

  Ieva, F., Paganoni, A. M., Romo, J., & Tarabelloni, N. (2019). roahd
  Package: Robust Analysis of High Dimensional Data. The R Journal,
  11(2), pp. 291-307.

A BibTeX entry for LaTeX users is

  @Article{,
    title = {{roahd Package: Robust Analysis of High Dimensional Data}},
    author = {Francesca Ieva and Anna Maria Paganoni and Juan Romo and Nicholas Tarabelloni},
    journal = {{The R Journal}},
    year = {2019},
    volume = {11},
    number = {2},
    pages = {291--307},
    url = {https://doi.org/10.32614/RJ-2019-032},
  }

roahd's People

Contributors

ntarabelloni avatar astamm avatar aefdz avatar

Stargazers

 avatar Francesco Grossetti avatar

Watchers

 avatar

roahd's Issues

scales 1.0.0 change to hue_pal() breaks fbplot

The upcoming scales 1.0.0 release includes some minor changes to hue_pal() that according to my revdep checks will break the multivariate boxplot example used in the documentation of fbplot() and your vignette. The error seems to be cause at L572 where you call hue_pal()(length(ID_out)). With scales 0.5.0 this returned an empty string where as in the newest version it fails, returning the error Error: Must request at least one colour from a hue palette. because length(ID_out) == 0.

The release is expected out by next week, and I wanted to give you a heads up about this minor breaking change.

Warning message in multivariate_outliegram when shift = TRUE

Dear Astamm,

I kindly ask for your assistance: when I run the ‘multivariate_outliergram’ with ‘shift = TRUE’, I incur in the following warning: ‘In max(max_diffs) : no non-missing arguments to max; returning -Inf’.

While debugging, I found out that the following lines of code give the warning/error:

max_diff_max = maxs %>%
dplyr::filter(obs %in% ID_non_outlying_Low_MEI ) %>%
dplyr::group_by(obs) %>%
dplyr::summarize(max_diff_max = max(max_diffs))

when handling points with low MEI.

When I run the command with ‘shift = FALSE’ I do not get any warning.

Thanks in advance.

Multivariate support of plot.cov

From the example of the roahd documentation:

"# Generating a univariate functional dataset
N = 1e2
P = 1e2
t0 = 0
t1 = 1
time_grid = seq( t0, t1, length.out = P )
Cov = exp_cov_function( time_grid, alpha = 0.3, beta = 0.4 )
D1 = generate_gauss_fdata( N, centerline = sin( 2 * pi * time_grid ), Cov = Cov )
fD1 = fData( time_grid, D1 )
plot( cov_fun( fD1 ), main = 'Covariance function', xlab = 'time', ylab = 'time' )"

I tried to execute the 'plot( cov_fun(mfData), main = 'Covariance function', xlab = 'time', ylab = 'time' )' command, with mfData is a multivariate functional dataset, but I get the following error:

Error in xy.coords(x, y, xlabel, ylabel, log) : 'x' is a list, but does not have components 'x' and 'y'

I kindly ask again for you assistance, thank you in advance!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.