Giter VIP home page Giter VIP logo

rrrlw / ripserr Goto Github PK

View Code? Open in Web Editor NEW
7.0 6.0 5.0 117.08 MB

R package porting Ripser-based persistent homology calculation engines from C++ via Rcpp. Currently ports Ripser (Vietoris-Rips complex) and Cubical Ripser (cubical complex).

Home Page: https://rrrlw.github.io/ripserr/

License: GNU General Public License v3.0

R 29.57% C++ 70.43%
ripser topological-data-analysis persistent-homology topology cubical-complex vietoris-complex rips-complex rcpp r cpp

ripserr's Introduction

ripserr: Calculate Persistent Homology of Vietoris-Rips and Cubical Complexes using Ripser in R

Travis-CI Build Status AppVeyor Build Status Codecov test coverage

License: GPL v3 CRAN version CRAN Downloads

Overview

ripserr ports the Ripser and Cubical Ripser persistent homology calculation engines from C++ via Rcpp. It can be used as a convenient and rapid calculation tool in topological data analysis pipelines.

Installation

# install development version
devtools::install_github("rrrlw/ripserr")

# install from CRAN
install.packages("ripserr")

Sample code

Ripser (Vietoris-Rips complex) can be used as follows for data with dimension greater than or equal to 2.

# load ripserr
library("ripserr")

set.seed(42)
SIZE <- 100

# 2-dimensional example
dataset2 <- rnorm(SIZE * 2)
dim(dataset2) <- c(SIZE, 2)
vr_phom2 <- vietoris_rips(dataset2)
head(vr_phom2)
#>   dimension birth      death
#> 1         0     0 0.01004861
#> 2         0     0 0.02923702
#> 3         0     0 0.04550504
#> 4         0     0 0.06829826
#> 5         0     0 0.06853393
#> 6         0     0 0.07187663
tail(vr_phom2)
#>     dimension     birth     death
#> 112         1 0.3916344 0.4239412
#> 113         1 0.3906770 0.5577989
#> 114         1 0.3880186 0.4029842
#> 115         1 0.3703398 0.5007012
#> 116         1 0.3330234 0.3416054
#> 117         1 0.2418318 0.2504820

# 3-dimensional example
dataset3 <- rnorm(SIZE * 3)
dim(dataset3) <- c(SIZE, 3)
vr_phom3 <- vietoris_rips(dataset3, max_dim = 2) # default: max_dim = 1
head(vr_phom3)
#>   dimension birth     death
#> 1         0     0 0.1282935
#> 2         0     0 0.1421812
#> 3         0     0 0.1516424
#> 4         0     0 0.1819928
#> 5         0     0 0.1858051
#> 6         0     0 0.2114116
tail(vr_phom3)
#>     dimension     birth     death
#> 132         1 0.5212961 0.5233529
#> 133         2 1.1829207 1.1999911
#> 134         2 1.1194325 1.3245908
#> 135         2 1.0707410 1.0914850
#> 136         2 0.9433034 0.9867254
#> 137         2 0.6882204 0.6913078

Cubical Ripser (cubical complex) can be used as follows for data with dimension equal to 2, 3, or 4.

# load ripserr
library("ripserr")

set.seed(42)
SIZE <- 10

# 2-dimensional example
dataset2 <- rnorm(SIZE ^ 2)
dim(dataset2) <- rep(SIZE, 2)
cub_phom2 <- cubical(dataset2)
head(cub_phom2)
#>   dimension      birth      death
#> 1         0 -1.1943289 -0.8607926
#> 2         0 -2.4142076 -0.8509076
#> 3         0 -0.8113932 -0.7844590
#> 4         0 -1.7170087 -0.7844590
#> 5         0 -0.7272921 -0.5428288
#> 6         0 -0.9535234 -0.5428288
tail(cub_phom2)
#>    dimension     birth     death
#> 22         1 0.8217731 0.9333463
#> 23         1 0.7681787 1.0385061
#> 24         1 0.7581632 1.5757275
#> 25         1 0.7208782 1.3025426
#> 26         1 0.6792888 1.4441013
#> 27         1 0.6359504 1.8951935

# 3-dimensional example
dataset3 <- rnorm(SIZE ^ 3)
dim(dataset3) <- rep(SIZE, 3)
cub_phom3 <- cubical(dataset3)
head(cub_phom3)
#>   dimension     birth     death
#> 1         0 -1.926167 -1.737728
#> 2         0 -1.737297 -1.439229
#> 3         0 -1.924950 -1.439229
#> 4         0 -1.500221 -1.354600
#> 5         0 -2.277778 -1.354600
#> 6         0 -1.682481 -1.306676
tail(cub_phom3)
#>     dimension     birth    death
#> 324         2 1.2488637 1.258482
#> 325         2 1.2009654 2.036972
#> 326         2 1.0452759 1.199978
#> 327         2 0.9885968 1.809382
#> 328         2 0.9310749 1.179696
#> 329         2 0.8447922 1.709689

# 4-dimensional example
dataset4 <- rnorm(SIZE ^ 4)
dim(dataset4) <- rep(SIZE, 4)
cub_phom4 <- cubical(dataset4)
head(cub_phom4)
#>   dimension     birth     death
#> 1         0 -1.986299 -1.923519
#> 2         0 -1.822606 -1.816506
#> 3         0 -1.776392 -1.710786
#> 4         0 -1.833663 -1.710387
#> 5         0 -1.947054 -1.704791
#> 6         0 -1.701462 -1.639160
tail(cub_phom4)
#>      dimension    birth    death
#> 4329         3 1.676609 2.019277
#> 4330         3 1.675766 1.932152
#> 4331         3 1.669449 2.149646
#> 4332         3 1.662486 1.863734
#> 4333         3 1.535361 1.963609
#> 4334         3 1.349235 2.263581

Functionality

  1. Calculation of persistent homology of Vietoris-Rips complexes using Ripser (function named vietoris_rips).
  2. Calculation of persistent homology of cubical complexes using Cubical Ripser (function named cubical).

Citation

If you use the ripserr package in your work, please consider citing the following (based on use):

  • General use of ripserr: Wadhwa RR, Piekenbrock M, Scott JG. ripserr: Calculate Persistent Homology with Ripser-based Engines; version 0.1.0. URL https://github.com/rrrlw/ripserr.
  • Calculation using Vietoris-Rips complex: Bauer U. Ripser: Efficient computation of Vietoris-Rips persistence barcodes. 2019; arXiv: 1908.02518.
  • Calculation using cubical complex: Kaji S, Sudo T, Ahara K. Cubical Ripser: Software for computing persistent homology of image and volume data. 2020; arXiv: 2005.12692.

Contribute

To contribute to ripserr, you can create issues for any bugs/suggestions on the issues page. You can also fork the ripserr repository and create pull requests to add useful features.

ripserr's People

Contributors

corybrunson avatar emily-noble avatar rrrlw avatar xinyiemilyzhang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

ripserr's Issues

Infinite bars?

Where are the infinite length persistence bars? I tried out ripserr on a simple uniformly random point cloud, and get:

> phom
# A tibble: 119 × 3
   dimension birth    death
       <int> <dbl>    <dbl>
 1         0     0 0.000583
 2         0     0 0.000980
 3         0     0 0.00102 
 4         0     0 0.00206 
 5         0     0 0.00224 
 6         0     0 0.00228 
 7         0     0 0.00249 
 8         0     0 0.00282 
 9         0     0 0.00295 
10         0     0 0.00315 

Notably,

> max(phom$death)
[1] 0.0377396

But unless you are deliberately computing reduced (co)homology, you should expect to see at least one infinite bar in dimension 0. Are you quietly dropping all the infinite bars?!

Another experiment, with a circle point cloud:

> circle.phom = vietoris_rips(circle.cloud)
> circle.phom
PHom object containing persistence data for 200 features.

Contains:
* 199 0-dim features
* 1 1-dim feature

Radius/diameter: min = 0; max = 1.7333.

but if I restrict max-radius:

> circle.phom = vietoris_rips(circle.cloud, threshold=1.5)
> circle.phom
PHom object containing persistence data for 199 features.

Contains:
* 199 0-dim features

Radius/diameter: min = 0; max = 0.15284.

I would like to strongly urge you to implement tracking and reporting of infinite bars.

Different call signatures on Mac (M1) and Ubuntu

I have installed ripserr 0.2.0 on two different computers, one Ubuntu x86_64 compute server and on a MacBook Pro (running arm64).

The call signature for vietoris_rips differs between the two installations, requiring dim= on the Mac and max_dim= on the server to set the upper homological dimension limit. As a result, I cannot move scripts between the two computers.

sundry storage and handling tweaks

While experimenting with Jose Bouza's tda-tools package, some possible improvements suggest themselves:

  • Rather than a single matrix or data frame, it would save space to store persistence data as a list of 2-column matrices. Dimension would be inferred from position in the list, and metadata (e.g. extended persistence) could be stored in a separate list with same-dimension members. This assumes that TDA uses only nonnegative integer dimensions, which is unlikely to change any time soon.
  • With the above chance, a coercion method as.data.frame() would recover the current form of output (without the 'PHom' class).
  • Parameters like the maximum dimension and the distance threshold should be retained in the output object, as list elements or (less likely) as attributes. These are important for interpretation and may be used in downstream steps like plotting or calculating landscapes.
  • Helper functions could be written to access birth–death pairs for a single dimension. These would be useful in analysis pipelines.

Coefficients

Would it be possible (and how much work would it be) to do computations with coefficients in Z/3?

Add vignettes

  • vietoris-rips phom w/ Ripser
  • cubical phom w/ Cubical Ripser

Single Data Point Error

Hi! My name is Xinyi and I am an undergraduate student. We are trying to use the ripserr package to produce homology features for a regression model. When we try to load a dataset containing x and y coordinates, an error occurs when the data set has only one single point. Experimentally, we think the package should return an empty data frame in this case. Hope you find this is helpful and could fix the issue. Thank you!

library("ripserr")
#> Warning: package 'ripserr' was built under R version 4.0.4
singlePoint <- data.frame(x = c(1), y = c(1))
rips <- as.data.frame(vietoris_rips(singlePoint));rips
#> Error in vietoris_rips(singlePoint): Point cloud must have at least 2 points and at least 2 dimensions.
#> Error in eval(expr, envir, enclos): object 'rips' not found
doublePoint <- data.frame(x = c(1,2), y = c(1,2))
rips <- as.data.frame(vietoris_rips(doublePoint));rips
#>   dimension birth    death
#> 1         0     0 1.414214

Created on 2021-05-31 by the reprex package (v2.0.0)

backward compatibility

The devel parameter max_dim replaced the parameter dim in the current CRAN release. Before the next CRAN submission, this and possibly other parameters should include corrections and warnings for backward compatibility.

add S3 class for Ripser + Cubical Ripser outputs

can be used by other TDA R packages to identify type of persistence output to more easily plot (or calculate inference, etc.); could be as simple as adding elements to class of variable? likely distinct for Ripser and Cubical Ripser, can keep adding as Ripser-based engines are added

Add better automated testing

For Ripser & Cubical Ripser, should test more than the 2 basic tests currently included (checking nrow/ncol/# features in each dimension for cubical ripser 2dim & 4dim)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.