gdkrmr / dimred Goto Github PK
View Code? Open in Web Editor NEWA Framework for Dimensionality Reduction in R
Home Page: https://www.guido-kraemer.com/software/dimred/
License: GNU General Public License v3.0
A Framework for Dimensionality Reduction in R
Home Page: https://www.guido-kraemer.com/software/dimred/
License: GNU General Public License v3.0
I have a huge data set with 98% of data missing. I use Sparse matrix and dataset fits easily into memory. As a full data frame it would use 100s of GB. Could you please let embed use Spare Matrix?
Sometimes I get the follwing error when testing:
✖ | 14 1 | the dimRedData class
────────────────────────────────────────────────────────────────────────────────
test_dimRedData.R:31: failure: misc functions
nrow(Iris) not equal to 150.
target is NULL, current is numeric
────────────────────────────────────────────────────────────────────────────────
happens when using devtools::test()
and R CMD check --run-donttest --run-dontrun --timings
The documentation for embed()
suggests that additional parameters can be passed via ..., but they seem to be ignored:
library(dimRed)
#> Loading required package: DRR
#> Loading required package: kernlab
#> Loading required package: CVST
#> Loading required package: Matrix
#>
#> Attaching package: 'dimRed'
#> The following object is masked from 'package:stats':
#>
#> embed
#> The following object is masked from 'package:base':
#>
#> as.data.frame
sr <- loadDataSet("Swiss Roll", n = 2000, sigma = 0.05)
test <- embed(sr, "Isomap", knn = 50, eps = 1, ndim = 2, get_geod = FALSE)
#> Warning in matchPars(methodObject, list(...)): Parameter matching: eps is not a
#> standard parameter, ignoring.
#> 2020-04-28 17:11:55: Isomap START
#> 2020-04-28 17:11:55: constructing knn graph
#> 2020-04-28 17:11:55: calculating geodesic distances
#> 2020-04-28 17:11:59: Classical Scaling
Created on 2020-04-28 by the reprex package (v0.3.0)
The PCA_L1
method appears not to work when ndim
is set to 1
(... last one for now ;))
To reproduce:
library(dimRed)
## Loading required package: DRR
## Loading required package: kernlab
## Loading required package: CVST
## Loading required package: Matrix
##
## Attaching package: 'dimRed'
## The following object is masked from 'package:stats':
##
## embed
## The following object is masked from 'package:base':
##
## as.data.frame
set.seed(1)
embed(matrix(rnorm(1E5), 100), 'PCA_L1', ndim = 1)
## Error in dimnames(rot) <- list(orgnames, newnames): 'dimnames' applied to non-array
System Information:
sessionInfo()
## R version 3.5.2 (2018-12-20)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Arch Linux
##
## Matrix products: default
## BLAS: /usr/lib/libblas.so.3.8.0
## LAPACK: /usr/lib/liblapack.so.3.8.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] dimRed_0.2.2 DRR_0.0.3 CVST_0.2-2 Matrix_1.2-15
## [5] kernlab_0.9-27 colorout_1.2-0
##
## loaded via a namespace (and not attached):
## [1] Rcpp_1.0.0 lattice_0.20-38 digest_0.6.18
## [4] grid_3.5.2 magrittr_1.5 evaluate_0.12
## [7] stringi_1.2.4 pcaL1_1.5.2 rmarkdown_1.11
## [10] tools_3.5.2 stringr_1.3.1 xfun_0.4
## [13] yaml_2.2.0 compiler_3.5.2 BiocManager_1.30.4
## [16] htmltools_0.3.6 knitr_1.21
Non-Method Features to add:
Methods to add:
(Semi) supervised methods:
Please propose more
I see:
> install_github("gdkrmr/dimRed")
Using GitHub PAT from envvar GITHUB_PAT
Downloading GitHub repo gdkrmr/dimRed@master
from URL https://api.github.com/repos/gdkrmr/dimRed/zipball/master
Installing dimRed
'/Library/Frameworks/R.framework/Resources/bin/R' --no-site-file --no-environ --no-save \
--no-restore --quiet CMD INSTALL \
'/private/tmp/Rtmpt5ZdO6/devtools53646e44af92/gdkrmr-dimRed-97564ff' \
--library='/Users/hadley/R' --with-keep.source --install-tests --no-multiarch
* installing *source* package ‘dimRed’ ...
** R
Error in .install_package_code_files(".", instdir) :
files in 'Collate' field missing from '/private/tmp/Rtmpt5ZdO6/devtools53646e44af92/gdkrmr-dimRed-97564ff/R':
get_info.R
ERROR: unable to collate and parse R files for package ‘dimRed’
* removing ‘/Users/hadley/R/dimRed’
* restoring previous ‘/Users/hadley/R/dimRed’
The HHLE
method appears not to work when ndim
is set to 1
.
To reproduce:
library(dimRed)
## Loading required package: DRR
## Loading required package: kernlab
## Loading required package: CVST
## Loading required package: Matrix
##
## Attaching package: 'dimRed'
## The following object is masked from 'package:stats':
##
## embed
## The following object is masked from 'package:base':
##
## as.data.frame
set.seed(1)
embed(matrix(rnorm(1E5), 100), 'HLLE', ndim = 1, knn = 10)
## 2019-01-26 23:42:42: Finding nearest neighbors
## 2019-01-26 23:42:42: Calculating Hessian
## 1/100
## Error in combn(seq_len(pars$ndim), 2): n < m
System Information:
sessionInfo()
## R version 3.5.2 (2018-12-20)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Arch Linux
##
## Matrix products: default
## BLAS: /usr/lib/libblas.so.3.8.0
## LAPACK: /usr/lib/liblapack.so.3.8.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] dimRed_0.2.2 DRR_0.0.3 CVST_0.2-2 Matrix_1.2-15
## [5] kernlab_0.9-27 nvimcom_0.9-75 colorout_1.2-0
##
## loaded via a namespace (and not attached):
## [1] RANN_2.6.1 Rcpp_1.0.0 lattice_0.20-38
## [4] digest_0.6.18 RSpectra_0.13-1 grid_3.5.2
## [7] magrittr_1.5 evaluate_0.12 stringi_1.2.4
## [10] rmarkdown_1.11 tools_3.5.2 stringr_1.3.1
## [13] xfun_0.4 yaml_2.2.0 compiler_3.5.2
## [16] BiocManager_1.30.4 htmltools_0.3.6 knitr_1.21
function names are a mess currently, I will standardize them at some point with a deprecation period.
TODOs for the Autoencoder:
dimRedResult
objectI'm creating a package that imports dimRed
and I'm getting an error ("Error in eval(expr, envir, enclos) : could not find function "dimRedMethodList") when invoking this code:
#' @importFrom dimRed Isomap dimRedData embed
foo <- function(x, training, ...) {
imap <- embed(dimRedData(training)),
"Isomap", knn = x$options$knn,
ndim = x$num, .mute = x$options$.mute)
}
I know that this isn't reproducible but it otherwise works when using Isomap
directly instead of embed(,"Isomap")
. I've tried importing dimRedMethodList
too but had the same error. Loading the package prior to invoking this function also works.
Hi, the ISOMAP function run fast. But Is there any method to automatically select landmark points in your ISOMAP? Many thanks.
Hi,
I am trying to find the reconstruction error for umap and tsne. I get this error.
ir <- loadDataSet("3D S Curve")
ir.umap <- embed(ir, "UMAP", ndim = ndims(ir))
ir.tsne <- embed(ir, "tSNE", ndim = ndims(ir))
rmse <- data.frame(
rmse_umap = reconstruction_error(ir.umap),
rmse_tsne = reconstruction_error(ir.tsne)
)
matplot(rmse, type = "l")
plot(ir)
plot(ir.umap)
plot(ir.tsne)
This gives me an error:
Error in .local(object, ...): object does not have an inverse function
Traceback:
data.frame(rmse_umap = reconstruction_error(ir.umap), rmse_tsne = reconstruction_error(ir.tsne))
reconstruction_error(ir.umap)
reconstruction_error(ir.umap)
.local(object, ...)
getData(inverse(object, getData(getDimRedData(object))[, seq_len(n[i]),
. drop = FALSE]))
inverse(object, getData(getDimRedData(object))[, seq_len(n[i]),
. drop = FALSE])
inverse(object, getData(getDimRedData(object))[, seq_len(n[i]),
. drop = FALSE])
.local(object, ...)
stop("object does not have an inverse function")
Please let me know where am I going wrong/ how to fix this issue. Thanks!
no applicable method for 'predict' applied to an object of class "c('NMFfit', 'NMF')"
seems like an import problem again.
✖ | 8 1 | NNMF [2.0 s]
────────────────────────────────────────────────────────────────────────────────
test_NNMF.R:68: error: other arguments
no applicable method for 'predict' applied to an object of class "c('NMFfit', 'NMF')"
1: embed(input_trn, "NNMF", seed = 13, nrun = 10, ndim = 3, method = "KL", options = list(.pbackend = NULL)) at /home/gkraemer/progs/R/dimRed/tests/testthat/test_NNMF.R:68
2: embed(input_trn, "NNMF", seed = 13, nrun = 10, ndim = 3, method = "KL", options = list(.pbackend = NULL))
3: .local(.data, ...)
4: do.call(methodObject@fun, args) at /home/gkraemer/progs/R/dimRed/R/embed.R:135
5: (function (data, pars, keep.org.data = TRUE)
{
chckpkg("NMF")
chckpkg("MASS")
meta <- data@meta
orgdata <- if (keep.org.data)
data@data
else NULL
data <- data@data
if (!is.matrix(data))
data <- as.matrix(data)
data <- t(data)
if (pars$ndim > nrow(data))
stop("`ndim` should be less than the number of columns.", call. = FALSE)
if (length(pars$method) != 1)
stop("only supply one `method`", call. = FALSE)
args <- list(x = quote(data), rank = pars$ndim, method = pars$method, nrun = pars$nrun,
seed = pars$seed)
if (length(pars$options) > 0)
args <- c(args, pars$options)
nmf_result <- do.call(NMF::nmf, args)
w <- NMF::basis(nmf_result)
h <- t(NMF::coef(nmf_result))
colnames(w) <- paste0("NNMF", 1:ncol(w))
other.data <- list(w = w)
colnames(h) <- paste0("NNMF", 1:ncol(h))
appl <- function(x) {
appl.meta <- if (inherits(x, "dimRedData"))
x@meta
else data.frame()
dat <- if (inherits(x, "dimRedData"))
x@data
else x
if (!is.matrix(dat))
dat <- as.matrix(dat)
if (ncol(dat) != nrow(w))
stop("x must have the same number of columns ", "as the original data (",
nrow(w), ")", call. = FALSE)
res <- dat %*% t(MASS::ginv(w))
colnames(res) <- paste0("NNMF", 1:ncol(res))
scores <- new("dimRedData", data = res, meta = appl.meta)
return(scores)
}
inv <- function(x) {
appl.meta <- if (inherits(x, "dimRedData"))
x@meta
else data.frame()
proj <- if (inherits(x, "dimRedData"))
x@data
else x
if (ncol(proj) > ncol(w))
stop("x must have less or equal number of dimensions ", "as the original data")
res <- tcrossprod(proj, w)
colnames(res) <- colnames(data)
res <- new("dimRedData", data = res, meta = appl.meta)
return(res)
}
inv <- function(x) {
appl.meta <- if (inherits(x, "dimRedData"))
x@meta
else data.frame()
proj <- if (inherits(x, "dimRedData"))
x@data
else x
if (ncol(proj) > ncol(data))
stop("x must have less or equal number of dimensions ", "as the original data")
reproj <- proj %*% other.data$H
reproj <- new("dimRedData", data = reproj, meta = appl.meta)
return(reproj)
}
res <- new("dimRedResult", data = new("dimRedData", data = h, meta = meta), org.data = orgdata,
apply = appl, inverse = inv, has.org.data = keep.org.data, has.apply = TRUE,
has.inverse = TRUE, method = "NNMF", pars = pars, other.data = other.data)
return(res)
})(data = <S4 object of class structure("dimRedData", package = "dimRed")>, keep.org.data = TRUE,
pars = structure(list(ndim = 3, method = "KL", nrun = 10, seed = 13, options = structure(list(
.pbackend = NULL), .Names = ".pbackend")), .Names = c("ndim", "method", "nrun",
"seed", "options")))
6: do.call(NMF::nmf, args) at /home/gkraemer/progs/R/dimRed/R/nnmf.R:93
7: (structure(function (x, rank, method, ...)
standardGeneric("nmf"), generic = structure("nmf", package = "NMF"), package = "NMF", group = list(), valueClass = character(0), signature = c("x",
"rank", "method"), default = `\001NULL\001`, skeleton = (function (x, rank, method,
...)
stop("invalid call in method dispatch to 'nmf' (no default method)", domain = NA))(x,
rank, method, ...), class = structure("standardGeneric", package = "methods")))(x = data,
rank = 3, method = "KL", nrun = 10, seed = 13, .pbackend = NULL)
8: (structure(function (x, rank, method, ...)
standardGeneric("nmf"), generic = structure("nmf", package = "NMF"), package = "NMF", group = list(), valueClass = character(0), signature = c("x",
"rank", "method"), default = `\001NULL\001`, skeleton = (function (x, rank, method,
...)
stop("invalid call in method dispatch to 'nmf' (no default method)", domain = NA))(x,
rank, method, ...), class = structure("standardGeneric", package = "methods")))(x = data,
rank = 3, method = "KL", nrun = 10, seed = 13, .pbackend = NULL)
9: nmf(x, rank, method = strategy, ...)
10: nmf(x, rank, method = strategy, ...)
...
15: (function (n, RNGobj)
{
if (verbose) {
if (verbose > 1) {
cat("\n## Run: ", n, "/", nrun, "\n", sep = "")
}
else {
cat("", n)
}
}
if (verbose > 2)
message("# Setting up loop RNG ... ", appendLF = FALSE)
setRNG(RNGobj, verbose = verbose > 3)
if (verbose > 2)
message("OK")
if (n == 1 && .checkRandomness) {
.RNGinit <- getRNG()
}
res <- nmf(x, rank, method, nrun = 1, seed = seed, model = model, .options = .options,
...)
if (n == 1 && .checkRandomness && rng.equal(.RNGinit)) {
warning("NMF::nmf - You are running multiple non-random NMF runs with a fixed seed",
immediate. = TRUE)
}
if (!keep.all) {
resList <- list(residuals = NA, .callback = NULL)
err <- residuals(res)
best <- best.static$residuals
if (is.na(best) || err < best) {
if (verbose) {
if (verbose > 1L)
cat("## Updating best fit [deviance =", err, "]\n", sep = "")
else cat("*")
}
best.static$fit <<- res
best.static$residuals <<- err
resList$residuals <- err
}
best.static$consensus <<- best.static$consensus + connectivity(res, no.attrib = TRUE)
if (!is.null(.callback)) {
resList$.callback <- tryCatch(.callback(res, n), error = function(e) e)
}
res <- resList
}
if (opt.gc && n%%opt.gc == 0) {
if (verbose > 1)
message("# Call garbage collection NOW")
else if (verbose)
cat("%")
gc(verbose = verbose > 3)
}
if (verbose > 1)
cat("## DONE\n")
res
})(dots[[1L]][[1L]], dots[[2L]][[1L]])
16: connectivity(res, no.attrib = TRUE)
17: connectivity(res, no.attrib = TRUE)
18: .local(object, ...)
19: callNextMethod(object = object, what = "samples")
20: eval(call, callEnv)
21: eval(call, callEnv)
22: .nextMethod(object = object, what = "samples")
23: predict(object, ...)
24: predict(object, ...)
─────────────────────────────────────────
The written documentation is very scarce (probably a consequence of using Roxygen ...). More detail would be helpful.
Technically, due to the documentation file names, the "see also" links are absolutely horrible; they would be much more helpful in their obvious short version. Can that be changed?
Best, Ulrike
It would be great if the quality methods would directly work on coRanking matrices.
Best, Ulrike
On my computer running testthat::test(".")
results in an endless loop with 100% CPU utilization, I think NMF is the culprit but cannot pin it down with absolute certainty. The test run just fine on CRAN though.
Thanks for providing dimRed.
The documentation regarding AUC_lnK_R_NX is quite misleading, as you also seem to be aware of in your code. You currently use normalized inverse position weights, instead of the claimed logarithmic ones.
Would be good if documentation were adapted to code or vice versa.
Best, Ulrike
Hi,
I am trying to apply the metrics using the dimRed package on MNIST dataset. I am unable to load the dataset and get an object of dimRedData. Please help. Thanks!
I'm not able to reproduce the kernel PCA results in comparison to the underlying function. Here's an example:
library(kernlab)
library(dimRed)
set.seed(131)
tr_dat <- matrix(rnorm(100*6), ncol = 6)
te_dat <- matrix(rnorm(20*6), ncol = 6)
colnames(tr_dat) <- paste0("X", 1:6)
colnames(te_dat) <- paste0("X", 1:6)
k_name <- "rbfdot"
k_par <- list(sigma = .2)
## test values
kpca_obj <- kPCA(stdpars = list(ndim = 3, kernel = k_name, kpar = k_par))
kpca_obj <- kpca_obj@fun(dimRedData(tr_dat), kpca_obj@stdpars)
kpca_pred <- kpca_obj@apply(te_dat)@data
## expected values
kpca_obj_exp <- kpca(tr_dat,
kernel = k_name,
kpar = k_par)
kpca_pred_exp <- predict(kpca_obj_exp, tr_dat)[, 1:3]
colnames(kpca_pred_exp) <- paste0("kPCA", 1:3)
I get
> head(kpca_pred)
kPCA1 kPCA2 kPCA3
[1,] -0.1754955 -2.8205993 0.51416167
[2,] 1.1112348 1.7925091 -0.02363246
[3,] 1.9973353 -0.9198911 0.14218226
[4,] 3.0105551 1.4249128 -2.79424169
[5,] -3.2053340 -2.0046749 -0.79662181
[6,] 1.5522026 3.6696689 -2.54760691
> head(kpca_pred_exp)
kPCA1 kPCA2 kPCA3
[1,] 2.614505 2.9551241 2.1230302
[2,] -1.827209 2.4680460 -2.3203690
[3,] 2.956935 -1.2295952 -2.9909752
[4,] -3.740879 -0.8210545 -4.0988922
[5,] -1.015746 -1.7453619 -0.5225218
[6,] -2.357748 2.1721046 -2.2195350
> sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: macOS Sierra 10.12.3
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] dimRed_0.0.3.9001 DRR_0.0.2 CVST_0.2-1 Matrix_1.2-7.1 kernlab_0.9-25
loaded via a namespace (and not attached):
[1] tools_3.3.2 grid_3.3.2 lattice_0.20-34
BTW what's the best way to access the objects generated in the fun
code from the base object? I'd like to get ahold of the PCA rotation matrix or the kPCA object res
. That gets computed once on the first call?
Thanks,
Max
I find it misleading trying to use embed()
on a sparse matrix and don't get an error. After an investigation, I see a call of as.matrix()
on my sparse matrix. I think it's reasonable to throw an error preventing a memory explosion. Even more, the call as.matrix()
assumes a user can pass something else and the result can be unexpected. It's dangerous and in some point of view, in most cases, useless.
It seems that NNLM is unmaintained and got removed from CRAN (@topepo ):
https://cran.r-project.org/web/packages/NNLM/index.html
Questions:
With the reference UMAP implementation (umap-learn 0.3.9, py27_0, conda-forge
) installed, dimRed
(0.2.3, R-) appears to use only the default knn
as specified in umap@stdpars
.
library(dimRed)
dat <- loadDataSet("3D S Curve", n = 300)
## use the S4 Class directly:
umap <- UMAP()
umap@stdpars
# $knn
# [1] 15
#
# $ndim
# [1] 2
#
# $d
# [1] "euclidean"
#
# $method
# [1] "umap-learn"
emb <- umap@fun(dat, umap@stdpars)
plot(emb)
umap@stdpars$knn <- 30
umap@stdpars
# $knn
# [1] 30
#
# $ndim
# [1] 2
#
# $d
# [1] "euclidean"
#
# $method
# [1] "umap-learn"
emb <- umap@fun(dat, umap@stdpars)
plot(emb) # same plot although it should be different because of change in knn
emb2 <- embed(dat, "UMAP", .mute = NULL, knn = 2, method="naive")
plot(emb2, type = "2vars")
emb2 <- embed(dat, "UMAP", .mute = NULL, knn = 200, method="naive")
plot(emb2, type = "2vars") # same here
sessionInfo()
# R version 3.6.0 (2019-04-26)
# Platform: x86_64-pc-linux-gnu (64-bit)
# Running under: Ubuntu 18.04 (Bionic Beaver)
#
# Matrix products: default
# BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
# LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
#
# locale:
# [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_AG.UTF-8
# [4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_AG.UTF-8 LC_MESSAGES=en_US.UTF-8
# [7] LC_PAPER=en_AG.UTF-8 LC_NAME=C LC_ADDRESS=C
# [10] LC_TELEPHONE=C LC_MEASUREMENT=en_AG.UTF-8 LC_IDENTIFICATION=C
#
# attached base packages:
# [1] stats graphics grDevices utils datasets methods base
#
# other attached packages:
# [1] dimRed_0.2.3 DRR_0.0.3 CVST_0.2-2 Matrix_1.2-17 kernlab_0.9-27
#
# loaded via a namespace (and not attached):
# [1] compiler_3.6.0 magrittr_1.5 tools_3.6.0 yaml_2.2.0 reticulate_1.12 Rcpp_1.0.1
# [7] RSpectra_0.14-0 grid_3.6.0 jsonlite_1.6 umap_0.2.2.0 lattice_0.20-38
When do you think that you'll do another release? I have a recipes version going to CRAN before the end of the year and I wasn't sure if I could include the NNMF or autoencoder features from dimRed
in that version.
Any plans on including this? I might get motivated enough to submit a PR. If so, you you prefer any particular package (NMF
or NNLM
)?
The DiffusionMaps
method appears not to work when ndim
is set to 1
.
To reproduce:
library(dimRed)
set.seed(1)
embed(matrix(rnorm(1E5), 100), 'DiffusionMaps', ndim=1)
## Performing eigendecomposition
## Computing Diffusion Coordinates
## Elapsed time: 0.009 seconds
## Warning in seq_len(ncol(outdata)): first element used of 'length.out'
## argument
## Error in seq_len(ncol(outdata)): argument must be coercible to non-negative integer
System info
sessionInfo()
## R version 3.5.2 (2018-12-20)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Arch Linux
##
## Matrix products: default
## BLAS: /usr/lib/libblas.so.3.8.0
## LAPACK: /usr/lib/liblapack.so.3.8.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] dimRed_0.2.2 DRR_0.0.3 CVST_0.2-2 Matrix_1.2-15
## [5] kernlab_0.9-27 nvimcom_0.9-75 colorout_1.2-0
##
## loaded via a namespace (and not attached):
## [1] Rcpp_1.0.0 lattice_0.20-38 digest_0.6.18
## [4] grid_3.5.2 magrittr_1.5 evaluate_0.12
## [7] stringi_1.2.4 scatterplot3d_0.3-41 rmarkdown_1.11
## [10] tools_3.5.2 stringr_1.3.1 igraph_1.2.2
## [13] xfun_0.4 yaml_2.2.0 compiler_3.5.2
## [16] pkgconfig_2.0.2 BiocManager_1.30.4 htmltools_0.3.6
## [19] diffusionMap_1.1-0.1 knitr_1.21
Tensorflow 2.0 has a new api.
Include something like this and add some parameters to inverse(...)
library(dimRed)
x <- loadDataSet("Iris")
ir.drr <- embed(ir, "DRR", ndim = ndims(x))
ir.pca <- embed(ir, "PCA", ndim = ndims(x))
get_rmse_by_ndim <- function (x, n = ndims(x)) {
res <- numeric(n)
org <- getData(getOrgData(x))
for (i in seq_len(n)) {
rec <- getData(inverse(x, getData(getDimRedData(x))[, seq_len(i), drop = FALSE]))
res[i] <- sqrt(mean((org - rec) ^ 2))
}
res
}
rmse <- data.frame(
rmse_drr = get_rmse_by_ndim(ir.drr),
rmse_pca = get_rmse_by_ndim(ir.pca)
)
matplot(rmse, type = "l")
plot(ir)
plot(ir.drr)
plot(ir.pca)
Current master does not do that!
Dear Guido Kraemer,
Thanks for the package! I am thinking of contributing a vignette that you help users to quickly understand how to use the package (more on the usage side than illustrating different methods). Is it a good idea?
Regards,
Srikanth KS
You might consider adding some of Tukey's data depth methods. R has a few packages that you could wrap including ddalpha
(see this paper gives a pretty good description of that).
Can you add a functions or classes that will allow the model to be estimated from a data set and then applied to any other data set? This wouldn't work for every method (e.g. MDS) but would be extremely useful.
For example, with PCA:
set.seed(12)
for_mod <- sample(1:nrow(USArrests), 40)
pca_mod <- prcomp(~ Murder + Assault + Rape, data = USArrests[for_mod, ], scale = TRUE)
## now apply the projection onto any data set:
pca_mod_data <- predict(pca_mod, USArrests[ for_mod, ])
pca_other_data <- predict(pca_mod, USArrests[-for_mod, ])
Thanks
Hi there,
Please see below the repro. The examples are taken from documentation of embed
.
library(dimRed)
#> Loading required package: DRR
#> Loading required package: kernlab
#> Loading required package: CVST
#> Loading required package: Matrix
#>
#> Attaching package: 'dimRed'
#> The following object is masked from 'package:stats':
#>
#> embed
#> The following object is masked from 'package:base':
#>
#> as.data.frame
as.data.frame(
embed(Species ~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width,
iris, "PCA", .keep.org.data = FALSE)
)
#> Error in h(simpleError(msg, call)): error in evaluating the argument 'x' in selecting a method for function 'as.data.frame': invalid class "dimRedResult" object: invalid object for slot "org.data" in class "dimRedResult": got class "NULL", should be or extend class "matrix"
as.data.frame(embed(iris[, 1:4], "PCA", .keep.org.data = FALSE))
#> Error in h(simpleError(msg, call)): error in evaluating the argument 'x' in selecting a method for function 'as.data.frame': invalid class "dimRedResult" object: invalid object for slot "org.data" in class "dimRedResult": got class "NULL", should be or extend class "matrix"
Created on 2022-08-28 by the reprex package (v2.0.1)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.