Giter VIP home page Giter VIP logo

Comments (6)

mtmorgan avatar mtmorgan commented on July 17, 2024 1

This is the advice in the vignette section 4.1.2

from biocparallel.

DavoSam avatar DavoSam commented on July 17, 2024 1

@mtmorgan Thank you for this link (and for the reproducible example btw)! I must've missed that detail when I first combed through the vignette, but it makes a lot of sense now

from biocparallel.

Jiefei-Wang avatar Jiefei-Wang commented on July 17, 2024

Hi @DavoSam ,

FYI: The current version of BiocParallel does support exporting objects automatically. It is not a panacea and has certain limitations, but it should work in your example.

I think the issue here is that the worker does not attach the package SingleCellExperiment to the search path. The words "attach" and "load" have different meanings in R. In your example, you load the package SingleCellExperiment without attaching it to the R search path. By doing that, the functions in SingleCellExperiment are available to you, but not to R generic functions.

One solution is to explicitly attach the package

 bplapply(
           1:n, 
           function(y) { 
               library(SingleCellExperiment)
               len = ncol(ob)
               sample(1:len, p*len, replace = r) 
           }, 
           BPPARAM = BP
           )

It should find the function BiocGenerics::ncol with no issue

Best,
Jiefei

from biocparallel.

mtmorgan avatar mtmorgan commented on July 17, 2024

Just to confirm that, for a reproducilbe example, after doing

library(SingleCellExperiment)
example(SingleCellExperiment)

The following fails (because SingleCellExperiment is loaded in the R session of the worker but not attached to the search path)

 > bplapply(list(1:2, 1:3), function(i, sce) sce[,i], sce, BPPARAM = SnowParam(2))
...
Error: BiocParallel errors
  2 remote errors, element index: 1, 2
  0 unevaluated and other errors
  first remote error:
Error in sce[, i]: object of type 'S4' is not subsettable

but the following succeeds

res <- bplapply(list(1:2, 1:3), function(i, sce) {
    suppressPackageStartupMessages({ library(SingleCellExperiment) })
    sce[,i]
}, sce, BPPARAM = SnowParam(2))

from biocparallel.

DavoSam avatar DavoSam commented on July 17, 2024

@mtmorgan @Jiefei-Wang

Thank you for the prompt reply, I tested the reproducible example on my local Windows machine and it worked once I attached the package using library(). I didn't recognize the difference between load and attach, thanks for the explanation!

Out of curiosity, I ran the reproducible example and some additional tests on my school's Hoffman2 cluster and noticed some differences in behavior compared to the tests on local. Of course, I recognize the setup is markedly different (the cluster OS is linux, using jupyterLab, R 4.2.2, more up to date packages). All test output is from the remote 'new test' setup except for the last one for Test 4.

New test setup

R version 4.2.2 (2022-10-31)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: /u/home/d/davidsam/miniconda3/envs/r_seurat/lib/libopenblasp-r0.3.21.so

locale:
[1] C

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] SingleCellExperiment_1.20.0 SummarizedExperiment_1.28.0
 [3] Biobase_2.58.0              GenomicRanges_1.50.0       
 [5] GenomeInfoDb_1.34.1         IRanges_2.32.0             
 [7] S4Vectors_0.36.0            BiocGenerics_0.44.0        
 [9] MatrixGenerics_1.10.0       matrixStats_0.63.0         
[11] BiocParallel_1.32.5        

loaded via a namespace (and not attached):
 [1] pillar_1.8.1           compiler_4.2.2         XVector_0.38.0        
 [4] base64enc_0.1-3        bitops_1.0-7           tools_4.2.2           
 [7] zlibbioc_1.44.0        digest_0.6.31          uuid_1.1-0            
[10] lattice_0.20-45        jsonlite_1.8.4         evaluate_0.20         
[13] lifecycle_1.0.3        rlang_1.0.6            Matrix_1.5-3          
[16] DelayedArray_0.24.0    IRdisplay_1.1          cli_3.6.0             
[19] IRkernel_1.3.2         parallel_4.2.2         fastmap_1.1.0         
[22] GenomeInfoDbData_1.2.9 repr_1.1.6             vctrs_0.5.2           
[25] grid_4.2.2             glue_1.6.2             snow_0.4-4            
[28] fansi_1.0.4            pbdZMQ_0.3-9           codetools_0.2-18      
[31] htmltools_0.5.4        utf8_1.2.2             RCurl_1.98-1.9        
[34] crayon_1.5.2

Local test setup

R version 4.1.3 (2022-03-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19044)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] BiocParallel_1.28.3         SingleCellExperiment_1.16.0 SummarizedExperiment_1.24.0 Biobase_2.54.0              GenomicRanges_1.46.1       
 [6] GenomeInfoDb_1.30.1         IRanges_2.28.0              S4Vectors_0.32.3            BiocGenerics_0.40.0         MatrixGenerics_1.6.0       
[11] matrixStats_0.61.0         

loaded via a namespace (and not attached):
 [1] SeuratObject_4.0.4     Rcpp_1.0.8.3           lattice_0.20-45        tidyr_1.2.0            snow_0.4-4             prettyunits_1.1.1     
 [7] ps_1.6.0               assertthat_0.2.1       rprojroot_2.0.2        utf8_1.2.2             R6_2.5.1               SeuratData_0.2.1      
[13] pillar_1.7.0           zlibbioc_1.40.0        rlang_1.0.2            callr_3.7.0            Matrix_1.4-0           desc_1.4.1            
[19] devtools_2.4.3         RCurl_1.98-1.6         DelayedArray_0.20.0    compiler_4.1.3         pkgconfig_2.0.3        pkgbuild_1.3.1        
[25] tidyselect_1.1.2       tibble_3.1.6           GenomeInfoDbData_1.2.7 fansi_1.0.3            crayon_1.5.1           dplyr_1.0.8           
[31] withr_2.5.0            bitops_1.0-7           brio_1.1.3             rappdirs_0.3.3         grid_4.1.3             gtable_0.3.0          
[37] lifecycle_1.0.1        DBI_1.1.2              magrittr_2.0.2         cli_3.2.0              cachem_1.0.6           XVector_0.34.0        
[43] fs_1.5.2               remotes_2.4.2          testthat_3.1.2         ellipsis_0.3.2         generics_0.1.2         vctrs_0.3.8           
[49] tools_4.1.3            glue_1.6.2             purrr_0.3.4            processx_3.5.3         pkgload_1.2.4          parallel_4.1.3        
[55] fastmap_1.1.0          sessioninfo_1.2.2      memoise_2.0.1          usethis_2.1.5

Test 1:

#reproducible example
#result: FAIL (expected), same as on local setup
library(BiocParallel)
library(SingleCellExperiment)
example(SingleCellExperiment)
bplapply(list(1:2, 1:3), function(i, sce) sce[,i], sce, BPPARAM = SnowParam(2))
Test 1 Output


Loading required package: SummarizedExperiment

Loading required package: MatrixGenerics

Loading required package: matrixStats


Attaching package: 'MatrixGenerics'


The following objects are masked from 'package:matrixStats':

    colAlls, colAnyNAs, colAnys, colAvgsPerRowSet, colCollapse,
    colCounts, colCummaxs, colCummins, colCumprods, colCumsums,
    colDiffs, colIQRDiffs, colIQRs, colLogSumExps, colMadDiffs,
    colMads, colMaxs, colMeans2, colMedians, colMins, colOrderStats,
    colProds, colQuantiles, colRanges, colRanks, colSdDiffs, colSds,
    colSums2, colTabulates, colVarDiffs, colVars, colWeightedMads,
    colWeightedMeans, colWeightedMedians, colWeightedSds,
    colWeightedVars, rowAlls, rowAnyNAs, rowAnys, rowAvgsPerColSet,
    rowCollapse, rowCounts, rowCummaxs, rowCummins, rowCumprods,
    rowCumsums, rowDiffs, rowIQRDiffs, rowIQRs, rowLogSumExps,
    rowMadDiffs, rowMads, rowMaxs, rowMeans2, rowMedians, rowMins,
    rowOrderStats, rowProds, rowQuantiles, rowRanges, rowRanks,
    rowSdDiffs, rowSds, rowSums2, rowTabulates, rowVarDiffs, rowVars,
    rowWeightedMads, rowWeightedMeans, rowWeightedMedians,
    rowWeightedSds, rowWeightedVars


Loading required package: GenomicRanges

Loading required package: stats4

Loading required package: BiocGenerics


Attaching package: 'BiocGenerics'


The following objects are masked from 'package:stats':

    IQR, mad, sd, var, xtabs


The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, aperm, append,
    as.data.frame, basename, cbind, colnames, dirname, do.call,
    duplicated, eval, evalq, get, grep, grepl, intersect, is.unsorted,
    lapply, mapply, match, mget, order, paste, pmax, pmax.int, pmin,
    pmin.int, rank, rbind, rownames, sapply, setdiff, sort, table,
    tapply, union, unique, unsplit, which.max, which.min


Loading required package: S4Vectors


Attaching package: 'S4Vectors'


The following objects are masked from 'package:base':

    I, expand.grid, unname


Loading required package: IRanges

Loading required package: GenomeInfoDb

Loading required package: Biobase

Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.



Attaching package: 'Biobase'


The following object is masked from 'package:MatrixGenerics':

    rowMedians


The following objects are masked from 'package:matrixStats':

    anyMissing, rowMedians



SnglCE> ncells <- 100

SnglCE> u <- matrix(rpois(20000, 5), ncol=ncells)

SnglCE> v <- log2(u + 1)

SnglCE> pca <- matrix(runif(ncells*5), ncells)

SnglCE> tsne <- matrix(rnorm(ncells*2), ncells)

SnglCE> sce <- SingleCellExperiment(assays=list(counts=u, logcounts=v),
SnglCE+     reducedDims=SimpleList(PCA=pca, tSNE=tsne))

SnglCE> sce
class: SingleCellExperiment 
dim: 200 100 
metadata(0):
assays(2): counts logcounts
rownames: NULL
rowData names(0):
colnames: NULL
colData names(0):
reducedDimNames(2): PCA tSNE
mainExpName: NULL
altExpNames(0):

SnglCE> ## coercion from SummarizedExperiment
SnglCE> se <- SummarizedExperiment(assays=list(counts=u, logcounts=v))

SnglCE> as(se, "SingleCellExperiment")
class: SingleCellExperiment 
dim: 200 100 
metadata(0):
assays(2): counts logcounts
rownames: NULL
rowData names(0):
colnames: NULL
colData names(0):
reducedDimNames(0):
mainExpName: NULL
altExpNames(0):
Loading required package: SingleCellExperiment
Loading required package: SummarizedExperiment
Loading required package: MatrixGenerics
Loading required package: matrixStats

Attaching package: 'MatrixGenerics'

The following objects are masked from 'package:matrixStats':

    colAlls, colAnyNAs, colAnys, colAvgsPerRowSet, colCollapse,
    colCounts, colCummaxs, colCummins, colCumprods, colCumsums,
    colDiffs, colIQRDiffs, colIQRs, colLogSumExps, colMadDiffs,
    colMads, colMaxs, colMeans2, colMedians, colMins, colOrderStats,
    colProds, colQuantiles, colRanges, colRanks, colSdDiffs, colSds,
    colSums2, colTabulates, colVarDiffs, colVars, colWeightedMads,
    colWeightedMeans, colWeightedMedians, colWeightedSds,
    colWeightedVars, rowAlls, rowAnyNAs, rowAnys, rowAvgsPerColSet,
    rowCollapse, rowCounts, rowCummaxs, rowCummins, rowCumprods,
    rowCumsums, rowDiffs, rowIQRDiffs, rowIQRs, rowLogSumExps,
    rowMadDiffs, rowMads, rowMaxs, rowMeans2, rowMedians, rowMins,
    rowOrderStats, rowProds, rowQuantiles, rowRanges, rowRanks,
    rowSdDiffs, rowSds, rowSums2, rowTabulates, rowVarDiffs, rowVars,
    rowWeightedMads, rowWeightedMeans, rowWeightedMedians,
    rowWeightedSds, rowWeightedVars

Loading required package: GenomicRanges
Loading required package: stats4
Loading required package: BiocGenerics

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:stats':

    IQR, mad, sd, var, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, aperm, append,
    as.data.frame, basename, cbind, colnames, dirname, do.call,
    duplicated, eval, evalq, get, grep, grepl, intersect, is.unsorted,
    lapply, mapply, match, mget, order, paste, pmax, pmax.int, pmin,
    pmin.int, rank, rbind, rownames, sapply, setdiff, sort, table,
    tapply, union, unique, unsplit, which.max, which.min

Loading required package: S4Vectors

Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

    I, expand.grid, unname

Loading required package: IRanges
Loading required package: GenomeInfoDb
Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.


Attaching package: 'Biobase'

The following object is masked from 'package:MatrixGenerics':

    rowMedians

The following objects are masked from 'package:matrixStats':

    anyMissing, rowMedians



Loading required package: SingleCellExperiment
Loading required package: SummarizedExperiment
Loading required package: MatrixGenerics
Loading required package: matrixStats

Attaching package: 'MatrixGenerics'

The following objects are masked from 'package:matrixStats':

    colAlls, colAnyNAs, colAnys, colAvgsPerRowSet, colCollapse,
    colCounts, colCummaxs, colCummins, colCumprods, colCumsums,
    colDiffs, colIQRDiffs, colIQRs, colLogSumExps, colMadDiffs,
    colMads, colMaxs, colMeans2, colMedians, colMins, colOrderStats,
    colProds, colQuantiles, colRanges, colRanks, colSdDiffs, colSds,
    colSums2, colTabulates, colVarDiffs, colVars, colWeightedMads,
    colWeightedMeans, colWeightedMedians, colWeightedSds,
    colWeightedVars, rowAlls, rowAnyNAs, rowAnys, rowAvgsPerColSet,
    rowCollapse, rowCounts, rowCummaxs, rowCummins, rowCumprods,
    rowCumsums, rowDiffs, rowIQRDiffs, rowIQRs, rowLogSumExps,
    rowMadDiffs, rowMads, rowMaxs, rowMeans2, rowMedians, rowMins,
    rowOrderStats, rowProds, rowQuantiles, rowRanges, rowRanks,
    rowSdDiffs, rowSds, rowSums2, rowTabulates, rowVarDiffs, rowVars,
    rowWeightedMads, rowWeightedMeans, rowWeightedMedians,
    rowWeightedSds, rowWeightedVars

Loading required package: GenomicRanges
Loading required package: stats4
Loading required package: BiocGenerics

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:stats':

    IQR, mad, sd, var, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, aperm, append,
    as.data.frame, basename, cbind, colnames, dirname, do.call,
    duplicated, eval, evalq, get, grep, grepl, intersect, is.unsorted,
    lapply, mapply, match, mget, order, paste, pmax, pmax.int, pmin,
    pmin.int, rank, rbind, rownames, sapply, setdiff, sort, table,
    tapply, union, unique, unsplit, which.max, which.min

Loading required package: S4Vectors

Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

    I, expand.grid, unname

Loading required package: IRanges
Loading required package: GenomeInfoDb
Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.


Attaching package: 'Biobase'

The following object is masked from 'package:MatrixGenerics':

    rowMedians

The following objects are masked from 'package:matrixStats':

    anyMissing, rowMedians



Error: BiocParallel errors
  2 remote errors, element index: 1, 2
  0 unevaluated and other errors
  first remote error:
Error in sce[, i]: object of type 'S4' is not subsettable

Traceback:

1. bplapply(list(1:2, 1:3), function(i, sce) sce[, i], sce, BPPARAM = SnowParam(2))
2. bplapply(list(1:2, 1:3), function(i, sce) sce[, i], sce, BPPARAM = SnowParam(2))
3. .bpinit(manager = manager, X = X, FUN = FUN, ARGS = ARGS, BPPARAM = BPPARAM, 
 .     BPOPTIONS = BPOPTIONS, BPREDO = BPREDO)

Test 2:

#reproducible example #2
#result: SUCCESS (expected), same as on local setup
library(BiocParallel)
library(SingleCellExperiment)
example(SingleCellExperiment)
res <- bplapply(list(1:2, 1:3), function(i, sce) {
    suppressPackageStartupMessages({ library(SingleCellExperiment) })
    sce[,i]
}, sce, BPPARAM = SnowParam(2))
res
Test 2 Output

Loading required package: SummarizedExperiment

Loading required package: MatrixGenerics

Loading required package: matrixStats


Attaching package: 'MatrixGenerics'


The following objects are masked from 'package:matrixStats':

    colAlls, colAnyNAs, colAnys, colAvgsPerRowSet, colCollapse,
    colCounts, colCummaxs, colCummins, colCumprods, colCumsums,
    colDiffs, colIQRDiffs, colIQRs, colLogSumExps, colMadDiffs,
    colMads, colMaxs, colMeans2, colMedians, colMins, colOrderStats,
    colProds, colQuantiles, colRanges, colRanks, colSdDiffs, colSds,
    colSums2, colTabulates, colVarDiffs, colVars, colWeightedMads,
    colWeightedMeans, colWeightedMedians, colWeightedSds,
    colWeightedVars, rowAlls, rowAnyNAs, rowAnys, rowAvgsPerColSet,
    rowCollapse, rowCounts, rowCummaxs, rowCummins, rowCumprods,
    rowCumsums, rowDiffs, rowIQRDiffs, rowIQRs, rowLogSumExps,
    rowMadDiffs, rowMads, rowMaxs, rowMeans2, rowMedians, rowMins,
    rowOrderStats, rowProds, rowQuantiles, rowRanges, rowRanks,
    rowSdDiffs, rowSds, rowSums2, rowTabulates, rowVarDiffs, rowVars,
    rowWeightedMads, rowWeightedMeans, rowWeightedMedians,
    rowWeightedSds, rowWeightedVars


Loading required package: GenomicRanges

Loading required package: stats4

Loading required package: BiocGenerics


Attaching package: 'BiocGenerics'


The following objects are masked from 'package:stats':

    IQR, mad, sd, var, xtabs


The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, aperm, append,
    as.data.frame, basename, cbind, colnames, dirname, do.call,
    duplicated, eval, evalq, get, grep, grepl, intersect, is.unsorted,
    lapply, mapply, match, mget, order, paste, pmax, pmax.int, pmin,
    pmin.int, rank, rbind, rownames, sapply, setdiff, sort, table,
    tapply, union, unique, unsplit, which.max, which.min


Loading required package: S4Vectors


Attaching package: 'S4Vectors'


The following objects are masked from 'package:base':

    I, expand.grid, unname


Loading required package: IRanges

Loading required package: GenomeInfoDb

Loading required package: Biobase

Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.



Attaching package: 'Biobase'


The following object is masked from 'package:MatrixGenerics':

    rowMedians


The following objects are masked from 'package:matrixStats':

    anyMissing, rowMedians



SnglCE> ncells <- 100

SnglCE> u <- matrix(rpois(20000, 5), ncol=ncells)

SnglCE> v <- log2(u + 1)

SnglCE> pca <- matrix(runif(ncells*5), ncells)

SnglCE> tsne <- matrix(rnorm(ncells*2), ncells)

SnglCE> sce <- SingleCellExperiment(assays=list(counts=u, logcounts=v),
SnglCE+     reducedDims=SimpleList(PCA=pca, tSNE=tsne))

SnglCE> sce
class: SingleCellExperiment 
dim: 200 100 
metadata(0):
assays(2): counts logcounts
rownames: NULL
rowData names(0):
colnames: NULL
colData names(0):
reducedDimNames(2): PCA tSNE
mainExpName: NULL
altExpNames(0):

SnglCE> ## coercion from SummarizedExperiment
SnglCE> se <- SummarizedExperiment(assays=list(counts=u, logcounts=v))

SnglCE> as(se, "SingleCellExperiment")
class: SingleCellExperiment 
dim: 200 100 
metadata(0):
assays(2): counts logcounts
rownames: NULL
rowData names(0):
colnames: NULL
colData names(0):
reducedDimNames(0):
mainExpName: NULL
altExpNames(0):
[[1]]
class: SingleCellExperiment 
dim: 200 2 
metadata(0):
assays(2): counts logcounts
rownames: NULL
rowData names(0):
colnames: NULL
colData names(0):
reducedDimNames(2): PCA tSNE
mainExpName: NULL
altExpNames(0):

[[2]]
class: SingleCellExperiment 
dim: 200 3 
metadata(0):
assays(2): counts logcounts
rownames: NULL
rowData names(0):
colnames: NULL
colData names(0):
reducedDimNames(2): PCA tSNE
mainExpName: NULL
altExpNames(0):

Test 3:

#test to see if explicit passing of function arguments is necessary with BiocParallel 1.32.5 (vs 1.28.3 on local)
#result: FAIL (unexpected) - same as on local setup
library(BiocParallel)
library(SingleCellExperiment)
example(SingleCellExperiment)
prop = 0.2
num = 2
dummyFUN2 = function(ob,n,p,n_cores=2) {
  BP = BiocParallel::SnowParam(workers = n_cores)
  bplapply(1:n, 
           function(y) { library(SingleCellExperiment); len = ncol(ob); sample(1:len, p*len, replace = FALSE) }
           , BPPARAM = BP)
  
}
dummyFUN2(sce,num,prop)
Test 3 Output

Loading required package: SummarizedExperiment

Loading required package: MatrixGenerics

Loading required package: matrixStats


Attaching package: 'MatrixGenerics'


The following objects are masked from 'package:matrixStats':

    colAlls, colAnyNAs, colAnys, colAvgsPerRowSet, colCollapse,
    colCounts, colCummaxs, colCummins, colCumprods, colCumsums,
    colDiffs, colIQRDiffs, colIQRs, colLogSumExps, colMadDiffs,
    colMads, colMaxs, colMeans2, colMedians, colMins, colOrderStats,
    colProds, colQuantiles, colRanges, colRanks, colSdDiffs, colSds,
    colSums2, colTabulates, colVarDiffs, colVars, colWeightedMads,
    colWeightedMeans, colWeightedMedians, colWeightedSds,
    colWeightedVars, rowAlls, rowAnyNAs, rowAnys, rowAvgsPerColSet,
    rowCollapse, rowCounts, rowCummaxs, rowCummins, rowCumprods,
    rowCumsums, rowDiffs, rowIQRDiffs, rowIQRs, rowLogSumExps,
    rowMadDiffs, rowMads, rowMaxs, rowMeans2, rowMedians, rowMins,
    rowOrderStats, rowProds, rowQuantiles, rowRanges, rowRanks,
    rowSdDiffs, rowSds, rowSums2, rowTabulates, rowVarDiffs, rowVars,
    rowWeightedMads, rowWeightedMeans, rowWeightedMedians,
    rowWeightedSds, rowWeightedVars


Loading required package: GenomicRanges

Loading required package: stats4

Loading required package: BiocGenerics


Attaching package: 'BiocGenerics'


The following objects are masked from 'package:stats':

    IQR, mad, sd, var, xtabs


The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, aperm, append,
    as.data.frame, basename, cbind, colnames, dirname, do.call,
    duplicated, eval, evalq, get, grep, grepl, intersect, is.unsorted,
    lapply, mapply, match, mget, order, paste, pmax, pmax.int, pmin,
    pmin.int, rank, rbind, rownames, sapply, setdiff, sort, table,
    tapply, union, unique, unsplit, which.max, which.min


Loading required package: S4Vectors


Attaching package: 'S4Vectors'


The following objects are masked from 'package:base':

    I, expand.grid, unname


Loading required package: IRanges

Loading required package: GenomeInfoDb

Loading required package: Biobase

Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.



Attaching package: 'Biobase'


The following object is masked from 'package:MatrixGenerics':

    rowMedians


The following objects are masked from 'package:matrixStats':

    anyMissing, rowMedians



SnglCE> ncells <- 100

SnglCE> u <- matrix(rpois(20000, 5), ncol=ncells)

SnglCE> v <- log2(u + 1)

SnglCE> pca <- matrix(runif(ncells*5), ncells)

SnglCE> tsne <- matrix(rnorm(ncells*2), ncells)

SnglCE> sce <- SingleCellExperiment(assays=list(counts=u, logcounts=v),
SnglCE+     reducedDims=SimpleList(PCA=pca, tSNE=tsne))

SnglCE> sce
class: SingleCellExperiment 
dim: 200 100 
metadata(0):
assays(2): counts logcounts
rownames: NULL
rowData names(0):
colnames: NULL
colData names(0):
reducedDimNames(2): PCA tSNE
mainExpName: NULL
altExpNames(0):

SnglCE> ## coercion from SummarizedExperiment
SnglCE> se <- SummarizedExperiment(assays=list(counts=u, logcounts=v))

SnglCE> as(se, "SingleCellExperiment")
class: SingleCellExperiment 
dim: 200 100 
metadata(0):
assays(2): counts logcounts
rownames: NULL
rowData names(0):
colnames: NULL
colData names(0):
reducedDimNames(0):
mainExpName: NULL
altExpNames(0):
Error: BiocParallel errors
  2 remote errors, element index: 1, 2
  0 unevaluated and other errors
  first remote error:
Error in h(simpleError(msg, call)): error in evaluating the argument 'x' in selecting a method for function 'ncol': object 'sce' not found

Traceback:

1. dummyFUN2(sce, num, prop)
2. bplapply(1:n, function(y) {
 .     library(SingleCellExperiment)
 .     len = ncol(ob)
 .     sample(1:len, p * len, replace = FALSE)
 . }, BPPARAM = BP)   # at line 10-12 of file <text>
3. bplapply(1:n, function(y) {
 .     library(SingleCellExperiment)
 .     len = ncol(ob)
 .     sample(1:len, p * len, replace = FALSE)
 . }, BPPARAM = BP)
4. .bpinit(manager = manager, X = X, FUN = FUN, ARGS = ARGS, BPPARAM = BPPARAM, 
 .     BPOPTIONS = BPOPTIONS, BPREDO = BPREDO)

Test 4:

#dummyFUN2 - test to see if (superficially) identical code works on cluster vs local (i.e. without doing library(SingleCellExperiment): 
#result - SUCCESS (unexpected) - different compared to local setup
library(BiocParallel)
library(SingleCellExperiment)
example(SingleCellExperiment)
prop = 0.2
num = 2
dummyFUN2 = function(ob,n,p,n_cores=2) {
  BP = BiocParallel::SnowParam(workers = n_cores)
  bplapply(1:n, 
           function(y, ob, p, r) { len = ncol(ob); sample(1:len, p*len, replace = r) }, 
           ob = ob, p = p, r = FALSE, BPPARAM = BP)
  
}
res = dummyFUN2(sce,num,prop)
print(res)
Test 4 Output

Loading required package: SummarizedExperiment

Loading required package: MatrixGenerics

Loading required package: matrixStats


Attaching package: 'MatrixGenerics'


The following objects are masked from 'package:matrixStats':

    colAlls, colAnyNAs, colAnys, colAvgsPerRowSet, colCollapse,
    colCounts, colCummaxs, colCummins, colCumprods, colCumsums,
    colDiffs, colIQRDiffs, colIQRs, colLogSumExps, colMadDiffs,
    colMads, colMaxs, colMeans2, colMedians, colMins, colOrderStats,
    colProds, colQuantiles, colRanges, colRanks, colSdDiffs, colSds,
    colSums2, colTabulates, colVarDiffs, colVars, colWeightedMads,
    colWeightedMeans, colWeightedMedians, colWeightedSds,
    colWeightedVars, rowAlls, rowAnyNAs, rowAnys, rowAvgsPerColSet,
    rowCollapse, rowCounts, rowCummaxs, rowCummins, rowCumprods,
    rowCumsums, rowDiffs, rowIQRDiffs, rowIQRs, rowLogSumExps,
    rowMadDiffs, rowMads, rowMaxs, rowMeans2, rowMedians, rowMins,
    rowOrderStats, rowProds, rowQuantiles, rowRanges, rowRanks,
    rowSdDiffs, rowSds, rowSums2, rowTabulates, rowVarDiffs, rowVars,
    rowWeightedMads, rowWeightedMeans, rowWeightedMedians,
    rowWeightedSds, rowWeightedVars


Loading required package: GenomicRanges

Loading required package: stats4

Loading required package: BiocGenerics


Attaching package: 'BiocGenerics'


The following objects are masked from 'package:stats':

    IQR, mad, sd, var, xtabs


The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, aperm, append,
    as.data.frame, basename, cbind, colnames, dirname, do.call,
    duplicated, eval, evalq, get, grep, grepl, intersect, is.unsorted,
    lapply, mapply, match, mget, order, paste, pmax, pmax.int, pmin,
    pmin.int, rank, rbind, rownames, sapply, setdiff, sort, table,
    tapply, union, unique, unsplit, which.max, which.min


Loading required package: S4Vectors


Attaching package: 'S4Vectors'


The following objects are masked from 'package:base':

    I, expand.grid, unname


Loading required package: IRanges

Loading required package: GenomeInfoDb

Loading required package: Biobase

Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.



Attaching package: 'Biobase'


The following object is masked from 'package:MatrixGenerics':

    rowMedians


The following objects are masked from 'package:matrixStats':

    anyMissing, rowMedians



SnglCE> ncells <- 100

SnglCE> u <- matrix(rpois(20000, 5), ncol=ncells)

SnglCE> v <- log2(u + 1)

SnglCE> pca <- matrix(runif(ncells*5), ncells)

SnglCE> tsne <- matrix(rnorm(ncells*2), ncells)

SnglCE> sce <- SingleCellExperiment(assays=list(counts=u, logcounts=v),
SnglCE+     reducedDims=SimpleList(PCA=pca, tSNE=tsne))

SnglCE> sce
class: SingleCellExperiment 
dim: 200 100 
metadata(0):
assays(2): counts logcounts
rownames: NULL
rowData names(0):
colnames: NULL
colData names(0):
reducedDimNames(2): PCA tSNE
mainExpName: NULL
altExpNames(0):

SnglCE> ## coercion from SummarizedExperiment
SnglCE> se <- SummarizedExperiment(assays=list(counts=u, logcounts=v))

SnglCE> as(se, "SingleCellExperiment")
class: SingleCellExperiment 
dim: 200 100 
metadata(0):
assays(2): counts logcounts
rownames: NULL
rowData names(0):
colnames: NULL
colData names(0):
reducedDimNames(0):
mainExpName: NULL
altExpNames(0):
Loading required package: SingleCellExperiment


Loading required package: SingleCellExperiment


[[1]]
 [1] 43 69 85 51 33 82 20 42 57 73 32 63 37 47 92 48  8 23 55 39

[[2]]
 [1] 33 14 35 19  9  1 17 44 18 37 64 66 48 67 80 47 94 25 99 88

Test 4 Output (local setup)

Loading required package: SingleCellExperiment
Loading required package: SummarizedExperiment
Loading required package: MatrixGenerics
Loading required package: matrixStats

Attaching package: ‘MatrixGenerics’

The following objects are masked from ‘package:matrixStats’:

    colAlls, colAnyNAs, colAnys, colAvgsPerRowSet, colCollapse, colCounts, colCummaxs, colCummins, colCumprods, colCumsums, colDiffs,
    colIQRDiffs, colIQRs, colLogSumExps, colMadDiffs, colMads, colMaxs, colMeans2, colMedians, colMins, colOrderStats, colProds,
    colQuantiles, colRanges, colRanks, colSdDiffs, colSds, colSums2, colTabulates, colVarDiffs, colVars, colWeightedMads,
    colWeightedMeans, colWeightedMedians, colWeightedSds, colWeightedVars, rowAlls, rowAnyNAs, rowAnys, rowAvgsPerColSet, rowCollapse,
    rowCounts, rowCummaxs, rowCummins, rowCumprods, rowCumsums, rowDiffs, rowIQRDiffs, rowIQRs, rowLogSumExps, rowMadDiffs, rowMads,
    rowMaxs, rowMeans2, rowMedians, rowMins, rowOrderStats, rowProds, rowQuantiles, rowRanges, rowRanks, rowSdDiffs, rowSds, rowSums2,
    rowTabulates, rowVarDiffs, rowVars, rowWeightedMads, rowWeightedMeans, rowWeightedMedians, rowWeightedSds, rowWeightedVars

Loading required package: GenomicRanges
Loading required package: stats4
Loading required package: BiocGenerics

Attaching package: ‘BiocGenerics’

The following objects are masked from ‘package:stats’:

    IQR, mad, sd, var, xtabs

The following objects are masked from ‘package:base’:

    anyDuplicated, append, as.data.frame, basename, cbind, colnames, dirname, do.call, duplicated, eval, evalq, Filter, Find, get, grep,
    grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget, order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank, rbind,
    Reduce, rownames, sapply, setdiff, sort, table, tapply, union, unique, unsplit, which.max, which.min

Loading required package: S4Vectors

Attaching package: ‘S4Vectors’

The following objects are masked from ‘package:base’:

    expand.grid, I, unname

Loading required package: IRanges

Attaching package: ‘IRanges’

The following object is masked from ‘package:grDevices’:

    windows

Loading required package: GenomeInfoDb
Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with 'browseVignettes()'. To cite Bioconductor, see 'citation("Biobase")', and for
    packages 'citation("pkgname")'.


Attaching package: ‘Biobase’

The following object is masked from ‘package:MatrixGenerics’:

    rowMedians

The following objects are masked from ‘package:matrixStats’:

    anyMissing, rowMedians


Error: BiocParallel errors
  2 remote errors, element index: 1, 2
  0 unevaluated and other errors
  first remote error: argument of length 0
In addition: Warning messages:
1: In serialize(data, node$con) :
  'package:stats' may not be available when loading
2: In serialize(data, node$con) :
 
 Error: BiocParallel errors
2 remote errors, element index: 1, 2
0 unevaluated and other errors
first remote error: argument of length 0

Follow-up Questions:

  1. It seems the remote cluster setup is able to run Test 4 without needing to call library(SingleCellExperiment). based on the output, the worker processes do it themselves whereas they do not on the local setup. @Jiefei-Wang Is this due to what you were saying regarding newest version of BiocParallel having better export ability to the workers? Or would you say it is more likely to be due to non-BiocParallel differences between the two tests? (R version, OS, etc).
  2. Since I was using the newest version of BiocParallel for the remote setup, I expected Test 3 to work due to better exporting of arguments whose values are found in global environment to the workers. Unless I misunderstood what you meant by that statement?

Takeaways(?):
From what I'm seeing, the best practice for me is to 1) explicitly pass function arguments to the FUN in bplapply and 2) call library() within FUN when working with non-base, special object classes such as SingleCellExperiment or SeuratObject.
I will be sharing this code with other users in my lab and know that their setups will be different than mine, so I'm looking for the most robust way of guaranteeing equivalent behavior. Sorry for the long post and thanks again for helping!

Best,
David Samvelian

from biocparallel.

Jiefei-Wang avatar Jiefei-Wang commented on July 17, 2024

Hi, @DavoSam ,

The problem with your test 3 is actually non-trivial. It is related to the lazy evaluation of R. BiocParallel does export the object ob, but it is an unevaluated one. When it is evaluated in a worker, the worker was looking for the object sce from the global space. Of course, the worker cannot find it as the object sce was not there. Thus you saw the error.

I think this is a place we can improve, I'll make a pull request to fix this issue.

Best,
Jiefei

from biocparallel.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.