Comments (40)
@jlmelville Yes! That fixed it. The patch added to rcpp-parallel on conda works for me.
conda install -c conda-forge r-rcppparallel
from uwot.
FWIW, I directly ran valgrind, and RDCsan
in the container provided by https://github.com/wch/r-debug (RDSan
seems to not work well with building RcppEigen) and did not see anything flagged that wasn't already something that shows up in the CRAN checks for RcppAnnoy and RcppParallel.
A new version of uwot
is now on CRAN with the two fixes unearthed in this issue.
from uwot.
Sorry you are having trouble. I am unable to reproduce the crash you give, even with R-devel
installed. If you get a stacktrace, do you get the same error as in the other issues, i.e. memory not mapped
when RcppParallel::setNumThreads()
is called?
If so, what does RcppParallel::defaultNumThreads()
say? My guess is that a non-integer value of n_threads
< 1 is being passed in, which I have just discovered does seem to cause RccpParallel some grief.
from uwot.
The current master might solve the issue. Please give it a try if possible and let me know.
from uwot.
Yeah, looks nasty. Confirm that 0 < n_threads < 1
blows up on my mac. Probably RcppParallel::defaultNumThreads()
is giving 1 on affecting machines so you get n_threads=0.5
. Interesting that n_threads=0
works, I would have thought that there would be a cast to integer at some point such that a fractional value would get truncated to zero pretty quickly...
from uwot.
Thanks for confirming the issue @LTLA. That's one problem solved, but is it the problem? I suppose with uninitialized memory anything is possible, although it's odd that it works once then fails, and that more than one person reports in satijalab/seurat#2256 that installing RcppParallel from conda-forge solves the issue. Some compiler difference that initializes the memory differently?
from uwot.
Hmm. The fact that it fails on the second go does suggest it's a memory leak of some sort rather than the n_threads
problem (which always fails immediately for me). Valgrind gives me a whole stack of warnings if run on the OP's code, but they all relate to base::eval
rather than anything in uwot.
The question is whether this is a memory leak in uwot or RcppParallel, given that the problem was "fixed" by reinstalling the latter from conda. Though given how much conda messes with the libraries, it feels like a house of cards to rely on that to solve this kind of problem.
It would be nice to see what happens if someone can run Valgrind on a machine where the above code crashes. Might be pretty painful to do on Windows, though.
from uwot.
Dear @LTLA and @jlmelville ,
I just installed the latest uwot code, reinstalled RcppParallel and ran the code again....it now works!
Should not have reinstalled RcppParallel to confirm that changes in the uwot code did the trick and not the reinstallation of RcppParallel, I am sorry for that ;-)
RcppParallel::defaultNumThreads() gives 8, by the way (as it did before).
Just to let you know, I had been running the code setting different values for n_threads, but to no avail.
Thanks for all your help!!
from uwot.
I'm glad it's working now, but I am mystified as to why. Did you reinstall RcppParallel from CRAN or from conda forge?
Edited to add: if n_threads
was being set manually, then I am even more baffled. Seems like I will have to do another check of the parallel code to make sure it's not calling an R API at any point before I submit a new version to CRAN.
from uwot.
I reinstalled using CRAN (should have said so in previous comment). I am also mystified, but I am not that well versed in programming/tracing/debugging to be able to find the source of what went wrong...
And I don't know how ' RcppParallel' in R works together with RcppParellel in conda.
from uwot.
I will note that in the following chunk of code:
out <- prcomp(as.matrix(iris[,-5]))
library(irlba)
out <- irlba(as.matrix(iris[,-5]), nu=1, nv=1)
library(Rtsne)
out <- Rtsne(as.matrix(iris[,-5]), check_duplicates=FALSE)
library(uwot)
iris_umap <- umap(as.matrix(iris[,-5]), pca = 50, n_threads=1)
# And a second time
iris_umap2 <- umap(iris[,-5], pca = 50, n_threads=1)
Only umap
triggers Valgrind warnings. So it actually doesn't seem like a pure eval
problem, there seems to be some interaction between something happening in uwot and eval
.
The first message looks something like this:
> iris_umap <- umap(as.matrix(iris[,-5]), pca = 50, n_threads=1)
==20371== Invalid read of size 32
==20371== at 0x7154C91: __wcsnlen_avx2 (strlen-avx2.S:62)
==20371== by 0x7082EC1: wcsrtombs (wcsrtombs.c:104)
==20371== by 0x7008B20: wcstombs (wcstombs.c:34)
==20371== by 0x1BE142: wcstombs (stdlib.h:154)
==20371== by 0x1BE142: do_makenames (character.c:938)
==20371== by 0x238822: bcEval (eval.c:7041)
==20371== by 0x24519F: Rf_eval (eval.c:688)
==20371== by 0x246F4E: R_execClosure (eval.c:1852)
==20371== by 0x247C44: Rf_applyClosure (eval.c:1778)
==20371== by 0x23C1C4: bcEval (eval.c:7009)
==20371== by 0x24519F: Rf_eval (eval.c:688)
==20371== by 0x246F4E: R_execClosure (eval.c:1852)
==20371== by 0x247C44: Rf_applyClosure (eval.c:1778)
==20371== Address 0x1136db90 is 0 bytes inside a block of size 12 alloc'd
==20371== at 0x4C31B25: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==20371== by 0x27A750: R_chk_calloc (memory.c:3422)
==20371== by 0x1BE0D0: do_makenames (character.c:931)
==20371== by 0x238822: bcEval (eval.c:7041)
==20371== by 0x24519F: Rf_eval (eval.c:688)
==20371== by 0x246F4E: R_execClosure (eval.c:1852)
==20371== by 0x247C44: Rf_applyClosure (eval.c:1778)
==20371== by 0x23C1C4: bcEval (eval.c:7009)
==20371== by 0x24519F: Rf_eval (eval.c:688)
==20371== by 0x246F4E: R_execClosure (eval.c:1852)
==20371== by 0x247C44: Rf_applyClosure (eval.c:1778)
==20371== by 0x23C1C4: bcEval (eval.c:7009)
Definitely cryptic enough to be a parallelization issue. Rtsne also parallelizes but via OpenMP, which is generally more restrictive so it's harder to accidentally put in R API calls.
from uwot.
Oops. Possibly maybe someone who shall remain nameless (spoiler alert: it's me) is calling the R random number generator from inside a thread? Fixing this isn't conceptually difficult, but requires a fair bit of typing (because it's C++) so might not get finished until later today.
from uwot.
I forgot to tag the commit with this issue, but what's currently on master should hopefully behave. @LTLA, if you ever install from master, re-running valgrind would be an interesting exercise.
from uwot.
master
doesn't get rid of the valgrind warnings, but I did manage to track them down to find_ab_params()
, most likely the stats::nls()
call therein. Running umap()
with specified a
and b
arguments avoids the warnings. This may well be a false positive, it's hard to believe that a base function would be compromised like that; I call nls
all over the place in my own functions.
from uwot.
I am still having issues with this even with the new version on cran. I reinstalled everything and tried again. But same as before on the second run of the example I get a seg fault.
from uwot.
Operating system?
from uwot.
It is a linux cluster, which unfortunately is running the 2.6 kernel. However, there doesn't seem to be any major issues with any other R package.
Linux n6426 2.6.32-754.14.2.el6.x86_64 #1 SMP Tue May 14 19:35:42 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
from uwot.
Do you have valgrind
installed? π€
If so, could you try copying the OP's code into some file (e.g., test.R
) and running:
R CMD BATCH --no-save -d valgrind test.R
and seeing what test.Rout
gives? If you don't have valgrind installed, the top should just say that valgrind isn't available. If you do have it installed, it should have some blurb at the top with memcheck blah blah blah
and then hopefully give some diagnostics before the crash.
Those diagnostics would be extremely helpful.
from uwot.
Also could you run
iris_umap <- umap(iris, pca = 50, verbose = TRUE, n_threads = 0)
twice in a row, as well as repeating twice with:
iris_umap <- umap(iris, pca = 50, verbose = TRUE, n_threads = 1)
and see if either makes a difference, providing the output for the second crashing run. Getting a clue to where the second crash occurs would be helpful (although I suspect the damage is already done at some point in the first run).
from uwot.
Edit: there is an explicit check that the number of components does not exceed the number of columns in the input, so for iris
, the pca = 50
argument should be able to be omitted without affecting the crash.
Also does omitting pca = 50
make a difference? For iris
, that step should be skipped anyway (I think β Iβm away from my computer at the moment) because the input data doesnβt have sufficient rank to extract 50 components. It would be good to get a minimal reproducible example.
from uwot.
Reinstalling RcppParallel from CRAN after you have installed uwot etc? It seemed to help for me...
from uwot.
If @theboocock has valgrind
installed, please try running our suggested commands above before attempting reinstallation. This would be a rare opportunity to identify the problem on a known failing machine and to fix it once and for all - such chances are hard to come by.
from uwot.
@LTLA , you're completely right!! Apologies for suggesting reinstallation BEFORE checking....
from uwot.
Hey all,
Reinstalling never does anything for me anyways. Here is the valgrind error. Seems like it is coming from libtbb.
Is this post relevant https://software.intel.com/en-us/forums/intel-threading-building-blocks/topic/641654?
>
> library(irlba)
Loading required package: Matrix
> out <- irlba(as.matrix(iris[,-5]), nu=1, nv=1)
>
> library(Rtsne)
> out <- Rtsne(as.matrix(iris[,-5]), check_duplicates=FALSE)
>
> library(uwot)
> iris_umap <- umap(as.matrix(iris[,-5]), pca = 50, n_threads=1)
==264614== Invalid read of size 8
==264614== at 0x1BB2A5E8: ??? (in /u/project/kruglyak/smilefre/anaconda3/lib/R/library/RcppParallel/lib/libtbb.so.2)
==264614== by 0x1BDAB1FF: ???
==264614== by 0x1BDC757F: ???
==264614== Address 0xfffffffffffffff7 is not stack'd, malloc'd or (recently) free'd
==264614==
*** caught segfault ***
address 0xfffffffffffffff7, cause 'memory not mapped'
Traceback:
1: RcppParallel::setThreadOptions(numThreads = n_threads)
2: uwot(X = X, n_neighbors = n_neighbors, n_components = n_components, metric = metric, n_epochs = n_epochs, alpha = learning_rate, scale = scale, init = init, init_sdev = init_sdev, spread = spread, min_dist = min_dist, set_op_mix_ratio = set_op_mix_ratio, local_connectivity = local_connectivity, bandwidth = bandwidth, gamma = repulsion_strength, negative_sample_rate = negative_sample_rate, a = a, b = b, nn_method = nn_method, n_trees = n_trees, search_k = search_k, method = "umap", approx_pow = approx_pow, n_threads = n_threads, n_sgd_threads = n_sgd_threads, grain_size = grain_size, y = y, target_n_neighbors = target_n_neighbors, target_weight = target_weight, target_metric = target_metric, pca = pca, pca_center = pca_center, pcg_rand = pcg_rand, fast_sgd = fast_sgd, ret_model = ret_model, ret_nn = ret_nn, tmpdir = tempdir(), verbose = verbose)
3: umap(as.matrix(iris[, -5]), pca = 50, n_threads = 1)
An irrecoverable exception occurred. R is aborting now ...
--264614-- VALGRIND INTERNAL ERROR: Valgrind received a signal 11 (SIGSEGV) - exiting
--264614-- si_code=1; Faulting address: 0x20000038; sp: 0x402efbf50
valgrind: the 'impossible' happened:
Killed by fatal signal
==264614== at 0x38047487: vgPlain_get_StackTrace_wrk (m_stacktrace.c:334)
==264614== by 0x3804756B: vgPlain_get_StackTrace (m_stacktrace.c:1086)
==264614== by 0x3802F82E: record_ExeContext_wrk (m_execontext.c:314)
==264614== by 0x38002A84: die_and_free_mem (mc_malloc_wrappers.c:361)
==264614== by 0x3807A59A: vgPlain_scheduler (scheduler.c:1665)
==264614== by 0x3803B63E: final_tidyup (m_main.c:2656)
==264614== by 0x3803B767: shutdown_actions_NORETURN (m_main.c:2457)
==264614== by 0x380A656B: run_a_thread_NORETURN (syswrap-linux.c:199)
sched status:
running_tid=1
Thread 1: status = VgTs_Runnable```
from uwot.
@theboocock, thank you for running valgrind
. Do you know if you are running any other packages that use RcppParallel? There are definitely some similar issues with memset and gcc6 but I'm loath to prematurely put the blame on TBB.
from uwot.
Well, it looks like it isn't even hitting uwot's C++ code, so it's hard to blame anything else... An even simpler test would be whether running RcppParallel::setNumThreads()
triggers the error, i.e.,
# valgrind me:
library(RcppParallel)
setThreadOptions(numThreads = 1)
If so, that seems like a slam dunk, though the use of conda does complicate matters.
from uwot.
@LTLA, seeing as we get through one run without a crash, is it possible that uwot
just stomps all over some memory that RcppParallel
or tbb
is using? Seems like it's hard to completely rule out uwot
being the villain. I'll have a look at finding a container with gcc6 in it and see if it can be reproduced.
from uwot.
I was looking at @theboocock's valgrind output above, where umap fails the first time it runs. (The difference from a non-valgrind
context is expected.) Either that, or I've had one too many G&T's.
It's also possible that irlba()
or Rtsne()
are doing something Bad... which would be even more concerning. The minimal example would be clarifying. So, either just:
# Put into test1.R with nothing else, and run under valgrind:
library(RcppParallel)
setThreadOptions(numThreads = 1)
Or, if the above doesn't trigger the error, then:
# Put into test2.R with nothing else, and run under valgrind:
library(uwot)
iris_umap <- umap(as.matrix(iris[,-5]), pca = 50, n_threads=1)
from uwot.
I wonder if benjjneb/dada2#684 is a related problem? There are some suspicious similarities.
from uwot.
library(RcppParallel)
setThreadOptions(numThreads = 1)
Triggers the error for me. I am going to try the dada2 solution now.,
from uwot.
This seems like the key piece
in build.sh
if [[ $target_platform =~ linux.* ]]; then
# The vendored TBB library adds compile-time flags based on a probe of gcc,
# this little "hack" ensures that the `gcc` executable is available when
# TBB is built.
mkdir $PWD/hack
export PATH="$PWD/hack:$PATH"
ln -s $CC $PWD/hack/gcc
chmod +x $PWD/hack/gcc
fi
from uwot.
So the takeaway is that if you're running R under conda, you should be installing RcppParallel via conda as well? Not the most intuitive outcome, but tolerable. Possibly another thing to throw into the README
; maybe it's worth having an entire section on "Known problems" along with the .Rprofile
issue.
from uwot.
Yes, I was hoping to work out if there is a lesson learned here. I don't want to mislead anyone. I don't have any experience using conda
for R packages, just Python, and I have no knowledge of bioconda. Is it safe in general to mix CRAN and conda packages or is that always ill-advised?
from uwot.
@aldojongejan, you mentioned that you reinstalled RcppParallel from CRAN to fix the issue. Do you know if you had previously installed from conda? Or if you had a mix of conda-installed and CRAN-installed packages?
from uwot.
I worked with the developers version of R to get the latest version of Seurat and SingleR working. Guess that that also installed RcppParallel. Then, as a possible fix for my problem, I installed via Conda as suggested here (cole-trapnell-lab/monocle3#186). That didn't solve it for me, only when I later reinstalled RcppParallel from CRAN again (removing the RcppParallel directory from the 'library' folder etc.)... I should have paid more attention and documented the exact steps...
from uwot.
@aldojongejan, no problem, thank you for opening the issue here in the first place.
from uwot.
Ha ha, opening the issue was not a real problem ;-)
I am sorry, I couldn;t be of more help, and I really appreciate all the work you guys put into helping me out and solving the problem!
from uwot.
uwot 0.1.8 removes the dependency on RcppParallel so hopefully these problems are gone.
from uwot.
from uwot.
Hopefully this is solved. Closing.
from uwot.
Related Issues (20)
- Test failure an arm64, ppc64el and s390x HOT 8
- Add general_simplicial_set_intersection to the uwot API HOT 3
- umap_transform causes R Studio to abort (R encountered a fatal error.) HOT 4
- umap_transform can give odd results with dens_scale HOT 17
- umap transform fuzzy graph HOT 3
- Citing {uwot} HOT 1
- Weird looking UMAP for spectral flow data HOT 3
- What C++ version should CXX_STD have? HOT 16
- devtools can no longer build uwot on Windows HOT 5
- I can not load the saved model, an example from the help HOT 9
- Reproducibility issue with the same data and OS HOT 17
- umap_transform with seurat umap coordinate HOT 2
- dgCMatrix_colSums' not found error when using a binary matrix HOT 2
- irlba as_cholmod_sparse problems HOT 1
- maybe unintended data use in examples HOT 2
- umap_transform uses a different distance metric if loaded in HOT 5
- `fuzzy_simplicial_set()` for BBKNN HOT 5
- Differences in model parameters when calling umap() causes umap_transform() to error HOT 3
- How to input similarity_graph back into umap parameters? HOT 3
- Range scale input before optimization HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from uwot.