Giter VIP home page Giter VIP logo

rcppannoy's Introduction

RcppAnnoy: Rcpp bindings for Annoy

CI License CRAN r-universe Dependencies Downloads Last Commit

What is Annoy?

Annoy is a small, fast and lightweight library for Approximate Nearest Neighbours with a particular focus on efficient memory use and the ability to load a pre-saved index.

Annoy is written by Erik Bernhardsson. See its page for more on features, its (Python) API, and the other language ports. Annoy is part of the esteemed let us find other music you may like algorithm by Spotify.

Why this package?

It provides a nice example for Rcpp Modules and use of templates: Annoy uses a clean C++ core with templated data type, as well as several distance measures. This package shows that it is easy to wrap both aspects from R giving us multi-lingual approaches to data discovery and machine learning.

Status

The package matches the behaviour of the original Python package in the original Python wrapper for the Annoy library. It also replicates all unit tests written for the Python frontend, including a test for efficiently mmap-ing a binary index file.

The package originally built on Linux and OS X, and thanks to a patch by Qiang Kou now also builds on Windows.

Installation

You can either install from source via this repo, or install the CRAN package the usual way from R.

Author

Dirk Eddelbuettel

License

GPL (>= 2)

rcppannoy's People

Contributors

adamspannbauer avatar dcdillon avatar eddelbuettel avatar jlmelville avatar ltla avatar mikepb avatar petehaitch avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rcppannoy's Issues

Warning during compilation with Ropen

Thanks for this useful package. I noticed a warning, but I don't know if it's important or not. Let me know, if you have time.

* installing *source* package 'RcppAnnoy' ...
** libs
c:/Rtools/mingw_64/bin/g++ -m64 -std=gnu++11 -I"C:/PROGRA~1/MICROS~4/ROPEN~1/R-35~1.1/include" -DNDEBUG -I../inst/include/ -I"C:/Users/sampgg/Documents/R/win-library/3.5/Rcpp/include"   -I"C:/swarm/workspace/External-R-3.5.1/vendor/extsoft/include"     -O2 -Wall  -mtune=core2 -c annoy.cpp -o annoy.o
In file included from ../inst/include/annoylib.h:42:0,
                 from annoy.cpp:39:
../inst/include/mman.h: In function 'void* mmap(void*, size_t, int, int, int, off_t)':
../inst/include/mman.h:102:48: warning: right shift count >= width of type
                     (DWORD)0 : (DWORD)((off >> 32) & 0xFFFFFFFFL);
                                                ^
../inst/include/mman.h:111:52: warning: right shift count >= width of type
                     (DWORD)0 : (DWORD)((maxSize >> 32) & 0xFFFFFFFFL);
                                                    ^

My current R is Ropen as stated here:

R version 3.5.1 (2018-07-02)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

Matrix products: default

locale:
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United Kingdom.1252   
[3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C                           
[5] LC_TIME=English_United Kingdom.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] RcppAnnoy_0.0.11     Rcpp_1.0.0           DT_0.5               KernSmooth_2.23-15  
[5] shinyTree_0.2.6      shiny_1.2.0          flowCore_1.48.1      RevoUtils_11.0.1    
[9] RevoUtilsMath_11.0.0

loaded via a namespace (and not attached):
 [1] mvtnorm_1.0-8       lattice_0.20-38     corpcor_1.6.9       prettyunits_1.0.2  
 [5] ps_1.2.1            assertthat_0.2.0    rprojroot_1.3-2     digest_0.6.18      
 [9] mime_0.6            R6_2.3.0            backports_1.1.2     stats4_3.5.1       
[13] pcaPP_1.9-73        rlang_0.3.0.1       curl_3.2            rstudioapi_0.8     
[17] callr_3.0.0         desc_1.2.0          devtools_2.0.1      Rtsne_0.15         
[21] stringr_1.3.1       htmlwidgets_1.3     compiler_3.5.1      httpuv_1.4.5       
[25] BiocGenerics_0.28.0 base64enc_0.1-3     pkgbuild_1.0.2      htmltools_0.3.6    
[29] RANN_2.6            codetools_0.2-15    matrixStats_0.54.0  rrcov_1.4-7        
[33] crayon_1.3.4        withr_2.1.2         later_0.7.5         MASS_7.3-50        
[37] grid_3.5.1          jsonlite_1.5        xtable_1.8-3        magrittr_1.5       
[41] graph_1.60.0        cli_1.0.1           stringi_1.2.4       fs_1.2.6           
[45] promises_1.0.1      remotes_2.0.2       testthat_2.0.1      robustbase_0.93-3  
[49] RColorBrewer_1.1-2  tools_3.5.1         Cairo_1.5-9         Biobase_2.42.0     
[53] glue_1.3.0          DEoptimR_1.0-8      crosstalk_1.0.0     processx_3.2.0     
[57] pkgload_1.0.2       parallel_3.5.1      yaml_2.2.0          cluster_2.0.7-1    
[61] BiocManager_1.30.4  sessioninfo_1.1.1   memoise_1.1.0       usethis_1.4.0      

Run reverse depends checks

The most recent GH version switches to using a namespace and this is likely going to upset packages compiling against the exported header.

So we should try scDHA and uwot.

New / remaining sanitizer issues

Per email from Uwe:

Brian found clang-SAN  still gives

/data/gannet/ripley/R/test-clang/Rcpp/include/Rcpp/internal/caster.h:30:25: 
runtime error: -1 is outside the range of representable values of type 
'unsigned long'
     #0 0x7f33c92f9c53 in unsigned long Rcpp::internal::caster<double, 
unsigned long>(double) 
/data/gannet/ripley/R/test-clang/Rcpp/include/Rcpp/internal/caster.h:30:25
     #1 0x7f33c92f9c53 in unsigned long 
Rcpp::internal::primitive_as<unsigned long>(SEXPREC*) 
/data/gannet/ripley/R/test-clang/Rcpp/include/Rcpp/as.h:39:21
     #2 0x7f33c9331a47 in unsigned long Rcpp::internal::as<unsigned 
long>(SEXPREC*, Rcpp::traits::r_type_primitive_tag) 
/data/gannet/ripley/R/test-clang/Rcpp/include/Rcpp/as.h:44:20
     #3 0x7f33c9331a47 in unsigned long Rcpp::as<unsigned 
long>(SEXPREC*) 
/data/gannet/ripley/R/test-clang/Rcpp/include/Rcpp/as.h:152:16
     #4 0x7f33c9331a47 in Rcpp::InputParameter<unsigned long>::operator 
unsigned long() 
/data/gannet/ripley/R/test-clang/Rcpp/include/Rcpp/InputParameter.h:34:38
     #5 0x7f33c9331a47 in Rcpp::CppMethod4<Annoy<int, float, Euclidean, 
Kiss64Random>, Rcpp::Vector<19, Rcpp::PreserveStorage>, int, unsigned 
long, unsigned long, bool>::operator()(Annoy<int, float, Euclidean, 
Kiss64Random>*, SEXPREC**) 
/data/gannet/ripley/R/test-clang/Rcpp/include/Rcpp/module/Module_generated_CppMethod.h:375:78
     #6 0x7f33c930c52a in Rcpp::class_<Annoy<int, float, Euclidean, 
Kiss64Random> >::invoke_notvoid(SEXPREC*, SEXPREC*, SEXPREC**, int) 
/data/gannet/ripley/R/test-clang/Rcpp/include/Rcpp/module/class.h:234:23
     #7 0x7f33c95b8489 in CppMethod__invoke_notvoid(SEXPREC*) 
/tmp/RtmpjzdNeY/R.INSTALL8afa7606c1f3/Rcpp/src/module.cpp:220:19

Segfault adding

I ran into this segfault today after upgrading Rcpp and rebuilding RcppAnnoy. It was working yesterday, with the older Rcpp, but I'm not sure if that's the underlying issue. This seems like one of those "was working yesterday" bugs...

> str(knn_dt)
Classes ‘data.table’ and 'data.frame':  213451 obs. of  1050 variables:
 $ gender                                                                            : num  NaN NaN NaN 2 NaN 2 2 NaN 1 1 ...
 $ signup_method                                                                     : num  1 1 1 1 1 1 1 1 1 1 ...
 $ signup_flow                                                                       : num  1 1 1 1 17 1 1 16 3 4 ...
 $ language                                                                          : num  6 6 6 6 6 6 6 6 6 6 ...
 $ affiliate_channel                                                                 : num  3 3 3 3 3 3 3 3 7 3 ...
 $ affiliate_provider                                                                : num  5 5 5 5 5 5 5 5 9 5 ...
 $ first_affiliate_tracked                                                           : num  4 7 7 7 7 7 7 7 7 7 ...
 $ signup_app                                                                        : num  4 4 4 4 3 4 4 1 4 4 ...
 $ first_device_type                                                                 : num  6 4 9 6 9 9 6 7 6 6 ...
 $ first_browser                                                                     : num  43 30 17 43 8 17 17 NaN 17 8 ...
 $ age                                                                               : num  31 NaN NaN 40 NaN 38 41 NaN 34 28 ...
 $ age_ul                                                                            : num  0 0 0 0 0 0 0 0 0 0 ...
 $ age_ll                                                                            : num  0 0 0 0 0 0 0 0 0 0 ...
 $ date_account_created.wday                                                         : num  3 0 5 2 1 3 1 4 2 5 ...
 $ date_account_created.mday                                                         : num  14 4 27 3 8 27 8 24 27 21 ...
 $ date_account_created.yday                                                         : num  133 215 208 336 188 57 188 113 360 264 ...
 $ date_account_created.mon                                                          : num  4 7 6 11 6 1 6 3 11 8 ...
 $ date_account_created.yweek                                                        : num  2 4 4 6 3 1 3 2 6 5 ...
 $ date_account_created.year                                                         : num  2014 2013 2012 2013 2013 ...
 $ date_account_created.wday0                                                        : num  0 1 0 0 0 0 0 0 0 0 ...
 $ date_account_created.wday1                                                        : num  0 0 0 0 1 0 1 0 0 0 ...
 $ date_account_created.wday2                                                        : num  0 0 0 1 0 0 0 0 1 0 ...
 $ date_account_created.wday3                                                        : num  1 0 0 0 0 1 0 0 0 0 ...
 $ date_account_created.wday4                                                        : num  0 0 0 0 0 0 0 1 0 0 ...
 $ date_account_created.wday5                                                        : num  0 0 1 0 0 0 0 0 0 1 ...
 $ date_account_created.wday6                                                        : num  0 0 0 0 0 0 0 0 0 0 ...
 $ timestamp_first_active.wday                                                       : num  3 0 5 2 1 3 1 4 2 5 ...
 $ timestamp_first_active.mday                                                       : num  14 4 27 3 8 27 8 24 27 21 ...
 $ timestamp_first_active.yday                                                       : num  133 215 208 336 188 57 188 113 360 264 ...
 $ timestamp_first_active.mon                                                        : num  4 7 6 11 6 1 6 3 11 8 ...
 $ timestamp_first_active.hour                                                       : num  19 22 9 13 17 19 16 17 1 5 ...
 $ timestamp_first_active.yweek                                                      : num  2 4 4 6 3 1 3 2 6 5 ...
 $ timestamp_first_active.year                                                       : num  2014 2013 2012 2013 2013 ...
 $ timestamp_first_active.wday0                                                      : num  0 1 0 0 0 0 0 0 0 0 ...
 $ timestamp_first_active.wday1                                                      : num  0 0 0 0 1 0 1 0 0 0 ...
 $ timestamp_first_active.wday2                                                      : num  0 0 0 1 0 0 0 0 1 0 ...
 $ timestamp_first_active.wday3                                                      : num  1 0 0 0 0 1 0 0 0 0 ...
 $ timestamp_first_active.wday4                                                      : num  0 0 0 0 0 0 0 1 0 0 ...
 $ timestamp_first_active.wday5                                                      : num  0 0 1 0 0 0 0 0 0 1 ...
 $ timestamp_first_active.wday6                                                      : num  0 0 0 0 0 0 0 0 0 0 ...
 $ device_percent.Android_App_Unknown_Phone_Tablet                                   : num  0 0 0 0 0 0 0 0 0 0 ...
 $ device_percent.Android_Phone                                                      : num  0 0 0 0 0 0 0 0 0 0 ...
 $ device_percent.Blackberry                                                         : num  0 0 0 0 0 0 0 0 0 0 ...
 $ device_percent.Chromebook                                                         : num  0 0 0 0 0 0 0 0 0 0 ...
 $ device_percent.Linux_Desktop                                                      : num  0 0 0 0 0 0 0 0 0 0 ...
 $ device_percent.Mac_Desktop                                                        : num  0 0 0 0 0 0 0 0 0 0 ...
 $ device_percent.Opera_Phone                                                        : num  0 0 0 0 0 0 0 0 0 0 ...
 $ device_percent.Tablet                                                             : num  0 0 0 0 0 0 0 0 0 0 ...
 $ device_percent.Windows_Desktop                                                    : num  0 0 0 0 0 0 0 0 0 0 ...
 $ device_percent.Windows_Phone                                                      : num  0 0 0 0 0 0 0 0 0 0 ...
 $ device_percent._unknown_                                                          : num  0 0 0 0 0 0 0 0 0 0 ...
 $ device_percent.iPad_Tablet                                                        : num  0 0 0 0 0 0 0 0 0 0 ...
 $ device_percent.iPhone                                                             : num  0 0 0 0 0 0 0 0 0 0 ...
 $ device_percent.iPodtouch                                                          : num  0 0 0 0 0 0 0 0 0 0 ...
 $ count.secs_elapsed.Compose.device_time.Android_App_Unknown_Phone_Tablet           : num  0 0 0 0 0 0 0 0 0 0 ...
 $ count.secs_elapsed.Compose.device_time.Android_Phone                              : num  0 0 0 0 0 0 0 0 0 0 ...
 $ count.secs_elapsed.Compose.device_time.Blackberry                                 : num  0 0 0 0 0 0 0 0 0 0 ...
 $ count.secs_elapsed.Compose.device_time.Chromebook                                 : num  0 0 0 0 0 0 0 0 0 0 ...
 $ count.secs_elapsed.Compose.device_time.Linux_Desktop                              : num  0 0 0 0 0 0 0 0 0 0 ...
 $ count.secs_elapsed.Compose.device_time.Mac_Desktop                                : num  0 0 0 0 0 0 0 0 0 0 ...
 $ count.secs_elapsed.Compose.device_time.Opera_Phone                                : num  0 0 0 0 0 0 0 0 0 0 ...
 $ count.secs_elapsed.Compose.device_time.Tablet                                     : num  0 0 0 0 0 0 0 0 0 0 ...
 $ count.secs_elapsed.Compose.device_time.Windows_Desktop                            : num  0 0 0 0 0 0 0 0 0 0 ...
 $ count.secs_elapsed.Compose.device_time.Windows_Phone                              : num  0 0 0 0 0 0 0 0 0 0 ...
 $ count.secs_elapsed.Compose.device_time._unknown_                                  : num  0 0 0 0 0 0 0 0 0 0 ...
 $ count.secs_elapsed.Compose.device_time.iPad_Tablet                                : num  0 0 0 0 0 0 0 0 0 0 ...
 $ count.secs_elapsed.Compose.device_time.iPhone                                     : num  0 0 0 0 0 0 0 0 0 0 ...
 $ count.secs_elapsed.Compose.device_time.iPodtouch                                  : num  0 0 0 0 0 0 0 0 0 0 ...
 $ sum.secs_elapsed.Compose.device_time.Android_App_Unknown_Phone_Tablet             : num  0 0 0 0 0 0 0 0 0 0 ...
 $ sum.secs_elapsed.Compose.device_time.Android_Phone                                : num  0 0 0 0 0 0 0 0 0 0 ...
 $ sum.secs_elapsed.Compose.device_time.Blackberry                                   : num  0 0 0 0 0 0 0 0 0 0 ...
 $ sum.secs_elapsed.Compose.device_time.Chromebook                                   : num  0 0 0 0 0 0 0 0 0 0 ...
 $ sum.secs_elapsed.Compose.device_time.Linux_Desktop                                : num  0 0 0 0 0 0 0 0 0 0 ...
 $ sum.secs_elapsed.Compose.device_time.Mac_Desktop                                  : num  0 0 0 0 0 0 0 0 0 0 ...
 $ sum.secs_elapsed.Compose.device_time.Opera_Phone                                  : num  0 0 0 0 0 0 0 0 0 0 ...
 $ sum.secs_elapsed.Compose.device_time.Tablet                                       : num  0 0 0 0 0 0 0 0 0 0 ...
 $ sum.secs_elapsed.Compose.device_time.Windows_Desktop                              : num  0 0 0 0 0 0 0 0 0 0 ...
 $ sum.secs_elapsed.Compose.device_time.Windows_Phone                                : num  0 0 0 0 0 0 0 0 0 0 ...
 $ sum.secs_elapsed.Compose.device_time._unknown_                                    : num  0 0 0 0 0 0 0 0 0 0 ...
 $ sum.secs_elapsed.Compose.device_time.iPad_Tablet                                  : num  0 0 0 0 0 0 0 0 0 0 ...
 $ sum.secs_elapsed.Compose.device_time.iPhone                                       : num  0 0 0 0 0 0 0 0 0 0 ...
 $ sum.secs_elapsed.Compose.device_time.iPodtouch                                    : num  0 0 0 0 0 0 0 0 0 0 ...
 $ mean.secs_elapsed.Compose.device_time.Android_App_Unknown_Phone_Tablet            : num  0 0 0 0 0 0 0 0 0 0 ...
 $ mean.secs_elapsed.Compose.device_time.Android_Phone                               : num  0 0 0 0 0 0 0 0 0 0 ...
 $ mean.secs_elapsed.Compose.device_time.Blackberry                                  : num  0 0 0 0 0 0 0 0 0 0 ...
 $ mean.secs_elapsed.Compose.device_time.Chromebook                                  : num  0 0 0 0 0 0 0 0 0 0 ...
 $ mean.secs_elapsed.Compose.device_time.Linux_Desktop                               : num  0 0 0 0 0 0 0 0 0 0 ...
 $ mean.secs_elapsed.Compose.device_time.Mac_Desktop                                 : num  0 0 0 0 0 0 0 0 0 0 ...
 $ mean.secs_elapsed.Compose.device_time.Opera_Phone                                 : num  0 0 0 0 0 0 0 0 0 0 ...
 $ mean.secs_elapsed.Compose.device_time.Tablet                                      : num  0 0 0 0 0 0 0 0 0 0 ...
 $ mean.secs_elapsed.Compose.device_time.Windows_Desktop                             : num  0 0 0 0 0 0 0 0 0 0 ...
 $ mean.secs_elapsed.Compose.device_time.Windows_Phone                               : num  0 0 0 0 0 0 0 0 0 0 ...
 $ mean.secs_elapsed.Compose.device_time._unknown_                                   : num  0 0 0 0 0 0 0 0 0 0 ...
 $ mean.secs_elapsed.Compose.device_time.iPad_Tablet                                 : num  0 0 0 0 0 0 0 0 0 0 ...
 $ mean.secs_elapsed.Compose.device_time.iPhone                                      : num  0 0 0 0 0 0 0 0 0 0 ...
 $ mean.secs_elapsed.Compose.device_time.iPodtouch                                   : num  0 0 0 0 0 0 0 0 0 0 ...
 $ sd.secs_elapsed.Compose.device_time.Android_App_Unknown_Phone_Tablet              : num  0 0 0 0 0 0 0 0 0 0 ...
 $ sd.secs_elapsed.Compose.device_time.Android_Phone                                 : num  0 0 0 0 0 0 0 0 0 0 ...
 $ sd.secs_elapsed.Compose.device_time.Blackberry                                    : num  0 0 0 0 0 0 0 0 0 0 ...
  [list output truncated]
 - attr(*, ".internal.selfref")=<externalptr> 
>   apply(knn_dt[1,], 1, function(row) {
+     knn$addItem(as.integer(row[[na_col]]), row[names(row) != na_col])
+   })

 *** caught segfault ***
address 0xfffff7c600000008, cause 'memory not mapped'

Traceback:
 1: .External(list(name = "CppMethod__invoke_void", address = <pointer: 0x7f92f6a37840>,     dll = list(name = "Rcpp", path = "/usr/local/lib/R/3.2/site-library/Rcpp/libs/Rcpp.so",         dynamicLookup = TRUE, handle = <pointer: 0x7f92f36f9360>,         info = <pointer: 0x107138670>), numParameters = -1L),     <pointer: 0x7f92f6e39f00>, <pointer: 0x7f92f3495f10>, .pointer,     ...)
 2: knn$addItem(as.integer(row[[na_col]]), row[names(row) != na_col])
 3: FUN(newX[, i], ...)
 4: apply(knn_dt[1, ], 1, function(row) {    knn$addItem(as.integer(row[[na_col]]), row[names(row) !=         na_col])})

R CMD build issue on Windows with R 4.2

I am having problems building the latest checkout of master on Windows with R 4.2.1:

E:\dev\R>"C:\Program Files\R\R-4.2.1\bin\R.exe" CMD build rcppannoy-gh
* checking for file 'rcppannoy-gh/DESCRIPTION' ... OK
* preparing 'RcppAnnoy':
* checking DESCRIPTION meta-information ... OK
* cleaning src
* installing the package to build vignettes
      -----------------------------------
* installing *source* package 'RcppAnnoy' ...
** using staged installation
** libs
g++  -std=gnu++11 -I"C:/PROGRA~1/R/R-42~1.1/include" -DNDEBUG -I../inst/include/  -I'E:/dev/R/win-library/4.0/Rcpp/include'   -I"C:/rtools42/x86_64-w64-mingw32.static.posix/include"     -O2 -Wall  -mfpmath=sse -msse2 -mstackrealign  -c RcppExports.cpp -o RcppExports.o
g++  -std=gnu++11 -I"C:/PROGRA~1/R/R-42~1.1/include" -DNDEBUG -I../inst/include/  -I'E:/dev/R/win-library/4.0/Rcpp/include'   -I"C:/rtools42/x86_64-w64-mingw32.static.posix/include"     -O2 -Wall  -mfpmath=sse -msse2 -mstackrealign  -c annoy.cpp -o annoy.o
g++  -std=gnu++11 -I"C:/PROGRA~1/R/R-42~1.1/include" -DNDEBUG -I../inst/include/  -I'E:/dev/R/win-library/4.0/Rcpp/include'   -I"C:/rtools42/x86_64-w64-mingw32.static.posix/include"     -O2 -Wall  -mfpmath=sse -msse2 -mstackrealign  -c arch.cpp -o arch.o
gcc  -I"C:/PROGRA~1/R/R-42~1.1/include" -DNDEBUG -I../inst/include/  -I'E:/dev/R/win-library/4.0/Rcpp/include'   -I"C:/rtools42/x86_64-w64-mingw32.static.posix/include"     -O2 -Wall  -std=gnu99 -mfpmath=sse -msse2 -mstackrealign  -c init.c -o init.o
g++  -std=gnu++11 -I"C:/PROGRA~1/R/R-42~1.1/include" -DNDEBUG -I../inst/include/  -I'E:/dev/R/win-library/4.0/Rcpp/include'   -I"C:/rtools42/x86_64-w64-mingw32.static.posix/include"     -O2 -Wall  -mfpmath=sse -msse2 -mstackrealign  -c version.cpp -o version.o
g++ -shared -s -static-libgcc -o RcppAnnoy.dll tmp.def RcppExports.o annoy.o arch.o init.o version.o -LC:/rtools42/x86_64-w64-mingw32.static.posix/lib/x64 -LC:/rtools42/x86_64-w64-mingw32.static.posix/lib -LC:/PROGRA~1/R/R-42~1.1/bin/x64 -lR
installing to C:/Users/jlmel/AppData/Local/Temp/RtmpOOV4HF/Rinst720470b2363b/00LOCK-RcppAnnoy/00new/RcppAnnoy/libs/x64
** R
** demo
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded from temporary location
ERROR: loading failed
* removing 'C:/Users/jlmel/AppData/Local/Temp/RtmpOOV4HF/Rinst720470b2363b/RcppAnnoy'
      -----------------------------------
ERROR: package installation failed

I get the same error with the latest R-devel. Using R-4.1.3 works. I have tried reinstalling rtools42 which had no effect.

I would chalk this up to something misconfigured with my local machine except I have forked this repo and added a github action to build against the latest and previous Windows builds and get the same result: windows-latest (release) is failing due RcppAnnoy not building and windows-latest (oldrel) succeeds.

In addition, my package RcppHNSW (which is modeled closely off RcppAnnoy) also shows the same problem, which suggests there is something going on here.

I can build other projects that use Rcpp without issues. So far RcppHNSW and RcppAnnoy are the only ones that I have found that are failing. A difference I have noted is that they both wrap C++ classes using RCPP_EXPOSED_CLASS_NODECL, RCPP_MODULE and Rcpp::class where other projects I have built do not.

I am not quite sure where to start on this (maybe try to reproduce the issue by wrapping a minimal C++ class?). Any pointers or tips would be helpful to me.

Demo program crashes on windows

I'm using Windows 8, R 3.4.0, and RcppAnnoy 0.0.8. I copied the code from demo/SimpleExample.R and modified the paths for "test.tree" to be just a filename (to avoid any path separator issues). The code runs fine until the very last line: "print(b$getNNsByItem(0, 40))". That line never returns, instead windows displays a crash report for R:

Problem signature:
Problem Event Name: APPCRASH
Application Name: rsession.exe
Application Version: 1.0.143.0
Application Timestamp: 58efc68f
Fault Module Name: RcppAnnoy.dll
Fault Module Version: 0.0.0.0
Fault Module Timestamp: 5921a5a5
Exception Code: c0000005
Exception Offset: 0000000000023e88
OS Version: 6.2.9200.2.0.0.256.48
Locale ID: 1033
Additional Information 1: 95c0
Additional Information 2: 95c0dadcc2e0790a3d6a0db9476ff7f7
Additional Information 3: c127
Additional Information 4: c127ededbb4a08cca6b24a2b1617bfb1

I get the same behavior whether run from the command line or Rstudio. The problem also occurred with R 3.3.0.

Not able to save a model

I am under Windows, and I noticed that when I train a model based on > 10K objets, I can t save it. Not tried on Linux.

After training, the model seems to work correctly (seems -> see other posted issue) but as soon as I call save it is finished, I can t request anything. Then I unload, restart R... try to recreate an object with the same size and reload the saved model. When I request anything I get this error message:

> a$getNNsByItem(18841, 5)
Error: vector::_M_range_insert

Any idea?

Kind regards,
Michael

Complete the documentation

There is an important parameter being never documented, it is the number to give to build function. It is just said on Annoy deposit that it should be related to our data. What is the meaning of the parameter? Is more better? Is there some rule to follow? Is is an issue to put something to high or to low?

Kind regards,
Michael

Error during compilation of RcppAnnoy

``devtools::install_github('eddelbuettel/rcppannoy')
Downloading GitHub repo eddelbuettel/rcppannoy@master
v  checking for file '/tmp/RtmpOXmQ0c/remotesfe32bfac6b5/eddelbuettel-rcppannoy-faaba96/DESCRIPTION' ...
-  preparing 'RcppAnnoy':
v  checking DESCRIPTION meta-information ...
-  cleaning src
-  running 'cleanup'
-  checking for LF line-endings in source and make files and shell scripts
-  checking for empty or unneeded directories
-  building 'RcppAnnoy_0.0.13.tar.gz'
   Warning: file 'RcppAnnoy/cleanup' did not have execute permissions: corrected
   Warning: invalid uid value replaced by that for user 'nobody'
   Warning: invalid gid value replaced by that for user 'nobody'

Installing package into '/ddn1/vol1/staging/leuven/stg_00002/lcb/asundar/R/3.1.0-foss-2018a-R-3.6.1-X11-20180604'
(as 'lib' is unspecified)
* installing *source* package 'RcppAnnoy' ...
** using staged installation
** libs
g++ -std=gnu++11 -I"/apps/leuven/skylake/2018a/software/R/3.6.0-foss-2018a-bare/lib64/R/include" -DNDEBUG -I../inst/include/ -I"/ddn1/vol1/staging/leuven/stg_00002/lcb/asundar/R/3.1.0-foss-2018a-R-3.6.1-X11-20180604/Rcpp/include" -I/apps/leuven/skylake/2018a/software/OpenBLAS/0.2.20-GCC-6.4.0-2.28/include -I/apps/leuven/skylake/2018a/software/FFTW/3.3.7-gompi-2018a/include -I/apps/leuven/skylake/2018a/software/libreadline/7.0-GCCcore-6.4.0/include -I/apps/leuven/skylake/2018a/software/ncurses/6.0-GCCcore-6.4.0/include -I/apps/leuven/skylake/2018a/software/libpng/1.6.34-GCCcore-6.4.0/include -I/apps/leuven/skylake/2018a/software/libjpeg-turbo/1.5.3-GCCcore-6.4.0/include -I/apps/leuven/skylake/2018a/software/Java/1.8.0_162/include -I/apps/leuven/skylake/2018a/software/ScaLAPACK/2.0.2-gompi-2018a-OpenBLAS-0.2.20/include -I/apps/leuven/skylake/2018a/software/bzip2/1.0.6-GCCcore-6.4.0/include -I/apps/leuven/skylake/2018a/software/cURL/7.58.0-GCCcore-6.4.0/include  -fpic  -O2 -ftree-vectorize -march=native -fno-math-errno  -c annoy.cpp -o annoy.o
In file included from annoy.cpp:39:0:
../inst/include/annoylib.h: In function 'T {anonymous}::dot(const T*, const T*, int) [with T = float]':
../inst/include/annoylib.h:257:37: error: '_mm512_reduce_add_ps' was not declared in this scope
     result += _mm512_reduce_add_ps(d);
                                     ^
../inst/include/annoylib.h: In function 'T {anonymous}::manhattan_distance(const T*, const T*, int) [with T = float]':
../inst/include/annoylib.h:276:67: error: '_mm512_abs_ps' was not declared in this scope
       manhattan = _mm512_add_ps(manhattan, _mm512_abs_ps(x_minus_y));
                                                                   ^
../inst/include/annoylib.h:281:44: error: '_mm512_reduce_add_ps' was not declared in this scope
     result = _mm512_reduce_add_ps(manhattan);
                                            ^
../inst/include/annoylib.h: In function 'T {anonymous}::euclidean_distance(const T*, const T*, int) [with T = float]':
../inst/include/annoylib.h:304:36: error: '_mm512_reduce_add_ps' was not declared in this scope
     result = _mm512_reduce_add_ps(d);
                                    ^
make: *** [annoy.o] Error 1
ERROR: compilation failed for package 'RcppAnnoy'
* removing '/ddn1/vol1/staging/leuven/stg_00002/lcb/asundar/R/3.1.0-foss-2018a-R-3.6.1-X11-20180604/RcppAnnoy'
Error: Failed to install 'RcppAnnoy' from GitHub:
  (converted from warning) installation of package '/tmp/RtmpOXmQ0c/filefe3616bd57b/RcppAnnoy_0.0.13.tar.gz' had non-zero exit status**

Trying to install this on a local library in a hpc cluster. Any idea what the problem is?

Vignette build on windows errors

Submitting to CRAN I got

More details are given in the directory:
https://win-builder.r-project.org/incoming_pretest/RcppAnnoy_0.0.11_20181029_133746/
The files will be removed after roughly 7 days.

We should either turn vignette builds off, or a least set eval=FALSE on Windows. Sadly I was unable to get the same error on win-builder :-/

/cc @LTLA

Expose `set_seed` method

The underlying C++ classes (and the Python wrapper) expose a set_seed method which seeds the internal KissRandom rng. Could it be added to RcppAnnoy?

I have an issue where if an Annoy index gets written to disk and is larger than 2GB in size, Annoy is unable to read it back in (at least on my machine, the off_t type used internally by Annoy is 32-bit). A possible work-around would be to contruct multiple smaller indexes, and then merge the results. But this only works if the seed can be altered for each build.

I would be happy to work on a pr for this if there is a chance of it being accepted.

0.0.16 build from source issues on macOS

Hi @eddelbuettel ,

I'm now seeing some macOS-specific compilation issues with RcppAnnoy 0.0.16, on either Mojave (10.14.6) or Catalina (10.15.3). For reference, I have no issues installing RcppAnnoy on Debian or Fedora Linux.

I have my Mac set up with the current recommended compiler settings from CRAN, using clang 7 and gfortran 6.1.

For reference, here's my current R system configuration, including Makevars.site:
https://github.com/acidgenomics/koopa/tree/master/os/darwin/etc/R

In case anybody else comes across this issue, note that macOS currently requires the Xcode command line tools to build from source:

# Full Xcode isn't required, just the command line tools (CLT).

# > xcode-select --help
sudo xcode-select --reset
sudo xcode-select --install

# Software update may be required.
# > sudo softwareupdate --clear-catalog
# > sudo softwareupdate -i -a --restart

xcode-select --print-path
# /Library/Developer/CommandLineTools

Based on this config, here's what I'm seeing:

> install.packages("RcppAnnoy", type = "source")
Installing package into ‘/Users/mike/Library/R/3.6/library’
(as ‘lib’ is unspecified)
trying URL 'https://cran.rstudio.com/src/contrib/RcppAnnoy_0.0.16.tar.gz'
Content type 'application/x-gzip' length 441533 bytes (431 KB)
==================================================
downloaded 431 KB

* installing *source* package ‘RcppAnnoy’ ...
** package ‘RcppAnnoy’ successfully unpacked and MD5 sums checked
** using staged installation
** libs
/usr/local/clang7/bin/clang++ -std=gnu++11 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I../inst/include/ -I"/Users/mike/Library/R/3.6/library/Rcpp/include" -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk -I/usr/local/include  -fPIC  -Wall -g -O2  -c RcppExports.cpp -o RcppExports.o
/usr/local/clang7/bin/clang++ -std=gnu++11 -I"/Library/Frameworks/R.framework/Resources/include" -DNDEBUG -I../inst/include/ -I"/Users/mike/Library/R/3.6/library/Rcpp/include" -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk -I/usr/local/include  -fPIC  -Wall -g -O2  -c annoy.cpp -o annoy.o
In file included from annoy.cpp:39:
In file included from ../inst/include/annoylib.h:22:
In file included from /Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/unistd.h:658:
/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/gethostuuid.h:39:17: error: C++ requires a type
      specifier for all declarations
int gethostuuid(uuid_t, const struct timespec *) __OSX_AVAILABLE_STARTING(__MAC_10_5, __IPHONE_NA);
                ^
In file included from annoy.cpp:39:
In file included from ../inst/include/annoylib.h:22:
/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/unistd.h:665:27: error: unknown type name 'uuid_t';
      did you mean 'uid_t'?
int      getsgroups_np(int *, uuid_t);
                              ^
/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/sys/_types/_uid_t.h:31:31: note: 'uid_t' declared
      here
typedef __darwin_uid_t        uid_t;
                              ^
In file included from annoy.cpp:39:
In file included from ../inst/include/annoylib.h:22:
/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/unistd.h:667:27: error: unknown type name 'uuid_t';
      did you mean 'uid_t'?
int      getwgroups_np(int *, uuid_t);
                              ^
/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/sys/_types/_uid_t.h:31:31: note: 'uid_t' declared
      here
typedef __darwin_uid_t        uid_t;
                              ^
In file included from annoy.cpp:39:
In file included from ../inst/include/annoylib.h:22:
/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/unistd.h:730:31: error: unknown type name 'uuid_t';
      did you mean 'uid_t'?
int      setsgroups_np(int, const uuid_t);
                                  ^
/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/sys/_types/_uid_t.h:31:31: note: 'uid_t' declared
      here
typedef __darwin_uid_t        uid_t;
                              ^
In file included from annoy.cpp:39:
In file included from ../inst/include/annoylib.h:22:
/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/unistd.h:732:31: error: unknown type name 'uuid_t';
      did you mean 'uid_t'?
int      setwgroups_np(int, const uuid_t);
                                  ^
/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/sys/_types/_uid_t.h:31:31: note: 'uid_t' declared
      here
typedef __darwin_uid_t        uid_t;
                              ^
5 errors generated.
make: *** [annoy.o] Error 1
ERROR: compilation failed for package ‘RcppAnnoy’
* removing ‘/Users/mike/Library/R/3.6/library/RcppAnnoy’
* restoring previous ‘/Users/mike/Library/R/3.6/library/RcppAnnoy’
Warning in install.packages("RcppAnnoy", type = "source") :
  installation of package ‘RcppAnnoy’ had non-zero exit status

It seems like the Xcode CLT headers aren't configured for uuid_t (libuuid) correctly, and I'm not sure how to fix this on the Mac. Has this come up in discussion before? I'm wondering if I'm missing something in my config, particularly the Makevars. Otherwise, I can follow up on r-sig-mac.

Best,
Mike

More content for RcppAnnoy.h

Per discussion in #64 we may as well add both these, no?

Comments, @LTLA @jlmelville ? Useful? Useless? Other additions?

#ifdef ANNOYLIB_MULTITHREADED_BUILD
  typedef AnnoyIndexMultiThreadedBuildPolicy AnnoyIndexThreadedBuildPolicy;
#else
  typedef AnnoyIndexSingleThreadedBuildPolicy AnnoyIndexThreadedBuildPolicy;
#endif

typedef Annoy<int32_t, float,    Angular,   Kiss64Random, AnnoyIndexThreadedBuildPolicy> AnnoyAngular;
typedef Annoy<int32_t, float,    Euclidean, Kiss64Random, AnnoyIndexThreadedBuildPolicy> AnnoyEuclidean;
typedef Annoy<int32_t, float,    Manhattan, Kiss64Random, AnnoyIndexThreadedBuildPolicy> AnnoyManhattan;
typedef Annoy<int32_t, uint64_t, Hamming,   Kiss64Random, AnnoyIndexThreadedBuildPolicy> AnnoyHamming;

Fix vignette description of output distance vector

Note to self: the specification of the output distance vector in the vignette is wrong, it should be std::vector<ANNOYTYPE> rather than a vector of doubles. I wonder if there is a way to extract out the C++ blocks from the vignette and compile them in some way, to make sure the advice is correct?

Building on Windows with mingw_64

I tried building this (immensely helpful and educational) package on Windows 10 with RStudio via devtools::load_all(".") and building with the g++ that comes with Rtools\mingw_64 fails. It doesn't realize it's on Windows, and wants to #include <sys/mman.h>.

Both the 32-bit and 64-bit versions of MinGW have a __MINGW32__ macro, so changing the check on line 37 of annoylib.h worked for me:

#if defined(_MSC_VER) || defined(__MINGW32__)
#ifdef _MSC_VER
#define NOMINMAX
#endif
#include "mman.h"
#include <windows.h>
#else
#include <sys/mman.h>
#endif

RcppAnnoy fails to compile

Hi,

I am a novice at Rcpp and Docker/Rocker, so bear with me for a second. I am working on pushing an image of a package to the cloud platform Seven Bridges, where my colleagues and I run data analysis. In order to do this, I am using Docker and following the steps documented on the Seven Bridges website: https://docs.sevenbridges.com/docs/install-and-run-samtools-sort

I am currently trying to install Seurat. I have an R container with devtools, but the installation terminates because RcppAnnoy is not able to compile. This is the error message I get:

Downloading GitHub repo eddelbuettel/rcppannoy@master
Your system is ready to build packages!
✔  checking for file ‘/tmp/RtmpFYOTSd/remotes167126338/eddelbuettel-rcppannoy-faaba96/DESCRIPTION’
─  preparing ‘RcppAnnoy’:
✔  checking DESCRIPTION meta-information
─  cleaning src
─  running ‘cleanup’
─  checking for LF line-endings in source and make files and shell scripts
─  checking for empty or unneeded directories
─  building ‘RcppAnnoy_0.0.13.tar.gz’

Installing package into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)
* installing *source* package ‘RcppAnnoy’ ...
** using staged installation
** libs
g++ -std=gnu++11 -I"/usr/share/R/include" -DNDEBUG -I../inst/include/ -I"/usr/local/lib/R/site-library/Rcpp/include"   -fpic  -g -O2 -fdebug-prefix-map=/build/r-base-J8poo0/r-base-3.6.1=. -fstac
k-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g  -c annoy.cpp -o annoy.o
g++: fatal error: Killed signal terminated program cc1plus
compilation terminated.
make: *** [/usr/lib/R/etc/Makeconf:176: annoy.o] Error 1
ERROR: compilation failed for package ‘RcppAnnoy’
* removing ‘/usr/local/lib/R/site-library/RcppAnnoy’
Error: Failed to install 'RcppAnnoy' from GitHub:
  (converted from warning) installation of package ‘/tmp/RtmpFYOTSd/file15e45f97b/RcppAnnoy_0.0.13.tar.gz’ had non-zero exit status

Thanks, and let me know what I can do!

Tiffany

Update binding to support optional arguments with `getNNsBy*` methods, with code

I managed to update the C++ code to use a template and changed getNNsBy* to accept search_k and include_distances with defaults:
https://github.com/mikepb/rcppannoy

However, the changes break the API:

# Before:
# a <- new(AnnoyEuclidean, f)
# After:
a <- AnnoyIndex(f, "euclidean")

I've only been using R for months and I'm not very familiar with the R class system. Do you think there could be a way to merge the changes and maintain backwards compatibility?

Unable to Save

I've fit a model on a very large (330K+ items, each with 350 elements) data set and while the model works fine, I don't seem to be able to save it out. I don't get an error message when I save, it just seems to keep running indefinitely. From what I can tell, the CPU is the bottle neck. Do you have any recommendations for overcoming this issue?

I'm running R version 3.6.2 on Windows 2012 Server with 16 available processors.

error installing rcppannoy on Mac 10.9.5, R 3.4.4

Hello,
I'm having problem installing your package on Mac. Any suggestions are appreciated!
Thanks a lot,
Alik

session_info()
version R version 3.4.4 Patched (2018-03-19 r75161)
os OS X Mavericks 10.9.5
system x86_64, darwin13.4.0
ui RStudio
language (EN)
collate de_CH.UTF-8
ctype de_CH.UTF-8
tz America/Los_Angeles
date 2019-08-26

install_github("eddelbuettel/rcppannoy")
Downloading GitHub repo eddelbuettel/rcppannoy@master
✔ checking for file ‘/private/var/folders/99/3xs1_5854ws2bskhd577pf9c0000gp/T/RtmpMWUQ46/remotes5d4b8bc44a/eddelbuettel-rcppannoy-052359e/DESCRIPTION’ ...
─ preparing ‘RcppAnnoy’:
✔ checking DESCRIPTION meta-information ...
─ cleaning src
─ running ‘cleanup’
─ checking for LF line-endings in source and make files and shell scripts
─ checking for empty or unneeded directories
─ building ‘RcppAnnoy_0.0.12.tar.gz’

  • installing source package ‘RcppAnnoy’ ...
    ** libs
    clang++ -std=gnu++11 -I/Library/Frameworks/R.framework/Resources/include -DNDEBUG -I../inst/include/ -I"/Library/Frameworks/R.framework/Versions/3.4/Resources/library/Rcpp/include" -I/usr/local/include -I/usr/local/include/freetype2 -I/opt/X11/include -fPIC -Wall -mtune=core2 -g -O2 -c annoy.cpp -o annoy.o
    clang -I/Library/Frameworks/R.framework/Resources/include -DNDEBUG -I../inst/include/ -I"/Library/Frameworks/R.framework/Versions/3.4/Resources/library/Rcpp/include" -I/usr/local/include -I/usr/local/include/freetype2 -I/opt/X11/include -fPIC -Wall -mtune=core2 -g -O2 -c init.c -o init.o
    clang++ -std=gnu++11 -dynamiclib -Wl,-headerpad_max_install_names -undefined dynamic_lookup -single_module -multiply_defined suppress -L/Library/Frameworks/R.framework/Resources/lib -L/usr/local/lib -o RcppAnnoy.so annoy.o init.o -F/Library/Frameworks/R.framework/.. -framework R -Wl,-framework -Wl,CoreFoundation
    installing to /Library/Frameworks/R.framework/Versions/3.4/Resources/library/RcppAnnoy/libs
    ** R
    ** demo
    ** inst
    ** preparing package for lazy loading
    ** help
    *** installing help indices
    ** building package indices
    ** installing vignettes
    ** testing if installed package can be loaded
    Error: package or namespace load failed for ‘RcppAnnoy’ in .doLoadActions(where, attach):
    error in load action .A.1 for package RcppAnnoy: loadModule(module = "AnnoyAngular", what = TRUE, env = ns, loadNow = TRUE): Unable to load module "AnnoyAngular": Vektor ist zu groß

Multithreading

Dear Dr. Eddelbuettel:

Could you please help me clarify how the multithreading works in RcppAnnoy.

  1. Suppose a client wants to use RcppAnnoy "as is", following your simpleExample.R.
    By default, will the multithreading be enabled?

  2. When multithreading is enabled (I take it for that one has to define ANNOYLIB_MULTITHREADED_BUILD in the source)
    are the search results going to be identical to those of the single threaded version?

Thank you for your help,
Nik Tuzov, PhD

Pass the whole matrix to addItem and getNNs function

For big matrix, it is too slow to add each row using addItem function, and too slow to call getNNsbyVector function row by row for big query matrix. Could you add new functions to allow passing the whole matrix as input for building the tree and for query?
Thanks a lot!

rcppannoy loading/installation error

Hi

I got this error when installing RcppAnnoy. This actually happened when I was trying to load Suerat package, got this error which led me to RcppAnnoy. I uninstalled it and now try to install it but failed.

Thanks!

install.packages("RcppAnnoy")
Installing package into ‘/home/rstudio/R/R_3.6’
(as ‘lib’ is unspecified)
trying URL 'https://mran.microsoft.com/snapshot/2019-11-11/src/contrib/RcppAnnoy_0.0.13.tar.gz'
Content type 'application/octet-stream' length 439735 bytes (429 KB)
==================================================
downloaded 429 KB

  • installing source package ‘RcppAnnoy’ ...
    ** package ‘RcppAnnoy’ successfully unpacked and MD5 sums checked
    ** using staged installation
    ** libs
    g++ -std=gnu++11 -I"/usr/local/lib/R/include" -DNDEBUG -I../inst/include/ -I"/home/rstudio/R/R_3.6/Rcpp/include" -I/usr/local/include -fpic -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -c annoy.cpp -o annoy.o
    gcc -I"/usr/local/lib/R/include" -DNDEBUG -I../inst/include/ -I"/home/rstudio/R/R_3.6/Rcpp/include" -I/usr/local/include -fpic -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -c init.c -o init.o
    g++ -std=gnu++11 -shared -L/usr/local/lib/R/lib -L/usr/local/lib -o RcppAnnoy.so annoy.o init.o -L/usr/local/lib/R/lib -lR
    installing to /home/rstudio/R/R_3.6/00LOCK-RcppAnnoy/00new/RcppAnnoy/libs
    ** R
    ** demo
    ** inst
    ** byte-compile and prepare package for lazy loading
    ** help
    *** installing help indices
    ** building package indices
    ** installing vignettes
    ** testing if installed package can be loaded from temporary location
    Error: package or namespace load failed for ‘RcppAnnoy’ in .doLoadActions(where, attach):
    error in load action .A.1 for package RcppAnnoy: loadModule(module = "AnnoyAngular", what = TRUE, env = ns, loadNow = TRUE): Unable to load module "AnnoyAngular": cannot allocate vector of size 701930.9 Gb
    Error: loading failed
    Execution halted
    ERROR: loading failed
  • removing ‘/home/rstudio/R/R_3.6/RcppAnnoy’
    Warning in install.packages :
    installation of package ‘RcppAnnoy’ had non-zero exit status

The downloaded source packages are in
‘/tmp/Rtmpi97ovS/downloaded_packages’

sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux 9 (stretch)

Matrix products: default
BLAS/LAPACK: /usr/lib/libopenblasp-r0.2.19.so

locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=C LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics grDevices utils datasets methods base

loaded via a namespace (and not attached):
[1] compiler_3.6.1 tools_3.6.1

Strange behavior

I have trained a model with RCPPAnnoy and I don t think the behavior makes sense

> a$getNNsByItem(12000, 2)
[1] 12000  8672
> a$getNNsByItem(12000, 10)
 [1] 12000 13752 15723  5171 13957 15702  1819 16020 13101 15279

Two things are not correct: the document 12 000 is in the request AND in the response.
The document 8672 is selected when I ask for the two closest, but but is not in the 10th closest.

Is it normal?

Kind regards,
Michael

getItemsVector crashes R

Calling getItemsVector make R crash. BTW getItemsVector is not in the tests :-)

library(RcppAnnoy)
set.seed(123)
f <- 40
a <- new(AnnoyEuclidean, f)
n <- 50                                
for (i in seq(n)) {
    v <- rnorm(f)
    a$addItem(i-1, v)
}
a$build(50)                   
a$save("/tmp/test.tree")
b <- new(AnnoyEuclidean, f)   
b$load("/tmp/test.tree")	
print(b$getNNsByItem(0, 40))
b$getItemsVector(0) # <- CRASH !!!
> devtools::session_info()
Session info -----------------------------------------------------------------------------------------------------------
 setting  value                       
 version  R version 3.4.1 (2017-06-30)
 system   x86_64, linux-gnu           
 ui       RStudio (1.1.350)           
 language fr_FR:en                    
 collate  fr_FR.UTF-8                 
 tz       Europe/Paris                
 date     2017-09-23                  

Packages ---------------------------------------------------------------------------------------------------------------
 package   * version date       source        
 base      * 3.4.1   2017-07-08 local         
 codetools   0.2-15  2016-10-05 CRAN (R 3.3.1)
 compiler    3.4.1   2017-07-08 local         
 datasets  * 3.4.1   2017-07-08 local         
 devtools    1.13.3  2017-08-02 CRAN (R 3.4.1)
 digest      0.6.12  2017-01-27 CRAN (R 3.4.0)
 graphics  * 3.4.1   2017-07-08 local         
 grDevices * 3.4.1   2017-07-08 local         
 memoise     1.1.0   2017-04-21 CRAN (R 3.4.0)
 methods   * 3.4.1   2017-07-08 local         
 Rcpp        0.12.12 2017-07-15 CRAN (R 3.4.1)
 RcppAnnoy * 0.0.9   2017-08-31 CRAN (R 3.4.1)
 stats     * 3.4.1   2017-07-08 local         
 tools       3.4.1   2017-07-08 local         
 utils     * 3.4.1   2017-07-08 local         
 withr       2.0.0   2017-07-28 CRAN (R 3.4.1)
 yaml        2.1.14  2016-11-12 CRAN (R 3.4.0)

More granular documentation

This is essentially a re-opening of #6 (closed due to inactivity). I'm using this package on a project and would like to include more documentation around RcppAnnoy to coworkers.

I think it would make the most sense for this to be included in the package, but I will document locally if the answer to the following question is "No". Would a PR using roxygen to document the Annoy* family be welcome?

I was thinking of using a similar style as what's used here to document an R6 class and its methods. Using roxygen adds RoxygenNote: 6.1.1 & Encoding: UTF-8 to DESCRIPTION.

Error in loading RcppAnnoy

Hi,

I am a quite new R user (used to work in python). While trying to load the Seurat package I encountered the following error:

library(Seurat)
Error: package or namespace load failed for ‘Seurat’ in .doLoadActions(where, attach):
error in load action .A.1 for package RcppAnnoy: loadModule(module = "AnnoyAngular", what = TRUE, env = ns, loadNow = TRUE): Unable to load module "AnnoyAngular": cannot allocate vector of size 12012.2 Gb

Is "RcppAnnoy" trying to allocate a 12 Gb vector? Is it make sense? I would be very glad for any help in solving this problem.

(R version 3.6.1)

Thanks a lot!
Tomer

segfault on install

on Ubuntu 16.04, R itself does not crash, but it reports a segfault.

> sessionInfo()
R version 3.3.1 (2016-06-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.1 LTS

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=de_DE.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=de_DE.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=de_DE.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
[1] magrittr_1.5   tools_3.3.1    roxygen2_5.0.1 Rcpp_0.12.6    stringi_1.1.1 
[6] stringr_1.1.0  tcltk_3.3.1 

The traceback:

> install.packages("RcppAnnoy")
Installing package into ‘/home/me/R/x86_64-pc-linux-gnu-library/3.3’
(as ‘lib’ is unspecified)
trying URL 'http://cran.wu.ac.at/src/contrib/RcppAnnoy_0.0.7.tar.gz'
Content type 'application/x-gzip' length 27566 bytes (26 KB)
==================================================
downloaded 26 KB

Loading required namespace: roxygen2
* installing *source* package ‘RcppAnnoy’ ...
** package ‘RcppAnnoy’ successfully unpacked and MD5 sums checked
** libs
g++ -std=c++11 -I/usr/share/R/include -DNDEBUG -I../inst/include/  -I"/home/me/R/x86_64-pc-linux-gnu-library/3.3/Rcpp/include"   -fpic  -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -c annoy.cpp -o annoy.o
g++ -std=c++11 -shared -L/usr/lib/R/lib -Wl,-Bsymbolic-functions -Wl,-z,relro -o RcppAnnoy.so annoy.o -L/usr/lib/R/lib -lR
installing to /home/me/R/x86_64-pc-linux-gnu-library/3.3/RcppAnnoy/libs
** R
** demo
** inst
** preparing package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded

 *** caught segfault ***
address 0x20, cause 'memory not mapped'

Traceback:
 1: .Call(Module__classes_info, xp)
 2: Module(module, mustStart = TRUE, where = env)
 3: doTryCatch(return(expr), name, parentenv, handler)
 4: tryCatchOne(expr, names, parentenv, handlers[[1L]])
 5: tryCatchList(expr, classes, parentenv, handlers)
 6: tryCatch(Module(module, mustStart = TRUE, where = env), error = function(e) e)
 7: loadModule(module = "AnnoyAngular", what = TRUE, env = ns, loadNow = TRUE)
 8: (function (ns) loadModule(module = "AnnoyAngular", what = TRUE, env = ns, loadNow = TRUE))(<environment>)
 9: doTryCatch(return(expr), name, parentenv, handler)
10: tryCatchOne(expr, names, parentenv, handlers[[1L]])
11: tryCatchList(expr, classes, parentenv, handlers)
12: tryCatch((function (ns) loadModule(module = "AnnoyAngular", what = TRUE, env = ns, loadNow = TRUE))(<environment>),     error = function(e) e)
13: eval(expr, envir, enclos)
14: eval(substitute(tryCatch(FUN(WHERE), error = function(e) e),     list(FUN = f, WHERE = where)), where)
15: .doLoadActions(where, attach)
16: methods::cacheMetaData(ns, TRUE, ns)
17: loadNamespace(package, lib.loc)
18: doTryCatch(return(expr), name, parentenv, handler)
19: tryCatchOne(expr, names, parentenv, handlers[[1L]])
20: tryCatchList(expr, classes, parentenv, handlers)
21: tryCatch(expr, error = function(e) {    call <- conditionCall(e)    if (!is.null(call)) {        if (identical(call[[1L]], quote(doTryCatch)))             call <- sys.call(-4L)        dcall <- deparse(call)[1L]        prefix <- paste("Error in", dcall, ": ")        LONG <- 75L        msg <- conditionMessage(e)        sm <- strsplit(msg, "\n")[[1L]]        w <- 14L + nchar(dcall, type = "w") + nchar(sm[1L], type = "w")        if (is.na(w))             w <- 14L + nchar(dcall, type = "b") + nchar(sm[1L],                 type = "b")        if (w > LONG)             prefix <- paste0(prefix, "\n  ")    }    else prefix <- "Error : "    msg <- paste0(prefix, conditionMessage(e), "\n")    .Internal(seterrmessage(msg[1L]))    if (!silent && identical(getOption("show.error.messages"),         TRUE)) {        cat(msg, file = stderr())        .Internal(printDeferredWarnings())    }    invisible(structure(msg, class = "try-error", condition = e))})
22: try({    attr(package, "LibPath") <- which.lib.loc    ns <- loadNamespace(package, lib.loc)    env <- attachNamespace(ns, pos = pos, deps)})
23: library(pkg_name, lib.loc = lib, character.only = TRUE, logical.return = TRUE)
24: withCallingHandlers(expr, packageStartupMessage = function(c) invokeRestart("muffleMessage"))
25: suppressPackageStartupMessages(library(pkg_name, lib.loc = lib,     character.only = TRUE, logical.return = TRUE))
26: doTryCatch(return(expr), name, parentenv, handler)
27: tryCatchOne(expr, names, parentenv, handlers[[1L]])
28: tryCatchList(expr, classes, parentenv, handlers)
29: tryCatch(expr, error = function(e) {    call <- conditionCall(e)    if (!is.null(call)) {        if (identical(call[[1L]], quote(doTryCatch)))             call <- sys.call(-4L)        dcall <- deparse(call)[1L]        prefix <- paste("Error in", dcall, ": ")        LONG <- 75L        msg <- conditionMessage(e)        sm <- strsplit(msg, "\n")[[1L]]        w <- 14L + nchar(dcall, type = "w") + nchar(sm[1L], type = "w")        if (is.na(w))             w <- 14L + nchar(dcall, type = "b") + nchar(sm[1L],                 type = "b")        if (w > LONG)             prefix <- paste0(prefix, "\n  ")    }    else prefix <- "Error : "    msg <- paste0(prefix, conditionMessage(e), "\n")    .Internal(seterrmessage(msg[1L]))    if (!silent && identical(getOption("show.error.messages"),         TRUE)) {        cat(msg, file = stderr())        .Internal(printDeferredWarnings())    }    invisible(structure(msg, class = "try-error", condition = e))})
30: try(suppressPackageStartupMessages(library(pkg_name, lib.loc = lib,     character.only = TRUE, logical.return = TRUE)))
31: tools:::.test_load_package("RcppAnnoy", "/home/me/R/x86_64-pc-linux-gnu-library/3.3")
An irrecoverable exception occurred. R is aborting now ...
Segmentation fault (core dumped)
ERROR: loading failed
* removing ‘/home/me/R/x86_64-pc-linux-gnu-library/3.3/RcppAnnoy’

The downloaded source packages are in
    ‘/tmp/RtmpNAupSC/downloaded_packages’
Warning message:
In install.packages("RcppAnnoy") :
  installation of package ‘RcppAnnoy’ had non-zero exit status

Vignette update

While release 0.0.17 went out nicely and without issues thanks to the updates in uwot and BiocNeighbors, we did not get around to updating the vignette for the newly added header file.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.