Giter VIP home page Giter VIP logo

collections's Introduction

High Performance Container Data Types

check codecov CRAN_Status_Badge

Github: https://github.com/randy3k/collections

Documentation: https://randy3k.github.io/collections/

Provides high performance container data types such as queues, stacks, deques, dicts and ordered dicts. Benchmarks https://randy3k.github.io/collections/articles/benchmark.html have shown that these containers are asymptotically more efficient than those offered by other packages.

Installation

You can install the released version of collections from CRAN with:

install.packages("collections")

Install the latest development version using

devtools::install_github("randy3k/collections")

Example

library(collections, warn.conflicts = FALSE)

Queue

q <- queue()
q$push(1)$push(2)
q$pop()
## [1] 1

Stack

s <- stack()
s$push(1)$push(2)
s$pop()
## [1] 2

Deque

dq <- deque()
dq$push(1)$pushleft(2)
dq$pop()
## [1] 1

Priority Queue

pq <- priority_queue()
pq$push("not_urgent")
pq$push("urgent", priority = 2)
pq$push("not_as_urgent", priority = 1)
pq$pop()
## [1] "urgent"
pq$pop()
## [1] "not_as_urgent"
pq$pop()
## [1] "not_urgent"

Dictionary. Comparing to R envrionments, dict() does not leak memory and supports various other types of keys.

d <- dict()
e <- new.env()
d$set(e, 1)$set(sum, 2)$set(c(1L, 2L), 3)
d$get(c(1L, 2L))
## [1] 3

Ordered Dictionary

d <- ordered_dict()
d$set("b", 1)$set("a", 2)
d$as_list()
## $b
## [1] 1
## 
## $a
## [1] 2

collections's People

Contributors

randy3k avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

collections's Issues

Error: C stack usage 7969440 is too close to the limit

Printing the storage object (e.g. last in deque) or call saveRDS on the container objects produces error.

d <- collections::deque()
d$push(1)
d$push(2)
d$push(3)
d$last

image

saveRDS(d, "test.rds")
Error: C stack usage  7969440 is too close to the limit

It seems that some containers use R data structures to implement the algorithms. I'm wondering if such error could be eliminated somehow?

Huge speed difference depending on key type

Is it expected that non-environment and non-character keys are approx. 25x slower to get? I assume the reason can be that non-environment/non-character keys must be serialized first (see the script below).

library(collections)

env <- new.env(parent = emptyenv())
num <- 1
int <- 1L
char <- "a"
bool <- TRUE

d_env <- dict()$set(env, NULL)
d_num <- dict()$set(num, NULL)
d_int <- dict()$set(int, NULL)
d_char <- dict()$set(char, NULL)
d_bool <- dict()$set(bool, NULL)
d_ser_int <- dict()$set(serialize(int, NULL), NULL)

microbenchmark::microbenchmark(
  d_env$get(env),
  d_num$get(num),
  d_int$get(int),
  d_char$get(char),
  d_bool$get(bool)
)
# Unit: microseconds
#        expr    min      lq     mean  median      uq     max neval
#         env  1.299  1.9655  2.14228  2.2315  2.3325   5.150   100
#         num 40.160 41.2545 44.06497 42.2385 44.0770 141.945   100
#         int 39.036 40.9910 43.23154 42.1640 43.9230  70.183   100
#        char  1.051  1.5435  1.83238  1.9385  2.0555   4.794   100
#        bool 39.232 40.8540 43.58119 41.7245 44.5625  64.041   100
#  serial_int 43.176 44.7480 47.04926 45.9635 47.9180  66.670   100

Limitation of functions as keys?

MRE:

d <- dict()
d$set(function(x) 1L, TRUE) # OK

fun <- function(x) x
d$set(fun, TRUE) # error
# Error in base::serialize(object, connection = NULL, ascii = ascii, version = serializeVersion) :  object 'x' not found
# Error in d$set(fun, TRUE) : cannot compute digest of the key

sfun <- serialize(fun, NULL)
d$set(sfun, TRUE) # OK

The documentation of dict states:

key: scalar character, environment or function

It seems there is a serious limitation on which functions can be used as keys. Is this intentional, but (currently) undocumented?

Initialiser for Dict runs incredibly slowly

I haven't tried any of the other collection objects, just the Dict. To initialise it with a list of about 100,000 string to integer key pairs took about a minute. However this only seem to happen when trying to initialise the Dict when initially creating it. If a create an empty Dict and use a for loop to populate the key pairs one at a time by using set, this dropped down to about 4 seconds.

Error: attempt to apply non-function

Hello,

It seems the dictionary function is not working correctly:

> library(collections, warn.conflicts = FALSE)
> d <- dict()
> e <- new.env()
> d$set(e, 1)$set(sum, 2)$set(c(1L, 2L), 3)
Error: attempt to apply non-function
> d$get(c(1L, 2L))
Error: TypeError: unhashable type: 'list'

Add `mpush()` and `mpop()`

Having performant implementations of mpush() and mpop() as similar to what fastmap::faststack() offers would be very nice.

Here are slow "reference implementations"

tmp1 <- collections::stack()
tmp2 <- fastmap::faststack()
tmp1$mpush <- function(items) {
  for (item in items) {
    self$push(item)
  }
}

tmp1$mpop <- function(n) {
  replicate(n, self$pop(), simplify = FALSE)
}
environment(tmp1$mpop) <- tmp1
environment(tmp1$mpush) <- tmp1

bench::mark(
  collections = {
    tmp1$mpush(seq_len(1000))
    tmp1$mpop(500)
  },
  fastmap = {
    tmp2$mpush(.list = seq_len(1000))
    tmp2$mpop(500)
  }
)
#> # A tibble: 2 × 6
#>   expression       min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>  <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 collections  834.6µs  943.7µs     1036.    10.2KB     35.9
#> 2 fastmap       33.2µs   37.1µs     5446.    33.9KB     14.1

Created on 2022-12-12 with reprex v2.0.2

Information about the type of data structure from object.

I am using dict() object in a function and want to perform check to make sure that the argument passed to the function is of dict type.

The class function doesn't provide this information.

d <- dict(list(a = 1,b = 2))  
class(d)
[1] "environment"

Any workaround for this issue?

fail to install collections: Symbol not found: _holes_clear

I'm unable to upgrade to the latest release of collections.

This is my sessioninfo()

version 3.6.1 (2019-07-05)
Platform: x86_64-apple-darwin18.7.0 (64-bit)
Running under: macOS Catalina 10.15

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /usr/local/Cellar/openblas/0.3.7/lib/libopenblasp-r0.3.7.dylib

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
[1] compiler_3.6.1 tools_3.6.1    packrat_0.5.0

This is what is happening:

install.packages("collections")
Installing package into ‘/Users/e/Library/R/userLibrary’
(as ‘lib’ is unspecified)
trying URL 'https://cran.rstudio.com/src/contrib/collections_0.2.1.tar.gz'
Content type 'application/x-gzip' length 45896 bytes (44 KB)
==================================================
downloaded 44 KB

* installing *source* package ‘collections’ ...
** package ‘collections’ successfully unpacked and MD5 sums checked
** using staged installation
** libs
clang -I"/usr/local/Cellar/r/3.6.1/lib/R/include" -DNDEBUG   -I/usr/local/opt/gettext/include -I/usr/local/opt/readline/include -I/usr/local/opt/icu4c/include -I/usr/local/include  -fPIC  -I/Library/Java/JavaVirtualMachines/jdk-8.0_172.jdk/Contents/Home/include -I/Library/Java/JavaVirtualMachines/jdk-8.0_172.jdk/Contents/Home/include/darwin  -c tommyds/tommy.c -o tommyds/tommy.o
clang -I"/usr/local/Cellar/r/3.6.1/lib/R/include" -DNDEBUG   -I/usr/local/opt/gettext/include -I/usr/local/opt/readline/include -I/usr/local/opt/icu4c/include -I/usr/local/include  -fPIC  -I/Library/Java/JavaVirtualMachines/jdk-8.0_172.jdk/Contents/Home/include -I/Library/Java/JavaVirtualMachines/jdk-8.0_172.jdk/Contents/Home/include/darwin  -c collections.c -o collections.o
clang -I"/usr/local/Cellar/r/3.6.1/lib/R/include" -DNDEBUG   -I/usr/local/opt/gettext/include -I/usr/local/opt/readline/include -I/usr/local/opt/icu4c/include -I/usr/local/include  -fPIC  -I/Library/Java/JavaVirtualMachines/jdk-8.0_172.jdk/Contents/Home/include -I/Library/Java/JavaVirtualMachines/jdk-8.0_172.jdk/Contents/Home/include/darwin  -c deque.c -o deque.o
clang -I"/usr/local/Cellar/r/3.6.1/lib/R/include" -DNDEBUG   -I/usr/local/opt/gettext/include -I/usr/local/opt/readline/include -I/usr/local/opt/icu4c/include -I/usr/local/include  -fPIC  -I/Library/Java/JavaVirtualMachines/jdk-8.0_172.jdk/Contents/Home/include -I/Library/Java/JavaVirtualMachines/jdk-8.0_172.jdk/Contents/Home/include/darwin  -c dict.c -o dict.o
clang -I"/usr/local/Cellar/r/3.6.1/lib/R/include" -DNDEBUG   -I/usr/local/opt/gettext/include -I/usr/local/opt/readline/include -I/usr/local/opt/icu4c/include -I/usr/local/include  -fPIC  -I/Library/Java/JavaVirtualMachines/jdk-8.0_172.jdk/Contents/Home/include -I/Library/Java/JavaVirtualMachines/jdk-8.0_172.jdk/Contents/Home/include/darwin  -c priority_queue.c -o priority_queue.o
clang -I"/usr/local/Cellar/r/3.6.1/lib/R/include" -DNDEBUG   -I/usr/local/opt/gettext/include -I/usr/local/opt/readline/include -I/usr/local/opt/icu4c/include -I/usr/local/include  -fPIC  -I/Library/Java/JavaVirtualMachines/jdk-8.0_172.jdk/Contents/Home/include -I/Library/Java/JavaVirtualMachines/jdk-8.0_172.jdk/Contents/Home/include/darwin  -c queue.c -o queue.o
clang -I"/usr/local/Cellar/r/3.6.1/lib/R/include" -DNDEBUG   -I/usr/local/opt/gettext/include -I/usr/local/opt/readline/include -I/usr/local/opt/icu4c/include -I/usr/local/include  -fPIC  -I/Library/Java/JavaVirtualMachines/jdk-8.0_172.jdk/Contents/Home/include -I/Library/Java/JavaVirtualMachines/jdk-8.0_172.jdk/Contents/Home/include/darwin  -c stack.c -o stack.o
clang -I"/usr/local/Cellar/r/3.6.1/lib/R/include" -DNDEBUG   -I/usr/local/opt/gettext/include -I/usr/local/opt/readline/include -I/usr/local/opt/icu4c/include -I/usr/local/include  -fPIC  -I/Library/Java/JavaVirtualMachines/jdk-8.0_172.jdk/Contents/Home/include -I/Library/Java/JavaVirtualMachines/jdk-8.0_172.jdk/Contents/Home/include/darwin  -c utils.c -o utils.o
clang -dynamiclib -Wl,-headerpad_max_install_names -undefined dynamic_lookup -single_module -multiply_defined suppress -L/usr/local/Cellar/r/3.6.1/lib/R/lib -L/usr/local/opt/openblas/lib -L/usr/local/opt/gettext/lib -L/usr/local/opt/readline/lib -L/usr/local/opt/icu4c/lib -L/usr/local/lib -o collections.so tommyds/tommy.o collections.o deque.o dict.o priority_queue.o queue.o stack.o utils.o -L/usr/local/Cellar/r/3.6.1/lib/R/lib -lR -lintl -Wl,-framework -Wl,CoreFoundation
installing to /Users/e/Library/R/userLibrary/00LOCK-collections/00new/collections/libs
** R
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded from temporary location
Error: package or namespace load failed for ‘collections’ in dyn.load(file, DLLpath = DLLpath, ...):
 unable to load shared object '/Users/e/Library/R/userLibrary/00LOCK-collections/00new/collections/libs/collections.so':
  dlopen(/Users/e/Library/R/userLibrary/00LOCK-collections/00new/collections/libs/collections.so, 6): Symbol not found: _holes_clear
  Referenced from: /Users/e/Library/R/userLibrary/00LOCK-collections/00new/collections/libs/collections.so
  Expected in: flat namespace
 in /Users/e/Library/R/userLibrary/00LOCK-collections/00new/collections/libs/collections.so
Error: loading failed
Execution halted
ERROR: loading failed
* removing ‘/Users/e/Library/R/userLibrary/collections’
* restoring previous ‘/Users/e/Library/R/userLibrary/collections’
Warning in install.packages :
  installation of package ‘collections’ had non-zero exit status

The downloaded source packages are in
	‘/private/var/folders/9v/251f7k_x6hn8t9wsz8v21rgh0000gn/T/RtmppoTOOx/downloaded_packages’

set method on one OrderedDict changes another OrderedDict

I created two OrderedDict objects x and r, set a key value pair in r, and found that the keys and values in x had also been modified to be a copy of r even though no set method was called on x. Here is a minimal working example of the bug with R output on lines that begin with ##.

library(collections)
x <- OrderedDict$new() 
r <- OrderedDict$new() 
print(paste("x$keys()",x$keys()))
## [1] "x$keys() "
print(paste("x$values()",x$values()))
## [1] "x$values() "
seed <- c(27)
for (s in seed) { 
  r$set(as.character(s), 1/length(seed))
}
print(paste("x$keys()",x$keys())) # this should still be empty, but it's not.  It's a copy of r
## [1] "x$keys() 27"
print(paste("x$values()",x$values())) # this should still be empty, but it's not.  It's a copy of r
## [1] "x$values() 1"
sessionInfo()
## R version 3.4.4 (2018-03-15)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 18.04.1 LTS
## 
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/atlas/libblas.so.3.10.3
## LAPACK: /usr/lib/x86_64-linux-gnu/atlas/liblapack.so.3.10.3
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] collections_0.1.2
## 
## loaded via a namespace (and not attached):
##  [1] compiler_3.4.4  backports_1.1.2 R6_2.2.2        magrittr_1.5   
##  [5] rprojroot_1.3-2 tools_3.4.4     htmltools_0.3.6 yaml_2.2.0     
##  [9] Rcpp_0.12.19    stringi_1.2.4   rmarkdown_1.10  knitr_1.20     
## [13] stringr_1.3.1   digest_0.6.17   evaluate_0.11

Dict$new() calls old key and values

Hi,

In a function when I call

paramList <- Dict$new()

Last key and values preserved and returns to dictionary. I tried to use rm(paramList) but no effect.

Is this is a bug or my fault?
Thanks

my symbolic code:

main <- function() {
method <- "auth.getaut"
GetRMResponse(method)
}

GetRMResponse <- function(method, params) {

paramList <- Dict$new()

...

}

Consider vector based stack for benchmarking vignette

For completeness would it be worth including a vector based stack for comparison in the benchmarking vignette? The performance of one based on Martin Morgan's suggestion on stackoverflow is better than a list/environment based one.

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(forcats)
library(ggplot2)
library(collections)
#> 
#> Attaching package: 'collections'
#> The following object is masked from 'package:utils':
#> 
#>     stack

# vector based stack (based on https://stackoverflow.com/a/18678440)
vec_stack <- function(type="double", length=1000L) {
    v <- vector(type, length)
    i <- 1L
    len <- length(v)
    list(
        push = function(elt) {
            if (typeof(elt) != type)
                stop("types must match")
            if (i == len) {
                length(v) <<- 1.6 * len
                len <<- length(v)
            }
            v[[i]] <<- elt
            i <<- i + 1L
        },
        pop = function() {
            i <<- i - 1L
            v[[i]]
        },
        clear = function() {
            v <<- vector(type, length)
        }
    )
}

# list based stack from benchmark vignette
list_stack <- function() {
    self <- environment()
    q <- NULL
    n <- NULL
    push <- function(item) {
        if (is.null(item)) {
            q[n + 1] <<- list(item)
        } else {
            q[[n + 1]] <<- item
        }
        n <<- n + 1
        invisible(self)
    }
    pop <- function() {
        if (n == 0) stop("stack is empty")
        v <- q[[n]]
        q <<- q[-n]
        n <<- n - 1
        v
    }
    clear <- function() {
        q <<- list()
        n <<- 0
        invisible(self)
    }
    clear()
    self
}

# bench mark based on one in vignette (slightly extended n)
bench_stack <- bench::press(
    n = c(10, 50, 100, 200, 500, 1000),
    bench::mark(
        `base::list_stack_grow` = {
            q <- list_stack()
            x <- rnorm(n)
            for (i in 1:n) q$push(x[i])
            for (i in 1:n) q$pop()
        },
        `base::vec_stack_pre_allocate` = {
            q <- vec_stack(length = n)
            x <- rnorm(n)
            for (i in 1:n) q$push(x[i])
            for (i in 1:n) q$pop()
        },
        `base::vec_stack_grow` = {
            q <- vec_stack(length = 2L)
            x <- rnorm(n)
            for (i in 1:n) q$push(x[i])
            for (i in 1:n) q$pop()
        },
        `collections::stack` = {
            q <- stack()
            x <- rnorm(n)
            for (i in 1:n) q$push(x[i])
            for (i in 1:n) q$pop()
        },
        check = FALSE
    )
) |> 
    mutate(expression = fct_reorder(
        as.character(expression), median, .fun = mean, .desc = TRUE))
#> Running with:
#>       n
#> 1    10
#> 2    50
#> 3   100
#> 4   200
#> 5   500
#> 6  1000

# plot
bench_stack %>%
    ggplot(aes(x = n, y = median)) +
    geom_line(aes(color = expression)) +
    scale_colour_brewer(palette = "Set2", direction = -1) +
    ggtitle("push and pop n times") + ylab("time")

Created on 2023-01-04 with reprex v2.0.2

Evaluate 'default' lazily in 'dict$get()'

Maybe some users would find the following sugar useful:

longRunningFn <- function() {
  message("Evaluating started...")
  Sys.sleep(5)
  FALSE
}
d <- collections::dict()$set("existing_key", TRUE)
## instead of this:
value <- 
  if (d$has("existing_key")) {
    d$get("existing_key")
  } else {
    longRunningFn()
  }
## we could write:
value <- d$get("existing_key", default = longRunningFn())

Currently the default argument is evaluated even if the key exists in the dictionary.

Consider locking methods in the environment

Hi,

Tiny suggestion:
Automatically lock all the methods in the environments that are used for dictionaries, stacks, etc.
Otherwise, the user can accidentally remove methods (like set()), forcing the user to cre-create the entire object.

Demonstration for clarification:

> d <- dict(list(apple = 5, orange = 10))
> d$set <- NULL
> d$set # this is now changed to "NULL"
NULL
> 
> d <- dict(list(apple = 5, orange = 10))
> lockBinding("set", d) # d$set() is now safe
> d$set <- NULL
Error in d$set <- NULL : cannot change value of locked binding for 'set'
> d$set("banana", 3) # this still works

Or is there some issue with locking methods that I don't see?

Kind regards,

Tony.

Backwards compatibility issue

Hi,

Thanks for creating this package it has been a great help.

I was using v0.3.3 and had saved the object, then upgraded to v0.3.5. When I read the v0.3.3 object into a v0.3.5 environment I get the following error: Error in missing_arg(default) : could not find function "missing_arg". If I re-create the same hash using v0.3.5 everything works OK.

Reproducible Example:

# install v0.3.3 & v0.3.5
devtools::install_version(
  "collections", 
  version = "0.3.3",
  lib = "~/v3.3"
)
devtools::install_version(
  "collections", 
  version = "0.3.5",
  lib = "~/v3.5"
)

# create and save hash in v0.3.3
detach("package:collections", unload = TRUE)
library("collections", lib = "~/v3.3")
h3.3 <- dict(
  items = list(1, 2, 3),
  keys = list("A", "B", "C")
)
saveRDS(h3.3, file = "~/v3.3.rds")

# load v0.3.5
detach("package:collections", unload = TRUE)
library("collections", lib = "~/v3.5")
h3.3 <- readRDS(file = "~/v3.3.rds")
h3.3$get("A")
# Error in missing_arg(default) : could not find function "missing_arg"

# create and test new hash in v0.3.5
h3.5 <- dict(
  items = list(1, 2, 3),
  keys = list("A", "B", "C")
)
h3.5$get("A")

ubuntu 18.04 installation fail

Hi,
I am trying to install package via "install.packages("collections")" at ubuntu 18.04.
However, I got following message during installation.

In file included from  xxh.c:1:0: 
xxh.c: In function ‘xxh_digest’:  
xxh.h:8:20: error: ‘false’ undeclared (first use in this function); did you mean ‘fabsl’?
 # define ALTREP(x) false
                    ^
xxh.c:79:77: note: in expansion of macro ‘ALTREP’
     if (Rf_length(x) >= 0 && Rf_isVectorAtomic(x) && (Rf_length(x) == 1 || !ALTREP(x))) {
                                                                             ^~~~~~
xxh.h:8:20: note: each undeclared identifier is reported only once for each function it appears in
 # define ALTREP(x) false
                    ^
xxh.c:79:77: note: in expansion of macro ‘ALTREP’
     if (Rf_length(x) >= 0 && Rf_isVectorAtomic(x) && (Rf_length(x) == 1 || !ALTREP(x))) {
                                                                             ^~~~~~
/usr/lib/R/etc/Makeconf:159: recipe for target 'xxh.o' failed

I think "stdbool.h" has to be included or "false" has to be declared.

segfault when ht_xptr is missing for dict environment

> k = collections::dict()
> k$ht_xptr = NULL
> k$get("test")

 *** caught segfault ***
address 0x108030040, cause 'memory not mapped'

Traceback:
 1: k$get("test")

Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace
Selection:

Maybe the C code should lock the environment binding or check the existent of ht_xptr or hide ht_xptr from users?

compilation error in Eigen during installation on linux

Hello,

When I try to install the package on linux I get a strange compilation error:

> install.packages("collections")
Installing package into ‘/home/backes/R/x86_64-pc-linux-gnu-library/4.1’
(as ‘lib’ is unspecified)
trying URL 'https://stat.ethz.ch/CRAN/src/contrib/collections_0.3.5.tar.gz'
Content type 'application/x-gzip' length 93547 bytes (91 KB)
==================================================
downloaded 91 KB

* installing *source* package ‘collections’ ...
** package ‘collections’ successfully unpacked and MD5 sums checked
** using staged installation
** libs
gcc -I"/usr/include/R/" -DNDEBUG   -I"/home/backes/R/x86_64-pc-linux-gnu-library/4.1/Rcpp/include/"  -I"/home/backes/R/x86_64-pc-linux-gnu-library/4.1/RcppEigen/include/"  -I"/home/backes/R/x86_64-pc-linux-gnu-library/4.1/RcppEigen/include/unsupported"  -I"/home/backes/R/x86_64-pc-linux-gnu-library/4.1/BH/include" -I"/home/backes/R/x86_64-pc-linux-gnu-library/4.1/StanHeaders/include/src/"  -I"/home/backes/R/x86_64-pc-linux-gnu-library/4.1/StanHeaders/include/"  -I"/home/backes/R/x86_64-pc-linux-gnu-library/4.1/RcppParallel/include/"  -I"/home/backes/R/x86_64-pc-linux-gnu-library/4.1/rstan/include" -DEIGEN_NO_DEBUG  -DBOOST_DISABLE_ASSERTS  -DBOOST_PENDING_INTEGER_LOG2_HPP  -DSTAN_THREADS  -DBOOST_NO_AUTO_PTR  -include '/home/backes/R/x86_64-pc-linux-gnu-library/4.1/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp'  -D_REENTRANT -DRCPP_PARALLEL_USE_TBB=1   -D_FORTIFY_SOURCE=2   -fpic  -march=x86-64 -mtune=generic -O2 -pipe -fno-plt  -c tommyds/tommy.c -o tommyds/tommy.o
In file included from /home/backes/R/x86_64-pc-linux-gnu-library/4.1/RcppEigen/include/Eigen/Core:88,
                 from /home/backes/R/x86_64-pc-linux-gnu-library/4.1/RcppEigen/include/Eigen/Dense:1,
                 from /home/backes/R/x86_64-pc-linux-gnu-library/4.1/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13,
                 from <command-line>:
/home/backes/R/x86_64-pc-linux-gnu-library/4.1/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:1: error: unknown type name ‘namespace’
  628 | namespace Eigen {
      | ^~~~~~~~~
/home/backes/R/x86_64-pc-linux-gnu-library/4.1/RcppEigen/include/Eigen/src/Core/util/Macros.h:628:17: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘{’ token
  628 | namespace Eigen {
      |                 ^
In file included from /home/backes/R/x86_64-pc-linux-gnu-library/4.1/RcppEigen/include/Eigen/Dense:1,
                 from /home/backes/R/x86_64-pc-linux-gnu-library/4.1/StanHeaders/include/stan/math/prim/mat/fun/Eigen.hpp:13,
                 from <command-line>:
/home/backes/R/x86_64-pc-linux-gnu-library/4.1/RcppEigen/include/Eigen/Core:96:10: fatal error: complex: No such file or directory
   96 | #include <complex>
      |          ^~~~~~~~~
compilation terminated.
make: *** [/usr/lib64/R/etc/Makeconf:168: tommyds/tommy.o] Error 1
ERROR: compilation failed for package ‘collections’
* removing ‘/home/backes/R/x86_64-pc-linux-gnu-library/4.1/collections’

The downloaded source packages are in
        ‘/tmp/RtmpdlcMoz/downloaded_packages’
Warning message:
In install.packages("collections") :
  installation of package ‘collections’ had non-zero exit status

I have R version 4.1.2:

R.version
               _                           
platform       x86_64-pc-linux-gnu         
arch           x86_64                      
os             linux-gnu                   
system         x86_64, linux-gnu           
status                                     
major          4                           
minor          1.2                         
year           2021                        
month          11                          
day            01                          
svn rev        81115                       
language       R                           
version.string R version 4.1.2 (2021-11-01)
nickname       Bird Hippie  

and the compiler is version 11.1.0 :

$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-pc-linux-gnu/11.1.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /build/gcc/src/gcc/configure --prefix=/usr --libdir=/usr/lib --libexecdir=/usr/lib --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=https://bugs.archlinux.org/ --enable-languages=c,c++,ada,fortran,go,lto,objc,obj-c++,d --with-isl --with-linker-hash-style=gnu --with-system-zlib --enable-__cxa_atexit --enable-cet=auto --enable-checking=release --enable-clocale=gnu --enable-default-pie --enable-default-ssp --enable-gnu-indirect-function --enable-gnu-unique-object --enable-install-libiberty --enable-linker-build-id --enable-lto --enable-multilib --enable-plugin --enable-shared --enable-threads=posix --disable-libssp --disable-libstdcxx-pch --disable-libunwind-exceptions --disable-werror gdc_include_dir=/usr/include/dlang/gdc
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 11.1.0 (GCC) 

Has someone else also encountered the same issue or knows how to fix it?

Thank you

[feature request] List class

I think it may make sense to create a wrapper around R's list with clear append semantics without copy of the object. Recently R lists (and actually arrays) seems improved a lot in a sense that they don't make copies with each append. But this is not widely known.

N = 1e6
dynamic = function(N) {
  lst = vector('list', 0)
  for(i in 1:N)  lst[[i]] = TRUE
}
preallocated = function(N){
  lst_preallocated = vector('list', N)
  for(i in 1:N)  lst_preallocated[[i]] = TRUE
}

microbenchmark::microbenchmark( dynamic(N), preallocated(N), times = 10)
Unit: milliseconds
            expr       min        lq      mean   median        uq       max neval
      dynamic(N) 164.18238 168.08486 178.70834 172.7364 177.34840 232.65446    10
 preallocated(N)  42.41742  42.73463  44.25383  44.2013  44.35719  47.54679    10

As can be seen it is still better to pre-allocate result, but dynamic resizing is not that bad now (few re-allocations).

Counter-intuitive behavior with sorted keys

k1 <- sort(c(1L, 10L, 3L))
k2 <- c(1L, 3L, 10L)
d <- collections::dict()
identical(k1, k2)
#> [1] TRUE
d$set(k1, TRUE)
d$has(k2)
#> [1] FALSE
d$has(sort(k2))
#> [1] FALSE
d$has(sort(rev(k2)))
#> [1] FALSE
d$has(c(1L, 3L, 10L))
#> [1] FALSE

Created on 2022-12-23 with reprex v2.0.2

k1 <- sort(c(1L, 10L, 3L))
k2 <- c(1L, 3L, 10L)
d <- collections::dict()
identical(k1, k2)
#> [1] TRUE
d$set(k2, TRUE)
d$has(k1)
#> [1] FALSE
d$has(k2)
#> [1] TRUE
d$has(sort(k2))
#> [1] TRUE
d$has(sort(rev(k2)))
#> [1] FALSE
d$has(sort(k1))
#> [1] FALSE
d$has(sort(rev(k1)))
#> [1] FALSE
d$has(c(1L, 3L, 10L))
#> [1] TRUE
d$has(sort(c(1L, 10L, 3L)))
#> [1] FALSE

Created on 2022-12-23 with reprex v2.0.2

Integer vector keys cannot be reliably used as keys in collections::dict().
This behavior is very confusing. Is it documented somewhere?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.