Giter VIP home page Giter VIP logo

datastructures's Introduction

Hello there 👋

I am a researcher in machine learning and computational statistics at the Swiss Data Science Center (SDSC) and ETHZ in Zurich.

  • 🔭 Research interests: causal inference, generative modelling, Bayesian inference, probabilistic programming, ...
  • 👋 Contact: firstname dot lastname @ protonmail dot com

datastructures's People

Contributors

dirmeier avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

datastructures's Issues

Package ‘datastructures’ was removed from the CRAN repository.

Package ‘datastructures’ was removed from the CRAN repository.
Formerly available versions can be obtained from the archive.
Archived on 2023-05-05 as issues were not corrected in time.
A summary of the most recent check results can be obtained from the check results archive.
Please use the canonical form https://cran.r-project.org/package=datastructures to link to this page.

Will datastructures return to CRAN any time soon?

Examples dont work on `R CMD check`

When calling R CMD check examples dont work, while they work if executed manually or using devtools.
What? Won't fix. Seems not to be a bug on our side.

remove an element from a hashmap

Consider this example:


hm <- hashmap("integer")
keys <- 1:2
values <- list(
  3L,
  data.frame(A=rbeta(3, .5, .5), B=rgamma(3, 1)))
hm[keys] <- values

how can I remove the element hm[1L]?

Support arbitrary R objects

I am wondering is there any plan to support arbitrary R objects, eg, lists or environments in the future. The current implementation has somewhat limited the practical usability of the package. Thanks.

Change names

When loading the package I get:

The following objects are masked from ‘package:utils’:

    head, stack

The following objects are masked from ‘package:base’:

    get, remove

The functions head and get are very basic (also stack and remove), to override them in the working environment is likely to break things for people. It is probably better to either make these functions generic or to change their names.

Adding has_key() and/or default values to hasmaps

Hi! I just came along this very nice package and was trying to play around a bit with hash tables.

My particular use cases require many queries to keys which may be absent from the table. For a simplified example, suppose I want to build a table of occurrences of elements in a vector v of strings through a single pass over the vector; ideally, I would like to do something like this:

# pseudocode
h <- hashmap("integer")
for (x in v) {
    h[x] <- h[x] + 1
}

where, if x is not yet a key, h[x] on the right-hand side of the assignment should return zero (like e.g. a C++ STL map with numeric values).

As far as I can see, the only sensible way to correctly implement my pseudocode above is to wrap the h[x]<- assignment in a tryCatch() clause. It would be thus nice to have either:

  • an has_key() method to check the presence of x in keys(h) in constant time, or
  • the possibility to set a default (R object) value to missing keys.

If you think any of these two features are worth implementing, I could try to provide some starting code!

Hope this helps,
Thanks.

Valerio

Is `insert` slow?

I came across your packages (which is cool!), I've played with adding an as.list.stack function, but it appears to be very slow. Below is the code, please let me know if I've missed anything:

# install.packages("datastructures")
# install.packages("microbenchmark")

library("datastructures")
library("microbenchmark")

as.list.stack <- function(x, ..., mode) {
  # x is a stack
  # mode is based on the class of the top element
  n <- size(x)
  top <- peek(x)
  if(missing(mode)) {
    if(is.list(top)) {
      mode <- "list"
    } else {
      mode <- class(top)
    }
  }
  out <- vector(n, mode = mode)
  for(i in n:1) {
    out[[i]] <- pop(x)
  }
  
  out
}


##
# Basic benchmarking:
f1 <- function(n = 100) {
  x <- c()
  for(i in 1:n) {
    x <- c(x, i)
  }
  x
}

f2 <- function(n = 100) {
  x <- stack()
  for(i in 1:n) {
    insert(x, i)
  }
  as.list(x)
}

# f2(10)

Benchmarking:

> library("microbenchmark")
> microbenchmark(f1(100),f2(100), times = 10)
Unit: microseconds
    expr       min        lq       mean    median        uq       max neval
 f1(100)    73.661    99.357   141.3709   124.345   197.047   210.013    10
 f2(100) 10904.531 11115.825 13590.8941 11988.055 17615.234 18755.162    10
> microbenchmark(f1(10000),f2(10000), times = 10)
Unit: milliseconds
      expr     min       lq     mean   median       uq       max neval
 f1(10000) 190.201 203.9211 230.4935 227.3870 246.1155  291.2356    10
 f2(10000) 581.717 642.4793 699.8962 655.3675 701.4607 1004.4701    10

When doing some profiling, it seems that the insert operation is very expansive:

library(profvis)
profvis({
  n <- 10000
  x <- c()
  for(i in 1:n) {
    x <- c(x, i)
  }
  x
})
profvis({
  n <- 10000
  x <- stack()
  for(i in 1:n) {
    insert(x, i)
  }
  as.list(x)
})

I'm suspecting this might be because insert is a generic method and it has to do a lookup everytime I call it (and since it is S4, it might make it a bit slow).
I've tried to find the function directly but failed to get it:

> # showMethods("insert")
> getMethod("insert", "stack")
Error in getMethod("insert", "stack") : 
  no method found for function 'insert' and signature stack

Any thoughts?

Benchmark file seems to not work (for me)

When running this file:
https://github.com/dirmeier/datastructures/blob/master/benchmark/benchmark_hashmap.R
I get:

> microbenchmark(
+     hash = f1(10000),
+     env  = f2(10000)
+ )
 Show Traceback
 
 Rerun with Debug
 Error in initialize(value, ...) : 
  cannot use object of class “character” in new():  class “hashmap” does not extend that class 

My session info:

> sessionInfo()
R version 3.3.3 (2017-03-06)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X Yosemite 10.10.5

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] microbenchmark_1.4-6 datastructures_0.2.7 Rcpp_0.12.14         tableHTML_1.1.0      httr_1.3.1          
 [6] ggplot2_3.0.0        lubridate_1.7.1      stringr_1.3.1        bindrcpp_0.2         dplyr_0.7.4         
[11] rtweet_0.6.8        

loaded via a namespace (and not attached):
 [1] pillar_1.3.0     plyr_1.8.4       bindr_0.1.1      tools_3.3.3      digest_0.6.13    lattice_0.20-34 
 [7] nlme_3.1-131     jsonlite_1.5     tibble_1.4.2     gtable_0.2.0     mgcv_1.8-17      pkgconfig_2.0.2 
[13] rlang_0.2.2      Matrix_1.2-8     cli_1.0.0        rstudioapi_0.7   curl_3.2         yaml_2.1.16     
[19] withr_2.1.2      grid_3.3.3       glue_1.2.0       R6_2.2.2         fansi_0.3.0      purrr_0.2.4     
[25] magrittr_1.5     codetools_0.2-15 htmltools_0.3.6  scales_0.5.0     assertthat_0.2.0 colorspace_1.3-2
[31] labeling_0.3     utf8_1.1.3       stringi_1.2.4    openssl_1.0.2    lazyeval_0.2.1   munsell_0.5.0   
[37] crayon_1.3.4    
> 

utils::head, utils::stack, base::get, base::remove get overwritten without replacement

datastructures overwrites these fairly important functions/generics from always loaded packaged without replacement which will break a lot of other peoples' code:

Expected behavior:

> x <- 1:10
> get("x")
 [1]  1  2  3  4  5  6  7  8  9 10
> head(x, -1)
[1] 1 2 3 4 5 6 7 8 9
> stack(as.data.frame(x))
   values ind
1       1   x
2       2   x
3       3   x
4       4   x
5       5   x
6       6   x
7       7   x
8       8   x
9       9   x
10     10   x
> remove(x)
> x
Error: object 'x' not found

After datastructures is loaded:

> library(datastructures)
> x <- 1:10
> get("x")
  Error in (function (classes, fdef, mtable)  : 
  unable to find an inherited method for functiongetfor signature"character", "missing", "missing"> head(x, -1)
Error in head(x, -1) : unused argument (-1)
> stack(as.data.frame(x))
Error in stack(as.data.frame(x)) : unused argument (as.data.frame(x))
> remove(x)
 Error in (function (classes, fdef, mtable)  : 
  unable to find an inherited method for functionremovefor signature"integer", "missing", "missing"

openjournals/joss-reviews#907

Convert entire stack/queue to list?

Once the data structure is "finished with", ie all the pushing and popping is finished, how can we access the data as a native R vector or list?

For example:

require('datastructures')

q = queue()
insert(q, as.list(c(1, 2, 3, 4)))
as.list(q)
# Error in as.list.default(q) : 
#  no method for coercing this S4 class to a vector

Stack is not type stable when inserting/popping list objects

The ability to insert a list and have the elements added individually is nice, however when pushing a list of length 1, the list is not unwrapped. This requires the user to detect length 1 lists and handle them specially which is surprising and cumbersome.

library(datastructures)
#> Loading required package: Rcpp
#> 
#> Attaching package: 'datastructures'
#> The following object is masked from 'package:utils':
#> 
#>     stack
s1 <- stack()
s2 <- stack()

insert(s1, list("1"))
#> An object of class stack<SEXP>
#> 
#> Peek: list, ...
pop(s1)
#> [[1]]
#> [1] "1"
# class list

insert(s2, list("1","2"))
#> An object of class stack<SEXP>
#> 
#> Peek: character, ...
pop(s2)
#> [1] "2"
# class character

Created on 2019-05-15 by the reprex package (v0.2.1)

Insert a list of values under a key in heap

Hi,
The package is a nice contribution. I am using this for an epidemic simulation. I am having a trouble using heap.

You heap structure take only key and value as argument. But frequently we need to insert a key with a value as list like
fheap <- fibonacci_heap("numeric", c("integer", "integer")). That means values length will be 1 or more. This improvement could help the package user so much. Also this package will be then a competitor with Python heapq package.

I hope to see the update in future.

Again thanks for producing such an helpful package

Yushuf

decrease-key

I am interested in using a priority queue for ropensci/drake#227, and I am really happy to have found this package. Is there a decrease-key method available? I could not find one in the documentation, and I will need it to modify the priorities in the queue as jobs complete at unpredictable times in a custom scheduler.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.