Giter VIP home page Giter VIP logo

profmem's Issues

profmem can't always parse output returned by (some) parallelized calls

this is stochastic, the failing examples below sometimes work for smaller jobs with fewer calls.

  • seems to affect fork-ed computations, mainly, but not consistently (cf. future.lapply, which never failed me so far)
  • suspect that if multiple jobs write into the same logfile simultaneously, that outfile gets corrupted.
  • errors are triggered from:
> bench::mark(parallel::mclapply(seq_len(1e3), keep_busy))
Error in parse(text = trace) : <text>:1:88: unexpected symbol
1: c("qnorm", "FUN", "lapply", "doTryCatch", "tryCatchOne", "tryCatchList", "tryCat2048 :"seq.default
                                                                                         ^

Enter a frame number, or 0 to exit   

1: bench::mark(parallel::mclapply(seq_len(1000), keep_busy))
2: eval_one(exprs[[i]])
3: parse_allocations(f)
4: profmem::readRprofmem(filename)
5: lapply(bfr, FUN = function(x) {
    bytes <- gsub(pattern, "\\1", x)
    wh
6: FUN(X[[i]], ...)
7: eval(parse(text = trace))
8: parse(text = trace)

reprex:

keep_busy <- function(n = 1e3) {
  r <- rnorm(n)
  p <- pnorm(r)
  q <- qnorm(p)
  o <- order(q)
}
bench::mark(parallel::mclapply(seq_len(1e3), keep_busy))
#> Error in parse(text = trace): <text>:1:158: unexpected symbol
#> 1: c("rnorm", "FUN", "lapply", "doTryCatch", "tryCatchOne", "tryCatchList", "tryCatch", "try", "sendMaster", "FUN", "lapply", "<Anonymous>", "eval", "eval", " "tryCatchOne
#>                                                                                                                                                                  ^

bench::mark(parallel::parLapply(seq_len(1e3), keep_busy,
                                cl = parallel::makeCluster(3)))
#> # A tibble: 1 x 6
#> # … with 6 more variables: expression <bch:expr>, min <bch:tm>,
#> #   median <bch:tm>, `itr/sec` <dbl>, mem_alloc <bch:byt>, `gc/sec` <dbl>

library(foreach)
library(doParallel); registerDoParallel(cores = 3)
#> Loading required package: iterators
#> Loading required package: parallel
bench::mark({foreach(x = seq_len(1e3)) %dopar% keep_busy})
#> Error in parse(text = trace): <text>:1:255: unexpected symbol
#> 1: lize", "sendMaster", "FUN", "lapply", "mclapply", "<Anonymous>", "%dopar%", "eval", "eval", "eval_one", "<Anonymous>", "eval", "eval", "withVisible", "withCallingHandlers", "doTryCatch", "tryC
#>                                                                                                                                                                                                                                                                   ^

future::plan("multicore")
bench::mark(future.apply::future_lapply(seq_len(1e3), keep_busy))
#> # A tibble: 1 x 6
#>   expression                                              min median
#>   <bch:expr>                                            <bch> <bch:>
#> 1 future.apply::future_lapply(seq_len(1000), keep_busy) 107ms  107ms
#> # … with 3 more variables: `itr/sec` <dbl>, mem_alloc <bch:byt>,
#> #   `gc/sec` <dbl>

future::plan("multisession")
bench::mark(future.apply::future_lapply(seq_len(1e3), keep_busy))
#> Warning: Some expressions had a GC in every iteration; so filtering is
#> disabled.
#> # A tibble: 1 x 6
#>   expression                                              min median
#>   <bch:expr>                                            <bch> <bch:>
#> 1 future.apply::future_lapply(seq_len(1000), keep_busy) 552ms  552ms
#> # … with 3 more variables: `itr/sec` <dbl>, mem_alloc <bch:byt>,
#> #   `gc/sec` <dbl>

future::plan("sequential")
bench::mark(future.apply::future_lapply(seq_len(1e3), keep_busy))
#> # A tibble: 1 x 6
#>   expression                                              min median
#>   <bch:expr>                                            <bch> <bch:>
#> 1 future.apply::future_lapply(seq_len(1000), keep_busy) 113ms  113ms
#> # … with 3 more variables: `itr/sec` <dbl>, mem_alloc <bch:byt>,
#> #   `gc/sec` <dbl>

Created on 2019-10-06 by the reprex package (v0.3.0)

Session info
devtools::session_info()
#> ─ Session info ──────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 3.6.1 (2019-07-05)
#>  os       Linux Mint 19.2             
#>  system   x86_64, linux-gnu           
#>  ui       X11                         
#>  language en_US                       
#>  collate  en_US.UTF-8                 
#>  ctype    en_US.UTF-8                 
#>  tz       Europe/Berlin               
#>  date     2019-10-06                  
#> 
#> ─ Packages ──────────────────────────────────────────────────────────────
#>  package      * version date       lib source        
#>  assertthat     0.2.1   2019-03-21 [1] CRAN (R 3.6.1)
#>  backports      1.1.4   2019-04-10 [1] CRAN (R 3.6.1)
#>  bench          1.0.4   2019-09-06 [1] CRAN (R 3.6.1)
#>  callr          3.3.1   2019-07-18 [1] CRAN (R 3.6.1)
#>  cli            1.1.0   2019-03-19 [1] CRAN (R 3.6.1)
#>  codetools      0.2-16  2018-12-24 [4] CRAN (R 3.5.2)
#>  crayon         1.3.4   2017-09-16 [1] CRAN (R 3.6.1)
#>  desc           1.2.0   2018-05-01 [1] CRAN (R 3.6.1)
#>  devtools       2.1.0   2019-07-06 [1] CRAN (R 3.6.1)
#>  digest         0.6.20  2019-07-04 [1] CRAN (R 3.6.1)
#>  doParallel   * 1.0.15  2019-08-02 [1] CRAN (R 3.6.1)
#>  doSNOW       * 1.0.18  2019-07-27 [1] CRAN (R 3.6.1)
#>  evaluate       0.14    2019-05-28 [1] CRAN (R 3.6.1)
#>  fansi          0.4.0   2018-10-05 [1] CRAN (R 3.6.1)
#>  foreach      * 1.4.7   2019-07-27 [1] CRAN (R 3.6.1)
#>  fs             1.3.1   2019-05-06 [1] CRAN (R 3.6.1)
#>  future         1.14.0  2019-07-02 [1] CRAN (R 3.6.1)
#>  future.apply   1.3.0   2019-06-18 [1] CRAN (R 3.6.1)
#>  globals        0.12.4  2018-10-11 [1] CRAN (R 3.6.1)
#>  glue           1.3.1   2019-03-12 [1] CRAN (R 3.6.1)
#>  highr          0.8     2019-03-20 [1] CRAN (R 3.6.1)
#>  htmltools      0.3.6   2017-04-28 [1] CRAN (R 3.6.1)
#>  iterators    * 1.0.12  2019-07-26 [1] CRAN (R 3.6.1)
#>  knitr          1.24    2019-08-08 [1] CRAN (R 3.6.1)
#>  listenv        0.7.0   2018-01-21 [1] CRAN (R 3.6.1)
#>  magrittr       1.5     2014-11-22 [1] CRAN (R 3.6.1)
#>  memoise        1.1.0   2017-04-21 [1] CRAN (R 3.6.1)
#>  pillar         1.4.2   2019-06-29 [1] CRAN (R 3.6.1)
#>  pkgbuild       1.0.4   2019-08-05 [1] CRAN (R 3.6.1)
#>  pkgconfig      2.0.2   2018-08-16 [1] CRAN (R 3.6.1)
#>  pkgload        1.0.2   2018-10-29 [1] CRAN (R 3.6.1)
#>  prettyunits    1.0.2   2015-07-13 [1] CRAN (R 3.6.1)
#>  processx       3.4.1   2019-07-18 [1] CRAN (R 3.6.1)
#>  profmem        0.5.0   2018-01-30 [1] CRAN (R 3.6.1)
#>  ps             1.3.0   2018-12-21 [1] CRAN (R 3.6.1)
#>  R6             2.4.0   2019-02-14 [1] CRAN (R 3.6.1)
#>  Rcpp           1.0.2   2019-07-25 [1] CRAN (R 3.6.1)
#>  remotes        2.1.0   2019-06-24 [1] CRAN (R 3.6.1)
#>  rlang          0.4.0   2019-06-25 [1] CRAN (R 3.6.1)
#>  rmarkdown      1.15    2019-08-21 [1] CRAN (R 3.6.1)
#>  rprojroot      1.3-2   2018-01-03 [1] CRAN (R 3.6.1)
#>  sessioninfo    1.1.1   2018-11-05 [1] CRAN (R 3.6.1)
#>  snow         * 0.4-3   2018-09-14 [1] CRAN (R 3.6.1)
#>  stringi        1.4.3   2019-03-12 [1] CRAN (R 3.6.1)
#>  stringr        1.4.0   2019-02-10 [1] CRAN (R 3.6.1)
#>  testthat       2.2.1   2019-07-25 [1] CRAN (R 3.6.1)
#>  tibble         2.1.3   2019-06-06 [1] CRAN (R 3.6.1)
#>  usethis        1.5.1   2019-07-04 [1] CRAN (R 3.6.1)
#>  utf8           1.1.4   2018-05-24 [1] CRAN (R 3.6.1)
#>  vctrs          0.2.0   2019-07-05 [1] CRAN (R 3.6.1)
#>  withr          2.1.2   2018-03-15 [1] CRAN (R 3.6.1)
#>  xfun           0.9     2019-08-21 [1] CRAN (R 3.6.1)
#>  yaml           2.2.0   2018-07-25 [1] CRAN (R 3.6.1)
#>  zeallot        0.1.0   2018-01-28 [1] CRAN (R 3.6.1)
#> 
#> [1] /home/fabian-s/R/x86_64-pc-linux-gnu-library/3.6
#> [2] /usr/local/lib/R/site-library
#> [3] /usr/lib/R/site-library
#> [4] /usr/lib/R/library

DOCS: Document the Rprofmem data.frame

Add more details on what an Rprofmem data.frame contains:

  • profmem::readRprofmem()
  • profmem::profmem()

Reminder:

> p <- profmem::profmem({ x <- integer(1e4); y <- foo(1e6) })
> p
Rprofmem memory profiling of:
{
    x <- integer(10000)
    y <- foo(1e+06)
}

Memory allocations:
       what   bytes              calls
1     alloc   40048          integer()
2     alloc 4000048 foo() -> integer()
total       4040096                   
> str(p)
Classes 'Rprofmem' and 'data.frame':	2 obs. of  3 variables:
 $ what : chr  "alloc" "alloc"
 $ bytes: num  4e+04 4e+06
 $ trace:List of 2
  ..$ : chr "integer"
  ..$ : chr  "integer" "foo"
 - attr(*, "threshold")= int 0
 - attr(*, "expression")= language {  x <- integer(10000); y <- foo(1e+06) }
  ..- attr(*, "srcref")=List of 3
  .. ..$ : 'srcref' int  1 23 1 23 23 23 1 1
  .. .. ..- attr(*, "srcfile")=Classes 'srcfilecopy', 'srcfile' <environment: 0x556b805fa4e0> 
  .. ..$ : 'srcref' int  1 25 1 41 25 41 1 1
  .. .. ..- attr(*, "srcfile")=Classes 'srcfilecopy', 'srcfile' <environment: 0x556b805fa4e0> 
  .. ..$ : 'srcref' int  1 44 1 56 44 56 1 1
  .. .. ..- attr(*, "srcfile")=Classes 'srcfilecopy', 'srcfile' <environment: 0x556b805fa4e0> 
  ..- attr(*, "srcfile")=Classes 'srcfilecopy', 'srcfile' <environment: 0x556b805fa4e0> 
  ..- attr(*, "wholeSrcref")= 'srcref' int  1 0 1 58 0 58 1 1
  .. ..- attr(*, "srcfile")=Classes 'srcfilecopy', 'srcfile' <environment: 0x556b805fa4e0> 
 - attr(*, "value")= int  0 0 0 0 0 0 0 0 0 0 ...

VIGNETTE: Was R updated such that example is no longer valid?

Vignette example may no longer be true in recent R versions, e.g. in R 3.4.3 there seems to be no coercion to double for x in:

> p <- profmem({
+     small <- (x < 5000)
+ })
> p
Rprofmem memory profiling of:
{
    small <- (x < 5000)
}
Memory allocations:
       bytes      calls
1      80040 <internal>
2      40040 <internal>
total 120080           

Memory profiling as of R 3.2.0 only?

R 3.1.3 gives me:

configure: WARNING: unrecognized options: --with-memory-profiling

Could be worthwhile to mention that in the README.

(My local build system could be wrong, too.)

profmem_begin() and profmem_end()

In addition to current:

p <- profmem({
  expressions profiled
})

Add support for something like:

profmem_begin()
{
 expressions profiled 
}
p <- profmem_end()

Shiny-based example

Hi there, I'm interested in applying this to Shiny apps - is that possible? Or would you recommend just using profvis?

Different result first time I run profmem...

Hi! I am trying to use profmem to compute memory usage of a set of functions and I get very different results when I run the same command again (it goes from 273376 to 36648 bytes).

Here is a reproducible example:

rm(list = ls())
r <- profmem({ 
  for (i in 1:5){
    for (j in 1:50){
      cat(i,j)
    }}
})
total(r) #19200
s <- profmem({ 
  for (i in 1:5){
    for (j in 1:50){
      cat(i,j)
    }}
})
total(s) #19760
t <- profmem({ 
  for (i in 1:5){
    for (j in 1:50){
      cat(i,j)
    }}
})
total(t) #19760

In this example memory usage increases while in my case it decreases a lot. Which one would be the "real" memory use, the first one or the following ones? Thank you!!!

Best,
InΓ©s

print() for Rprofmem should report if there was an error while profiling

Related to #21, print() on Rprofmem does not report on errors that occurred during profiling, although attribute error of the returned results captures it;

> p <- profmem::profmem(log("a"))
> print(p)
Rprofmem memory profiling of:
log("a")
<environment: R_EmptyEnv>

Memory allocations:
      what bytes calls
total          0  

> attr(p, "error")
<simpleError in log("a"): non-numeric argument to mathematical function>

HELP NEEDED: R binaries known to have memory profiling enabled

There are a few options for ready-to-install R binaries for Linux, macOS and Windows users. Which of these are built to have memory-profile enabled?

If you've installed any of the above binaries, or binaries from some other source, please check whether

> capabilities("profmem")

returns TRUE or FALSE and report back here and I'll add the info to the vignette.

Support for nested `profmem()` calls

Wish

To be able to do:

p1 <- profmem({
  x <- 1:1000
  p2 <- profmem({
     y <- as.double(x)
  })
  z <- x * y
})

Where p1 should hold all memory allocations including those done during p2.

Issue

Rprofmem() itself can only handle a single file; from ?Rprofmem:

Enabling profiling automatically disables any existing profiling to another or the same file.

which is also confirmed when inspecting the internal code.

In develop, profmem_begin/end() now protects against the stack from being greater than one level, in order to avoid overwriting existing profiling logs by mistake.

Solution

It should not be impossible to have the profmem package to orchestrate a stack of files. One idea is to have profmem_begin() parse existing profile buffer and stack it before starting over producing a fresh one. Calls to profmem_end() will consume the current buffer and append it any previous one existing at the same level. If not at the top level, then it'll resume profiling to a fresh profile file.

The threshold argument

  • The threshold argument should also be recorded in the stack.

  • If a nested profmem is created, it's threshold must not be higher than its parent. Generate a warning and fall back to the active threshold.

  • If there's differs, then the recorded entries need to be filtered accordingly when returned.

  • There should be a method for retrieving threshold used.

  • The print() method should also report on the threshold used.

  • Drop threshold argument from profmem_resume() - use what's on the stack instead.

ROBUSTNESS: Add explicit 'stringsAsFactors' arguments [data.frame]

$ for pkg in $pkgs; do echo "$pkg:"; (cd "$pkg"; grep -E "^[ \t]*[^#].*data[.]frame" -- */*.R | grep -vF stringsAsFactors;); echo; read -r -p "Press ENTER to continue ..."; done

profmem:
R/profmem.R:  empty <- structure(empty, class = c("Rprofmem", "data.frame"), threshold = 0L)
R/Rprofmem-class.R:as.data.frame.Rprofmem <- function(x, ...) {
R/Rprofmem-class.R:  res <- data.frame(what = what, bytes = bytes, calls = traces,
R/Rprofmem-class.R:} ## as.data.frame()
R/Rprofmem-class.R:  data <- as.data.frame(x, ...)
tests/profmem.R:  data <- as.data.frame(p)
tests/profmem.R:  d <- as.data.frame(p)
tests/profmem.R:  d1 <- as.data.frame(p1)
tests/profmem.R:  d2 <- as.data.frame(p2)

Errors are silenced

Example:

> p <- profmem::profmem({ y <- log("a") })
> p
Rprofmem memory profiling of:
{
    y <- log("a")
}

Memory allocations:
      what bytes calls
total          0
> p <- profmem::profmem({ y <- non.existing::foo() })
> p
Rprofmem memory profiling of:
{
    y <- non.existing::foo()
}

Memory allocations:
      what bytes calls
total          0

and

> p <- profmem::profmem({ stop("boom") })
> p
Rprofmem memory profiling of:
{
    stop("boom")
}

Memory allocations:
      what bytes calls
total          0      

Action

Capture error and re-signal as a warning;

p <- profvis::profvis(log("a"), errors = "ignore")  ## the default
p <- profvis::profvis(log("a"), errors = "warn")
Warning: profvis() detected a run-time error: 
Error in log("a") : non-numeric argument to mathematical function
p <- profvis::profvis(log("a"), errors = "error")
Error in log("a") : non-numeric argument to mathematical function

Don't display NA by default

print() should not report on NAs by default because there can be quite a few at times. It could have a line saying:

Number of NA entries not displayed: 34

Add option to print() and as.data.frame() to include them.

VIGNETTE: Integers `5000L` in code snippets are displayed as `5000`

Integers 5000L in code snippets are displayed as 5000. This is due to a limitation in R.utils::withCapture();

> R.utils::withCapture(list(a=5000, b=5000L))
> list(a = 5000, b = 5000L)
$a
[1] 5000
$b
[1] 5000

which in turn is due to a limitation to how print() outputs integers;

> list(a = 5000, b = 5000L)
$a
[1] 5000
$b
[1] 5000
> 5000
[1] 5000
> 5000L
[1] 5000

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.