henrikbengtsson / profmem Goto Github PK
View Code? Open in Web Editor NEWπ§ R package: profmem - Simple Memory Profiling for R
Home Page: https://cran.r-project.org/package=profmem
π§ R package: profmem - Simple Memory Profiling for R
Home Page: https://cran.r-project.org/package=profmem
Was a bit quite add that microbenchmark paragraph to the vignette just before submitting to CRAN. It claims to measure overhead added by the garbage collector. However, the difference is more likely due to the coercion of x
to double.
Should be fixed.
this is stochastic, the failing examples below sometimes work for smaller jobs with fewer calls.
fork
-ed computations, mainly, but not consistently (cf. future.lapply
, which never failed me so far)> bench::mark(parallel::mclapply(seq_len(1e3), keep_busy))
Error in parse(text = trace) : <text>:1:88: unexpected symbol
1: c("qnorm", "FUN", "lapply", "doTryCatch", "tryCatchOne", "tryCatchList", "tryCat2048 :"seq.default
^
Enter a frame number, or 0 to exit
1: bench::mark(parallel::mclapply(seq_len(1000), keep_busy))
2: eval_one(exprs[[i]])
3: parse_allocations(f)
4: profmem::readRprofmem(filename)
5: lapply(bfr, FUN = function(x) {
bytes <- gsub(pattern, "\\1", x)
wh
6: FUN(X[[i]], ...)
7: eval(parse(text = trace))
8: parse(text = trace)
reprex
:
keep_busy <- function(n = 1e3) {
r <- rnorm(n)
p <- pnorm(r)
q <- qnorm(p)
o <- order(q)
}
bench::mark(parallel::mclapply(seq_len(1e3), keep_busy))
#> Error in parse(text = trace): <text>:1:158: unexpected symbol
#> 1: c("rnorm", "FUN", "lapply", "doTryCatch", "tryCatchOne", "tryCatchList", "tryCatch", "try", "sendMaster", "FUN", "lapply", "<Anonymous>", "eval", "eval", " "tryCatchOne
#> ^
bench::mark(parallel::parLapply(seq_len(1e3), keep_busy,
cl = parallel::makeCluster(3)))
#> # A tibble: 1 x 6
#> # β¦ with 6 more variables: expression <bch:expr>, min <bch:tm>,
#> # median <bch:tm>, `itr/sec` <dbl>, mem_alloc <bch:byt>, `gc/sec` <dbl>
library(foreach)
library(doParallel); registerDoParallel(cores = 3)
#> Loading required package: iterators
#> Loading required package: parallel
bench::mark({foreach(x = seq_len(1e3)) %dopar% keep_busy})
#> Error in parse(text = trace): <text>:1:255: unexpected symbol
#> 1: lize", "sendMaster", "FUN", "lapply", "mclapply", "<Anonymous>", "%dopar%", "eval", "eval", "eval_one", "<Anonymous>", "eval", "eval", "withVisible", "withCallingHandlers", "doTryCatch", "tryC
#> ^
future::plan("multicore")
bench::mark(future.apply::future_lapply(seq_len(1e3), keep_busy))
#> # A tibble: 1 x 6
#> expression min median
#> <bch:expr> <bch> <bch:>
#> 1 future.apply::future_lapply(seq_len(1000), keep_busy) 107ms 107ms
#> # β¦ with 3 more variables: `itr/sec` <dbl>, mem_alloc <bch:byt>,
#> # `gc/sec` <dbl>
future::plan("multisession")
bench::mark(future.apply::future_lapply(seq_len(1e3), keep_busy))
#> Warning: Some expressions had a GC in every iteration; so filtering is
#> disabled.
#> # A tibble: 1 x 6
#> expression min median
#> <bch:expr> <bch> <bch:>
#> 1 future.apply::future_lapply(seq_len(1000), keep_busy) 552ms 552ms
#> # β¦ with 3 more variables: `itr/sec` <dbl>, mem_alloc <bch:byt>,
#> # `gc/sec` <dbl>
future::plan("sequential")
bench::mark(future.apply::future_lapply(seq_len(1e3), keep_busy))
#> # A tibble: 1 x 6
#> expression min median
#> <bch:expr> <bch> <bch:>
#> 1 future.apply::future_lapply(seq_len(1000), keep_busy) 113ms 113ms
#> # β¦ with 3 more variables: `itr/sec` <dbl>, mem_alloc <bch:byt>,
#> # `gc/sec` <dbl>
Created on 2019-10-06 by the reprex package (v0.3.0)
devtools::session_info()
#> β Session info ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
#> setting value
#> version R version 3.6.1 (2019-07-05)
#> os Linux Mint 19.2
#> system x86_64, linux-gnu
#> ui X11
#> language en_US
#> collate en_US.UTF-8
#> ctype en_US.UTF-8
#> tz Europe/Berlin
#> date 2019-10-06
#>
#> β Packages ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
#> package * version date lib source
#> assertthat 0.2.1 2019-03-21 [1] CRAN (R 3.6.1)
#> backports 1.1.4 2019-04-10 [1] CRAN (R 3.6.1)
#> bench 1.0.4 2019-09-06 [1] CRAN (R 3.6.1)
#> callr 3.3.1 2019-07-18 [1] CRAN (R 3.6.1)
#> cli 1.1.0 2019-03-19 [1] CRAN (R 3.6.1)
#> codetools 0.2-16 2018-12-24 [4] CRAN (R 3.5.2)
#> crayon 1.3.4 2017-09-16 [1] CRAN (R 3.6.1)
#> desc 1.2.0 2018-05-01 [1] CRAN (R 3.6.1)
#> devtools 2.1.0 2019-07-06 [1] CRAN (R 3.6.1)
#> digest 0.6.20 2019-07-04 [1] CRAN (R 3.6.1)
#> doParallel * 1.0.15 2019-08-02 [1] CRAN (R 3.6.1)
#> doSNOW * 1.0.18 2019-07-27 [1] CRAN (R 3.6.1)
#> evaluate 0.14 2019-05-28 [1] CRAN (R 3.6.1)
#> fansi 0.4.0 2018-10-05 [1] CRAN (R 3.6.1)
#> foreach * 1.4.7 2019-07-27 [1] CRAN (R 3.6.1)
#> fs 1.3.1 2019-05-06 [1] CRAN (R 3.6.1)
#> future 1.14.0 2019-07-02 [1] CRAN (R 3.6.1)
#> future.apply 1.3.0 2019-06-18 [1] CRAN (R 3.6.1)
#> globals 0.12.4 2018-10-11 [1] CRAN (R 3.6.1)
#> glue 1.3.1 2019-03-12 [1] CRAN (R 3.6.1)
#> highr 0.8 2019-03-20 [1] CRAN (R 3.6.1)
#> htmltools 0.3.6 2017-04-28 [1] CRAN (R 3.6.1)
#> iterators * 1.0.12 2019-07-26 [1] CRAN (R 3.6.1)
#> knitr 1.24 2019-08-08 [1] CRAN (R 3.6.1)
#> listenv 0.7.0 2018-01-21 [1] CRAN (R 3.6.1)
#> magrittr 1.5 2014-11-22 [1] CRAN (R 3.6.1)
#> memoise 1.1.0 2017-04-21 [1] CRAN (R 3.6.1)
#> pillar 1.4.2 2019-06-29 [1] CRAN (R 3.6.1)
#> pkgbuild 1.0.4 2019-08-05 [1] CRAN (R 3.6.1)
#> pkgconfig 2.0.2 2018-08-16 [1] CRAN (R 3.6.1)
#> pkgload 1.0.2 2018-10-29 [1] CRAN (R 3.6.1)
#> prettyunits 1.0.2 2015-07-13 [1] CRAN (R 3.6.1)
#> processx 3.4.1 2019-07-18 [1] CRAN (R 3.6.1)
#> profmem 0.5.0 2018-01-30 [1] CRAN (R 3.6.1)
#> ps 1.3.0 2018-12-21 [1] CRAN (R 3.6.1)
#> R6 2.4.0 2019-02-14 [1] CRAN (R 3.6.1)
#> Rcpp 1.0.2 2019-07-25 [1] CRAN (R 3.6.1)
#> remotes 2.1.0 2019-06-24 [1] CRAN (R 3.6.1)
#> rlang 0.4.0 2019-06-25 [1] CRAN (R 3.6.1)
#> rmarkdown 1.15 2019-08-21 [1] CRAN (R 3.6.1)
#> rprojroot 1.3-2 2018-01-03 [1] CRAN (R 3.6.1)
#> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 3.6.1)
#> snow * 0.4-3 2018-09-14 [1] CRAN (R 3.6.1)
#> stringi 1.4.3 2019-03-12 [1] CRAN (R 3.6.1)
#> stringr 1.4.0 2019-02-10 [1] CRAN (R 3.6.1)
#> testthat 2.2.1 2019-07-25 [1] CRAN (R 3.6.1)
#> tibble 2.1.3 2019-06-06 [1] CRAN (R 3.6.1)
#> usethis 1.5.1 2019-07-04 [1] CRAN (R 3.6.1)
#> utf8 1.1.4 2018-05-24 [1] CRAN (R 3.6.1)
#> vctrs 0.2.0 2019-07-05 [1] CRAN (R 3.6.1)
#> withr 2.1.2 2018-03-15 [1] CRAN (R 3.6.1)
#> xfun 0.9 2019-08-21 [1] CRAN (R 3.6.1)
#> yaml 2.2.0 2018-07-25 [1] CRAN (R 3.6.1)
#> zeallot 0.1.0 2018-01-28 [1] CRAN (R 3.6.1)
#>
#> [1] /home/fabian-s/R/x86_64-pc-linux-gnu-library/3.6
#> [2] /usr/local/lib/R/site-library
#> [3] /usr/lib/R/site-library
#> [4] /usr/lib/R/library
Add more details on what an Rprofmem data.frame contains:
profmem::readRprofmem()
profmem::profmem()
Reminder:
> p <- profmem::profmem({ x <- integer(1e4); y <- foo(1e6) })
> p
Rprofmem memory profiling of:
{
x <- integer(10000)
y <- foo(1e+06)
}
Memory allocations:
what bytes calls
1 alloc 40048 integer()
2 alloc 4000048 foo() -> integer()
total 4040096
> str(p)
Classes 'Rprofmem' and 'data.frame': 2 obs. of 3 variables:
$ what : chr "alloc" "alloc"
$ bytes: num 4e+04 4e+06
$ trace:List of 2
..$ : chr "integer"
..$ : chr "integer" "foo"
- attr(*, "threshold")= int 0
- attr(*, "expression")= language { x <- integer(10000); y <- foo(1e+06) }
..- attr(*, "srcref")=List of 3
.. ..$ : 'srcref' int 1 23 1 23 23 23 1 1
.. .. ..- attr(*, "srcfile")=Classes 'srcfilecopy', 'srcfile' <environment: 0x556b805fa4e0>
.. ..$ : 'srcref' int 1 25 1 41 25 41 1 1
.. .. ..- attr(*, "srcfile")=Classes 'srcfilecopy', 'srcfile' <environment: 0x556b805fa4e0>
.. ..$ : 'srcref' int 1 44 1 56 44 56 1 1
.. .. ..- attr(*, "srcfile")=Classes 'srcfilecopy', 'srcfile' <environment: 0x556b805fa4e0>
..- attr(*, "srcfile")=Classes 'srcfilecopy', 'srcfile' <environment: 0x556b805fa4e0>
..- attr(*, "wholeSrcref")= 'srcref' int 1 0 1 58 0 58 1 1
.. ..- attr(*, "srcfile")=Classes 'srcfilecopy', 'srcfile' <environment: 0x556b805fa4e0>
- attr(*, "value")= int 0 0 0 0 0 0 0 0 0 0 ...
Vignette example may no longer be true in recent R versions, e.g. in R 3.4.3 there seems to be no coercion to double for x
in:
> p <- profmem({
+ small <- (x < 5000)
+ })
> p
Rprofmem memory profiling of:
{
small <- (x < 5000)
}
Memory allocations:
bytes calls
1 80040 <internal>
2 40040 <internal>
total 120080
R 3.1.3 gives me:
configure: WARNING: unrecognized options: --with-memory-profiling
Could be worthwhile to mention that in the README.
(My local build system could be wrong, too.)
In addition to current:
p <- profmem({
expressions profiled
})
Add support for something like:
profmem_begin()
{
expressions profiled
}
p <- profmem_end()
Make it possible to disable printing of the expression in print() via an option
Hi there, I'm interested in applying this to Shiny apps - is that possible? Or would you recommend just using profvis
?
Hi! I am trying to use profmem to compute memory usage of a set of functions and I get very different results when I run the same command again (it goes from 273376 to 36648 bytes).
Here is a reproducible example:
rm(list = ls())
r <- profmem({
for (i in 1:5){
for (j in 1:50){
cat(i,j)
}}
})
total(r) #19200
s <- profmem({
for (i in 1:5){
for (j in 1:50){
cat(i,j)
}}
})
total(s) #19760
t <- profmem({
for (i in 1:5){
for (j in 1:50){
cat(i,j)
}}
})
total(t) #19760
In this example memory usage increases while in my case it decreases a lot. Which one would be the "real" memory use, the first one or the following ones? Thank you!!!
Best,
InΓ©s
$ for pkg in $pkgs; do echo "$pkg:"; (cd "$pkg"; grep -E "^[ \t]*[^#].*[cr]bind" -- */*.R | grep -vF stringsAsFactors;); echo; read -r -p "Press ENTER to continue ..."; done
profmem:
R/Rprofmem-class.R: data <- rbind(data, list(what = "", bytes = total, calls = ""))
Related to #21, print()
on Rprofmem does not report on errors that occurred during profiling, although attribute error
of the returned results captures it;
> p <- profmem::profmem(log("a"))
> print(p)
Rprofmem memory profiling of:
log("a")
<environment: R_EmptyEnv>
Memory allocations:
what bytes calls
total 0
> attr(p, "error")
<simpleError in log("a"): non-numeric argument to mathematical function>
There are a few options for ready-to-install R binaries for Linux, macOS and Windows users. Which of these are built to have memory-profile enabled?
If you've installed any of the above binaries, or binaries from some other source, please check whether
> capabilities("profmem")
returns TRUE
or FALSE
and report back here and I'll add the info to the vignette.
To be able to do:
p1 <- profmem({
x <- 1:1000
p2 <- profmem({
y <- as.double(x)
})
z <- x * y
})
Where p1
should hold all memory allocations including those done during p2
.
Rprofmem()
itself can only handle a single file; from ?Rprofmem
:
Enabling profiling automatically disables any existing profiling to another or the same file.
which is also confirmed when inspecting the internal code.
In develop, profmem_begin/end()
now protects against the stack from being greater than one level, in order to avoid overwriting existing profiling logs by mistake.
It should not be impossible to have the profmem package to orchestrate a stack of files. One idea is to have profmem_begin()
parse existing profile buffer and stack it before starting over producing a fresh one. Calls to profmem_end()
will consume the current buffer and append it any previous one existing at the same level. If not at the top level, then it'll resume profiling to a fresh profile file.
The threshold
argument should also be recorded in the stack.
If a nested profmem is created, it's threshold must not be higher than its parent. Generate a warning and fall back to the active threshold.
If there's differs, then the recorded entries need to be filtered accordingly when returned.
There should be a method for retrieving threshold used.
The print()
method should also report on the threshold used.
Drop threshold argument from profmem_resume()
- use what's on the stack instead.
$ for pkg in $pkgs; do echo "$pkg:"; (cd "$pkg"; grep -E "^[ \t]*[^#].*data[.]frame" -- */*.R | grep -vF stringsAsFactors;); echo; read -r -p "Press ENTER to continue ..."; done
profmem:
R/profmem.R: empty <- structure(empty, class = c("Rprofmem", "data.frame"), threshold = 0L)
R/Rprofmem-class.R:as.data.frame.Rprofmem <- function(x, ...) {
R/Rprofmem-class.R: res <- data.frame(what = what, bytes = bytes, calls = traces,
R/Rprofmem-class.R:} ## as.data.frame()
R/Rprofmem-class.R: data <- as.data.frame(x, ...)
tests/profmem.R: data <- as.data.frame(p)
tests/profmem.R: d <- as.data.frame(p)
tests/profmem.R: d1 <- as.data.frame(p1)
tests/profmem.R: d2 <- as.data.frame(p2)
The vignette reports:
Memory allocations (>= 1000 bytes):
[...]
Where is that that 1000 threshold coming from?
Example:
> p <- profmem::profmem({ y <- log("a") })
> p
Rprofmem memory profiling of:
{
y <- log("a")
}
Memory allocations:
what bytes calls
total 0
> p <- profmem::profmem({ y <- non.existing::foo() })
> p
Rprofmem memory profiling of:
{
y <- non.existing::foo()
}
Memory allocations:
what bytes calls
total 0
and
> p <- profmem::profmem({ stop("boom") })
> p
Rprofmem memory profiling of:
{
stop("boom")
}
Memory allocations:
what bytes calls
total 0
Capture error and re-signal as a warning;
p <- profvis::profvis(log("a"), errors = "ignore") ## the default
p <- profvis::profvis(log("a"), errors = "warn")
Warning: profvis() detected a run-time error:
Error in log("a") : non-numeric argument to mathematical function
p <- profvis::profvis(log("a"), errors = "error")
Error in log("a") : non-numeric argument to mathematical function
print() should not report on NAs by default because there can be quite a few at times. It could have a line saying:
Number of NA entries not displayed: 34
Add option to print() and as.data.frame() to include them.
Integers 5000L
in code snippets are displayed as 5000
. This is due to a limitation in R.utils::withCapture()
;
> R.utils::withCapture(list(a=5000, b=5000L))
> list(a = 5000, b = 5000L)
$a
[1] 5000
$b
[1] 5000
which in turn is due to a limitation to how print()
outputs integers;
> list(a = 5000, b = 5000L)
$a
[1] 5000
$b
[1] 5000
> 5000
[1] 5000
> 5000L
[1] 5000
A declarative, efficient, and flexible JavaScript library for building user interfaces.
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. πππ
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google β€οΈ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.