I'm seeing some very weird behaviour in pandoc 's benc

I was able to get somewhat better results with this patch: <a href="https://github.com

I tried switching to gauge , and I found that <ul

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

--pattern option seems to impact benchmark results?! about tasty-bench HOT 14 CLOSED

bodigrim commented on May 30, 2024

--pattern option seems to impact benchmark results?!

from tasty-bench.

Comments (14)

sjakobi commented on May 30, 2024

I'm amazed! I was wondering whether cabal might somehow pick a different, stale executable, but I can still reproduce the issue when running the executables directly.

from tasty-bench.

Bodigrim commented on May 30, 2024

I can reproduce the issue, but the slowdown cannot be attributed to --pattern option. If you pass -p ers (so that both readers and writers match), performance is as fast as without -p. And vice versa: if you do not pass -p at all, but just comment out bgroup "readers", performance does degrade.

from tasty-bench.

sjakobi commented on May 30, 2024

Good sleuthing! And what a weird effect!

from tasty-bench.

Bodigrim commented on May 30, 2024

I was able to get somewhat better results with this patch: https://github.com/Bodigrim/pandoc/commit/95aa335146a650544dc7dd4cbaf8d3cc9bf12c95
I think the rest is a weird code layout / sharing / laziness issue, business as usual (sigh), but I do not know pandoc well enough to investigate further.

from tasty-bench.

sjakobi commented on May 30, 2024

Oof, sorry for bothering you with this. I didn't expect that the problem would be on the pandoc side!

from tasty-bench.

jgm commented on May 30, 2024

I can reproduce this too, and I have been trying various things (increasing strictness, forcing values, etc.), to no avail. What's really very odd is that if you use the pattern writers, everything is slower, but if you use asciidoc, it's fast again, even though there is no asciidoc reader, so it's only writers that are being tested in this case.

from tasty-bench.

jgm commented on May 30, 2024

Btw, I tried the patch above and it didn't make the problem go away.

from tasty-bench.

jgm commented on May 30, 2024

I tried switching to gauge, and I found that

I no longer get different results depending on whether I use the pattern writers
I get longer run times than reported by tasty-bench (~ 9.3ms for the asciidoc writer, as compared to 5.0 or 6.9 with tasty-bench, depending on the pattern used)

This, and the fact that I couldn't get the funny behavior to go away by increasing strictness or other changes in the benchmark suite, lead me to believe that this could in fact be an issue with tasty-bench.

from tasty-bench.

Bodigrim commented on May 30, 2024

@jgm

Generally speaking, it is expected that executing only selected benchmarks (or the same benchmarks, but in a different order) can affect their measurements. That's because all benchmarks pay a tax for GC, which depends on a global heap layout. More often executing less benchmarks makes them faster, e. g., consider the following scenario:

testData :: String
testData = replicate 1000000 'a'

main = defaultMain
  [ bench "length" $ nf length testData
  , bench "square" $ nf (^2) 10
  , bench "genericLength" $ nf genericLength testData
  ]

Here the first benchmark allocates a huge amount of heap, which is retained because the third benchmark also uses it. It means that the second benchmark will be really slow: each GC kicking in during its execution is extremely expensive. Now if one rerun this suite with -p square, so that instead of a huge string only a small thunk is kept in heap, the results will be drastically faster.

It is a bit less expected and more counterintuitive, that executing only a few benchmarks can make them slower. A natural hypothesis would be that --pattern is expensive, but as discussed above this is not the case.

My (uneducated) guess is that since readers benchmarks involve corresponding writers as well (e. g., commonmark), there is something funny with sharing going on. Like, there is a thunk, which is referenced from both bench groups; as long as the second group is present, GC will never prune it. But once readers disabled, GC eagerly prunes this thunk, causing its reevaluation and slowing down writers. Dunno, it's hard for me to tell.

The issue of heap layout plagues all Haskell benchmarks. It was reported for criterion and while in this particular case gauge seems stable, it is not immune to it. One workaround is to run each benchmark in a separate process. If you are interested in exploring this path, I can come up with a Bash incantation.

Why am I blaming heap layout and GC for this conundrum? If I run pandoc benchmarks with -p ers +RTS -s, I see 50% of time spent in GC (this is already quite bad to achieve stable results from benchmarking). But if I run with -p writers +RTS -s, time spent in GC grows to 60%! I cannot really attribute this change to anything inherent to tasty-bench: we still use a pattern, and readers / writers are relatively long (several ms), so it's not like tasty-bench bookkeeping can eat comparable resources.

Now, if I increase nursery size with -A256m, the problem is gone, at least on my machine: -p writers +RTS -A256m produces the same measurements as just +RTS -A256m. That's the reason why I think that the issue is caused by RTS and heap layout and not by tasty-bench.

I suggest running benchmarks with increased nursery +RTS -A256m to eliminate this kind of noise.

(Upd.: I copied -A256m from pandoc's cabal.project, in fact something like -A32m could probably do as well)

from tasty-bench.

jgm commented on May 30, 2024

Thanks for the explanation and the observations -- and also for the suggestion!
I will try it (and maybe move back to tasty-bench, which I like).

from tasty-bench.

jgm commented on May 30, 2024

Reporting back: using +RTS -A256m for the benchmarks does fix things. (The effect of --pattern disappears, all the timings get shorter, and the timings from gauge and tasty-bench line up.) So I think you correctly diagnosed this issue and it can be closed.

from tasty-bench.

sjakobi commented on May 30, 2024

Awesome! It would be great if this advice was documented!

from tasty-bench.

Bodigrim commented on May 30, 2024

@sjakobi I updated documentation in 2b4d1e4

from tasty-bench.

sjakobi commented on May 30, 2024

Looks great! Thank you! :)

from tasty-bench.

--pattern option seems to impact benchmark results?! about tasty-bench HOT 14 CLOSED

Comments (14)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent