Giter VIP home page Giter VIP logo

Comments (14)

sjakobi avatar sjakobi commented on May 30, 2024

I'm amazed! I was wondering whether cabal might somehow pick a different, stale executable, but I can still reproduce the issue when running the executables directly.

from tasty-bench.

Bodigrim avatar Bodigrim commented on May 30, 2024

I can reproduce the issue, but the slowdown cannot be attributed to --pattern option. If you pass -p ers (so that both readers and writers match), performance is as fast as without -p. And vice versa: if you do not pass -p at all, but just comment out bgroup "readers", performance does degrade.

from tasty-bench.

sjakobi avatar sjakobi commented on May 30, 2024

Good sleuthing! And what a weird effect!

from tasty-bench.

Bodigrim avatar Bodigrim commented on May 30, 2024

I was able to get somewhat better results with this patch: https://github.com/Bodigrim/pandoc/commit/95aa335146a650544dc7dd4cbaf8d3cc9bf12c95
I think the rest is a weird code layout / sharing / laziness issue, business as usual (sigh), but I do not know pandoc well enough to investigate further.

from tasty-bench.

sjakobi avatar sjakobi commented on May 30, 2024

Oof, sorry for bothering you with this. I didn't expect that the problem would be on the pandoc side!

from tasty-bench.

jgm avatar jgm commented on May 30, 2024

I can reproduce this too, and I have been trying various things (increasing strictness, forcing values, etc.), to no avail. What's really very odd is that if you use the pattern writers, everything is slower, but if you use asciidoc, it's fast again, even though there is no asciidoc reader, so it's only writers that are being tested in this case.

from tasty-bench.

jgm avatar jgm commented on May 30, 2024

Btw, I tried the patch above and it didn't make the problem go away.

from tasty-bench.

jgm avatar jgm commented on May 30, 2024

I tried switching to gauge, and I found that

  • I no longer get different results depending on whether I use the pattern writers
  • I get longer run times than reported by tasty-bench (~ 9.3ms for the asciidoc writer, as compared to 5.0 or 6.9 with tasty-bench, depending on the pattern used)

This, and the fact that I couldn't get the funny behavior to go away by increasing strictness or other changes in the benchmark suite, lead me to believe that this could in fact be an issue with tasty-bench.

from tasty-bench.

Bodigrim avatar Bodigrim commented on May 30, 2024

@jgm

Generally speaking, it is expected that executing only selected benchmarks (or the same benchmarks, but in a different order) can affect their measurements. That's because all benchmarks pay a tax for GC, which depends on a global heap layout. More often executing less benchmarks makes them faster, e. g., consider the following scenario:

testData :: String
testData = replicate 1000000 'a'

main = defaultMain
  [ bench "length" $ nf length testData
  , bench "square" $ nf (^2) 10
  , bench "genericLength" $ nf genericLength testData
  ]

Here the first benchmark allocates a huge amount of heap, which is retained because the third benchmark also uses it. It means that the second benchmark will be really slow: each GC kicking in during its execution is extremely expensive. Now if one rerun this suite with -p square, so that instead of a huge string only a small thunk is kept in heap, the results will be drastically faster.

It is a bit less expected and more counterintuitive, that executing only a few benchmarks can make them slower. A natural hypothesis would be that --pattern is expensive, but as discussed above this is not the case.

My (uneducated) guess is that since readers benchmarks involve corresponding writers as well (e. g., commonmark), there is something funny with sharing going on. Like, there is a thunk, which is referenced from both bench groups; as long as the second group is present, GC will never prune it. But once readers disabled, GC eagerly prunes this thunk, causing its reevaluation and slowing down writers. Dunno, it's hard for me to tell.

The issue of heap layout plagues all Haskell benchmarks. It was reported for criterion and while in this particular case gauge seems stable, it is not immune to it. One workaround is to run each benchmark in a separate process. If you are interested in exploring this path, I can come up with a Bash incantation.

Why am I blaming heap layout and GC for this conundrum? If I run pandoc benchmarks with -p ers +RTS -s, I see 50% of time spent in GC (this is already quite bad to achieve stable results from benchmarking). But if I run with -p writers +RTS -s, time spent in GC grows to 60%! I cannot really attribute this change to anything inherent to tasty-bench: we still use a pattern, and readers / writers are relatively long (several ms), so it's not like tasty-bench bookkeeping can eat comparable resources.

Now, if I increase nursery size with -A256m, the problem is gone, at least on my machine: -p writers +RTS -A256m produces the same measurements as just +RTS -A256m. That's the reason why I think that the issue is caused by RTS and heap layout and not by tasty-bench.

I suggest running benchmarks with increased nursery +RTS -A256m to eliminate this kind of noise.

(Upd.: I copied -A256m from pandoc's cabal.project, in fact something like -A32m could probably do as well)

from tasty-bench.

jgm avatar jgm commented on May 30, 2024

Thanks for the explanation and the observations -- and also for the suggestion!
I will try it (and maybe move back to tasty-bench, which I like).

from tasty-bench.

jgm avatar jgm commented on May 30, 2024

Reporting back: using +RTS -A256m for the benchmarks does fix things. (The effect of --pattern disappears, all the timings get shorter, and the timings from gauge and tasty-bench line up.) So I think you correctly diagnosed this issue and it can be closed.

from tasty-bench.

sjakobi avatar sjakobi commented on May 30, 2024

Awesome! It would be great if this advice was documented!

from tasty-bench.

Bodigrim avatar Bodigrim commented on May 30, 2024

@sjakobi I updated documentation in 2b4d1e4

from tasty-bench.

sjakobi avatar sjakobi commented on May 30, 2024

Looks great! Thank you! :)

from tasty-bench.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.