Comments (6)
Hi, thank you for your interest in TruffleRuby. How long does the benchmark run? TruffleRuby needs time to warm up. Please also make sure that Truffle::Graal.graal?
returns true. TruffleRuby's BigDecimal is essentially an wrapper around Java's BigDecimal.
from truffleruby.
@pitr-ch it Truffle::Graal.graal?
returns true.
Changing the code as below I get better results but still 492ms for Truffle and 92ms for CRuby.
require 'bigdecimal'
cnt = 0
while true do
cc = BigDecimal(cnt)
cnt += 1
break if cnt > 100000
end
cnt = 0
t1 = Time.now
while true do
cc = BigDecimal(cnt)
cnt += 1
break if cnt > 100000
end
puts (Time.now - t1) * 1000
from truffleruby.
I rewrote your benchmark using benchmark-ips
which is designed to accommodate optimising implementations of Ruby.
require 'bigdecimal'
require 'benchmark/ips'
Benchmark.ips do |x|
x.iterations = 3
x.report("bigdecimal") do
cnt = 0
while true do
cc = BigDecimal(cnt)
cnt += 1
break if cnt > 100000
end
end
end
MRI 2.3.3: 4.276k (±12.8%) i/s
Rubinius 3.60: 2.554 (±78.3%) i/s
JRuby 9.1.6.0: 11.310 (± 8.8%) i/s
GraalVM 0.19: crashes :)
So we have a bug in compiling BigDecimal
code, which we need to fix. And also the fact that you weren't seeing the bug means you weren't triggering compilation with that benchmark, which is why I used benchmark-ips
. Also note that JRuby is slow with BigDecimal
. I think Java's BigDecimal
is a bit slow, and we use that as well, so I don't expect TruffleRuby to be much faster either. I'm not sure what is wrong with Rubinius.
BigDecimal
is not something that is going to be much faster on TruffleRuby. All Ruby implementations will use a system version of BigDecimal
that is probably already native code that is optimised well. There isn't much actual Ruby code here to be optimised outside of that native code.
from truffleruby.
I wrote about why I rewrote your benchmark https://github.com/graalvm/truffleruby/blob/truffle-head/doc/user/reporting-performance-problems.md.
from truffleruby.
Thanks @chrisseaton
from truffleruby.
This looks from experience like it's going to be a relatively simple problem to fix, so for interest I thought I'd document how I fix issues like this in TruffleRuby.
Having rewritten the benchmark to use benchmark-ips
the code is now attempted to be compiled by Graal and we see errors being reported. The errors don't stop the program, because an error in the compiler doesn't mean the program can't continue. We'd like to stop when the error occurs though, so I use the Graal option -J-Dgraal.TruffleCompilationExceptionsAreFatal=true
. Now I see the error and the program stops.
I then switched away from the GraalVM and to a development repository of TruffleRuby with a build of latest graal-core
. The erorr is still there.
The error output is really verbose and appears like a kind of stack trace. I'll read it from the bottom up and I see the stack trace from the compiler at the bottom there.
...
org.graalvm.compiler.truffle.OptimizedCallTarget.callRoot(OptimizedCallTarget.java:214)
at org.graalvm.compiler.replacements.PEGraphDecoder.tooDeepInlining(PEGraphDecoder.java:729)
at org.graalvm.compiler.replacements.PEGraphDecoder.doInline(PEGraphDecoder.java:585)
at org.graalvm.compiler.replacements.PEGraphDecoder.tryInline(PEGraphDecoder.java:570)
at org.graalvm.compiler.replacements.PEGraphDecoder.trySimplifyInvoke(PEGraphDecoder.java:490)
at org.graalvm.compiler.replacements.PEGraphDecoder.handleInvoke(PEGraphDecoder.java:464)
at org.graalvm.compiler.nodes.GraphDecoder.processNextNode(GraphDecoder.java:550)
at org.graalvm.compiler.nodes.GraphDecoder.decode(GraphDecoder.java:393)
at org.graalvm.compiler.replacements.PEGraphDecoder.decode(PEGraphDecoder.java:398)
...
The error is too deep inlining in the partial evaluator graph decode phase. The partial evaluator combines the runtime data structure that is the AST of this benchmark with the code that is the interpreter implementation methods written in Java, and partially evaluates (executes) as much of it as it can, leaving only the parts that depend on runtime data, which it will then feed into the rest of the Graal compiler. The graph decoder is the part that takes a Java program in bytecode and produces a Graal graph, on which the partial evaluator algorithm runs.
Looking further up, I see what looks like a kind of stack overflow, or infinite recursion.
java.math.BigInteger.square(BigInteger.java:1899)
java.math.BigInteger.squareToomCook3(BigInteger.java:2049)
java.math.BigInteger.square(BigInteger.java:1899)
java.math.BigInteger.squareToomCook3(BigInteger.java:2049)
java.math.BigInteger.square(BigInteger.java:1899)
java.math.BigInteger.squareToomCook3(BigInteger.java:2049)
java.math.BigInteger.square(BigInteger.java:1899)
java.math.BigInteger.multiply(BigInteger.java:1491)
java.math.BigInteger.pow(BigInteger.java:2302)
java.math.BigDecimal.bigTenToThe(BigDecimal.java:3543)
java.math.BigDecimal.bigDigitLength(BigDecimal.java:3820)
java.math.BigDecimal.precision(BigDecimal.java:2240)
java.math.BigDecimal.doRound(BigDecimal.java:3988)
java.math.BigDecimal.plus(BigDecimal.java:2195)
org.truffleruby.stdlib.bigdecimal.CreateBigDecimalNode.create(CreateBigDecimalNode.java:83)
The partial evaluator inlines all Java methods. This is essential because the AST interpreter is comprised of lots of little methods and if we didn't we wouldn't see many optimisation opportunities. To prevent inlining you use an annotation, called @TruffleBoundary
.
If your program recurses infinitely, Truffle will try to inline all those calls, and eventually it'll run out of memory (or actually it'll realise it's going too far and stop, which is what has happened here).
I look at where the recursion starts, and look at the last bit of code that we wrote, which is this CreateBigDecimalNode.create
. Looking at this there is some complexity here - creating a BigDecimal
appears to involve reading some kind of global state in the mode
variable. This isn't unusual for Ruby.
The stack trace shows that there is a call to BigDecimal.plus
, but looking at the Java code there isn't. Sometimes the partial evaluator can't report exact source information. I'm not sure why. But I can see a call to BigDecimal.round
, which I know calls BigDecimal.plus
.
It looks like BigDecimal.plus
is just not code that it makes sense to partially evaluate, since it has this recursion that is not statically knowable (or dynamically knowable with the profiling information we have) to be bounded. Rounding logic is often complex. The first thing I'll try is adding one of those @TruffleBoundary
annotations to prevent the partial evaluator inlining this call to round
(and so to plus
).
That works. No more errors. There are likely some other errors in BigDecimal
code like this because we clearly haven't exercised BigDecimal
in the compiler much. If you try another BigDecimal
benchmark you might see another similar error. We have a systematic way to fix these - we can run the specs with the compiler set to be very aggressive and always compiling things, but I don't have time right now to really dig into BigDecimal
.
I was running the benchmark incorrectly last time (I left a |times|
parameter in the benchmark-ips
block, which makes it behave differently, but then I wasn't actually using that parameter), so here's some fresh results.
TruffleRuby:
Warming up --------------------------------------
bigdecimal 1.000 i/100ms
bigdecimal 1.000 i/100ms
bigdecimal 6.000 i/100ms
Calculating -------------------------------------
bigdecimal 64.757 (±17.0%) i/s - 312.000 in 5.044669s
bigdecimal 65.775 (±13.7%) i/s - 324.000 in 5.054271s
bigdecimal 69.920 (± 2.9%) i/s - 354.000 in 5.067837s
MRI 2.4.0:
Warming up --------------------------------------
bigdecimal 1.000 i/100ms
bigdecimal 1.000 i/100ms
bigdecimal 1.000 i/100ms
Calculating -------------------------------------
bigdecimal 13.369 (± 7.5%) i/s - 67.000 in 5.022234s
bigdecimal 13.166 (± 0.0%) i/s - 66.000 in 5.018765s
bigdecimal 13.580 (± 7.4%) i/s - 68.000 in 5.016456s
JRuby 9.1.6.0:
Warming up --------------------------------------
bigdecimal 20.000 i/100ms
bigdecimal 23.000 i/100ms
bigdecimal 22.000 i/100ms
Calculating -------------------------------------
bigdecimal 227.979 (± 3.1%) i/s - 1.144k in 5.023113s
bigdecimal 221.881 (± 4.1%) i/s - 1.122k in 5.064217s
bigdecimal 226.781 (± 4.4%) i/s - 1.144k in 5.053341s
Rubinius 3.60:
Warming up --------------------------------------
bigdecimal 1.000 i/100ms
bigdecimal 1.000 i/100ms
bigdecimal 1.000 i/100ms
Calculating -------------------------------------
bigdecimal 2.082 (± 0.0%) i/s - 10.000
bigdecimal 2.562 (±78.1%) i/s - 10.000
bigdecimal 2.931 (±68.2%) i/s - 10.000
So relative to MRI, Rubinius is 0.2x slower, JRuby is a whopping 16.7x faster, and TruffleRuby is 5x faster. So we still have some work to do there - there's no reason we should be any slower than JRuby - but as I say I don't have time now to work on it for the sake of it. If you give me more BigDecimal
issues which you actually encounter and actually want fixed I'll fix them though. I'm not sure what is wrong with Rubinius. You could open an issue for that if you wanted, as I think they want issues for when they're slower than MRI.
Thanks for the issue! Fix is in 26e6a40, but it might not make it into GraalVM 0.20 as it's very close to the release.
from truffleruby.
Related Issues (20)
- dead handle in nokogiri test suite HOT 1
- Regression when `IO.wait_*` or `rb_io_wait` is interrupted by `Thread#kill` HOT 1
- Truffleruby platform mismatch for Gemfile HOT 4
- The buffer encoding should remain unchanged after read_nonblock(N, buffer) HOT 1
- Parsing floats fails when using locales with a decimal separator different than `.`
- Unable to install truffleruby+graalvm-21.3.0 & 21.0.0 HOT 5
- `jt test fast` fails with JVM CE environment HOT 4
- Excessive splitting with `Method#to_proc` HOT 4
- Failure in SimpleCov test suite HOT 8
- Emit a performance warning when redefining methods in CoreMethodAssumptions HOT 1
- Serious performance regression for method pow(a, m) HOT 7
- Prepending a module to Integer disables many Inlined*Node
- method_source compatibility problems introduced with TruffleRuby 24.0.0 HOT 3
- Error installing pg 1.1.4 HOT 1
- concurrent-ruby Fixed Thread Pool memory leak HOT 2
- Array#pack does not support :buffer kwarg HOT 1
- Monkey patching not working HOT 2
- TruffleRuby set `host_cpu` to `aarch64` on `arm64-darwin` causing `REUSE_AS_BINARY_ON_TRUFFLERUBY` to not work as expected HOT 3
- rails 7, rails new - no such filre or directory HOT 1
- gem error: OpenSSL is not available in Oracle Linux 9 HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from truffleruby.