This repo is cool! I am really happy to have a test suite. This seems great for people

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

corsix/x64 was effectively merged into <code class="n

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

I have updated the CI to run from <a href="https://github.com/lukego/LuaJIT-test-clean

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

CI for benchmarks online about luajit-test-cleanup HOT 13 OPEN

lukego commented on August 27, 2024

CI for benchmarks online

from luajit-test-cleanup.

Comments (13)

siddhesh commented on August 27, 2024 2

@lukego we have set up a CI loop for luajit on the Linaro CI to run tests on commits to v2.1 on arm64:

https://ci.linaro.org/job/luajit-aarch64-perf/

We'll be happy to add an x86_64 node to it if you have one, or add an x86_64 node ourselves.

As for other architectures, please feel free to ping me either on this issue or personally to have more nodes added to the trigger. At some point we also need to figure out a place to report the results.

from luajit-test-cleanup.

corsix commented on August 27, 2024

corsix/x64 was effectively merged into v2.1, so I don't expect to be making any more commits to it. corsix/newgc on the other hand...

from luajit-test-cleanup.

lukego commented on August 27, 2024

@corsix Roger. I updated the config to test newgc instead of x64. The results will automatically go up on the permalink above.

from luajit-test-cleanup.

lukego commented on August 27, 2024

Is it hopelessly naive to simply run the benchmarks by evaluating them with no arguments? https://github.com/lukego/LuaJIT-branch-tests/blob/5043523d6cb59d35e7ecf5ee51f2253ab75d8675/default.nix#L57. I suppose that I should at least save the output to check if they are really working. Some execute very quickly.

@corsix do you need any special build options for newgc?

from luajit-test-cleanup.

MikePall commented on August 27, 2024

@lukego Maybe you missed those bench/PARAM* files that contain the N arguments to each benchmark? Scale as appropriate to give a run time of a couple seconds each. No point in running these more than a dozen times.

Consider verifying the checksum of the benchmark output against known good checksums for each N. E.g. generated with plain Lua or the C equivalents of the tests (you really need this for larger N).

Note that mandelbrot suffers from numerical instability and may give different results, depending on fused vs. unfused FP arithmetics on some platforms (JIT-compiled, i.e. fused is actually more accurate). And partialsums depends on the accuracy of a couple of math library functions, which isn't very good on some platforms.

from luajit-test-cleanup.

lukego commented on August 27, 2024

@MikePall Aha! Thanks for pointing out bench/PARAM*. Just the thing.

For me it is important to run tests 100+ times and to seed them with entropy. While we have issues like LuaJIT/LuaJIT#218 to contend with I think that benchmark results need to be interpreted as probability distributions rather than scalar values.

(The non-determinism is perhaps more important to me than to others. In the Snabb context we absolutely cannot have a situation where you deploy 100 routers and expect 5 of them to have half the capacity of the others. People are currently using lousy workarounds like detecting system overload and calling jit.flush() to roll the dice on a new trace. I need to find a proper solution to this & the CI has to show me improvements and regressions in how dependable performance is in the presence of workload entropy.)

from luajit-test-cleanup.

lukego commented on August 27, 2024

I have updated the CI to run from PARAM_x86_CI.txt from my branchmarks branch. This is closely based on PARAM_x86_CI.txt but I removed a couple that seemed to fail or hang.

The results permalink is the same. Hopefully the report is beginning to be meaningful. Now each benchmark takes between 0.1s and 10s which is hopefully a reasonable range for getting stable and meaningful results.

I have pulled the iteration count down to 12 from 100. The Relative Standard Deviation graph probably needs to be taken with a grain of salt. I will revisit this when time permits. (Just now I am running all the iterations in a bash loop which ties up a test server continuously. I should make each run into a separate Nix derivation so that the CI will schedule them intelligently e.g. parallelize across more servers and interleave with other CI tasks instead of blocking them.)

Notable difference by eyeball is that the report is no longer flagging corsix/newgc as slower on the binary-trees benchmark. Previously this benchmark was only running for around 0.001 seconds and so the difference may well have been due to some tiny constant factor.

from luajit-test-cleanup.

SameeraDes commented on August 27, 2024

I am trying to run the benchmarks in continuous integration job for Aarch64 port which is in v2.1. Is there any central CI system to which the Aarch64 tests be added, or I need to setup completely new CI job for the same?

from luajit-test-cleanup.

nico-abram commented on August 27, 2024

@lukego
https://hydra.snabb.co/build/3807227 errors with "Aborted: cannot connect to ‘[email protected]’: ssh: connect to host murren-1.snabb.co port 22: Connection timed out (propagated from build 3807225) "
This (https://hydra.snabb.co/build/3803719) seems to be the most recent passing build

from luajit-test-cleanup.

lukego commented on August 27, 2024

@nico-abram ah yes! The compute hosts running these LuaJIT benchmarks have recently been retired. I didn't think of this job because I haven't seen much activity here over the past few years and don't know how much interest there is.

If you want to run the benchmarks locally and generate the report you can use the instructions in the RaptorJIT README that I hope will work with standard LuaJIT too. I'm happy to advise if someone wants to troubleshoot a local setup or run a new CI.

If someone wants to sponsor running and updating a benchmark CI for LuaJIT then I'm also happy to help with that in my professional capacity at Snabb Solutions.

P.S. Here are some of the other ways that I put these tests to use while exploring the contribution of individual optimizations to overall performance:

Validating the HOTCOUNT table raptorjit/raptorjit#56.
Validating LuaJIT optimizations raptorjit/raptorjit#46
Validating LuaJIT micro-optimizations raptorjit/raptorjit#48

That last one turned up a potentially important micro-optimization:

md5 benchmark 15% speedup by removing "slow LEA" raptorjit/raptorjit#48

Surprisingly interesting to take simple benchmarks and use them to make systematic experiments!

from luajit-test-cleanup.

lukego commented on August 27, 2024

@SameeraDes Good question. This CI is based on Nix and Nix seems to support ARM these days. So it should be possible to add an ARM server onto the backend but I don't know how much hassle to expect. The sticky-tape solution could also be for random machines to post results to Git repos in plain text and for this CI to download those are build/publish the reports.

I am meaning to migrate over to https://www.hercules-ci.com/ but haven't made time for that yet.

from luajit-test-cleanup.

SameeraDes commented on August 27, 2024

Thanks for your response, @lukego
I have added CI based on Jenkins for ARM64 for now. It would be great if we can have central CI for all LuaJIT perf runs, I am willing to contribute for ARM64 port.

from luajit-test-cleanup.

lukego commented on August 27, 2024

@siddhesh Cool!

I am running a CI for RaptorJIT and related projects that sometimes covers LuaJIT too. I don't have spare machines to contribute to other CIs like yours though so please go ahead with your own.

from luajit-test-cleanup.

CI for benchmarks online about luajit-test-cleanup HOT 13 OPEN

Comments (13)

Related Issues (11)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent