Giter VIP home page Giter VIP logo

renaissance-benchmarks / renaissance Goto Github PK

View Code? Open in Web Editor NEW
307.0 25.0 57.0 146.57 MB

The Renaissance Benchmark Suite

Home Page: https://renaissance.dev

License: GNU General Public License v3.0

Scala 27.88% Shell 0.67% Batchfile 0.36% HTML 0.94% JavaScript 10.43% CSS 2.70% Makefile 0.01% Ruby 0.01% Java 5.77% Less 0.07% SCSS 0.08% SMT 51.08%
benchmark-suite scala java jvm-performance

renaissance's Introduction

Renaissance Benchmark Suite

The Renaissance Benchmark Suite aggregates common modern JVM workloads, including, but not limited to, Big Data, machine-learning, and functional programming. The suite is intended to be used to optimize just-in-time compilers, interpreters, GCs, and for tools such as profilers, debuggers, or static analyzers, and even different hardware. It is intended to be an open-source, collaborative project, in which the community can propose and improve benchmark workloads.

Running the suite

To run the suite, you will need to download a Renaissance Suite JAR from https://renaissance.dev/download. If you wish to build it yourself, please, consult CONTRIBUTING.md for instructions on building.

To run a Renaissance benchmark, you need to have a JRE version 11 (or later) installed and execute the following java command:

$ java -jar 'renaissance-gpl-0.15.0.jar' <benchmarks>

In the above command, <benchmarks> is the list of benchmarks that you want to run. You can refer to individual benchmarks, e.g., scala-kmeans, or a group of benchmarks, e.g., apache-spark.

The suite generally executes the benchmark's measured operation multiple times. By default, the suite executes each benchmark operation for a specific number of times. The benchmark-specific number of repetitions is only intended for quick visual evaluation of benchmark execution time, but is not sufficient for thorough experimental evaluation, which will generally need much more repetitions.

For thorough experimental evaluation, the benchmarks should be repeated for a large number of times or executed for a long time. The number of repetitions and the execution time can be set for all benchmarks using the -r or -t options. More fine-grained control over benchmark execution can be achieved by providing the harness with a plugin implementing a custom execution policy (see below for details).

Licensing

The Renaissance Suite comes in two distributions, and is available under both the MIT license and the GPL3 license. The GPL distribution with all the benchmarks is licensed under the GPL3 license, while the MIT distribution includes only those benchmarks that themselves have less restrictive licenses.

Depending on your needs, you can use either of the two distributions.

The list below contains the licensing information (and JVM version requirements) for each benchmark.

List of benchmarks

The following is the complete list of benchmarks, separated into groups.

apache-spark

  • als - Runs the ALS algorithm from the Spark ML library.
    Default repetitions: 30; APACHE2 license, MIT distribution; Supported JVM: 11 and later

  • chi-square - Runs the chi-square test from Spark MLlib.
    Default repetitions: 60; APACHE2 license, MIT distribution; Supported JVM: 11 and later

  • dec-tree - Runs the Random Forest algorithm from the Spark ML library.
    Default repetitions: 40; APACHE2 license, MIT distribution; Supported JVM: 11 and later

  • gauss-mix - Computes a Gaussian mixture model using expectation-maximization.
    Default repetitions: 40; APACHE2 license, MIT distribution; Supported JVM: 11 and later

  • log-regression - Runs the Logistic Regression algorithm from the Spark ML library.
    Default repetitions: 20; APACHE2 license, MIT distribution; Supported JVM: 11 and later

  • movie-lens - Recommends movies using the ALS algorithm.
    Default repetitions: 20; APACHE2 license, MIT distribution; Supported JVM: 11 and later

  • naive-bayes - Runs the multinomial Naive Bayes algorithm from the Spark ML library.
    Default repetitions: 30; APACHE2 license, MIT distribution; Supported JVM: 11 and later

  • page-rank - Runs a number of PageRank iterations, using RDDs.
    Default repetitions: 20; APACHE2 license, MIT distribution; Supported JVM: 11 and later

concurrency

  • akka-uct - Runs the Unbalanced Cobwebbed Tree actor workload in Akka.
    Default repetitions: 24; MIT license, MIT distribution; Supported JVM: 11 and later

  • fj-kmeans - Runs the K-Means algorithm using the fork/join framework.
    Default repetitions: 30; APACHE2 license, MIT distribution; Supported JVM: 11 and later

  • reactors - Runs benchmarks inspired by the Savina microbenchmark workloads in a sequence on Reactors.IO.
    Default repetitions: 10; MIT license, MIT distribution; Supported JVM: 11 and later

database

  • db-shootout - Executes a shootout test using several in-memory databases.
    Default repetitions: 16; APACHE2 license, MIT distribution; Supported JVM: 11 - 18

  • neo4j-analytics - Executes Neo4j graph queries against a movie database.
    Default repetitions: 20; GPL3 license, GPL3 distribution; Supported JVM: 17 and later

functional

  • future-genetic - Runs a genetic algorithm using the Jenetics library and futures.
    Default repetitions: 50; APACHE2 license, MIT distribution; Supported JVM: 11 and later

  • mnemonics - Solves the phone mnemonics problem using JDK streams.
    Default repetitions: 16; MIT license, MIT distribution; Supported JVM: 11 and later

  • par-mnemonics - Solves the phone mnemonics problem using parallel JDK streams.
    Default repetitions: 16; MIT license, MIT distribution; Supported JVM: 11 and later

  • rx-scrabble - Solves the Scrabble puzzle using the Rx streams.
    Default repetitions: 80; GPL2 license, GPL3 distribution; Supported JVM: 11 and later

  • scrabble - Solves the Scrabble puzzle using JDK Streams.
    Default repetitions: 50; GPL2 license, GPL3 distribution; Supported JVM: 11 and later

scala

  • dotty - Runs the Dotty compiler on a set of source code files.
    Default repetitions: 50; BSD3 license, MIT distribution; Supported JVM: 11 and later

  • philosophers - Solves a variant of the dining philosophers problem using ScalaSTM.
    Default repetitions: 30; BSD3 license, MIT distribution; Supported JVM: 11 and later

  • scala-doku - Solves Sudoku Puzzles using Scala collections.
    Default repetitions: 20; MIT license, MIT distribution; Supported JVM: 11 and later

  • scala-kmeans - Runs the K-Means algorithm using Scala collections.
    Default repetitions: 50; MIT license, MIT distribution; Supported JVM: 11 and later

  • scala-stm-bench7 - Runs the stmbench7 benchmark using ScalaSTM.
    Default repetitions: 60; BSD3, GPL2 license, GPL3 distribution; Supported JVM: 11 and later

web

  • finagle-chirper - Simulates a microblogging service using Twitter Finagle.
    Default repetitions: 90; APACHE2 license, MIT distribution; Supported JVM: 11 and later

  • finagle-http - Sends many small Finagle HTTP requests to a Finagle HTTP server and awaits response.
    Default repetitions: 12; APACHE2 license, MIT distribution; Supported JVM: 11 and later

The suite also contains a group of benchmarks intended solely for testing purposes:

dummy

  • dummy-empty - A dummy benchmark which only serves to test the harness.
    Default repetitions: 20; MIT license, MIT distribution; Supported JVM: 11 and later

  • dummy-failing - A dummy benchmark for testing the harness (fails during iteration).
    Default repetitions: 20; MIT license, MIT distribution; Supported JVM: 11 and later

  • dummy-param - A dummy benchmark for testing the harness (test configurable parameters).
    Default repetitions: 20; MIT license, MIT distribution; Supported JVM: 11 and later

  • dummy-setup-failing - A dummy benchmark for testing the harness (fails during setup).
    Default repetitions: 20; MIT license, MIT distribution; Supported JVM: 11 and later

  • dummy-teardown-failing - A dummy benchmark for testing the harness (fails during teardown).
    Default repetitions: 20; MIT license, MIT distribution; Supported JVM: 11 and later

  • dummy-validation-failing - A dummy benchmark for testing the harness (fails during validation).
    Default repetitions: 20; MIT license, MIT distribution; Supported JVM: 11 and later

If you are using an external tool to inspect a benchmark, such as an instrumentation agent, or a profiler, then you may need to make this tool aware of when a benchmark's measured operation is about to be executed and when it finished executing.

If you need to collect additional metrics associated with the execution of the measured operation, e.g., hardware counters, you will need to be notified about operation execution, and you may want to store the measured values in the output files produced by the harness.

If you need the harness to produce output files in different format (other than CSV or JSON), you will need to be notified about values of metrics collected by the harness and other plugins.

If you need more fine-grained control over the repetition of the benchmark's measured operation, you will need to be able to tell the harness when to keep executing the benchmark and when to stop.

To this end, the suite provides hooks for plugins which can subscribe to events related to harness state and benchmark execution.

This repository contains two such plugins: one that uses a native agent built with PAPI to collect information from hardware counters and a plugin for collecting information from a CompilationMXBean. See their respective READMEs under the plugin directory for more information.

If you wish to create your own plugin, please consult documentation/plugins.md for more details.

To make the harness use an external plugin, it needs to be specified on the command line. The harness can load multiple plugins, and each must be enabled using the --plugin <class-path>[!<class-name>] option. The <class-path> is the class path on which to look for the plugin class (optionally, you may add <class-name> to specify a fully qualified name of the plugin class).

Custom execution policy must be enabled using the --policy <class-path>!<class-name> option. The syntax is the same as in case of normal plugins (and the policy is also a plugin, which can register for all event types), but this option tells the harness to actually use the plugin to control benchmark execution. Other than that, policy is treated the same way as a plugin.

When registering plugins for pair events (harness init/shutdown, benchmark set up/tear down, operation set up/tear down), the plugins specified earlier will "wrap" plugins specified later. This means that for example plugins that want to collect additional measurements and need to invoked as close as possible to the measured operation need to be specified last. Note that this also applies to an external execution policy, which would be generally specified first, but any order is possible.

Plugins (and policies) can receive additional command line arguments. Each argument must be given using the --with-arg <arg> option, which appends <arg> to the list of arguments for the plugin (or policy) that was last mentioned on the command line. Whenever a --plugin (or --policy) option is encountered, the subsequent --with-arg options will append arguments to that plugin (or policy).

Complete list of command-line options

The following is a complete list of command-line options.

Renaissance Benchmark Suite, version 0.15.0
Usage: renaissance [options] [benchmark-specification]

  -h, --help               Prints this usage text.
  -r, --repetitions <count>
                           Execute the measured operation a fixed number of times.
  -t, --run-seconds <seconds>
                           Execute the measured operation for fixed time (wall-clock).
  --operation-run-seconds <seconds>
                           Execute the measured operation for fixed accumulated operation time (wall-clock).
  --policy <class-path>!<class-name>
                           Use policy plugin to control repetition of measured operation execution.
  --plugin <class-path>[!<class-name>]
                           Load external plugin. Can appear multiple times to load different plugins.
  --with-arg <value>       Adds an argument to the plugin or policy specified last. Can appear multiple times.
  --csv <csv-file>         Output results as CSV to <csv-file>.
  --json <json-file>       Output results as JSON to <json-file>.
  -c, --configuration <conf-name>
                           Use benchmark parameters from configuration <conf-name>.
  -o, --override <name>=<value>
                           Override the value of a configuration parameter <name> to <value>.
  --scratch-base <dir>     Create scratch directories in <dir>. Defaults to current directory.
  --keep-scratch           Keep the scratch directories after VM exit. Defaults to deleting scratch directories.
  --no-forced-gc           Do not force garbage collection before each measured operation. Defaults to forced GC.
  --no-jvm-check           Do not check benchmark JVM version requirements (for execution or raw-list).
  --list                   Print the names and descriptions of all benchmarks.
  --raw-list               Print the names of benchmarks compatible with this JVM (one per line).
  --group-list             Print the names of all benchmark groups (one per line).
  --benchmark-metadata <path-or-uri>
                           Path or an URI pointing to a .properties file with benchmark metadata.
  --standalone             Run harness in standalone mode. Disables benchmark module loader.
  benchmark-specification  List of benchmarks (or groups) to execute (or 'all').

JMH support

You can also build and run Renaissance with JMH. To build a JMH-enabled JAR, run:

$ tools/sbt/bin/sbt renaissanceJmhPackage

To run the benchmarks using JMH, you can execute the following java command:

$ java -jar 'renaissance-jmh/target/renaissance-jmh-0.15.0.jar'

Contributing

Please see the contribution guide for a description of the contribution process.

Documentation

Apart from documentation embedded directly in the source code, further information about design and internals of the suite can be found in the documentation folder of this project.

Support

When you find a bug in the suite, when you want to propose a new benchmark or ask for help, please, open an Issue at the project page at GitHub.

renaissance's People

Contributors

axel22 avatar blankspaceplus avatar bohdanqq avatar ceresek avatar davleopo avatar dependabot[bot] avatar ericcaspole avatar farquet avatar fithos avatar guilhas07 avatar jexp avatar jovanstevanovic avatar lbulej avatar lovisek avatar parttimenerd avatar reneleonhardt avatar vhotspur avatar villazon avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

renaissance's Issues

Decide on benchmark-specific parametrization

Should the benchmarks have configurable sizes and parallelism levels, or other parameters?
If so, how should a benchmark implementation inform the harness about its available parameters, and how should these be passed from the harness to the benchmark?

Remove or increase timeouts

In some benchmarks (particularly those running on Spark), there are timeouts that trigger (e.g., causing a TimeoutException) if a response does not come within a given timeframe.
While the timeout won't be triggered during a normal benchmark execution, expensive dynamic analyses performing heavy instrumentation may significantly slowdown the application, triggering the timeouts. As a result, the benchmark will fail prematurely, and the analysis won't be possible for such benchmark.

To support such kinds of analyses, this behavior should be minimized. All benchmarks should be revised to see whether they make use of timeouts and, if possible, remove them or significantly increase them.

OutOfMemoryError : MetaSpace

Due to the number of things loaded by sbt in the current setup, it is possible that users hit an OOM when they do :
tools/sbt/bin/sbt 'runMain org.renaissance.RenaissanceSuite <bench_name>'

So we should find a way to force a bigger meta space for all sbt tasks (or maybe just the assembly one which is triggered by the run task ?). This can be done with -XX:MaxMetaspaceSize=300M with 300M being an example. It should be set to something realistic instead.

Error looks like :

[info] Non-compiled module 'compiler-bridge_2.12' for Scala 2.12.3. Compiling...
java.lang.OutOfMemoryError: Metaspace
	at java.lang.ClassLoader.defineClass1(Native Method)
	at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
	at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
	at java.net.URLClassLoader.defineClass(URLClassLoader.java:468)
	at java.net.URLClassLoader.access$100(URLClassLoader.java:74)
	at java.net.URLClassLoader$1.run(URLClassLoader.java:369)
	at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:362)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
	at scala.collection.convert.AsScalaConverters.asScalaBuffer(AsScalaConverters.scala:107)
	at scala.collection.convert.AsScalaConverters.asScalaBuffer$(AsScalaConverters.scala:105)
	at scala.collection.JavaConverters$.asScalaBuffer(JavaConverters.scala:73)
	at scala.collection.convert.DecorateAsScala.$anonfun$asScalaBufferConverter$1(DecorateAsScala.scala:52)
	at scala.collection.convert.DecorateAsScala$$Lambda$8001/2003911583.apply(Unknown Source)
	at scala.collection.convert.Decorators$AsScala.asScala(Decorators.scala:25)
	at scala.tools.nsc.backend.jvm.opt.LocalOpt.methodOptimizations(LocalOpt.scala:194)
	at scala.tools.nsc.backend.jvm.GenBCode$BCodePhase$Worker2.$anonfun$localOptimizations$1(GenBCode.scala:248)
	at scala.tools.nsc.backend.jvm.GenBCode$BCodePhase$Worker2.localOptimizations(GenBCode.scala:248)
	at scala.tools.nsc.backend.jvm.GenBCode$BCodePhase$Worker2.run(GenBCode.scala:267)
	at scala.tools.nsc.backend.jvm.GenBCode$BCodePhase.buildAndSendToDisk(GenBCode.scala:384)
	at scala.tools.nsc.backend.jvm.GenBCode$BCodePhase.run(GenBCode.scala:350)
	at scala.tools.nsc.Global$Run.compileUnitsInternal(Global.scala:1431)
	at scala.tools.nsc.Global$Run.compileUnits(Global.scala:1416)
	at scala.tools.nsc.Global$Run.compileSources(Global.scala:1412)
	at scala.tools.nsc.Global$Run.compile(Global.scala:1515)
	at scala.tools.nsc.Driver.doCompile(Driver.scala:35)
	at scala.tools.nsc.MainClass.doCompile(Main.scala:24)
	at scala.tools.nsc.Driver.process(Driver.scala:55)
	at scala.tools.nsc.Main.process(Main.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

Investigate Delta Lake

Very recently, databricks announced the open sourcing of Delta Lake which provides ACID transactions on top of Apache Spark. Delta Lake is licensed under Apache 2.
It would be worth investigating if it could be an interesting benchmark to add.

The main task would be to be find out how to write a realistic application using their API and find a reasonable data set for it.

Add commands for listing the benchmarks

A user should be able to see the list of available benchmarks in the command line, using the e.g. --list flag.
Something along these lines:

benchmark | description
========= | ===========
page-rank | ...
...

We should also have a command that gives an easily parsible raw listing.

Possible finagle-chirper crash

Some users reported seeing this on some machines:

Exception in thread "Thread-938" Failure(connection timed out: localhost/127.0.0.1:51834 at remote address: localhost/127.0.0.1:51834. Remote Info: Not Available, flags=0x08) with RemoteInfo -> Upstream Address: Not Available, Upstream id: Not Available, Downstream Address: localhost/127.0.0.1:51834, Downstream label: :51834, Trace Id: 3b8504004e328e76.3b8504004e328e76<:3b8504004e328e76 with Service -> :51834
Caused by: com.twitter.finagle.ConnectionFailedException: connection timed out: localhost/127.0.0.1:51834 at remote address: localhost/127.0.0.1:51834. Remote Info: Not Available
	at com.twitter.finagle.netty4.ConnectionBuilder$$anon$1.operationComplete(ConnectionBuilder.scala:99)
	at com.twitter.finagle.netty4.ConnectionBuilder$$anon$1.operationComplete(ConnectionBuilder.scala:78)
	at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:511)
	at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:504)
	at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:483)
	at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:424)
	at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:121)
	at io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe$2.run(AbstractEpollChannel.java:570)
	at io.netty.util.concurrent.PromiseTask$RunnableAdapter.call(PromiseTask.java:38)
	at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:127)
	at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404)
	at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:335)
	at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:897)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at com.twitter.finagle.util.BlockingTimeTrackingThreadFactory$$anon$1.run(BlockingTimeTrackingThreadFactory.scala:23)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at java.lang.Thread.run(Thread.java:748)
Caused by: io.netty.channel.ConnectTimeoutException: connection timed out: localhost/127.0.0.1:51834
	at io.netty.channel.epoll.AbstractEpollChannel$AbstractEpollUnsafe$2.run(AbstractEpollChannel.java:568)
	... 11 more

JDK 9+ support

We should make sure that the suite works well with the the most recent JDK versions.
And the travis gate should include a JDK11 test and maybe newer if possible.

I currently see that there is an issue building the suite with JDK11 :

[info] Compiling 10 Scala sources to <suite-path>/benchmarks/apache-spark/target/scala-2.11/classes ...
[error] <suite-path>/benchmarks/apache-spark/src/main/scala/org/renaissance/apache/spark/PageRank.scala:43:52: value zipWithIndex is not a member of java.util.stream.Stream[String]
[error]       val sublist = for ((line, num) <- text.lines.zipWithIndex if num < MAX_LINE) yield line
[error]                                                    ^
[error] <suite-path>/renaissance/benchmarks/apache-spark/src/main/scala/org/renaissance/apache/spark/PageRank.scala:43:72: value < is not a member of Any
[error]       val sublist = for ((line, num) <- text.lines.zipWithIndex if num < MAX_LINE) yield line
[error]                                                                        ^
[error] two errors found

Make randomly generated input consistent between different JDKs

In some benchmarks, the input set is randomly generated before the first iteration using a fixed seed. While this ensures that, on the same JDK, the input set remains fixed, this assumption may not hold on different JDKs. I.e., the input set generated on JDK A may be different from the one generated on JDK B. Hence, the two workloads may differ. This is an important aspect when studying the determinism of the workloads. A good benchmark suite should minimize platform-specific behavior.

Reading the input set from a file or using a renaissance-specific RNG would remove this issue.

Benchmarks currently affected by this issue:

  • als
  • chi-square
  • gauss-mix
  • fj-kmeans
  • scala-kmeans
  • [please add more if you find some other workloads]

Add Travis definition file for basic testing

We should have some basic testing for basic Q&A. The Travis job at GitHub should:

  • build the project (renaissanceBundle)
  • check that the formatting is correct (renaissanceFormat)
  • run all the benchmarks

Project listing should be more robust

If you are a Mac user and you browse with the Finder until the benchmarks directory, the OS will create an hidden directory there. While loading the project, this hidden directory will be considered as a subproject and this will crash early with something like :

[error] Invalid build URI (no handler available): file:<path>/renaissance-benchmarks/benchmarks/.DS_Store/

While it's easy to exclude it, we should make the project listing more robust. Maybe only consider directories that contain a build.sbt file and ignore the rest.

Website mobile issues

Reported by a user:

Mobile view on renaissance doesnโ€™t work well for me; the page jitters

Unable to create plugin

Hello, I am unable to make plugins work. To me it looks like a classpath/classloader issue because of the proxies we use (since we otherwise pass things from java.* only it has not manifested itself yet?) but perhaps it is just a misconfiguration on my side?

[Context: I was working on JSON dump of the results but this prevented me from passing a class for storing the results to the actual benchmark.]

Steps to reproduce

  1. The following goes into renaissance-core/src/main/java/org/renaissance/ExamplePlugin.java:
package org.renaissance;

public class ExamplePlugin extends Plugin {
  public void beforeIteration(Policy policy) {
    System.out.println("Hey, I am plugin before iteration.");
  }

  @Override
  public void onCreation() {}

  @Override
  public void afterIteration(Policy policy, long duration) {}
}
  1. Compile via ./tools/sbt/bin/sbt assembly
  2. Run java -jar target/scala-2.12/renaissance-0.1.jar --plugins org.renaissance.ExamplePlugin -r 1 dummy
Exception occurred in org.renaissance.ProxyRenaissanceBenchmark@9660f4e: org.renaissance.ExamplePlugin cannot be cast to org.renaissance.Plugin
java.lang.ClassCastException: org.renaissance.ExamplePlugin cannot be cast to org.renaissance.Plugin
	at org.renaissance.RenaissanceBenchmark.runBenchmark(RenaissanceBenchmark.java:82)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.renaissance.ProxyRenaissanceBenchmark.call(ProxyRenaissanceBenchmark.scala:17)
	at org.renaissance.ProxyRenaissanceBenchmark.runBenchmark(ProxyRenaissanceBenchmark.scala:72)
	at org.renaissance.RenaissanceSuite$.$anonfun$main$3(renaissance-suite.scala:135)
	at org.renaissance.RenaissanceSuite$.$anonfun$main$3$adapted(renaissance-suite.scala:133)
	at scala.collection.Iterator.foreach(Iterator.scala:941)
	at scala.collection.Iterator.foreach$(Iterator.scala:941)
	at scala.collection.AbstractIterator.foreach(Iterator.scala:1429)
	at scala.collection.IterableLike.foreach(IterableLike.scala:74)
	at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
	at scala.collection.AbstractIterable.foreach(Iterable.scala:56)
	at org.renaissance.RenaissanceSuite$.main(renaissance-suite.scala:133)
	at org.renaissance.RenaissanceSuite.main(renaissance-suite.scala)

What I am doing wrong?

No "group" execution works (but is mentioned in --help)

It is not possible to run any of the groups (core, apache-spark, jdk-concurrent, etc.)

java -jar target/renaissance-0.1.jar core
Benchmark `core` does not exist.

The --help does mentions "groups"

benchmark-specification  Comma-separated list of benchmarks (or groups) that must be executed.

Probably a --group-list option could be added to have the list of groups?

neo4j-analytics and finagle-chirper have issues with JDK >= 9

When running these benchs with JDK >=9 (compiled with JDK8), a java.lang.IncompatibleClassChangeError exceptions is thrown:

java -jar target/scala-2.12/renaissance-0.1.jar neo4j-analytics
Checking previous DB remnants in /Users/alexvillazon/RENAISSANCE/renaissance-benchmarks/target/modules/neo4j/neo4j-analytics.db
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.neo4j.unsafe.impl.internal.dragons.UnsafeUtil (file:/Users/alexvillazon/RENAISSANCE/renaissance-benchmarks/target/modules/neo4j/neo4j-unsafe-3.3.4.jar) to constructor java.lang.String(char[],boolean)
WARNING: Please consider reporting this to the maintainers of org.neo4j.unsafe.impl.internal.dragons.UnsafeUtil
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
Creating indices...
Exception occurred in org.renaissance.ProxyRenaissanceBenchmark@315105f: Method org.neo4j.graphdb.Label.label(Ljava/lang/String;)Lorg/neo4j/graphdb/Label; must be InterfaceMethodref constant
java.lang.IncompatibleClassChangeError: Method org.neo4j.graphdb.Label.label(Ljava/lang/String;)Lorg/neo4j/graphdb/Label; must be InterfaceMethodref constant
	at org.renaissance.neo4j.analytics.AnalyticsBenchmark.createIndex(AnalyticsBenchmark.scala:190)
	at org.renaissance.neo4j.analytics.AnalyticsBenchmark.createIndices(AnalyticsBenchmark.scala:176)
	at org.renaissance.neo4j.analytics.AnalyticsBenchmark.setupAll(AnalyticsBenchmark.scala:64)
	at org.renaissance.neo4j.Neo4jAnalytics.setUpBeforeAll(Neo4jAnalytics.scala:31)
	at org.renaissance.RenaissanceBenchmark.runBenchmark(RenaissanceBenchmark.java:74)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:564)
	at org.renaissance.ProxyRenaissanceBenchmark.call(ProxyRenaissanceBenchmark.scala:17)
	at org.renaissance.ProxyRenaissanceBenchmark.runBenchmark(ProxyRenaissanceBenchmark.scala:72)
	at org.renaissance.RenaissanceSuite$.$anonfun$main$3(renaissance-suite.scala:135)
	at org.renaissance.RenaissanceSuite$.$anonfun$main$3$adapted(renaissance-suite.scala:133)
	at scala.collection.Iterator.foreach(Iterator.scala:941)
	at scala.collection.Iterator.foreach$(Iterator.scala:941)
	at scala.collection.AbstractIterator.foreach(Iterator.scala:1429)
	at scala.collection.IterableLike.foreach(IterableLike.scala:74)
	at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
	at scala.collection.AbstractIterable.foreach(Iterable.scala:56)
	at org.renaissance.RenaissanceSuite$.main(renaissance-suite.scala:133)
	at org.renaissance.RenaissanceSuite.main(renaissance-suite.scala)
java -jar target/scala-2.12/renaissance-0.1.jar finagle-chirper
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by com.twitter.jvm.Hotspot (file:/Users/alexvillazon/RENAISSANCE/renaissance-benchmarks/target/modules/twitter-finagle/util-jvm_2.11-19.4.0.jar) to field sun.management.ManagementFactoryHelper.jvm
WARNING: Please consider reporting this to the maintainers of com.twitter.jvm.Hotspot
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
Apr 23, 2019 7:19:37 AM com.twitter.finagle.Init$$anonfun$6 apply$mcV$sp
INFO: Finagle version 19.4.0 (rev=15ae0aba979a2c11ed4a71774b2e995f5df918b4) built at 20190418-114039
Master port: 61807
Cache ports: 61808, 61809, 61810, 61811
====== finagle-chirper (finagle), iteration 0 started ======
Resetting master, feed map size: 5000
Apr 23, 2019 7:19:39 AM com.twitter.finagle.util.DefaultMonitor logWithRemoteInfo
WARNING: Exception propagated to the default monitor (upstream address: /127.0.0.1:61815, downstream address: n/a, label: ).
java.util.concurrent.ExecutionException: java.lang.IncompatibleClassChangeError: Method java.util.Comparator.naturalOrder()Ljava/util/Comparator; must be InterfaceMethodref constant
	at com.twitter.util.ExecutorServiceFuturePool$$anon$4.run(FuturePool.scala:147)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:514)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641)
	at java.base/java.lang.Thread.run(Thread.java:844)
Caused by: java.lang.IncompatibleClassChangeError: Method java.util.Comparator.naturalOrder()Ljava/util/Comparator; must be InterfaceMethodref constant
	at org.renaissance.twitter.finagle.FinagleChirper$Master.mostRechirpsInAllFeeds(FinagleChirper.scala:105)
	at org.renaissance.twitter.finagle.FinagleChirper$Master$$anonfun$apply$10.apply(FinagleChirper.scala:184)
	at org.renaissance.twitter.finagle.FinagleChirper$Master$$anonfun$apply$10.apply(FinagleChirper.scala:183)
	at com.twitter.util.Try$.apply(Try.scala:26)
	at com.twitter.util.ExecutorServiceFuturePool$$anon$4.run(FuturePool.scala:140)
	... 5 more

Apr 23, 2019 7:19:39 AM com.twitter.finagle.util.DefaultMonitor logWithRemoteInfo
WARNING: Exception propagated to the default monitor (upstream address: /127.0.0.1:61816, downstream address: n/a, label: ).
java.util.concurrent.ExecutionException: java.lang.IncompatibleClassChangeError: Method java.util.Comparator.naturalOrder()Ljava/util/Comparator; must be InterfaceMethodref constant
	at com.twitter.util.ExecutorServiceFuturePool$$anon$4.run(FuturePool.scala:147)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:514)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641)
	at java.base/java.lang.Thread.run(Thread.java:844)
Caused by: java.lang.IncompatibleClassChangeError: Method java.util.Comparator.naturalOrder()Ljava/util/Comparator; must be InterfaceMethodref constant
	at org.renaissance.twitter.finagle.FinagleChirper$Master.mostRechirpsInAllFeeds(FinagleChirper.scala:105)
	at org.renaissance.twitter.finagle.FinagleChirper$Master$$anonfun$apply$10.apply(FinagleChirper.scala:184)
	at org.renaissance.twitter.finagle.FinagleChirper$Master$$anonfun$apply$10.apply(FinagleChirper.scala:183)
	at com.twitter.util.Try$.apply(Try.scala:26)
	at com.twitter.util.ExecutorServiceFuturePool$$anon$4.run(FuturePool.scala:140)

Most likely solving one, will solve the other.

Modify timing information in CSV/JSON output (PR #98)

For some processing it is useful to have information on absolute time of benchmark repetitions (start + end), for example to see how much time is spent between repetitions (GC or validation), or to correlate with events reported by the VM (GC log, JIT log), which have timestamps measured since VM start. I would suggest the following modifications:

  • At the start of the harness, acquire delta between System.nanoTime and VM start as reported by java.lang.Runtime MX bean (= nanoStartTime).
  • For each benchmark repetition, report time when the repetition started relative to VM start (nanoTime-nanoStartTime) and repetition duration (as reported now).
  • In JSON, and in comments in CSV header, include information on VM start as ISO UTC timestamp (to permit correlation with external logs).

Version is "?.?.?" (incorrectly reading the manifest on JDK8)

This issue seems to be a problem in MetaInfo.java when reading the "/META-INF/MANIFEST.MF"

On macOS, with JDK 8, the version is print as ?.?.?, whereas with JDK 9, it prints the version correctly.

When trying to print all the attributes as follows:

      Attributes mainAttributes = manifest.getMainAttributes();
      for (Iterator it = mainAttributes.keySet().iterator(); it.hasNext();) {
        Attributes.Name attrName = (Attributes.Name) it.next();
        String attrValue = mainAttributes.getValue(attrName);
        System.out.println(attrName + " : " + attrValue);
      }

on JDK 8, only two attributes are printed:

Created-By : 1.4.2_09 (Apple Computer, Inc.)
Manifest-Version : 1.0

and therefore reading mainAttributes.getValue("Renaissance-Version") fails. Note that "Created-By" is not in the manifest!

On JDK 9, all the attributes from the manifest file are read correctly.

Make it possible to run two Renaissance instances side by side

I'm not sure whether it is possible or not at the moment, but as we merge benchmarks with complex underlying frameworks, we should make sure that they can't step on each other's toes.

This is probably easy for things such as scratch space -- the harness should provide a benchmark with a scratch directory beneath a global scratch space.

It's probably going to be more difficult for things such as port numbers, if we use benchmarks that communicate over socket (Spark comes to mind).

dec-tree fails on Windows

Even with the work-around described in issue #58, dec-tree fails on Windows with the following exception:

Exception occurred in org.renaissance.ProxyRenaissanceBenchmark@13cc0b90: java.net.URISyntaxException: Relative path in absolute URI: file:C:/Users/Fithos/Desktop/spark-warehouse java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: file:C:/Users/Fithos/Desktop/spark-warehouse.

A possible fix is to set explicitly the spark.sql.warehouse.dir when the SparkSession is created, as described here.

Better error handling in neo4j-analytics

When terminating neo4j-analytics by killing its threads (necessary when the harness does not have support for execution timeout), the benchmark can reach a state where the database is shut down, accesses report org.neo4j.graphdb.DatabaseShutdownException: This database is shutdown. but the benchmark happily reports nonsensical repetition times (from 5-6 seconds per repetition to sub ms per repetition).

The benchmark should not report repetition times when the repetitions are obviously failing. We should check this for all benchmarks (that is, include result validation and report at least a warning if the results do not validate).

neo4j-analytics-stderr.log
neo4j-analytics-stdout.log

Prevent non ASCII characters in Java source files

Non english users on Mac OS will hit

[error] <path>/renaissance-benchmarks/benchmarks/rx/src/main/java/org/renaissance/rx/RxScrabbleImplementation.java:2:1: unmappable character for encoding ASCII

if they run sbt assembly since a non ASCII character appears in the source code because of locale environment variables set automatically by the OS.

We could add a travis check that prevents non ascii characters in java files (scala is not a problem).
The goal being to make the process as smooth as possible for new cloners of the repo.

Minimum heap size for each benchmark

In the last build, neo4j-analytics terminated with java.lang.OutOfMemoryError. The amount of RAM each Travis job gets is under 3G so that may not be enough.

Thus we might need to exclude some benchmarks from running on Travis (or run them with smaller workloads).

Any thoughts on this?

Provide a detailed description of each benchmark

We should provide a detailed high-level description of each benchmark, so that a reader can get a quick understanding of what a benchmark is doing with what data. For benchmarks that we actually implement locally, there is always at least the code, but a high-level description would be nice anyway.

However, for some benchmarks, we just have a bit of Scala code that calls into a big library with some data and that's it. This occurred to me when looking at the code of the Alternating Least Squares benchmark, which is exactly the case (it calls into Spark Machine Learning library).

In addition to what the benchmark is doing, we should also document what kind of parallelism it uses (if any).

Website minor issues

As pointed out on the jdk-dev mailing list, there are some minor issues on the website :

The website could be more perfect if the following issues be fixed:

  1. "target/scala-2.12" might be "target/renaissance-0.9.0.jar" on page[1] under "Building the suite".
  2. The page link[2], which was referenced in[1] under "Contributing", was not found there.

[1] https://renaissance.dev/docs
[2] https://renaissance.dev/CONTRIBUTION.md

Clean up suite design and API/SPI elements (tentative)

When looking at the changes related to the reworked class loading (#79), I realized that there are some elements of the suite design and the published SPI/API that I believe need a bit of a clean-up. When it comes to public SPI/API, I have this compulsive urge to (try to) do it right before first publishing the suite, but I do realize that I can't currently give it full attention in the remaining time to release due to other duties, so I'm not trying to pick a fight over it :-)

As far as benchmarks in the repository are concerned, we can deal with any API clean-ups/changes easily (I volunteer), so I'm mainly concerned for SPI/API users who create their own internal benchmarks/plugins/policies. Admittedly, it will take some time before there are any such users, so we might have quite some time here (all the time in the universe, if nobody cares).

Anyway, I would like to know what you think of what I have in mind:

Separate out API/SPI classes into a package
It would contain basically four interfaces that define elements used by the harness and the benchmarks. We could actually use the org.renaissance.core package for that purpose, because the package would be actually mostly SPI and a bit of API and the core package does not currently contain anything useful (the Dummy benchmark should be moved out of there). The package would include:

  • Benchmark interface, which represents a benchmark to the harness. It would be renamed from RenaissanceBenchmarkApi and instead of runIteration method it would have an execute() or simply run so that we avoid the work iteration, which we want to avoid talking about (we are trying to consistently use the word repetition.
  • Plugin interface. It is currently an abstract class, which I would reserve for providing default implementations, but not for an enterface entity. If there is some base code that would make sense to provide to potential plugin developers, we should make such an abstract class. But we should not force anyone to subclass when providing a plugin.
  • Policy interface. Again, it is currently an abstract class, which forces anyone who wants to provide a policy to subclass, which should be optional. The class currently creates policy factories but I believe that this is the task for the harness.
  • Config class, possibly modified or replaced. The Config object is currently provided to benchmarks, but it really is a harness config -- it contains things such as readme or printList flags. I think that possible consumers are at most plugins or policies, but not necessarily benchmarks. Benchmarks should receive a tailored configuration (or context) object, which tells them where their scratch directory is and what objects to use for input/output, possibly even provide some facade for filesystem operations so that benchmarks can request their resources through a central point. In addition to those, I think we may want to be able to pass various benchmark-specific properties (that the harness does not necessarily need to understand) as key-value pairs.
  • Context class (related to above discussion of Config), providing configuration relevant to benchmarks and possibly interface for some operations that we want to centralize in the harness.

Depend on the (renamed) Benchmark interface, not on RenaissanceBenchmark class
Even though we currently have the RenaissanceBenchmarkApi interface (which I propose to rename to Benchmark, and it's actually an SPI), in many places in the harness and in the Policy class (which I propose to turn into interface) we depend directly on the RenaissanceBenchmark class, basically forcing people to subclass it when they want to add a benchmark. Again, subclassing should be optional and not required for extension. As for the harness, it should depend on abstractions.

Split off benchmark executor/runner functionality from RenaissanceBenchmark class
The RenaissanceBenchmark class currently contains a runBenchmark() method, which handles the calling of plugins and uses the policy to execute the benchmark, i.e., itself. I think that the RenaissanceBenchmark class is hoarding responsibilities here. In my view, RenaissanceBenchmark should be a reasonable (optional) implementation base for a benchmark, period. The runBenchmark() and the runIterationWithBeforeAndAfter() methods are different responsibilities, that belong in an benchmark Executor / Runner class and the process of running a benchmark should be fixed, not defined in a class that is supposed to provide an implementation base for other benchmarks (i.e., provide reasonable default implementation of the Benchmark interface, the one renamed from RenaissanceBenchmarkApi). Moving those things away from RenaissanceBenchmark away would allow the harness to really depend on a Benchmark interface to only provide a benchmark implementation, instead of depending on a concrete class and force people to subclass it.

Move utility methods from RenaissanceBenchmark into utility classes
I understand that it may seem convenient to use some of the methods from the benchmark implementations directly, but I believe that, e.g., filesystem utility methods should be moved away into, e.g., FileUtils class living in org.renaissance.util package. I would actually prefer if benchmarks just asked the harness (or some context object) for any of the bundled resources and avoided manipulating filesystem on their own. That's something we can add later, but I would anyway want to avoid bloating the RenaissanceBenchmark class.

Let me know what you think. I know we are still missing bits that are commonly found in other harnesses, but those we can and will be adding later -- they will have relatively small impact on the API side of the suite. But I think we should also try not to have the design/API flaws that I have seen in the other harnesses (if not from the start, then at least later, rather than not at all :-). Some of the things (moving/renaming stuff and splitting off code to different classes) should be relatively easy, but coming up with the right Context and Config classes will require a bit more time.

Dump partial CSV/JSON output when benchmark interrupted

The CSV/JSON output is only dumped when the benchmark runs to completion. Consider dumping on external interrupt (CTRL-C) so that results collected so far are not lost ? (Maybe also mark the run with some sort of incomplete flag or non zero exit code in JSON.)

Exit code 0, even on failure (makes travis to report success on failure)

There is an issue with the exit code of the benchmark execution.

For example, in this travis execution executed on PR #92, there is an OutOfMemoryError but the exit code of the whole execution is still 0
https://travis-ci.com/D-iii-S/renaissance-benchmarks/jobs/196297155#L611

so the test in the .travis.yml https://github.com/D-iii-S/renaissance-benchmarks/blob/a2323e70d76ea47b7fe8b928a9a066d4c609dac1/.travis.yml#L21 will not report errors correctly.

The error-code propagation has some problems, as we can make renaissance fail, but the exit code is still 0:

java -Xmx16m  -jar ./target/renaissance-0.1.jar -r 1 neo4j
Checking previous DB remnants in /Users/alexvillazon/RENAISSANCE/renaissance-benchmarks/target/modules/neo4j/neo4j-analytics.db
Creating indices...
Populating vertices...
Exception occurred in org.renaissance.neo4j.Neo4jAnalytics@28fd3dc1: Java heap space
java.lang.OutOfMemoryError: Java heap space
	at java.util.Arrays.copyOf(Arrays.java:3332)
	at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124)
	at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:596)
	at java.lang.StringBuilder.append(StringBuilder.java:190)
	at org.apache.commons.io.output.StringBuilderWriter.write(StringBuilderWriter.java:142)
	at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:2538)
	at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:2516)
	at org.apache.commons.io.IOUtils.copy(IOUtils.java:2493)
	at org.apache.commons.io.IOUtils.copy(IOUtils.java:2441)
	at org.apache.commons.io.IOUtils.toString(IOUtils.java:1084)
	at org.renaissance.neo4j.analytics.AnalyticsBenchmark.populateVertices(AnalyticsBenchmark.scala:105)
	at org.renaissance.neo4j.analytics.AnalyticsBenchmark.setupAll(AnalyticsBenchmark.scala:77)
	at org.renaissance.neo4j.Neo4jAnalytics.setUpBeforeAll(Neo4jAnalytics.scala:31)
	at org.renaissance.RenaissanceBenchmark.runBenchmark(RenaissanceBenchmark.java:74)
	at org.renaissance.RenaissanceSuite$.$anonfun$main$2(renaissance-suite.scala:174)
	at org.renaissance.RenaissanceSuite$.$anonfun$main$2$adapted(renaissance-suite.scala:172)
	at org.renaissance.RenaissanceSuite$$$Lambda$91/445288316.apply(Unknown Source)
	at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
	at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
	at org.renaissance.RenaissanceSuite$.main(renaissance-suite.scala:172)
	at org.renaissance.RenaissanceSuite.main(renaissance-suite.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.renaissance.Launcher.main(Launcher.java:18)
$ echo $?
0

Note that dummy reports correct exitcode 1

java -Xmx2m  -jar ./target/renaissance-0.1.jar -r 1 dummy
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
	at java.lang.StringCoding$StringDecoder.decode(StringCoding.java:149)
	at java.lang.StringCoding.decode(StringCoding.java:193)
	at java.lang.String.<init>(String.java:426)
	at java.util.jar.Manifest.parseName(Manifest.java:304)
	at java.util.jar.Manifest.read(Manifest.java:258)
	at java.util.jar.Manifest.<init>(Manifest.java:81)
	at java.util.jar.Manifest.<init>(Manifest.java:73)
	at java.util.jar.JarFile.getManifestFromReference(JarFile.java:199)
	at java.util.jar.JarFile.getManifest(JarFile.java:180)
	at sun.misc.URLClassPath$JarLoader$2.getManifest(URLClassPath.java:992)
	at java.net.URLClassLoader.defineClass(URLClassLoader.java:451)
	at java.net.URLClassLoader.access$100(URLClassLoader.java:74)
	at java.net.URLClassLoader$1.run(URLClassLoader.java:369)
	at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:362)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
	at java.lang.ClassLoader.defineClass1(Native Method)
	at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
	at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
	at java.net.URLClassLoader.defineClass(URLClassLoader.java:468)
	at java.net.URLClassLoader.access$100(URLClassLoader.java:74)
	at java.net.URLClassLoader$1.run(URLClassLoader.java:369)
	at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:362)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
	at java.lang.ClassLoader.defineClass1(Native Method)
	at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
	at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
$ echo $?
1

Could it be related to classloading?

Website Completion

The website is mostly ready, but it is still missing a few things:

  • JavaScript logic that fetches the README.md and CONTRIBUTION.md and embeds the content into a special page on the website (so that the files and the website are always in sync) - Contribute link at the top is already there, Getting Started link also (should lead to a page with the README)
  • The link to the downloads page
  • Link to the Travis builds is there, but we should check if the link and the badge work
  • Link to the latest and historical benchmark results - we need an extra page for this
  • Link to the ScalaDoc/JavaDoc of the Renaissance API
  • Link to the download page
  • Link to a discussion forum - either Gitter or something else (Discuss at the top)

apache-spark benchmarks fail on Windows

All the apache-spark benchmarks currently merged fail on Windows with the following exception:

ERROR Shell: Failed to locate the winutils binary in the hadoop binary path java.io.Exception: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.

A work-around to this issue is to download winutils.exe, put it in (e.g.) C:\myfolder\bin, and set the HADOOP_PATH environment variable to C:\myfolder.

A possible persistent solution is to include winutils.exe in the bundle, and set the hadoop.home.dir system property from the code, as described here.

Add sbt command to clean all class files

I noticed that running clean in sbt, even followed by cleanFiles keeps a lot of class files around. If you are switching between JDKs, you will hit bad class file versions and you will have to remove them manually somehow.
Currently, I simply do

find . -type d -name 'target' -prune -print -exec rm -rf {} ';'

but that's dangerous and removes more than necessary.
We should introduce an sbt command renaissanceClean that properly removes those class files.

Reference : https://stackoverflow.com/questions/4483230/an-easy-way-to-get-rid-of-everything-generated-by-sbt

Spark based benchmarks fail on OpenJ9

Not sure if this is purely OpenJ9 issue or if we are involved with e.g. packaging, but the Spark based benchmarks fail on OpenJ9, here is an example dump from chi-square:

Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
19/05/07 15:43:22 INFO SparkContext: Running Spark version 2.0.0
19/05/07 15:43:23 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
19/05/07 15:43:23 WARN SparkConf: In Spark 1.0 and later spark.local.dir will be overridden by the value set by the cluster manager (via SPARK_LOCAL_DIRS in mesos/standalone and LOCAL_DIRS in YARN).
19/05/07 15:43:23 INFO SecurityManager: Changing view acls to: root
19/05/07 15:43:23 INFO SecurityManager: Changing modify acls to: root
19/05/07 15:43:23 INFO SecurityManager: Changing view acls groups to: 
19/05/07 15:43:23 INFO SecurityManager: Changing modify acls groups to: 
19/05/07 15:43:23 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(root); groups with view permissions: Set(); users  with modify permissions: Set(root); groups with modify permissions: Set()
19/05/07 15:43:24 INFO Utils: Successfully started service 'sparkDriver' on port 39263.
19/05/07 15:43:24 INFO SparkEnv: Registering MapOutputTracker
19/05/07 15:43:24 INFO SparkEnv: Registering BlockManagerMaster
19/05/07 15:43:24 INFO DiskBlockManager: Created local directory at /var/tmp/tmp.4h1JqcRmg7/chi_square927351186687148580/blockmgr-92b98c5d-cb1e-4a9b-9859-877a4b344304
19/05/07 15:43:24 INFO MemoryStore: MemoryStore started with capacity 7.0 GB
19/05/07 15:43:24 INFO SparkEnv: Registering OutputCommitCoordinator
19/05/07 15:43:24 INFO Utils: Successfully started service 'SparkUI' on port 4040.
19/05/07 15:43:24 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://10.10.49.19:4040
19/05/07 15:43:24 INFO Executor: Starting executor ID driver on host localhost
19/05/07 15:43:24 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 34441.
19/05/07 15:43:24 INFO NettyBlockTransferService: Server created on 10.10.49.19:34441
19/05/07 15:43:24 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 10.10.49.19, 34441)
19/05/07 15:43:24 INFO BlockManagerMasterEndpoint: Registering block manager 10.10.49.19:34441 with 7.0 GB RAM, BlockManagerId(driver, 10.10.49.19, 34441)
19/05/07 15:43:24 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 10.10.49.19, 34441)
Error during tear-down: null
java.lang.NullPointerException
	at scala.collection.mutable.ArrayOps$ofRef$.length$extension(ArrayOps.scala:192)
	at scala.collection.mutable.ArrayOps$ofRef.length(ArrayOps.scala:192)
	at scala.collection.SeqLike$class.size(SeqLike.scala:106)
	at scala.collection.mutable.ArrayOps$ofRef.size(ArrayOps.scala:186)
	at scala.collection.mutable.Builder$class.sizeHint(Builder.scala:69)
	at scala.collection.mutable.ArrayBuilder.sizeHint(ArrayBuilder.scala:22)
	at scala.collection.TraversableLike$class.builder$1(TraversableLike.scala:230)
	at scala.collection.TraversableLike$class.map(TraversableLike.scala:233)
	at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:186)
	at org.renaissance.apache.spark.ChiSquare.tearDownAfterAll(ChiSquare.scala:98)
	at org.renaissance.RenaissanceBenchmark.runBenchmark(RenaissanceBenchmark.java:97)
	at org.renaissance.RenaissanceSuite$.$anonfun$main$2(renaissance-suite.scala:308)
	at org.renaissance.RenaissanceSuite$.$anonfun$main$2$adapted(renaissance-suite.scala:306)
	at org.renaissance.RenaissanceSuite$$$Lambda$93.000000005C8BA540.apply(Unknown Source)
	at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
	at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
	at org.renaissance.RenaissanceSuite$.main(renaissance-suite.scala:306)
	at org.renaissance.RenaissanceSuite.main(renaissance-suite.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.renaissance.Launcher.main(Launcher.java:18)
Exception occurred in org.renaissance.apache.spark.ChiSquare@ec4bdca1: null
java.lang.ExceptionInInitializerError
	at java.lang.J9VMInternals.ensureError(J9VMInternals.java:148)
	at java.lang.J9VMInternals.recordInitializationFailure(J9VMInternals.java:137)
	at org.apache.commons.lang3.SerializationUtils.clone(SerializationUtils.java:88)
	at org.apache.spark.SparkContext$$anon$2.childValue(SparkContext.scala:336)
	at org.apache.spark.SparkContext$$anon$2.childValue(SparkContext.scala:332)
	at java.lang.ThreadLocal$ThreadLocalMap.<init>(ThreadLocal.java:391)
	at java.lang.ThreadLocal$ThreadLocalMap.<init>(ThreadLocal.java:298)
	at java.lang.ThreadLocal.createInheritedMap(ThreadLocal.java:255)
	at java.lang.Thread.initialize(Thread.java:356)
	at java.lang.Thread.<init>(Thread.java:324)
	at java.lang.Thread.<init>(Thread.java:117)
	at com.ibm.lang.management.internal.MemoryNotificationThread.<init>(MemoryNotificationThread.java:54)
	at com.ibm.lang.management.internal.ExtendedMemoryMXBeanImpl.<init>(ExtendedMemoryMXBeanImpl.java:50)
	at com.ibm.lang.management.internal.ExtendedMemoryMXBeanImpl.<clinit>(ExtendedMemoryMXBeanImpl.java:38)
	at com.ibm.java.lang.management.internal.ManagementUtils$Component.registerAll(ManagementUtils.java:728)
	at com.ibm.java.lang.management.internal.ManagementUtils$Metadata.<clinit>(ManagementUtils.java:1003)
	at com.ibm.java.lang.management.internal.ManagementUtils.getAllAvailableMXBeans(ManagementUtils.java:663)
	at java.lang.management.ManagementFactory$ServerHolder$1Registration.run(ManagementFactory.java:479)
	at java.lang.management.ManagementFactory$ServerHolder$1Registration.run(ManagementFactory.java:468)
	at java.security.AccessController.doPrivileged(AccessController.java:647)
	at java.lang.management.ManagementFactory$ServerHolder.<clinit>(ManagementFactory.java:505)
	at java.lang.management.ManagementFactory.getPlatformMBeanServer(ManagementFactory.java:247)
	at org.apache.spark.util.SizeEstimator$.getIsCompressedOops(SizeEstimator.scala:140)
	at org.apache.spark.util.SizeEstimator$.initialize(SizeEstimator.scala:112)
	at org.apache.spark.util.SizeEstimator$.<init>(SizeEstimator.scala:105)
	at org.apache.spark.util.SizeEstimator$.<clinit>(SizeEstimator.scala)
	at org.apache.spark.util.collection.SizeTracker$class.takeSample(SizeTracker.scala:78)
	at org.apache.spark.util.collection.SizeTracker$class.resetSamples(SizeTracker.scala:61)
	at org.apache.spark.util.collection.SizeTrackingVector.resetSamples(SizeTrackingVector.scala:25)
	at org.apache.spark.util.collection.SizeTracker$class.$init$(SizeTracker.scala:51)
	at org.apache.spark.util.collection.SizeTrackingVector.<init>(SizeTrackingVector.scala:26)
	at org.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:199)
	at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:919)
	at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:910)
	at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:866)
	at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:910)
	at org.apache.spark.storage.BlockManager.putIterator(BlockManager.scala:700)
	at org.apache.spark.storage.BlockManager.putSingle(BlockManager.scala:1213)
	at org.apache.spark.broadcast.TorrentBroadcast.writeBlocks(TorrentBroadcast.scala:103)
	at org.apache.spark.broadcast.TorrentBroadcast.<init>(TorrentBroadcast.scala:86)
	at org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:34)
	at org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:56)
	at org.apache.spark.SparkContext.broadcast(SparkContext.scala:1370)
	at org.apache.spark.SparkContext$$anonfun$hadoopFile$1.apply(SparkContext.scala:984)
	at org.apache.spark.SparkContext$$anonfun$hadoopFile$1.apply(SparkContext.scala:981)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
	at org.apache.spark.SparkContext.withScope(SparkContext.scala:682)
	at org.apache.spark.SparkContext.hadoopFile(SparkContext.scala:981)
	at org.apache.spark.SparkContext$$anonfun$textFile$1.apply(SparkContext.scala:802)
	at org.apache.spark.SparkContext$$anonfun$textFile$1.apply(SparkContext.scala:800)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
	at org.apache.spark.SparkContext.withScope(SparkContext.scala:682)
	at org.apache.spark.SparkContext.textFile(SparkContext.scala:800)
	at org.renaissance.apache.spark.ChiSquare.loadData(ChiSquare.scala:66)
	at org.renaissance.apache.spark.ChiSquare.setUpBeforeAll(ChiSquare.scala:89)
	at org.renaissance.RenaissanceBenchmark.runBenchmark(RenaissanceBenchmark.java:74)
	at org.renaissance.RenaissanceSuite$.$anonfun$main$2(renaissance-suite.scala:308)
	at org.renaissance.RenaissanceSuite$.$anonfun$main$2$adapted(renaissance-suite.scala:306)
	at org.renaissance.RenaissanceSuite$$$Lambda$93.000000005C8BA540.apply(Unknown Source)
	at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
	at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
	at org.renaissance.RenaissanceSuite$.main(renaissance-suite.scala:306)
	at org.renaissance.RenaissanceSuite.main(renaissance-suite.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.renaissance.Launcher.main(Launcher.java:18)
Caused by: java.lang.NullPointerException
	at org.apache.commons.lang3.SerializationUtils$ClassLoaderAwareObjectInputStream.<init>(SerializationUtils.java:300)
	at org.apache.commons.lang3.SerializationUtils.clone(SerializationUtils.java:88)
	at org.apache.spark.SparkContext$$anon$2.childValue(SparkContext.scala:336)
	at org.apache.spark.SparkContext$$anon$2.childValue(SparkContext.scala:332)
	at java.lang.ThreadLocal$ThreadLocalMap.<init>(ThreadLocal.java:391)
	at java.lang.ThreadLocal$ThreadLocalMap.<init>(ThreadLocal.java:298)
	at java.lang.ThreadLocal.createInheritedMap(ThreadLocal.java:255)
	at java.lang.Thread.initialize(Thread.java:356)
	at java.lang.Thread.<init>(Thread.java:324)
	at java.lang.Thread.<init>(Thread.java:231)
	at java.io.ClassCache$Reaper.<init>(ClassCache.java:221)
	at java.io.ClassCache$CreateReaperAction.run(ClassCache.java:212)
	at java.io.ClassCache$CreateReaperAction.run(ClassCache.java:201)
	at java.security.AccessController.doPrivileged(AccessController.java:647)
	at java.io.ClassCache.<init>(ClassCache.java:67)
	at java.io.ObjectInputStream.<clinit>(ObjectInputStream.java:346)
	... 69 more

Add automation for the website publishing

Since we'll be publishing from the website directory, we will need a job that invokes jekyll and copies the files to renaissance-benchmarks.github.io.

This is not critical for the first release, since we can just manually copy the files, but we should eventually automate it.

Design API that denotes which releases the benchmark belongs to

Since benchmarks may get added and removed across releases, and we still want to have a staging area of benchmarks under consideration, as well as keep the removed ones for convenience, there should be some way of denoting the releases to which a benchmark belongs.

One approach is to have something like this:

interface Benchmark {
  String[] releases();
}

And then by default list only those benchmarks for which the current Renaissance version is in the releases list.

Benchmark result validation

Each benchmark should be able to validate its results so we can detect broken runs. A proposed solution would be that runIteration would return object that could be later validated (i.e. the validation must occur outside the measured code).

interface BenchmarkResult {
    boolean validate();
}

...

class RenaissanceBenchmark {
    BenchmarkResult runIteration(Config config);
}

Travis: ensure all benchmarks are tested

Follow-up from #65:

Can we find a way to make sure that new pull requests add this [adding benchmark to .travis.yml] ? Some kind of Travis check to check the travis file :)

and

Opt: given that new benchmarks will be added in the future, we might want to consider auto-generating the list of benchmarks, either by somehow creating the matrix dynamically if that's possible, or by regenerating this file using a builtin command similar to --readme (we have the --raw-list command for that).

All spark-related benchs may fail if ports are used (spark.port.maxRetries not set)

Sometimes, if random ports are already used, benchmarks fail (it just checks once that the port is used and aborts)
When looking at the spark console (http://localhost:4040/environment/), the spark.port.maxRetries configuration is not set.
The current implementation is hardcoding the configuration values (e.g. in log-regression)

def setUpSpark() = {
    val conf = new SparkConf()
      .setAppName("logistic-regression")
      .setMaster(s"local[$THREAD_COUNT]")
      .set("spark.local.dir", tempDirPath.toString)
      .set("spark.sql.warehouse.dir", tempDirPath.resolve("warehouse").toString)
    sc = new SparkContext(conf)
    sc.setLogLevel("ERROR")
  }

One possible solution is to add this spark.port.maxRetries in the code:

def setUpSpark() = {
    val conf = new SparkConf()
      .setAppName("logistic-regression")
      .setMaster(s"local[$THREAD_COUNT]")
      .set("spark.local.dir", tempDirPath.toString)
      .set("spark.port.maxRetries","16")
      .set("spark.sql.warehouse.dir", tempDirPath.resolve("warehouse").toString)
    sc = new SparkContext(conf)
    sc.setLogLevel("ERROR")
  }

After adding this line, you will be able to see the spark.port.maxRetries correctly set in http://localhost:4040/environment/

Another solution could be to put the values in a configuration file conf/spark-defaults.conf as described here https://spark.apache.org/docs/preview/configuration.html

But this needs to the tested, as the configs should be in the renaissance jars

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.