Giter VIP home page Giter VIP logo

bench's Introduction

About

slinc vs jni benchmark

benchmark for simple operations

Qsort benchmark

Environment

  • Scala 3.2.2
  • JVM: OpenJDK Runtime Environment Zulu19.30+11-CA (build 19.0.1+10)
  • slinc: 0.1.1-110-7863cb
  • Apple clang version 13.1.6 (clang-1316.0.21.2.5)
  • slinc 0.1.1-110-7863cb

In general, it is clear that upcall JVM method from native is quite slow. The following change makes a program extremely slower(check SimpleNativeCallBenchmarks. jniNativeQSort and SimpleNativeCallBenchmarks.jniQSort).

(JNIEnv *jenv, jobject jobj, jintArray jarr){
  jint *arr = (*jenv)->GetIntArrayElements(jenv,jarr, 0);
  jsize len = (*jenv)->GetArrayLength(jenv,jarr);
-  qsort(arr,len,sizeof(int),compare_int);
+  qsort(arr,len,sizeof(int),upcall_compare_int);
  (*jenv)->ReleaseIntArrayElements(jenv,jarr,arr, 0);
  return;
};

There's only a small difference in performance between array copy back and forth and array copy without copy back. See SimpleNativeCallBenchmarks.slincQSortWithCopyBack and SimpleNativeCallBenchmarks.slincQSortWithoutCopyBack.1

As is mentioned in the comment, I confirmed that SimpleNativeCallBenchmarks.slincQsortAllocCallbackForEachIteration is much slower than slincQSortWithCopyBack and slincQSortWithoutCopyBack. Allocating upcall seems costly operation.

Having cloned your bench and having the callback allocated once (rather than per benchmark iteration), I see a improvement in performance of Slinc's upcall code to just 2x slower than JNI, rather than 5x slower

scala-interop/slinc#81 (comment)

SlinC one is around 2 times slower than JNI one when allocating callback in advance and 5x~ times slower when allocating callback for each iteration.

SlinC one is around 5 times slower than JNI one probably because SlincQsort transfers array between JVM and native, but I suspect there are other reasons why SlinC ones are slow because there is not large difference between slincQSortWithCopyBack and slincQSortWithoutCopyBack, which implies data transfer is not the bottleneck

Benchmark NOTE Mode Cnt Score Error Units
SimpleNativeCallBenchmarks.slincQSortJVM avgt 5 1774.509 ± 4.972 ns/op
SimpleNativeCallBenchmarks.jniNativeQSort Using native comparator. No upcall avgt 5 4272.838 ± 50.298 ns/op
SimpleNativeCallBenchmarks.jniQSort Using upcall. destructively mutate original array avgt 5 299570.811 ± 4542.836 ns/op
SimpleNativeCallBenchmarks.slincQSortWithCopyBack Using global shared upcall. Copy array back and forth. avgt 5 618014.439 ± 8280.107 ns/op
SimpleNativeCallBenchmarks.slincQSortWithoutCopyBack Using upcall. Copy and transfer array but not copy back. avgt 5 625336.580 ± 10471.754 ns/op
SimpleNativeCallBenchmarks.slincQsortAllocCallbackForEachIteration Allocating upcall for each iteration. avgt 5 1700443.210 ± 650331.220 ns/op

Feedback from SlinC author(@markehammons)

Having cloned your bench and having the callback allocated once (rather than per benchmark iteration), I see a improvement in performance of Slinc's upcall code to just 2x slower than JNI, rather than 5x slower. I think there may be more performance improvements to be found, but first I should make us able to generate an upcall from a method rather than a lambda and see what the performance from that looks like.

benchmark for more complex routine

For the following routine,

  1. copy string (jvm to native)
  2. invoke native call (jvm to native)
  3. dereferencing pointer (native to jvm)
  4. copy object (jvm to native)
  5. copy object (native to jvm)

with the following environment,

  • Scala 3.2.2
  • JVM: JDK 17.0.3, OpenJDK 64-Bit Server VM, 17.0.3+7-LTS
  • slinc: 0.1.1-110-7863cb
  • Apple clang version 13.1.6 (clang-1316.0.21.2.5)
  • slinc 0.10-110-7863cb

slinc one takes around 3x~ longer than jni.

Result

Benchmark Mode Cnt Score Error Units
NativeBenchmarks.jni avgt 5 5064.292 ± 593.829 ns/op
NativeBenchmarks.slinc avgt 5 16882.792 ± 1172.054 ns/op

However, just updating JDK to Zulu19.30+11-CA, slinc one gets nearly as fast as jni one.

JVM: OpenJDK Runtime Environment Zulu19.30+11-CA (build 19.0.1+10)

Benchmark Mode Cnt Score Error Units
NativeBenchmarks.jni avgt 5 4872.056 ± 57.582 ns/op
NativeBenchmarks.slinc avgt 5 5607.126 ± 115.210 ns/op

Caveat

JNI or FFI is not always the fastest solution as modern JVM is quote performant and communication between native and jvm is not free.

For example, see benchmark for quicksort and you can find SimpleNativeCallBenchmarks.slincQSortJVM is the fastest.

You should take overhead into consideration. Find a significant bottleneck and carefully measure the performance benefits before resorting to FFI.

Footnotes

  1. This is not the case with Slinc 0.3.0. It takes around 1.3x longer for with copyback than without copyback. Note that the author says it is the expected result as 0.3.0 focuses on refactoring rather than performance optimization.

bench's People

Contributors

i10416 avatar

Watchers

 avatar

bench's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.