Giter VIP home page Giter VIP logo

lesnyrumcajs / grpc_bench Goto Github PK

View Code? Open in Web Editor NEW
867.0 20.0 139.0 2 MB

Various gRPC benchmarks

License: MIT License

Shell 10.10% Dockerfile 19.78% Rust 6.22% C++ 11.49% Go 3.76% Ruby 3.33% Python 1.91% Scala 10.76% Java 11.86% Kotlin 2.01% Dart 1.31% Swift 2.94% Lua 0.87% Crystal 0.33% JavaScript 1.19% PHP 1.32% C# 3.03% Elixir 3.02% Haskell 0.54% Erlang 4.22%
benchmark grpc performance

grpc_bench's Introduction

image

CI Discord

One repo to finally have a clear, objective gRPC benchmark with code for everyone to verify and improve.

Contributions are most welcome! Feel free to use discussions if you have questions/issues or ideas. There is also a category where you are encouraged to submit your own benchmark results!

See Nexthink blog post for a deeper overview of the project and recent results.

Goal

The goal of this benchmark is to compare the performance and resource usage of various gRPC libraries across different programming languages and technologies. To achieve that, a minimal protobuf contract is used to not pollute the results with other concepts (e.g. performances of hash maps) and to make the implementations simple.

That being said, the service implementations should NOT take advantage of that and keep the code generic and maintainable. What does generic mean? One should be able to easily adapt the existing code to some fundamental use cases (e.g. having a thread-safe hash map on server side to provide values to client given some key, performing blocking I/O or retrieving a network resource).
Keep in mind the following guidelines:

  • No inline assembly or other, language specific, tricks / hacks should be used
  • The code should be (reasonably) idiomatic, built upon the modern patterns of the language
  • Don't make any assumption on the kind of work done inside the server's request handler
  • Don't assume all client requests will have the exact same content

You decide what is better

Although in the end results are sorted according to the number of requests served, one should go beyond and look at the resource usage - perhaps one implementation is slightly better in terms of raw speed but uses three times more CPU to achieve that. Maybe it's better to take the first one if you're running on a Raspberry Pi and want to get the most of it. Maybe it's better to use the latter in a big server with 32 CPUs because it scales. It all depends on your use case. This benchmark is created to help people make an informed decision (and get ecstatic when their favourite technology seems really good, without doubts).

Metrics

We try to provide some metrics to make this decision easier:

  • req/s - the number of requests the service was able to successfully serve
  • average latency, and 90/95/99 percentiles - time from sending a request to receiving the response
  • average CPU, memory - average resource usage during the benchmark, as reported by docker stats

What this benchmark does NOT take into account

  1. Completeness of the gRPC library. We test only basic unary RPC at the moment. This is the most common service method which may be enough for some business use cases, but not for the others. When you're happy about the results of some technology, you should check out it's documentation (if it exists) and decide yourself if is it production-ready.
  2. Taste. Some may find beauty in Ruby, some may feel like Java is the only real deal. Others treat languages as tools and don't care at all. We don't judge (officially ๐Ÿ˜‰ ). Unless it's a huge state machine with raw void pointers. Ups!

Prerequisites

Linux or MacOS with Docker. Keep in mind that the results on MacOS may not be that reliable, Docker for Mac runs on a VM.

Running benchmark

To build the benchmarks images use: ./build.sh [BENCH1] [BENCH2] ... . You need them to run the benchmarks.

To run the benchmarks use: ./bench.sh [BENCH1] [BENCH2] ... . They will be run sequentially.

To clean-up the benchmark images use: ./clean.sh [BENCH1] [BENCH2] ...

Configuring the benchmark

The benchmark can be configured through the following environment variables:

Name Description Default value
GRPC_BENCHMARK_DURATION Duration of the benchmark. 20s
GRPC_BENCHMARK_WARMUP Duration of the warmup. Stats won't be collected. 5s
GRPC_REQUEST_SCENARIO Scenario (from scenarios/) containing the protobuf and the data to be sent in the client request. complex_proto
GRPC_SERVER_CPUS Maximum number of cpus used by the server. 1
GRPC_SERVER_RAM Maximum memory used by the server. 512m
GRPC_CLIENT_CONNECTIONS Number of connections to use. 50
GRPC_CLIENT_CONCURRENCY Number of requests to run concurrently. It can't be smaller than the number of connections. 1000
GRPC_CLIENT_QPS Rate limit, in queries per second (QPS). 0 (unlimited)
GRPC_CLIENT_CPUS Maximum number of cpus used by the client. 1
GRPC_IMAGE_NAME Name of Docker image built by ./build.sh 'grpc_bench'

Parameter recommendations

  • GRPC_BENCHMARK_DURATION should not be too small. Some implementations need a warm-up before achieving their optimal performance and most real-life gRPC services are expected to be long running processes. From what we measured, 300s should be enough.
  • GRPC_SERVER_CPUS + GRPC_CLIENT_CPUS should not exceed total number of cores on the machine. The reason for this is that you don't want the ghz client to steal precious CPU cycles from the service under test. Keep in mind that having the GRPC_CLIENT_CPUS too low may not saturate the service in some of the more performant implementations. Also keep in mind limiting the number of GRPC_SERVER_CPUS to 1 will severely hamper the performance for some technologies - is running a service on 1 CPU your use case? It may be, but keep in mind eventual load balancer also incurs some costs.
  • GRPC_REQUEST_SCENARIO is a parameter to both build.sh and bench.sh. The images must be rebuilt each time you intend to use a scenario having a different helloworld.proto from the one ran previously.

Other parameters will depend on your use-case. Choose wisely.

Results

You can find our old sample results in the Wiki. Be sure to run the benchmarks yourself if you have sufficient hardware, especially for multi-core scenarios. New results will be posted to discussions and you are encouraged to publish yours as well!

grpc_bench's People

Contributors

akshaymankar avatar alban-io avatar amondnet avatar berestovskyy avatar bmarwell avatar brunoborges avatar denisgolius avatar dependabot[bot] avatar fenollp avatar gcnyin avatar he-pin avatar hxsf avatar ikhoon avatar jamesnk avatar joschi avatar jrudolph avatar jtjeferreira avatar kubo39 avatar lesnyrumcajs avatar michalszynkiewicz avatar mrmage avatar naoh87 avatar night-crawler avatar scala-steward avatar sleipnir avatar tradias avatar trezm avatar trisfald avatar vake93 avatar vincentdephily avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

grpc_bench's Issues

Add Java AoT GraalVM benchmark

Add a Java grpc benchmark with AoT compilation and GraalVm.

  • gRPC server implementation
  • Dockerfile to build an image with the gRPC server
  • update build.sh, clean.sh and bench.sh scripts

Define Goals and Metrics for the Benchmark

This benchmark is interesting and I personally think it is an important asset to help developers better choose what languages and runtimes to choose for brand new applications.

But here are couple things that I think should be addressed.

1. What is the goal of the benchmark?

An answer like "which one is faster" really doesn't help here, as "faster" can mean different things to different people. Faster also depends a lot on the environment chosen to run the benchmark, which is why we often see benchmarks with several scenarios (e.g. X CPUs, X RAM, X instances).

Perhaps a better answer is: "which runtime/language provides the best ratio of request/s per cpu/memory ratio". Some runtimes will behave better than others when less CPU/memory is given, but these same runtimes may not improve despite getting more CPU/memory.

And then when we consider cloud deployments with multiple instances, despite scaling out (more instances), the ratio is not growing as it should because maybe the load balancer now becomes the bottleneck.

So again, understanding how this comparison is done and what is its goal, is extremely important so developers can better judge the results.

2. Metrics for comparison

Benchmarks also provide quite often more than just an "average" result. Ideally, the benchmark should provide a 90th, 95th & 99th pct values for true comparison. Pure average will impact and unfairly represent statistics from runtimes that have JIT compilers that optimize code overtime during the load.

Duration is important as well. Truly honest benchmarks that aim at comparing languages and runtimes for microservices should go above an hour or so for their tests.

More could be added here, but this is what I have at the top of my head over the weekend :-)

All in all, thanks for putting this together.

I like very much this bench suite and I think it will go above and beyond.

Benchmark suite with LB for single-threaded servers

Based on @brunoborges idea in #76

Create a benchmark suite (ideally using the existing images) that would provide single-threaded implementations with a LB (to be chosen) so that the we can compare e.g. rust_tonic_st x 2 + LB against single dotnet appliance, both running on e.g. 3 cores.

Make benchmark more realistic

The current results aren't realistic because the method on the server does no work. In the real-world there will be I/O (reading from FS, DB, calling another service, etc) and an execution model that assumes no blocking will grind to a halt.

Consider either:

  1. Simulating work on the server by adding a delay. For example, the method waits 50ms before sending a response. In .NET that would look something like this:
public async Task<HelloReply> Hello(HelloRequest request, ServerCallContext context)
{
    await Task.Delay(50);
    return new HelloReply { Message = request.Name };
}
  1. Or having rules about implementation requirements. I don't know if you're familiar with TechEmpower benchmarks but they have rules about how implementations are written and behave.

Addressing encoder/decoder performance in benchmarks

First of all, thank you for your efforts!

Currently, I'm researching the possibility of employing Rust for some performance-critical microservices in our infrastructure. Unfortunately, the gRPC greeter does not seem to be enough to reveal the actual performance of concrete decoder/encoder implementations.

I have a repository that mimics a thing we mostly do: read something from gRPC, give it a bit of processing, and put it to Kafka.

On average the dry-run (skipping Kafka) gives me:

Profiling shows me this picture (I filtered not interesting things out):

image

It seems that the prost spends a lot of time processing the data. It's not a surprise, but it's interesting to compare it against some other implementations.

What do you think about adding some heavy production-like data structures that can help reveal more real-world-like performance?

Thank you.

Create benchmark scenarios script / script aggregate

So far we have a single script to run all benchmarks with some overridable defaults that we need to change by hand. It'd be good to have something official, to be even run overnight.

Some tests I can think of:

  1. 1,2,3 CPU, ghz unrestrained, we can see how much each implementation can support + resource usage
  2. CPU limit 6, ghz 20,30,40k req/s - we can see resource usage of each implementation given some pre-defined load

Those could be run in 1,5 and 10 minutes periods per suite - to see e.g. Java getting better over time.

Add CSV output for reports

To make it easier to visualize the results, sort them, integrate in e.g. spreadsheets. A relevant CSV file should be created in the results directory.

Another set of benchmark results

Feel free to close this or copy it into the Wiki; it is purely for informational purposes.

See inline for the command invocation. Run on a Google C2 compute-optimized instance with 16 cores and 64 GB RAM running Ubuntu 20.04 with the latest updates. Note the settings GRPC_CLIENT_CPUS=5 (to provide sufficient loadandGRPC_BENCHMARK_DURATION=30s` (due to time constraints).

daniel@grpc-bench:~/grpc_bench$ GRPC_CLIENT_CPUS=5 GRPC_BENCHMARK_DURATION=30s GRPC_SERVER_CPUS=1 ./bench.sh *_bench
0.0.1: Pulling from infoblox/ghz
Digest: sha256:ce02c4410816d3dfc8497dfd38f0f5100cf04aa6e7de8645228430ca49d62f41
Status: Image is up to date for infoblox/ghz:0.0.1
docker.io/infoblox/ghz:0.0.1
==> Running benchmark for cpp_grpc_mt_bench...
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
    Requests/sec:	17570.72
==> Running benchmark for cpp_grpc_st_bench...
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
    Requests/sec:	47470.30
==> Running benchmark for crystal_grpc_bench...
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
    Requests/sec:	29653.57
==> Running benchmark for csharp_grpc_bench...
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
    Requests/sec:	9496.82
==> Running benchmark for dart_grpc_bench...
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
    Requests/sec:	9146.68
Error response from daemon: No such container: dart_grpc_bench
==> Running benchmark for dotnet_grpc_bench...
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
    Requests/sec:	40450.96
==> Running benchmark for elixir_grpc_bench...
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
    Requests/sec:	6190.76
==> Running benchmark for go_grpc_bench...
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
    Requests/sec:	17289.95
==> Running benchmark for java_aot_bench...
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
    Requests/sec:	6578.12
==> Running benchmark for java_grpc_g1gc_bench...
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
    Requests/sec:	38236.71
==> Running benchmark for java_grpc_pgc_bench...
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
    Requests/sec:	40699.54
==> Running benchmark for java_grpc_sgc_bench...
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
    Requests/sec:	38608.68
==> Running benchmark for java_grpc_she_bench...
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
    Requests/sec:	37361.22
==> Running benchmark for java_grpc_zgc_bench...
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
    Requests/sec:	41331.83
==> Running benchmark for java_micronaut_bench...
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
    Requests/sec:	11189.75
==> Running benchmark for kotlin_grpc_bench...
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
    Requests/sec:	6982.95
==> Running benchmark for lua_grpc_st_bench...
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
    Requests/sec:	32052.83
==> Running benchmark for node_grpcjs_st_bench...
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
    Requests/sec:	10192.49
==> Running benchmark for node_grpc_st_bench...
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
    Requests/sec:	19421.70
==> Running benchmark for php_grpc_bench...
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
    Requests/sec:	3010.23
==> Running benchmark for python_grpc_bench...
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
    Requests/sec:	3629.02
==> Running benchmark for ruby_grpc_bench...
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
    Requests/sec:	4617.75
==> Running benchmark for rust_thruster_mt_bench...
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
    Requests/sec:	43606.76
==> Running benchmark for rust_thruster_st_bench...
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
    Requests/sec:	45417.94
==> Running benchmark for rust_tonic_mt_bench...
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
    Requests/sec:	46232.62
==> Running benchmark for rust_tonic_st_bench...
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
    Requests/sec:	48165.17
==> Running benchmark for scala_akka_bench...
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
    Requests/sec:	1043.99
==> Running benchmark for swift_grpc_bench...
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
    Requests/sec:	16365.13
-----
Benchmark finished. Detailed results are located in: results/212602T144033
--------------------------------------------------------------------------------------------------------------------------------
| name               |   req/s |   avg. latency |        90 % in |        95 % in |        99 % in | avg. cpu |   avg. memory |
--------------------------------------------------------------------------------------------------------------------------------
| rust_tonic_st      |   48165 |        0.99 ms |        0.80 ms |        0.91 ms |        9.62 ms |   70.26% |      4.93 MiB |
| cpp_grpc_st        |   47469 |        1.02 ms |        1.21 ms |        1.28 ms |        2.30 ms |   95.74% |      3.73 MiB |
| rust_tonic_mt      |   46231 |        1.05 ms |        1.40 ms |        1.59 ms |       10.28 ms |   86.29% |      5.06 MiB |
| rust_thruster_st   |   45417 |        1.04 ms |        0.83 ms |        0.97 ms |       16.49 ms |   69.48% |      5.84 MiB |
| rust_thruster_mt   |   43606 |        1.11 ms |        1.63 ms |        1.86 ms |        3.55 ms |   93.88% |      5.91 MiB |
| java_grpc_zgc      |   41332 |        1.16 ms |        1.14 ms |        1.39 ms |       23.72 ms |   84.15% |     104.3 MiB |
| java_grpc_pgc      |   40699 |        1.18 ms |        1.07 ms |        1.37 ms |       30.67 ms |   81.88% |    133.68 MiB |
| dotnet_grpc        |   40450 |        1.20 ms |        1.03 ms |        1.28 ms |       14.23 ms |   99.48% |     84.38 MiB |
| java_grpc_sgc      |   38610 |        1.25 ms |        1.30 ms |        1.57 ms |       24.00 ms |   88.85% |     65.34 MiB |
| java_grpc_g1gc     |   38235 |        1.26 ms |        1.22 ms |        1.62 ms |       23.66 ms |   88.07% |     87.04 MiB |
| java_grpc_she      |   37360 |        1.29 ms |        1.24 ms |        1.57 ms |       23.47 ms |   90.03% |    309.55 MiB |
| lua_grpc_st        |   32051 |        1.53 ms |        2.18 ms |        2.40 ms |        3.01 ms |  100.65% |       7.7 MiB |
| crystal_grpc       |   29652 |        1.66 ms |        0.67 ms |        0.94 ms |       43.95 ms |   91.36% |      7.57 MiB |
| node_grpc_st       |   19420 |        2.54 ms |        2.76 ms |        2.85 ms |        4.09 ms |   99.19% |     19.08 MiB |
| cpp_grpc_mt        |   17570 |        2.81 ms |        0.95 ms |        1.33 ms |       81.37 ms |   99.74% |     11.31 MiB |
| go_grpc            |   17289 |        2.85 ms |        1.07 ms |        1.59 ms |       81.06 ms |   100.0% |     15.84 MiB |
| swift_grpc         |   16364 |        3.02 ms |        3.24 ms |        3.31 ms |        5.65 ms |   99.77% |      2.92 MiB |
| java_micronaut     |   11189 |        4.43 ms |        2.74 ms |        4.90 ms |       71.79 ms |  100.05% |    138.08 MiB |
| node_grpcjs_st     |   10191 |        4.87 ms |        5.74 ms |        9.42 ms |       20.87 ms |  100.16% |     30.15 MiB |
| csharp_grpc        |    9494 |        5.22 ms |        2.52 ms |       59.81 ms |       81.41 ms |  101.25% |     79.03 MiB |
| dart_grpc          |    9145 |        5.44 ms |        3.19 ms |       46.99 ms |       65.20 ms |  100.57% |     30.46 MiB |
| kotlin_grpc        |    6981 |        7.12 ms |        3.09 ms |       75.18 ms |       81.01 ms |   100.9% |    158.72 MiB |
| java_aot           |    6577 |        7.56 ms |        3.33 ms |       82.28 ms |      105.56 ms |  101.27% |    138.64 MiB |
| elixir_grpc        |    6189 |        8.04 ms |        9.04 ms |        9.33 ms |       10.09 ms |   101.0% |     58.08 MiB |
| ruby_grpc          |    4593 |       10.79 ms |       37.24 ms |       40.09 ms |       42.22 ms |  100.51% |     23.21 MiB |
| python_grpc        |    3627 |       13.74 ms |       42.08 ms |       43.58 ms |       44.98 ms |   99.93% |      16.5 MiB |
| php_grpc           |    3009 |       16.56 ms |       88.29 ms |       88.93 ms |       89.80 ms |  101.16% |     42.83 MiB |
| scala_akka         |    1042 |       47.83 ms |       97.39 ms |      103.63 ms |      201.28 ms |   99.89% |     133.5 MiB |
--------------------------------------------------------------------------------------------------------------------------------
All done.

Re-enable Haskell benchmark

The Haskell benchmark must be updated to use the new protobuf contract:

message HelloRequest {
  Hello request = 1;
}

message HelloReply {
  Hello response = 1;
}

Little confusion with benchmark results

So, I am confused, why there's a huge difference in benchmarking results between:
2021 04 13 bench results
and
2021 05 10 bench results
What changed in setup or versions?

Re-enable Swift benchmark

The Swift benchmark must be updated to compile the protobuf on the fly and use the new contract:

message HelloRequest {
  Hello request = 1;
}

message HelloReply {
  Hello response = 1;
}

Add multiple core benchmark

Hey!

Great work on this benchmark! This benchmark is very timely as I was actually benching a few libraries the past few days as well! I noticed the results pretty much match what you get, but the results change when you use multiple cores.

So there should also be a multi-core benchmark as well a single core one. To see how each implementation maintains uses threads as it would represent a production environment more closely.

More details on Benchmark Results Wiki

It would be great if results published on the wiki provided more details:

  1. Commit hash/tag of source code used to run the bench
  2. Any possible tweaks done during the execution (e.g. GRPC_SERVER_CPUS)
  3. Log file

The results state that three benchmarks were done (1 CPU, 2 CPUs, and 3 CPUs), but it is unclear how these CPUs were properly allocated between the client ghz and the server container.

By running the benchmark without limiting the CPU of the client, there will be contention and CPU time taken by the client away from the server.

Here's a benchmark I did on a subset of the benchs on my 8-core Intel(R) Xeon(R) CPU E5-1620 v3 @ 3.50GHz server.

$ GRPC_BENCHMARK_DURATION=300s GRPC_SERVER_CPUS=2 GRPC_CLIENT_CPUS=6 ./bench.sh java_micronaut* java_grpc* node* dotnet* go_* rust_*
0.0.1: Pulling from infoblox/ghz
Digest: sha256:ce02c4410816d3dfc8497dfd38f0f5100cf04aa6e7de8645228430ca49d62f41
Status: Image is up to date for infoblox/ghz:0.0.1
docker.io/infoblox/ghz:0.0.1
==> Running benchmark for java_micronaut_bench...
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
    Requests/sec:	24082.78
==> Running benchmark for java_grpc_bench...
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
    Requests/sec:	56083.47
==> Running benchmark for java_grpc_pgc_bench...
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
    Requests/sec:	58381.94
==> Running benchmark for java_grpc_she_bench...
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
    Requests/sec:	55687.58
==> Running benchmark for java_grpc_zgc_bench...
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
    Requests/sec:	57031.26
==> Running benchmark for node_grpcjs_st_bench...
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
    Requests/sec:	8844.26
==> Running benchmark for node_grpc_st_bench...
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
    Requests/sec:	11968.34
==> Running benchmark for dotnet_grpc_bench...
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
    Requests/sec:	34672.57
==> Running benchmark for go_grpc_bench...
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
    Requests/sec:	36212.07
==> Running benchmark for rust_thruster_bench...
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
    Requests/sec:	5360.58
==> Running benchmark for rust_tonic_mt_bench...
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
    Requests/sec:	42860.32
==> Running benchmark for rust_tonic_st_bench...
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
    Requests/sec:	43511.97
-----
Benchmark finished. Detailed results are located in: results/201208T165544
--------------------------------------------------------------------------------------------------------------------------------
| name               |   req/s |   avg. latency |        90 % in |        90 % in |        95 % in |  99 % in |      avg. cpu |
--------------------------------------------------------------------------------------------------------------------------------
| java_grpc_pgc      |   58382 |        0.80 ms |        1.52 ms |        1.52 ms |        2.14 ms |  5.20 ms |       170.42% |
| java_grpc_zgc      |   57032 |        0.82 ms |        1.43 ms |        1.43 ms |        2.01 ms |  5.02 ms |       178.71% |
| java_grpc          |   56083 |        0.84 ms |        1.57 ms |        1.57 ms |        2.16 ms |  5.04 ms |       170.64% |
| java_grpc_she      |   55688 |        0.85 ms |        1.73 ms |        1.73 ms |        2.42 ms |  6.28 ms |        180.6% |
| rust_tonic_st      |   43512 |        1.10 ms |        1.27 ms |        1.27 ms |        1.40 ms |  2.13 ms |        98.97% |
| rust_tonic_mt      |   42860 |        1.12 ms |        1.66 ms |        1.66 ms |        1.89 ms |  2.69 ms |       189.13% |
| go_grpc            |   36212 |        1.32 ms |        1.85 ms |        1.85 ms |        2.50 ms | 14.20 ms |       205.25% |
| dotnet_grpc        |   34672 |        1.39 ms |        1.92 ms |        1.92 ms |        2.28 ms |  5.24 ms |       180.82% |
| java_micronaut     |   24082 |        2.02 ms |        2.87 ms |        2.87 ms |        4.18 ms | 39.78 ms |       200.75% |
| node_grpc_st       |   11968 |        4.09 ms |        4.43 ms |        4.43 ms |        4.60 ms |  5.88 ms |       102.26% |
| node_grpcjs_st     |    8844 |        5.56 ms |        6.58 ms |        6.58 ms |        7.77 ms | 11.15 ms |        113.4% |
| rust_thruster      |    5360 |        9.25 ms |       43.61 ms |       43.61 ms |       44.06 ms | 45.01 ms |         51.1% |
--------------------------------------------------------------------------------------------------------------------------------
All done.

This is the source I used: https://github.com/brunoborges/grpc_bench/tree/0ea80a174caf9f4763cbeac7b8976da149e2d745

Errors running benchmark in CI with "RUN BENCHMARK" issue

I tried running the benchmark in CI(In a fork) and got a number of errors:

==> Error building ./lua_grpc_st_bench
==> Error building ./kotlin_grpc_bench
 ==> Error building ./java_aot_bench
==> Error building ./haskell_mu_bench

They all seem to be related to running out of disk space

Python server with grpc.aio (AsyncIO)

Provide please a benchmark with a server built on top of asyncio - this is a recommended way nowadays for Python.

See an example:
https://github.com/grpc/grpc/blob/master/examples/python/helloworld/async_greeter_server.py

API Reference:
https://grpc.github.io/grpc/python/grpc_asyncio.html

gRPC AsyncIO API is the new version of gRPC Python whose architecture is tailored to AsyncIO. Underlying, it utilizes the same C-extension, gRPC C-Core, as existing stack, and it replaces all gRPC IO operations with methods provided by the AsyncIO library.

Re-enable Dart benchmark

Dart benchmark stopped working. It was disabled temporarily, would be nice to re-enable it once the issue is identified and fixed.

Re-enable Erlang benchmark

Erlang benchmark stopped working (on my local machine all requests failed according to ghz report). Would be nice to identify the issue and resolve it. I temporarily disabled it to unblock CI.

Cache the images

The build-images step takes annoyingly long (e.g. for Lua that needs to build the entire gRPC library). We should to devise a strategy to cache the built images / don't build them at each run if there were no changes. Any ideas?

Rust diff

Hi, what different between rust impl gRPC:

rust_tonic_st
rust_tonic_mt

Re-enable PHP swoole benchmark

The PHP swoole benchmark must be updated to use the new protobuf contract:

message HelloRequest {
  Hello request = 1;
}

message HelloReply {
  Hello response = 1;
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.