Comments (7)
Hey there @stavpie,
You are correct that we are missing documentation on how to run these performance tools. Let me help unblock you - the way you use them is via the ./bin/kafka-rest-run-class
script.
An example:
~/Documents/code/kafka-rest - (master) $ ./bin/kafka-rest-run-class io.confluent.kafkarest.tools.ProducerPerformance
> Usage: java io.confluent.kafkarest.tools.ProducerPerformance rest_url topic_name num_records record_size batch_size target_records_sec
An example (assuming the proxy is running on localhost:8013) with 100 records, each of size 100 produced to the topic_name
topic:
~/Documents/code/kafka-rest - (master) $ ./bin/kafka-rest-run-class io.confluent.kafkarest.tools.ProducerPerformance http://localhost:8013 topic_name 100 100 100 100
100 records processed, 81.900082 records/sec (0.01 MB/sec), 8.90 ms avg latency, 890.00 ms max latency, 890 ms 50th, 890 ms 95th, 890 ms 99th, 890 ms 99.9th.
from kafka-rest.
Using the (synchronous) ProducerPerformance and ConsumerPerformance tools, here are some numbers from EC2 m3.2xlarge instances with 1 ZK node, 1 broker, 100 byte messages, 1 ack:
INFO:_.NativeVsRestProducerPerformance:Producer performance: 294846.090341 per sec, 1137.000000 ms
INFO:_.NativeVsRestProducerPerformance:REST Producer performance: 193709.317463 per sec, 1214.000000 ms
INFO:_.NativeVsRestConsumerPerformance:Consumer performance: 123.821600 MB/s, 1298364.061300 msg/sec
INFO:_.NativeVsRestConsumerPerformance:REST Consumer performance: 1.590000 MB/s, 16348.791661 msg/sec
Given that the REST proxy requests are synchronous and there's overhead (the HTTP request, extra network hop, base64 encoding, JSON encoding, etc.), the producer performance doesn't look too much worse. The consumer is currently a lot worse. I haven't looked into where the bottleneck is yet; in addition to bugs, another possibility is that consumer requests use a default of 100 msgs/request by default. With small messages (100 bytes), this results in relatively small payloads which may drastically limit throughput. If this is the case, we may want to switch to a maximum # of bytes rather than max # of messages.
For reference, here are some baseline Kafka benchmark results on the same set of servers (replicating this blog post but on different hardware):
INFO:_.KafkaBenchmark:=================
INFO:_.KafkaBenchmark:BENCHMARK RESULTS
INFO:_.KafkaBenchmark:=================
INFO:_.KafkaBenchmark:Single producer, no replication: 684097.470208 rec/sec (65.240000 MB/s)
INFO:_.KafkaBenchmark:Single producer, async 3x replication: 667494.359673 rec/sec (63.660000 MB/s)
INFO:_.KafkaBenchmark:Single producer, sync 3x replication: 116485.764275 rec/sec (11.110000 MB/s)
INFO:_.KafkaBenchmark:Three producers, async 3x replication: 1696519.022182 rec/sec (161.790000 MB/s)
INFO:_.KafkaBenchmark:Message size:
INFO:_.KafkaBenchmark: 10: 1637825.195625 rec/sec (15.620000 MB/s)
INFO:_.KafkaBenchmark: 100: 605504.877911 rec/sec (57.750000 MB/s)
INFO:_.KafkaBenchmark: 1000: 90351.817570 rec/sec (86.170000 MB/s)
INFO:_.KafkaBenchmark: 10000: 8306.180862 rec/sec (79.210000 MB/s)
INFO:_.KafkaBenchmark: 100000: 978.403499 rec/sec (93.310000 MB/s)
INFO:_.KafkaBenchmark:Throughput over long run, data > memory:
INFO:_.KafkaBenchmark: Time block 0: 684725.151324 rec/sec (65.300000 MB/s)
INFO:_.KafkaBenchmark:Single consumer: 701031.140000 rec/sec (56.830500 MB/s)
INFO:_.KafkaBenchmark:Three consumers: 3304011.014900 rec/sec (267.830800 MB/s)
INFO:_.KafkaBenchmark:Producer + consumer:
INFO:_.KafkaBenchmark: Producer: 624984.375391 rec/sec (59.600000 MB/s)
INFO:_.KafkaBenchmark: Consumer: 624984.375391 rec/sec (59.600000 MB/s)
INFO:_.KafkaBenchmark:End-to-end latency: median 2.000000 ms, 99% 4.000000 ms, 99.9% 19.000000 ms
from kafka-rest.
As a simple test, I increased the max # of messages per consumer request to 1000 and ran the perf test again:
INFO:_.NativeVsRestConsumerPerformance:Consumer performance: 265.647400 MB/s, 2785515.320300 msg/sec
INFO:_.NativeVsRestConsumerPerformance:REST Consumer performance: 9.270000 MB/s, 95330.702206 msg/sec
So it looks like it's at least partially resolved by increasing that value. But the baseline performance also changed quite a bit, so there may be some consistency issues.
from kafka-rest.
Reran the performance test and the REST proxy is at somewhere between 8-10 MB/s. This is actually about the expected amount since currently the read will only use 1 thread per request whereas the consumer performance test uses 10 threads by default (and we have overhead for all the encoding for JSON + base64 binary, synchronous requests, etc). You can get the same boost in performance as the consumer performance test by using multiple parallel requests. This currently requires multiple consumers since reads in the REST proxy are by consumer-topic and not parallelizable within a consumer. If we want to improve that, we can file specific issues rather than having an open-ended performance bug. However, we also need to balance a potentially more complex implementation + API (depending on how we end up supporting parallelism) with the fact that we'll have a much better underlying consumer API (hopefully with better performance) in the near future that will likely result in a different API.
from kafka-rest.
Excellent. Thanks for this update @ewencp
from kafka-rest.
Hello @ewencp, sorry for this silly question but how can we use ProducerPerformance and ConsumerPerformance in order to execute perfomance tests? I have searched into the repo and I was not able to find any instructions.
from kafka-rest.
Thanks a lot @stanislavkozlovski :)
from kafka-rest.
Related Issues (20)
- Kafka Rest consumer record endpoint does not return message headers
- Reset offsets increments sent offset by 1
- REST PROXY Produce several records in one call with schema validation version 7.3.1
- kafka-rest for kraft HOT 1
- Configure Field in message to be used as Message Key in Kafka partitioning
- Unify environment variable prefixes from `KAFKAREST_` to `KAFKA_REST_`
- Reuse serializers
- Is latest version of kafka rest support Java 17 and Spring Boot Version 3.X
- How to run and config local with IntelliJ - Spring boot
- max.poll.records though configured at RestProxy configuration not working
- Config SSL and SASL
- Kafka rest uses vulnerable dependency CVE-2023-44981 HOT 2
- consuming one message per http request
- v2 - POST /topics/ - return full error why validation has been failed
- Inconsistent consumer instances | "Consumer instance not found" error HOT 1
- API endpoint to list consumers?
- Update KafkaRestProxies crd with option to specify external and internal listeneres, as Kafka, KsqlDb and Schemaregistry already has
- BUG: Error serializing Avro schema containing timestamp logical types HOT 2
- JSON messages produced to protobuf topics fail after evolution HOT 1
- 7.6.0
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from kafka-rest.