Giter VIP home page Giter VIP logo

Comments (7)

stanislavkozlovski avatar stanislavkozlovski commented on May 8, 2024 4

Hey there @stavpie,

You are correct that we are missing documentation on how to run these performance tools. Let me help unblock you - the way you use them is via the ./bin/kafka-rest-run-class script.

An example:

~/Documents/code/kafka-rest -  (master) $ ./bin/kafka-rest-run-class io.confluent.kafkarest.tools.ProducerPerformance

> Usage: java io.confluent.kafkarest.tools.ProducerPerformance rest_url topic_name num_records record_size batch_size target_records_sec

An example (assuming the proxy is running on localhost:8013) with 100 records, each of size 100 produced to the topic_name topic:

 ~/Documents/code/kafka-rest -  (master) $ ./bin/kafka-rest-run-class io.confluent.kafkarest.tools.ProducerPerformance http://localhost:8013 topic_name 100 100 100 100
100 records processed, 81.900082 records/sec (0.01 MB/sec), 8.90 ms avg latency, 890.00 ms max latency, 890 ms 50th, 890 ms 95th, 890 ms 99th, 890 ms 99.9th.

from kafka-rest.

ewencp avatar ewencp commented on May 8, 2024

Using the (synchronous) ProducerPerformance and ConsumerPerformance tools, here are some numbers from EC2 m3.2xlarge instances with 1 ZK node, 1 broker, 100 byte messages, 1 ack:

INFO:_.NativeVsRestProducerPerformance:Producer performance: 294846.090341 per sec, 1137.000000 ms
INFO:_.NativeVsRestProducerPerformance:REST Producer performance: 193709.317463 per sec, 1214.000000 ms
INFO:_.NativeVsRestConsumerPerformance:Consumer performance: 123.821600 MB/s, 1298364.061300 msg/sec
INFO:_.NativeVsRestConsumerPerformance:REST Consumer performance: 1.590000 MB/s, 16348.791661 msg/sec

Given that the REST proxy requests are synchronous and there's overhead (the HTTP request, extra network hop, base64 encoding, JSON encoding, etc.), the producer performance doesn't look too much worse. The consumer is currently a lot worse. I haven't looked into where the bottleneck is yet; in addition to bugs, another possibility is that consumer requests use a default of 100 msgs/request by default. With small messages (100 bytes), this results in relatively small payloads which may drastically limit throughput. If this is the case, we may want to switch to a maximum # of bytes rather than max # of messages.

For reference, here are some baseline Kafka benchmark results on the same set of servers (replicating this blog post but on different hardware):

INFO:_.KafkaBenchmark:=================
INFO:_.KafkaBenchmark:BENCHMARK RESULTS
INFO:_.KafkaBenchmark:=================
INFO:_.KafkaBenchmark:Single producer, no replication: 684097.470208 rec/sec (65.240000 MB/s)
INFO:_.KafkaBenchmark:Single producer, async 3x replication: 667494.359673 rec/sec (63.660000 MB/s)
INFO:_.KafkaBenchmark:Single producer, sync 3x replication: 116485.764275 rec/sec (11.110000 MB/s)
INFO:_.KafkaBenchmark:Three producers, async 3x replication: 1696519.022182 rec/sec (161.790000 MB/s)
INFO:_.KafkaBenchmark:Message size:
INFO:_.KafkaBenchmark: 10: 1637825.195625 rec/sec (15.620000 MB/s)
INFO:_.KafkaBenchmark: 100: 605504.877911 rec/sec (57.750000 MB/s)
INFO:_.KafkaBenchmark: 1000: 90351.817570 rec/sec (86.170000 MB/s)
INFO:_.KafkaBenchmark: 10000: 8306.180862 rec/sec (79.210000 MB/s)
INFO:_.KafkaBenchmark: 100000: 978.403499 rec/sec (93.310000 MB/s)
INFO:_.KafkaBenchmark:Throughput over long run, data > memory:
INFO:_.KafkaBenchmark: Time block 0: 684725.151324 rec/sec (65.300000 MB/s)
INFO:_.KafkaBenchmark:Single consumer: 701031.140000 rec/sec (56.830500 MB/s)
INFO:_.KafkaBenchmark:Three consumers: 3304011.014900 rec/sec (267.830800 MB/s)
INFO:_.KafkaBenchmark:Producer + consumer:
INFO:_.KafkaBenchmark: Producer: 624984.375391 rec/sec (59.600000 MB/s)
INFO:_.KafkaBenchmark: Consumer: 624984.375391 rec/sec (59.600000 MB/s)
INFO:_.KafkaBenchmark:End-to-end latency: median 2.000000 ms, 99% 4.000000 ms, 99.9% 19.000000 ms

from kafka-rest.

ewencp avatar ewencp commented on May 8, 2024

As a simple test, I increased the max # of messages per consumer request to 1000 and ran the perf test again:

INFO:_.NativeVsRestConsumerPerformance:Consumer performance: 265.647400 MB/s, 2785515.320300 msg/sec
INFO:_.NativeVsRestConsumerPerformance:REST Consumer performance: 9.270000 MB/s, 95330.702206 msg/sec

So it looks like it's at least partially resolved by increasing that value. But the baseline performance also changed quite a bit, so there may be some consistency issues.

from kafka-rest.

ewencp avatar ewencp commented on May 8, 2024

Reran the performance test and the REST proxy is at somewhere between 8-10 MB/s. This is actually about the expected amount since currently the read will only use 1 thread per request whereas the consumer performance test uses 10 threads by default (and we have overhead for all the encoding for JSON + base64 binary, synchronous requests, etc). You can get the same boost in performance as the consumer performance test by using multiple parallel requests. This currently requires multiple consumers since reads in the REST proxy are by consumer-topic and not parallelizable within a consumer. If we want to improve that, we can file specific issues rather than having an open-ended performance bug. However, we also need to balance a potentially more complex implementation + API (depending on how we end up supporting parallelism) with the fact that we'll have a much better underlying consumer API (hopefully with better performance) in the near future that will likely result in a different API.

from kafka-rest.

nehanarkhede avatar nehanarkhede commented on May 8, 2024

Excellent. Thanks for this update @ewencp

from kafka-rest.

stavpie avatar stavpie commented on May 8, 2024

Hello @ewencp, sorry for this silly question but how can we use ProducerPerformance and ConsumerPerformance in order to execute perfomance tests? I have searched into the repo and I was not able to find any instructions.

from kafka-rest.

stavpie avatar stavpie commented on May 8, 2024

Thanks a lot @stanislavkozlovski :)

from kafka-rest.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.