Comments (15)
Myeah, the rdkafka consumer is analogue to the official SimpleConsumer, i.e.: it does not do zookeeper and it does not provide consumer group abstraction.
Providing an example utility to consume from multiple partitions is a good idea but I think it should be a separate program.
I wrote something similar for WMF called kafkatee that consumes from one or more topics and one or more partitions and outputs to one or more outputs. It is currently in review but will be available on github shortly.
from librdkafka.
rdkafka_example should now provide a better (but not perfect!) error when -p is not set.
from librdkafka.
I see that now. It's weird though that when I have the example running successfully and kill it/ctrl-c it, it shuts down but tells me that all brokers are down
from librdkafka.
Thats weird, rdkafka_example doesnt even have an error_cb :)
And I cant reproduce this with rdkafka_performance (which does).
What command line are you using?
from librdkafka.
Was example. It's running, then hit Ctrl-C and get that.
./rdkafka_example -C -b d146537-021.masked.com:5757,
d146537-022.masked.com:5757,d146537-023.masked.com:5757 -p 3 -t LOGS1
% Consumer reached end of LOGS1 [3] message queue at offset 666496
^C1391220310.289 RDKAFKA-3-ERROR: rdkafka#consumer-0: 3/3 brokers are down
On Fri, Jan 31, 2014 at 8:44 PM, Magnus Edenhill
[email protected]:
Thats weird, rdkafka_example doesnt even have an error_cb :)
And I cant reproduce this with rdkafka_performance (which does).
What command line are you using?
Reply to this email directly or view it on GitHubhttps://github.com//issues/65#issuecomment-33859686
.
from librdkafka.
Ah, yeah, errors are logged if no error_cb is set.
Anyhoo, it wont fire this event if rdkafka is being brought down, this fixes it.
Thanks.
from librdkafka.
Maybe you want to write a sample consumer consuming from all partitions, using the shiny new C++ interface? Thanks, great!
from librdkafka.
I am currently trying to write a consumer that uses consumer groups to synchronize with other consumers (so I can start multiple instances of the consumer and consume each message exactly once). If I understand this correctly I can do this by using offsets stored on the brokers. Since there is no consumer balancing as in the High Level Consumer I would need to run all consumers on all partitions of all topics and rely on the offsets stored on the broker to skip anything that was already consumed, right?
Does this work with multiple consumers consuming at the same time?
from librdkafka.
As long as you dont run multiple consumers for the same topic+partition (toppar) combo, this will work fine.
The problems start when you have multiple consumers for the same toppar, and thats where consumer balancing comes in and is strictly required.
Consumer balancing is unfortunately not implemented in librdkafka since it relies on ZooKeeper.
But, the good news is that the LinkedIn boys are working on moving the consumer balancing logic (et.al) to the broker and dropping the ZooKeeper requirement on the clients.
This will also be implemented in librdkafka thus providing proper and conformant consumer balancing.
Until then, you would need to roll your own, and I guess that is what you are saying you want to do using the current broker based offset storage, right?
This is probably possibly by utilizing the metadata field in the OffsetRequest and OffsetResponse messages (see here https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol#AGuideToTheKafkaProtocol-OffsetCommitRequest, available since Apache Kafka 0.8.1).
But you would need to come up with your own semantics on how to do this and it is not completely trivial.
At this time I would advice against it since the proper broker mechanism for full consumer balancing will be available in Apache Kafka 0.8.1.1 (IIRC) which is due soon, shortly followed by the librdkafka implementation.
If you cant upgrade, or wait, I instead suggest that you look into implementing the old (current) consumer balancing algorithm and mechanism that utilizes ZooKeeper.
I hope that answers your questions.
from librdkafka.
Thanks for the great explanation. I think I will wait for the consumer balancing on the broker then. For the time being manually assigning the toppar should work for us.
Thanks for the help and for developing librdkafka!
from librdkafka.
Hey @winbatch,
I've now added support for consumer queues which reroutes fetched messages from multiple topics+partitions to one single queue that can be polled with one single call.
This is on the consumer_queues branch which will be merged into master after some more testing:
2e76915
With this feature, and the existing metadata API, you can easily create an application that consumes from all partitions.
I've added support for this in kafkacat's consumer_queue branch:
edenhill/kcat@aa45711
from librdkafka.
Oh awesome! I have wanted this feature in kafkacat!
from librdkafka.
Will it guarantee that messages from a given partition will be handed in
order?
On Wednesday, July 30, 2014, Magnus Edenhill [email protected]
wrote:
Hey @winbatch https://github.com/winbatch,
I've now added support for consumer queues which reroutes fetched messages
from multiple topics+partitions to one single queue that can be polled with
one single call.
This is on the consumer_queues branch which will be merged into master
after some more testing:
2e76915
2e76915With this feature, and the existing metadata API, you can easily create an
application that consumes from all partitions.
I've added support for this in kafkacat's consumer_queue branch:
edenhill/kcat@aa45711
edenhill/kcat@aa45711—
Reply to this email directly or view it on GitHub
#65 (comment).
from librdkafka.
@winbatch, yes
from librdkafka.
Support for this has now been merged into master in both librdkafka and kafkacat.
from librdkafka.
Related Issues (20)
- adminClient describeConsumerGroups leads to python: rdkafka_queue.h:1052: rd_kafka_enq_once_del_source_return: Assertion `eonce->refcnt > 0' failed. HOT 2
- Producer never reconnects to broker HOT 3
- ListOffsets loop of failed requests on leader epoch change until timeout happens HOT 3
- Question: Can event_cb be called on a stopped producer/consumer? HOT 1
- Data race in `rd_kafka_broker_timeout_scan`
- rd_kafka_produceva: double free headers on message sending error HOT 2
- coop sticky algo on large partition number HOT 1
- Duplicate messages could be received when resuming partitions that weren't paused
- Some components use `strcmp` which is vulnerable to timing attacks
- Timeout overflows when over 2047.48
- Crash in rd_kafka_broker_add_logical HOT 1
- Reading committed offsets where metadata contains null byte leads to reading random data after null byte.
- zlib library security vulnerability through to version 1.3
- when using librdkafka, create a thread will fail.
- Unable to connect C++ consumer client with OAUTHBEARER mechanism to server HOT 1
- query_watermark_offsets() Doesn't Return Authentication Failure Errors.
- Critical Vulnerabilities identified in librdkafka HOT 3
- Pr 1
- Sporadic crash in rd_kafka_buf_callback()
- Starting with 8e20e1ee, after broker goes down and back up, `rd_kafka_destroy` of groupconsumer hangs HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from librdkafka.