Comments (13)
@rwdaigle I'm mulling on whether you can bring the consumer_group
to be specific for the topic too:
config :kaffe,
consumers: [
[
endpoint: [kafka: 9092],
topics: [
[
topic: "topic1",
mode: :sync,
message_handler: MessageProcessor,
consumer_group: "consumer-group-for-topic-1",
offset: :earliest # or integer or :latest?
], [
topic: "topic2",
mode: :async,
message_handler: MessageProcessor2,
consumer_group: "consumer-group-for-topic-2",
offset: :latest
]
]
],
[
endpoint: [kafka: 8092],
# ....
]
]
from kaffe.
kaffe seems to assume that developers want only one consumer group, one producer. Most of the time people need more things at the same time. So yes, a rework of that would be really awesome
from kaffe.
I'm mulling on whether you can bring the consumer_group to be specific for the topic too:
Yes! Definitely seems like this should be the case.
from kaffe.
@duff That would be great, yeah
from kaffe.
I'm also very much in need for such a feature!
from kaffe.
Maybe we can rework the config for the next major revision.
from kaffe.
What does having a separate consumer group per topic provide? The consumer group is still specific to the topic.
from kaffe.
You might want to have multiple consumer groups, doing different things, like consuming from 2 different sets of topics independantly. You also could consume from two different kafka clusters and write to a third. I have some cases where I need to consume data from a lot of topics, using one consumer group, then I want a second consumer group to consume a set of "instructions" topics, where it'll be able to read which actions to perform. I don't want the data reading to affect the instruction readings. Etc.
from kaffe.
Yes, if you're consuming from the same topic multiple times, two consumer groups would be necessary.
However, I don't think consuming from two different clusters would necessitate different consumer groups. Each cluster would just store the same consumer group name associated with whatever topics your client is subscribed to.
In the example above, if the "instructions" topics are different from the first set of topics, using the same consumer group (which is just a name that refers to the partition offsets of a topic) shouldn't be a problem, right?
LMK if I'm not understanding.
from kaffe.
Well, it's an issue, because (correct me if I'm wrong) but kaffe can only setup one message handler, so you've got one handler for both data and instruction topics, so it's not easy to maintain isolation of processing. Also, if the data topics have a lot of partition reassignment, it's going to impact your instruction topic consumer, as the consumer group will have to pause. But maybe I'm missing something, like a way to have 2 different message handlers ? From what it seems, kaffe is really tailored towards doing one consuming thing, and one producer thing, which is limiting.
For instance, you can't setup one producer to send message with a :random partitioning, and an other one with a :md5 partitioner. Also producers to different kafka cluster is not supported.
from kaffe.
Well, it's an issue, because (correct me if I'm wrong) but kaffe can only setup one message handler, so you've got one handler for both data and instruction topics, so it's not easy to maintain isolation of processing.
This is true, only one message handler is supported. Isolation would have to be handled downstream.
Also, if the data topics have a lot of partition reassignment, it's going to impact your instruction topic consumer, as the consumer group will have to pause.
I might be wrong, but I don't think this is true. The GroupMember
s are specific to a topic. If one is rebalancing, it shouldn't impact subscribers to the other topics. However, even if Kaffe isn't doing the right thing here, it shouldn't have anything to do with consumer groups.
From what it seems, kaffe is really tailored towards doing one consuming thing, and one producer thing, which is limiting
Yes, I can see this from the standpoint in isolating the processing, so that errors in one don't impact another, for example. In this case, I think you'd need separate apps.
For instance, you can't setup one producer to send message with a :random partitioning, and an other one with a :md5 partitioner. Also producers to different kafka cluster is not supported.
Correct on both counts.
But I still don't think any of that most of that has anything to do with consumer groups (except for possibly the rebalance thing). This is a general design issue, which starts with how Kaffe is configured. Kaffe would require some refactoring to support this, but most of the internals are topic/subscriber/worker specific.
from kaffe.
This is true, only one message handler is supported. Isolation would have to be handled downstream.
Which might be an issue if for instance one topic if DDos-ed with msgs, you can't keep a sererate handler that would be unaffected. Well you can but it's not easy to implement I think.
I might be wrong, but I don't think this is true. The GroupMembers are specific to a topic. If one is rebalancing, it shouldn't impact subscribers to the other topics. However, even if Kaffe isn't doing the right thing here, it shouldn't have anything to do with consumer groups.
Hm it's weird because I see this issue that rebalancing is affecting all the groupmembers, and actually all the nodes part of the consumer group. But maybe it's something on my side, so it's interesting to know that it shouldn't happen. I'll dig into this.
Yes, I can see this from the standpoint in isolating the processing, so that errors in one don't impact another, for example. In this case, I think you'd need separate apps.
That would increase complexity by a substantial margin
Correct on both counts.
But I still don't think any of that most of that has anything to do with consumer groups (except for possibly the rebalance thing). This is a general design issue, which starts with how Kaffe is configured. Kaffe would require some refactoring to support this, but most of the internals are topic/subscriber/worker specific.
Also, don't get me wrong, kaffe is nice, it's just weird because to me the internal code and modules layout seems to have been designed to allow a lot of flexibility and different way to use it, but in reality it's arbitrary rigid about what it allows users to do. Maybe that's on purpose?
It looks like I could implement a thin layer on top of kaffe's internals, with a different approach to configuration and pluggability, and achieve total freedom of use. Which seems strange :)
Anyway keep up the great work !
from kaffe.
Which might be an issue if for instance one topic if DDos-ed with msgs, you can't keep a sererate handler that would be unaffected.
👍
Hm it's weird because I see this issue that rebalancing is affecting all the groupmembers, and actually all the nodes part of the consumer group. But maybe it's something on my side, so it's interesting to know that it shouldn't happen. I'll dig into this.
I might totally be wrong. I used a one_for_all
strategy in the supervisors to ensure everything was in sync if something went wrong. Maybe that's too aggressive for the supervision strategy Kaffe currently has.
That would increase complexity by a substantial margin
Too true! I wish umbrellas were a solution here, but they all share the same config.
Also, don't get me wrong, kaffe is nice, it's just weird because to me the internal code and modules layout seems to have been designed to allow a lot of flexibility and different way to use it, but in reality it's arbitrary rigid about what it allows users to do. Maybe that's on purpose?
Kaffe started out as a very simple consumer with the same process handling all partitions. It was a very limited use case.
When we discovered that the throughput wasn't good, we followed the design of Brucke and got great throughput. But we were still using the old style config. So it was a bit of a split personality. 😉
Anyway keep up the great work !
Thanks very much. We also appreciate your contributions, your willingness to argue your point and submit PRs. That all helps make it better for everyone!
from kaffe.
Related Issues (20)
- Defining multiple handlers HOT 1
- worker_per_topic_partition with multiple topics HOT 1
- Examples not compatible with Elixir 1.10 or 1.11 HOT 2
- extract_der is giving error with SSL HOT 2
- Undefined function exponential_backoff HOT 10
- Offset doesn't get updated between runs and runs crash with OOM errors HOT 4
- async ack - lots of duplicate messages until I restart the application HOT 2
- Kaffe.Producer.produce_sync raises on timeout
- How to set kafka headers when publishing message HOT 1
- Invalid call to raise/reraise on brod/kpro error
- Add support for SCRAM mechanism in SASL authentication. HOT 1
- Module to help write ExUnit tests
- It's impossible to create 2 separate consumers for different topics
- Running mix with kaffe deps fails to download pc package from hex
- Wrong place for configuration
- Repeated rebalance cycle with kafka broker 2.3.0 HOT 16
- kaffe cannot recover from unreachable Kafka HOT 18
- Missing documentation HOT 8
- Connecting to a TLS-based Kafka instance under AWS MSK? HOT 18
- Receives notification when rebalance in progress/assignments revoked HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from kaffe.