Comments (10)
Good one, we set it to lz4
everywhere. I'll have a look
from racecar.
I wonder how compression comes into play on the consumer side. According to https://github.com/edenhill/librdkafka/blob/master/CONFIGURATION.md the compression parameters can only be set for producers.
from racecar.
I'm not even sure why this happened ā it was during broker shutdown. Could be an incorrect error message.
from racecar.
Iām on mobile right now, but generally speaking printing the error code might be helpful. Depending on the versions in use there might be a disagreement on what an error code means, see e.g. confluentinc/librdkafka#2245
from racecar.
I don't know how to reproduce this.
Using Kafka 2.3.0 / Zookeeper 3.4.13 from https://github.com/wurstmeister/kafka-docker in single broker mode (docker-compose-single-broker.yml
with KAFKA_ADVERTISED_HOST_NAME: 127.0.0.1
) I manage to get a
Rdkafka::RdkafkaError: Local: Broker transport failure (transport)
with essentially the same stack trace (as in the initial comment) when
docker-compose -f docker-compose-single-broker.yml up
- start racecar consumer
CTRL+C
docker compose
This doesn't always work for some reason (timing?), but the error message seems decent enough. The transport failure one is also years old, so it's unlikely I'm hitting a code path with changed/unreleased/mismatched error codes from Kafka and librdkafka.
Can you provide more details about your local setup? Generally speaking I'm hesitant to add re-connect in these scenarios because resetting librdkafka's internal offset state is not trivial.
from racecar.
I can try to see if I can hit the condition again.
However ā the consumers are supposed to be resilient to temporary cluster failures. I think we need some kind of automatic retries.
from racecar.
I think that let it crash is a reasonable strategy for unknown and unpredictable errors. It is advisable anyway to have a supervising process that does restarts (with exponential back-off) and alerting.
When there is a possibility to reproduce the condition, then it makes sense to handle it here. Probably catch/retry this specific exception + error code around poll
in the ConsumerSet
. Otherwise the situation could get worse, as @breunigs described with unknowns in the internal state of rdkafka.
I think that we should not introduce a generic mechanism for automatic retries.
from racecar.
Is Broker transport failure
not a temporary error condition? Or does that indicate a configuration error?
from racecar.
Unfortunately it can be both: https://github.com/edenhill/librdkafka/wiki/FAQ#why-am-i-seeing-receive-failed-disconnected (via confluentinc/librdkafka#1664). However, the client misconfiguration is only happening when librdkafka and broker version are an incompatible match, which seems much less likely than random TCP hiccup.
So, something like this around polling?
begin
# ā¦
rescue => e
retries = (retries || 0) + 1
raise if e.error_code != 123456 || retries > MAX_RETRIES
sleep 2 ** retries
retry
end
Please note that we were generally letting Kubernetes handle these restarts for us, so we might have missed some error cases if they didn't occur regularly enough.
from racecar.
Racecar's contract with the user is that it doesn't crash on ephemeral problems ā so I think we should do some kind of retries with exponential backoff.
from racecar.
Related Issues (20)
- Cannot find ActiveRecord Class in Rails 6 uninitialized constant for Model in `app/models` HOT 1
- Gem with statistics_interval setting HOT 2
- Add required_ruby_version to gemspec? 2.3.0 release is broken on Ruby < 2.5 HOT 3
- Integrating Confluent kafka with racecar HOT 2
- No such configuration property "enable.ssl.certificate.verification" HOT 6
- Racecar trying to connect to 127.0.0.1:9092 regardless of configuration
- default start_from_beginning = false HOT 2
- long running processes HOT 4
- how can we mention sasl_username, sasl_password in racecar.yml for different brokers
- Maximum application poll interval (max.poll.interval.ms) exceeded (max_poll_exceeded) HOT 2
- Unable to configure producer partitioner_name HOT 2
- [question] Multi service Datadog dashboard setup HOT 1
- Consumer client ids defaulting to consumer-null-NNNN
- Consumer reset not needed on a :not_coordinator according to librdkafka HOT 1
- Allow setting King Konf's ignore_unknown_variables HOT 3
- How to do a quick health check HOT 2
- Define active_support dependencies in production.
- IAM access control HOT 1
- Configure producer using environment variables HOT 2
- Config not loaded automatically for a producer in Rails. HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
š Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ššš
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ā¤ļø Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from racecar.