Comments (10)
Thanks, fixed the formatting.
from strimzi-kafka-operator.
Thanks for raising the issue. Could you please format the YAMLs to make them readable? Thanks.
from strimzi-kafka-operator.
@urbandan I tried it with the latest operator(0.41.0) and was not able to get this error. For me when you increase the node pool brokers to 4 then the KafkaRebalance
moves into NotReady
state since the Kafka cluster is in the NotReady
state as the pods are still coming which should be what we desire. Then once the Kafka cluster is up with all pods running then the KafkaRebalance
will move again to ProposalReady
state. I will try with 0.40.0
now
from strimzi-kafka-operator.
@ShubhamRwt Why would the Kafka cluster move to NotReady
just because you scaled up the node pool to 4 nodes? That sounds like you had some other issue interfering with the reproducer.
from strimzi-kafka-operator.
@scholzj I meant when we scale up the nodes then pods corresponding to cruise-control
and the 4th
node will be coming up. It would mean that the kafka cluster is not ready yet so there we have the logic in the Rebalance
operator that if the Kafka cluster is not up yet then we say Kafka cluster is not Ready
.
from strimzi-kafka-operator.
No, the Kafka cluster should stay in Ready while scaling up.
from strimzi-kafka-operator.
Also, look at it differently -> what happens if you simply delete the CC pod while a rebalance is in progress?
from strimzi-kafka-operator.
Discussed on the community call on 16.5.2024: (Assuming this can be reproduced - see the discussion above), this should be addressed by failing the rebalance or restarting the process if possible. (Let's keep it in triage for next time to make sure it is reproducible and discuss the options)
Note: This should be already handled by the Topic Operator when changing the replication factor, where the TO detects this and automatically restarts the RF change. @fvaleri will double-check.
from strimzi-kafka-operator.
Yeah, we have this corner case covered in the Topic Operator.
TL/DR: Cruise Control has no memory of the task that it was working on before restart, so the operator is responsible for detecting this event and resubmit any ongoing task.
The operator periodically calls the user_tasks
endpoint with one or more Cruise Control generated User-Task-ID
, in order to check the status of pending tasks. If it gets back an empty task list, then it means that Cruise Control has restarted, and the tasks may or may not have been completed. This means that the operator must reset its internal state (switch back the resource state from ongoing to pending), and resubmit the tasks.
Note that there is a small chance that the task could have been completed just before Cruise Control restarted, but the operator didn't had time to know that. In this case, the new task submission would be a duplicate. This is not a problem in practice, as the work has already been done, and the duplicated task would be completed quickly (no-op).
from strimzi-kafka-operator.
Related Issues (20)
- [Bug]: strimzi-cluster-operator keeps restarting with issue: WARN BlockedThreadChecker: - Thread Thread[vert.x-eventloop-thread-1,5,main] has been blocked for
- [Bug]: java.net.UnknownHostException: my-cluster-zookeeper-0.my-cluster-zookeeper-nodes.default.svc
- [Bug]: StrimziPodSet ignores strimzi.io/pause-reconciliation annotation HOT 10
- Performance Scaling Issues with KRaft-Based Kafka Clusters HOT 4
- Strimizi kafka operator, vulnerabilities for v 3.6.0
- [Enhancement]: Support additionalVolumes HOT 5
- [CRD-v1] Make the .spec section in all resources required HOT 1
- [CRD-v1] Try to enforce the Connect topic names or make better defaults HOT 2
- [Bug]: podTemplate securityContext doesn't honor sysctl values HOT 12
- [Bug]: Strimzi operator unable to spin up kafka cluster on gke
- [Enhancement]: Deploy different versions of Strimzi operators in a single k8s cluster, allowing the cluster to deploy different versions of Kafka clusters
- Kafka Users are created with ACL entries and during performing operations allowed by ACL we see Denied Operation HOT 1
- [Enhancement]: Ensure Helm chart is published after image published
- [Bug]: Unable to create kafka cluster with security.protocol=SASL_SSL and security.mechanis=PLAIN HOT 2
- [Bug]: Configure kafka connect connection to an existing kafka cluster
- [Bug]: Adding a serviceAccount annotation in KafkaConnect spec makes Cluster Operator Stuck on Reconciliation of Strimzipodset
- [Bug]: KafkaRebalance not respecting replicationThrottle HOT 7
- Add Cruise Control timer metrics to the Topic Operator HOT 1
- [Bug]: The StrimziPodSetController is not checking controller flag of the old owner before adding new one
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from strimzi-kafka-operator.