Comments (23)
from event-streams.
Hi Edwin
Thanks for getting in touch. From the logs you've sent in it looks like this Kafka pod can't talk to at least one of the Zookeepers instances, however the logs show that the three Zookeeper pods are up and running.
I can't tell from the information above why this might be. Could you run the script at
https://github.com/IBM/charts/blob/master/stable/ibm-eventstreams-dev/additionalFiles/get-logs.sh
and attach the file generated to this issue please?
Are you running SE Linux, or standard Red Hat?
Can you ping one worker node from the other worker nodes? eg ssh into 10.69.5.67 and see if you can ping 10.69.5.68 and 10.69.5.69.
Can you exec into the Kafka container:
kubectl exec -it KAFKA_POD_NAME -c kafka bash
and see if it can contact the zookeeper containers. There's no ping, so try nc:
nc -z 172.18.167.193 2181
nc -z 172.18.186.106 2181
Note that I can't work out the ip of the third zookeeper from the Kafka healthcheck logs - I'm assuming the Kafka healthcheck can communicate with this ZK, possibly because it is on the same node as the Kafka pod the logs above are from.
Thanks
Emma
from event-streams.
from event-streams.
Hi Edwin - did you save the script on a Windows machine first? It looks like there are carriage return ("/r") characters on the end of each line which Linux won't be happy with.
My suspicion is that there is a setting in the SELinux configuration that is preventing the pods from talking across nodes. Can you also attach your SELinux settings from /etc/selinux/config
Is it possible for you to test with SELinux disabled to see if this resolves the problem? That will either rule SELinux configuration out, or highlight where to focus investigation.
Thanks
Emma
from event-streams.
from event-streams.
Hi Edwin - I can't see any new logs attached to this issue. Could you double check you added them as it looks like reply via email might not be including files.
Thanks
Emma
from event-streams.
from event-streams.
from event-streams.
from event-streams.
Hi - thanks for attempting to upload the docs again. Can you see the attachments on this issue? I can't see them unfortunately. I was wondering if you could try using the git web client to upload the files rather than email.
How did you get on with disabling SELinux to exclude this from what's going on?
Regards
Emma
from event-streams.
from event-streams.
from event-streams.
from event-streams.
Hi Edwin,
I'm replying on behalf of Emma:
Did disabling SELinux have any affect on the state of the Event Streams install, for example is the UI showing the brokers as unavailable still? If there’s no difference in behavior could you:
- confirm se linux is disabled by running sestatus and checking the output, then
- run kubectl delete pod ZOOKEEPER_POD_NAME for each zookeeper in turn, followed by a kubectl delete pod KAFKA_POD_NAME for each kafka in turn, waiting for each pod to restart before deleting the next, then
- run
kubectl get pods
to confirm the status of each pod
If the pods still don’t all come up, we can remove the event streams network policy next:
Save the network policy files for kafka and zookeeper:
kubectl get netpol -o yaml $(kubectl get netpol | grep zookeeper | awk '{ print $1 }') > zookeeper-netpol.yaml
kubectl get netpol -o yaml $(kubectl get netpol | grep kafka | awk '{ print $1 }') > kafka-netpol.yaml
Delete the network policies for kafka and zookeeper:
3. kubectl delete netpol $(kubectl get netpol | grep zookeeper | awk '{ print $1 }')
4. kubectl delete netpol $(kubectl get netpol | grep kafka | awk '{ print $1 }')
- Restart the zookeeper and kafka pods as described above
- Run
kubectl get pods
to see status
Recreate the network policies for kafka and zookeeper:
7. kubectl apply -f zookeeper-netpol.yaml
8. kubectl apply -f kafka-netpol.yaml
I’m wondering if there is some issue attaching large files to this issue as I still can’t see them linked, so I’m going to suggest another route to supply the logs
https://slack-invite-ibm-cloud-tech.mybluemix.net/ - fill in your email to request an invite to join our slack workspace.
Once the invite has come through go to the channel https://ibm-cloud-tech.slack.com/messages/CADFRM4FR/ and post in there referencing this issue. You should be able to upload the log file there.
Thanks
Andrew
from event-streams.
from event-streams.
Hi Edwin,
It's a good sign that the containers are coming up without the network policies in place.
The purpose of the network policies is to enforce restrictions on which pods can communicate with each other on particular ports. For example: the Kafka network policy has a rule that allows Kafka to send traffic to port 2181 of the Zookeeper pod, and similarly the Zookeeper network policy has a rule which allows it to receive traffic from the Kafka pod on port 2181.
These rules are put in place to prevent communication between pods that aren't designed to talk to each other, for security reasons.
It looks like the problem you are experiencing is that the network policies for Kafka and Zookeeper are incorrectly preventing them from communicating, since when you deleted the network policies all of the containers came up. This is unexpected behaviour so I'd like to investigate why this is occurring.
Please could you send the following:
- Output of a
kubectl get netpol
command - The
zookeeper-netpol.yaml
andkafka-netpol.yaml
files - The output of a
kubectl describe pod ZOOKEEPER_POD
andkubectl describe pod KAFKA_POD
of one of each of the new pods
In the short term, you can make a decision whether you are happy to run without the Zookeeper and Kafka network policies in place. The consequence of this would be that the Kafka and Zookeeper pods will be able to contacted by other pods within your cluster.
Thanks,
Andrew
from event-streams.
from event-streams.
Hi Edwin
Thanks for the output of the kafka and zookeeper kubectl describe pods commands
Could you also supply the contents of the two .yaml files used to save the network policy information, ie zookeeper-netpol.yaml and kafkan-netpol.yaml
It looks like something has gone awry with the network policy and we'll need these two files to start to understand what that might be.
Thank you!
Emma
from event-streams.
from event-streams.
HI @egargaritano There's still something odd about how files get attached to this issue and I can't see them.
Could you copy and paste the text of the two network policy output files into the email response directly as you did for the kubectl describe commands above, as that way I can see the content.
Thanks
Emma
from event-streams.
The network policies for Kafka, Zookeeper and the deny policy match my working system, the ports look correct, as do the container names and policy labels.
So, let's start back at the beginning.
- Disable SE Linux
- Install a fresh Event Streams with a new release name
- Check the pods all come up
Let me know how you get on.
Thanks
Emma
from event-streams.
from event-streams.
Hi Edwin,
From the PMR ticket I saw I think you're up and running.
I'm closing this off for now, please do reopen if you have any further issues.
Thanks
Emma
from event-streams.
Related Issues (20)
- Unable to connect clients to bootstrap url when on OpenShift
- Vulnerability in IBM Event Streams - CVE-2020-4662
- Operator is generating constant log output
- The es-proxy-deploy pods periodically stop responding and do not seem to recover HOT 5
- Kafka messages with duplicate header keys prevent the message browser from displaying in the UI
- Airgap installation instructions incorrect for Event Streams HOT 1
- 504 timeout error when viewing consumer groups in the Event Streams UI
- UI is unable to display consumer groups for a topic
- KafkaExporter reconcile and runtime issues when deployed via operator
- Unable to override bootstrap and broker routes
- Unable to renew Custom CA Certificates in Event Streams cluster
- Unable to configure TLS between Openshift Serverless and IBM Event Streams HOT 1
- Event Streams access controller loops on failure causing other clients to be disconnected
- Event Streams UI returns 401:Unauthorized after previously working successfully
- Viewing topics in the UI returns a 503 error
- ProducerMetricsInterceptor fails with IllegalStateException and/or IllegalAccessException
- Issue with v2 endpoint of Rest Producer API
- Download schema registry Java dependencies not working on 2019.4.2 HOT 2
- Cannot add a destination cluster via the UI with long URL for API Address
- Schema Registry auto-schema registration id generation problem HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from event-streams.