Giter VIP home page Giter VIP logo

Comments (23)

egargaritano avatar egargaritano commented on August 14, 2024 1

from event-streams.

EmmaHumber avatar EmmaHumber commented on August 14, 2024

Hi Edwin

Thanks for getting in touch. From the logs you've sent in it looks like this Kafka pod can't talk to at least one of the Zookeepers instances, however the logs show that the three Zookeeper pods are up and running.

I can't tell from the information above why this might be. Could you run the script at

https://github.com/IBM/charts/blob/master/stable/ibm-eventstreams-dev/additionalFiles/get-logs.sh

and attach the file generated to this issue please?

Are you running SE Linux, or standard Red Hat?

Can you ping one worker node from the other worker nodes? eg ssh into 10.69.5.67 and see if you can ping 10.69.5.68 and 10.69.5.69.

Can you exec into the Kafka container:
kubectl exec -it KAFKA_POD_NAME -c kafka bash

and see if it can contact the zookeeper containers. There's no ping, so try nc:

nc -z 172.18.167.193 2181
nc -z 172.18.186.106 2181

Note that I can't work out the ip of the third zookeeper from the Kafka healthcheck logs - I'm assuming the Kafka healthcheck can communicate with this ZK, possibly because it is on the same node as the Kafka pod the logs above are from.

Thanks
Emma

from event-streams.

egargaritano avatar egargaritano commented on August 14, 2024

from event-streams.

EmmaHumber avatar EmmaHumber commented on August 14, 2024

Hi Edwin - did you save the script on a Windows machine first? It looks like there are carriage return ("/r") characters on the end of each line which Linux won't be happy with.

My suspicion is that there is a setting in the SELinux configuration that is preventing the pods from talking across nodes. Can you also attach your SELinux settings from /etc/selinux/config

Is it possible for you to test with SELinux disabled to see if this resolves the problem? That will either rule SELinux configuration out, or highlight where to focus investigation.

Thanks
Emma

from event-streams.

egargaritano avatar egargaritano commented on August 14, 2024

from event-streams.

EmmaHumber avatar EmmaHumber commented on August 14, 2024

Hi Edwin - I can't see any new logs attached to this issue. Could you double check you added them as it looks like reply via email might not be including files.

Thanks
Emma

from event-streams.

egargaritano avatar egargaritano commented on August 14, 2024

from event-streams.

egargaritano avatar egargaritano commented on August 14, 2024

from event-streams.

egargaritano avatar egargaritano commented on August 14, 2024

from event-streams.

EmmaHumber avatar EmmaHumber commented on August 14, 2024

Hi - thanks for attempting to upload the docs again. Can you see the attachments on this issue? I can't see them unfortunately. I was wondering if you could try using the git web client to upload the files rather than email.

How did you get on with disabling SELinux to exclude this from what's going on?

Regards
Emma

from event-streams.

egargaritano avatar egargaritano commented on August 14, 2024

from event-streams.

egargaritano avatar egargaritano commented on August 14, 2024

from event-streams.

egargaritano avatar egargaritano commented on August 14, 2024

from event-streams.

andrewdunnings avatar andrewdunnings commented on August 14, 2024

Hi Edwin,

I'm replying on behalf of Emma:

Did disabling SELinux have any affect on the state of the Event Streams install, for example is the UI showing the brokers as unavailable still? If there’s no difference in behavior could you:

  1. confirm se linux is disabled by running sestatus and checking the output, then
  2. run kubectl delete pod ZOOKEEPER_POD_NAME for each zookeeper in turn, followed by a kubectl delete pod KAFKA_POD_NAME for each kafka in turn, waiting for each pod to restart before deleting the next, then
  3. run kubectl get pods to confirm the status of each pod

If the pods still don’t all come up, we can remove the event streams network policy next:

Save the network policy files for kafka and zookeeper:

  1. kubectl get netpol -o yaml $(kubectl get netpol | grep zookeeper | awk '{ print $1 }') > zookeeper-netpol.yaml
  2. kubectl get netpol -o yaml $(kubectl get netpol | grep kafka | awk '{ print $1 }') > kafka-netpol.yaml

Delete the network policies for kafka and zookeeper:
3. kubectl delete netpol $(kubectl get netpol | grep zookeeper | awk '{ print $1 }')
4. kubectl delete netpol $(kubectl get netpol | grep kafka | awk '{ print $1 }')

  1. Restart the zookeeper and kafka pods as described above
  2. Run kubectl get pods to see status

Recreate the network policies for kafka and zookeeper:
7. kubectl apply -f zookeeper-netpol.yaml
8. kubectl apply -f kafka-netpol.yaml

I’m wondering if there is some issue attaching large files to this issue as I still can’t see them linked, so I’m going to suggest another route to supply the logs

https://slack-invite-ibm-cloud-tech.mybluemix.net/ - fill in your email to request an invite to join our slack workspace.

Once the invite has come through go to the channel https://ibm-cloud-tech.slack.com/messages/CADFRM4FR/ and post in there referencing this issue. You should be able to upload the log file there.

Thanks
Andrew

from event-streams.

egargaritano avatar egargaritano commented on August 14, 2024

from event-streams.

andrewdunnings avatar andrewdunnings commented on August 14, 2024

Hi Edwin,

It's a good sign that the containers are coming up without the network policies in place.

The purpose of the network policies is to enforce restrictions on which pods can communicate with each other on particular ports. For example: the Kafka network policy has a rule that allows Kafka to send traffic to port 2181 of the Zookeeper pod, and similarly the Zookeeper network policy has a rule which allows it to receive traffic from the Kafka pod on port 2181.
These rules are put in place to prevent communication between pods that aren't designed to talk to each other, for security reasons.

It looks like the problem you are experiencing is that the network policies for Kafka and Zookeeper are incorrectly preventing them from communicating, since when you deleted the network policies all of the containers came up. This is unexpected behaviour so I'd like to investigate why this is occurring.

Please could you send the following:

  • Output of a kubectl get netpol command
  • The zookeeper-netpol.yaml and kafka-netpol.yaml files
  • The output of a kubectl describe pod ZOOKEEPER_POD and kubectl describe pod KAFKA_POD of one of each of the new pods

In the short term, you can make a decision whether you are happy to run without the Zookeeper and Kafka network policies in place. The consequence of this would be that the Kafka and Zookeeper pods will be able to contacted by other pods within your cluster.

Thanks,
Andrew

from event-streams.

egargaritano avatar egargaritano commented on August 14, 2024

from event-streams.

EmmaHumber avatar EmmaHumber commented on August 14, 2024

Hi Edwin

Thanks for the output of the kafka and zookeeper kubectl describe pods commands

Could you also supply the contents of the two .yaml files used to save the network policy information, ie zookeeper-netpol.yaml and kafkan-netpol.yaml

It looks like something has gone awry with the network policy and we'll need these two files to start to understand what that might be.

Thank you!
Emma

from event-streams.

egargaritano avatar egargaritano commented on August 14, 2024

from event-streams.

EmmaHumber avatar EmmaHumber commented on August 14, 2024

HI @egargaritano There's still something odd about how files get attached to this issue and I can't see them.

Could you copy and paste the text of the two network policy output files into the email response directly as you did for the kubectl describe commands above, as that way I can see the content.

Thanks
Emma

from event-streams.

EmmaHumber avatar EmmaHumber commented on August 14, 2024

Hi @egargaritano

The network policies for Kafka, Zookeeper and the deny policy match my working system, the ports look correct, as do the container names and policy labels.

So, let's start back at the beginning.

  1. Disable SE Linux
  2. Install a fresh Event Streams with a new release name
  3. Check the pods all come up

Let me know how you get on.

Thanks
Emma

from event-streams.

egargaritano avatar egargaritano commented on August 14, 2024

from event-streams.

EmmaHumber avatar EmmaHumber commented on August 14, 2024

Hi Edwin,

From the PMR ticket I saw I think you're up and running.

I'm closing this off for now, please do reopen if you have any further issues.

Thanks
Emma

from event-streams.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.