Giter VIP home page Giter VIP logo

Comments (8)

rolinh avatar rolinh commented on June 2, 2024 1

Any idea ?

There's a new bit of information: the hubble-relay service runs on port 443. This means you enabled TLS for the Relay service so you will need to enforce TLS using the Hubble CLI:

$ hubble help observe | grep -i tls
      --tls                           Specify that TLS must be used when establishing a connection to a Hubble server.
                                      By default, TLS is only enabled if the server address starts with 'tls://'.
      --tls-allow-insecure            Allows the client to skip verifying the server's certificate chain and host name.
                                      This option is NOT recommended as, in this mode, TLS is susceptible to machine-in-the-middle attacks.
                                      See also the 'tls-server-name' option which allows setting the server name.
      --tls-ca-cert-files strings     Paths to custom Certificate Authority (CA) certificate files.The files must contain PEM encoded data.
      --tls-client-cert-file string   Path to the public key file for the client certificate to connect to a Hubble server (implies TLS).
      --tls-client-key-file string    Path to the private key file for the client certificate to connect a Hubble server (implies TLS).
      --tls-server-name string        Specify a server name to verify the hostname on the returned certificate (eg: 'instance.hubble-relay.cilium.io').

from hubble.

ledroide avatar ledroide commented on June 2, 2024

Does anyone have a idea for a workaround to use hubble ?
Since it looks like a port-forwarding issue, how can I display "hubble observe" without port-forwarding ?

from hubble.

rolinh avatar rolinh commented on June 2, 2024

What you can try doing is accessing the Hubble Relay service via the Hubble CLI (e.g. from within a Cilium agent pod) and check if everything is fine there (e.g. hubble status against the Hubble Relay service).

from hubble.

ledroide avatar ledroide commented on June 2, 2024

What you can try doing is accessing the Hubble Relay service via the Hubble CLI (e.g. from within a Cilium agent pod) and check if everything is fine there (e.g. hubble status against the Hubble Relay service).

many thanks @rolinh for your suggestion. This is a good workaround.

$ kubectl exec -ti ds/cilium -c cilium-agent -- hubble observe --since 10m --follow --namespace kube-system
Feb  2 15:47:11.048: 10.233.68.120:42736 (remote-node) <> kube-system/metrics-server-6dbb566f54-n9j4g:10250 (ID:51921) to-overlay FORWARDED (TCP Flags: ACK, PSH)
Feb  2 15:47:19.722: 10.233.68.120:44872 (host) <- kube-system/dns-autoscaler-8576bb9f5b-n4vh7:8080 (ID:9883) to-stack FORWARDED (TCP Flags: SYN, ACK)

from hubble.

ledroide avatar ledroide commented on June 2, 2024

Edit : the trick does not work as expected : the cilium pod only observes workloads that run on the same node.

if running hubble observe on a cilium pod running on worker-1, hubble will only observe pods running on worker-1.

from hubble.

rolinh avatar rolinh commented on June 2, 2024

if running hubble observe on a cilium pod running on worker-1, hubble will only observe pods running on worker-1.

Yes, by default the Hubble CLI on the pod queries the local hubble server. Howver, you can point it to the hubble-relay service instead by either using the --server flag or by using the HUBBLE_SERVER environment variable. So from a pod, you want to do something like:

HUBBLE_SERVER=hubble-relay.kube-system:80 hubble observe

from hubble.

ledroide avatar ledroide commented on June 2, 2024

Current status :

  • Nothing has changed after upgrading cilium to v1.15.1
  • cilium hubble port-forward still leads to "error reading server preface: EOF" when using hubble locally from workstation
  • kubectl exec - first trick from @rolinh - works only if we target a cilium pod that is running on a worker node
  • but hubble does no see any trafic if kubectl exec on a master node
  • HUBBLE_SERVER=hubble-relay.kube-system:80 2nd trick from @rolinh does solve anything

Let's dig a bit further. Here are the cilium pods running on 3 masters and 3 workers :

$ kubectl get pod -n kube-system -l k8s-app=cilium -o wide
NAME           READY   STATUS    RESTARTS   AGE   IP               NODE             NOMINATED NODE   READINESS GATES
cilium-2q56l   1/1     Running   0          11h   100.94.2.140     k8ststworker-3   <none>           <none>
cilium-5wdv5   1/1     Running   0          11h   100.94.2.45      k8ststmaster-1   <none>           <none>
cilium-lsgfm   1/1     Running   0          11h   100.182.210.97   k8ststworker-2   <none>           <none>
cilium-vnllq   1/1     Running   0          11h   100.94.2.86      k8ststmaster-2   <none>           <none>
cilium-w7lph   1/1     Running   0          11h   100.94.2.34      k8ststmaster-3   <none>           <none>
cilium-z65zm   1/1     Running   0          11h   100.94.2.101     k8ststworker-1   <none>           <none>

The first trick that works if I choose a worker node :

$ kubectl exec -ti pod/cilium-2q56l -c cilium-agent -n kube-system -- hubble observe --since 1m --namespace sxxxxxxl -l section=wxxxxxr
Feb 20 06:57:25.239: sxxxxxxl/zincsearch-0:43618 (ID:63357) -> 74.192.137.83:443 (world) to-stack FORWARDED (TCP Flags: ACK, PSH)
Feb 20 06:57:25.239: sxxxxxxl/zincsearch-0:43618 (ID:63357) -> 74.192.137.83:443 (world) to-stack FORWARDED (TCP Flags: ACK, FIN)
Feb 20 06:57:25.243: sxxxxxxl/zincsearch-0:43618 (ID:63357) <- 74.192.137.83:443 (world) to-endpoint FORWARDED (TCP Flags: ACK)
Feb 20 06:57:25.243: sxxxxxxl/zincsearch-0:43618 (ID:63357) <- 74.192.137.83:443 (world) to-endpoint FORWARDED (TCP Flags: ACK, FIN)
Feb 20 06:57:25.243: sxxxxxxl/zincsearch-0:43618 (ID:63357) -> 74.192.137.83:443 (world) to-stack FORWARDED (TCP Flags: ACK)
Feb 20 06:57:28.645: 10.233.65.97:43796 (host) -> sxxxxxxl/zincsearch-0:4080 (ID:63357) policy-verdict:L3-Only INGRESS ALLOWED (TCP Flags: SYN)
Feb 20 06:57:28.645: 10.233.65.97:43796 (host) -> sxxxxxxl/zincsearch-0:4080 (ID:63357) to-endpoint FORWARDED (TCP Flags: SYN)

Same command with exec to a master node -> no answer, no line displayed, hubble looks blind.

$ kubectl exec -ti pod/cilium-vnllq -c cilium-agent -n kube-system -- hubble observe --since 1m --namespace sxxxxxxl -l section=wxxxxxr

Now we try variable HUBBLE_SERVER=hubble-relay.kube-system:80 from a worker node -> "connection refused"

$ kubectl exec -ti pod/cilium-2q56l -c cilium-agent -n kube-system -- bash
root@k8ststworker-3:/home/cilium# HUBBLE_SERVER=hubble-relay.kube-system:80 hubble observe --since 1m --namespace sxxxxxxl -l section=wxxxxxr
failed to connect to 'hubble-relay.kube-system:80': connection error: desc = "transport: error while dialing: dial tcp 10.233.3.157:80: connect: connection refused"
root@k8ststworker-3:/home/cilium# host hubble-relay.kube-system
bash: host: command not found
root@k8ststworker-3:/home/cilium# getent hosts hubble-relay.kube-system
10.233.3.157    hubble-relay.kube-system.svc.cluster.local
root@k8ststworker-3:/home/cilium# 
exit

service/hubble-relay listens on 443, not 80. Not that hubble-relay pod had been scheduled to a worker node :

$ kubectl get svc,ep,pod -l k8s-app=hubble-relay -n kube-system -o wide
NAME                           TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)    AGE   SELECTOR
service/hubble-relay           ClusterIP   10.233.3.157   <none>        443/TCP    42d   k8s-app=hubble-relay
service/hubble-relay-metrics   ClusterIP   None           <none>        9966/TCP   42d   k8s-app=hubble-relay

NAME                             ENDPOINTS            AGE
endpoints/hubble-relay           10.233.65.168:4245   42d
endpoints/hubble-relay-metrics   <none>               42d

NAME                               READY   STATUS    RESTARTS   AGE   IP              NODE             NOMINATED NODE   READINESS GATES
pod/hubble-relay-b4df78f74-p2rhd   1/1     Running   0          11h   10.233.65.168   k8ststworker-3   <none>           <none>

So let's try port 443 from the same worker node :

$ kubectl exec -ti pod/cilium-2q56l -c cilium-agent -n kube-system -- bash
root@k8ststworker-3:/home/cilium# HUBBLE_SERVER=hubble-relay.kube-system:443 hubble observe --since 1m --namespace sxxxxxxl -l section=wxxxxxr
failed to connect to 'hubble-relay.kube-system:443': context deadline exceeded: connection error: desc = "error reading server preface: EOF"

Conclusions :

  • At first I was considering a network issue between my workstation and the cluster, that would prevent port-forwarding for some ports - knowing that port-forwarding works fine with all other services in other namespaces from the same cluster
  • But now, after testing further, I can see the issue from inside the cluster

Any idea ?

from hubble.

ledroide avatar ledroide commented on June 2, 2024

Issue is SOLVED thanks to @rolinh. It was a TLS issue.

$ cilium hubble port-forward &
$ hubble config set tls true
$ hubble config set tls-allow-insecure true
$ 
$ cat ~/.config/hubble/config.yaml
tls: true
tls-allow-insecure: true
$ 
$ hubble status
Healthcheck (via localhost:4245): Ok
Current/Max Flows: 24,570/24,570 (100.00%)
Flows/s: 51.15
Connected Nodes: 6/6
$ 
$ hubble observe -n trivy-system --since 1m
Feb 20 14:05:47.862: trivy-system/trivy-operator-69cff49598-kwvgq:44670 (ID:3610) <- 100.94.2.25:443 (kube-apiserver) to-endpoint FORWARDED (TCP Flags: ACK, PSH)
Feb 20 14:05:47.862: trivy-system/trivy-operator-69cff49598-kwvgq:44670 (ID:3610) -> 100.94.2.25:6443 (kube-apiserver) to-stack FORWARDED (TCP Flags: ACK)
Feb 20 14:05:49.827: 10.233.70.129:49776 (host) -> trivy-system/trivy-operator-69cff49598-kwvgq:9090 (ID:3610) to-endpoint FORWARDED (TCP Flags: SYN)
Feb 20 14:05:49.827: 10.233.70.129:49776 (host) <- trivy-system/trivy-operator-69cff49598-kwvgq:9090 (ID:3610) to-stack FORWARDED (TCP Flags: SYN, ACK)

from hubble.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.