Giter VIP home page Giter VIP logo

Comments (17)

currycan avatar currycan commented on September 17, 2024

@chenchun I met this problem when using the floating ip, the pod health check would be failed.
The deployment is:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-floatingip
spec:
  strategy:
    type: Recreate
  replicas: 3
  selector:
    matchLabels:
      app: nginx-floatingip
  template:
    metadata:
      name: nginx-floatingip
      labels:
        app: nginx-floatingip
      annotations:
        k8s.v1.cni.cncf.io/networks: "galaxy-k8s-vlan"
        k8s.v1.cni.galaxy.io/release-policy: "immutable"
    spec:
      tolerations:
        - operator: "Exists"
      containers:
      - name: nginx
        image: nginx:alpine
        ports:
          - name: http-80
            containerPort: 80
        resources:
          requests:
            cpu: "0.1"
            memory: "32Mi"
            tke.cloud.tencent.com/eni-ip: "1"
          limits:
            cpu: "0.1"
            memory: "32Mi"
            tke.cloud.tencent.com/eni-ip: "1"
        livenessProbe:
          # httpGet:
          #   path: /
          #   port: 80
          #   scheme: HTTP
          tcpSocket:
            port: 80
          initialDelaySeconds: 30
          periodSeconds: 10
          successThreshold: 1
          failureThreshold: 3
          timeoutSeconds: 1
        readinessProbe:
          # httpGet:
          #   path: /
          #   port: 80
          #   scheme: HTTP
          tcpSocket:
            port: 80
          initialDelaySeconds: 5
          periodSeconds: 5
          successThreshold: 2
          failureThreshold: 3
          timeoutSeconds: 1

the pod will always restart due to the health check probe failed.
This is the pod describition info:

  Warning  FailedScheduling  82s                default-scheduler  deployment nginx-floatingip has allocated 3 ips with replicas of 3, wait for releasing
  Warning  FailedScheduling  82s                default-scheduler  deployment nginx-floatingip has allocated 3 ips with replicas of 3, wait for releasing
  Normal   Scheduled         78s                default-scheduler  Successfully assigned default/nginx-floatingip-5cdcd7bcbd-6ql2x to 10.177.140.18
  Warning  Unhealthy         16s (x3 over 36s)  kubelet            Liveness probe failed: dial tcp 10.177.140.44:80: i/o timeout

from galaxy.

chenchun avatar chenchun commented on September 17, 2024

This issue is about galaxy-ipam liveness and readiness gates.
Can you provide more information? Can you ping the pod ip from host network? Can you curl the pod port from inside the pod?

from galaxy.

currycan avatar currycan commented on September 17, 2024

@chenchun
ping and curl both succeed in the pod

[root@k8s-master-01 ~]# kubectl get po -o wide
NAME                                READY   STATUS    RESTARTS   AGE    IP              NODE            NOMINATED NODE   READINESS GATES
nginx-floatingip-c895bbb7f-hs9bk    1/1     Running   1          2d     10.177.140.46   10.177.140.16   <none>           <none>
nginx-floatingip-c895bbb7f-tkl8j    1/1     Running   0          2d     10.177.140.53   10.177.140.18   <none>           <none>
nginx-floatingip-c895bbb7f-tplc9    1/1     Running   1          2d     10.177.140.44   10.177.140.16   <none>           <none>
[root@k8s-master-01 ~]# kubectl exec -it nginx-floatingip-c895bbb7f-hs9bk -- sh
/ # ping 10.177.140.46
PING 10.177.140.46 (10.177.140.46): 56 data bytes
64 bytes from 10.177.140.46: seq=0 ttl=64 time=0.046 ms
64 bytes from 10.177.140.46: seq=1 ttl=64 time=0.070 ms
^C
--- 10.177.140.46 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 0.046/0.058/0.070 ms
/ # curl 10.177.140.46
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
    body {
        width: 35em;
        margin: 0 auto;
        font-family: Tahoma, Verdana, Arial, sans-serif;
    }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>
/ #

But ping or curl pod both failed if pod scheduled in the node. For example, the pod nginx-floatingip-c895bbb7f-hs9bk scheduled in the 10.177.140.16 node, ping or curl both failed

from galaxy.

chenchun avatar chenchun commented on September 17, 2024

May I ask what is the underlay network? Is it a VPC or a IDC network?
Lots of vpc network drops packets from unknown mac address. If you use galaxy-k8s-vlan cni, it connects pods with host via a veth pair, thus pods has their own mac addresses.

from galaxy.

currycan avatar currycan commented on September 17, 2024

Yes, I use the galaxy-k8s-vlan in the IDC underlay network. galaxy.json is

    {
      "NetworkConf":[
        {"name":"tke-route-eni","type":"tke-route-eni","eni":"eth1","routeTable":1},
        {"name":"galaxy-flannel","type":"galaxy-flannel", "delegate":{"type":"galaxy-veth"},"subnetFile":"/run/flannel/subnet.env"},
        {"name":"galaxy-k8s-vlan","type":"galaxy-k8s-vlan", "device":"ens192", "switch":"ipvlan", "ipvlan_mode":"l2"},
        {"name":"galaxy-k8s-sriov","type": "galaxy-k8s-sriov", "device": "ens192", "vf_num": 10}
      ],
      "DefaultNetworks": ["galaxy-flannel"],
      "ENIIPNetwork": "galaxy-k8s-vlan"
    }

How to create a veth pair in the pod when using the ipvlan mode which disturbed me very much. I have already turn on promiscuous mode in the host

from galaxy.

chenchun avatar chenchun commented on September 17, 2024

You mean ping pod on another node or ping the other pod on the same node is unreachable ?

from galaxy.

currycan avatar currycan commented on September 17, 2024

if the pod in the node, this node ping pod is unreachable

from galaxy.

chenchun avatar chenchun commented on September 17, 2024

moby/moby#21735 (comment) @currycan

Note: In both Macvlan and Ipvlan you are not able to ping or communicate with the default namespace IP address. For example, if you create a container and try to ping the Docker host's eth0 it will not work. That traffic is explicitly filtered by the kernel modules themselves to offer additional provider isolation and security.

The default namespace is not reachable per ipvlan design in order to isolate container namespaces from the underlying host.

from galaxy.

currycan avatar currycan commented on September 17, 2024

@chenchun
If using the floating IP, the pod's livenessProbe and readinessProbe will be unavailable, which will be very terrible.
I get some other information from: https://hansedong.github.io/2019/03/19/14/
But how to create another veth pair in the pod like this:

{
    "name": "cni0",
    "cniVersion": "0.3.1",
    "plugins": [
        {
            "nodename": "k8s-node-2",
            "name": "myipvlan",
            "type": "ipvlan",
            "debug": true,
            "master": "eth0",
            "mode": "l2",
            "ipam": {
                "type": "host-local",
                "subnet": "172.18.12.0/24",
                "rangeStart": "172.18.12.211",
                "rangeEnd": "172.18.12.230",
                "gateway": "172.18.12.1",
                "routes": [
                    {
                        "dst": "0.0.0.0/0"
                    }
                ]
            }
        },
        {
            "name": "ptp",
            "type": "unnumbered-ptp",
            "hostInterface": "eth0",
            "containerInterface": "veth0",
            "ipMasq": true
        }
    ]
}

from galaxy.

chenchun avatar chenchun commented on September 17, 2024

@currycan I would rather suggest you to use galaxy-underlay-veth instead of ipvlan which is based on proxy_arp.
It's the ideal solution, livenessProbe, readinessProbe and kubernetes service all works.

from galaxy.

currycan avatar currycan commented on September 17, 2024

@chenchun
I changed the mode to galaxy-underlay-veth, and probes work well.
But the network seems something wrong, the domain name can't be resolved in the pod:

/ # nslookup cloud.tencent.com
;; connection timed out; no servers could be reached

/ # cat /etc/resolv.conf
nameserver 172.31.0.10
search default.svc.cluster.local svc.cluster.local cluster.local
options ndots:5
/ # route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         10.177.143.254  0.0.0.0         UG    0      0        0 eth0
10.177.140.0    0.0.0.0         255.255.252.0   U     0      0        0 eth0

from galaxy.

chenchun avatar chenchun commented on September 17, 2024

Can you try ping 172.31.0.10? and also try to ping coredns pod ip directly?
Is your coredns pod using flannel network? Does the flannel network still work between these two hosts ?

from galaxy.

chenchun avatar chenchun commented on September 17, 2024

I also suggest you to try running coredns with host network which is more simple and reliable.

from galaxy.

currycan avatar currycan commented on September 17, 2024

Coredns and flannel are running, and coredns is running using the flannel cni. ping coredns cluster ip and pod ip are reachable.
And if running coredns with host network, Do I still need to create a service for coredns?

from galaxy.

currycan avatar currycan commented on September 17, 2024

@chenchun I tested it for a long time and finally found that it was a problem with the dnsPolicy configuration of coreDNS deployment.The value of dnsPolicy must be "default"

from galaxy.

chenchun avatar chenchun commented on September 17, 2024

So, everything is working now?

from galaxy.

currycan avatar currycan commented on September 17, 2024

@chenchun Yes, thank you very much!

from galaxy.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.