Comments (17)
@chenchun I met this problem when using the floating ip, the pod health check would be failed.
The deployment is:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-floatingip
spec:
strategy:
type: Recreate
replicas: 3
selector:
matchLabels:
app: nginx-floatingip
template:
metadata:
name: nginx-floatingip
labels:
app: nginx-floatingip
annotations:
k8s.v1.cni.cncf.io/networks: "galaxy-k8s-vlan"
k8s.v1.cni.galaxy.io/release-policy: "immutable"
spec:
tolerations:
- operator: "Exists"
containers:
- name: nginx
image: nginx:alpine
ports:
- name: http-80
containerPort: 80
resources:
requests:
cpu: "0.1"
memory: "32Mi"
tke.cloud.tencent.com/eni-ip: "1"
limits:
cpu: "0.1"
memory: "32Mi"
tke.cloud.tencent.com/eni-ip: "1"
livenessProbe:
# httpGet:
# path: /
# port: 80
# scheme: HTTP
tcpSocket:
port: 80
initialDelaySeconds: 30
periodSeconds: 10
successThreshold: 1
failureThreshold: 3
timeoutSeconds: 1
readinessProbe:
# httpGet:
# path: /
# port: 80
# scheme: HTTP
tcpSocket:
port: 80
initialDelaySeconds: 5
periodSeconds: 5
successThreshold: 2
failureThreshold: 3
timeoutSeconds: 1
the pod will always restart due to the health check probe failed.
This is the pod describition info:
Warning FailedScheduling 82s default-scheduler deployment nginx-floatingip has allocated 3 ips with replicas of 3, wait for releasing
Warning FailedScheduling 82s default-scheduler deployment nginx-floatingip has allocated 3 ips with replicas of 3, wait for releasing
Normal Scheduled 78s default-scheduler Successfully assigned default/nginx-floatingip-5cdcd7bcbd-6ql2x to 10.177.140.18
Warning Unhealthy 16s (x3 over 36s) kubelet Liveness probe failed: dial tcp 10.177.140.44:80: i/o timeout
from galaxy.
This issue is about galaxy-ipam liveness and readiness gates.
Can you provide more information? Can you ping the pod ip from host network? Can you curl the pod port from inside the pod?
from galaxy.
@chenchun
ping and curl both succeed in the pod
[root@k8s-master-01 ~]# kubectl get po -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-floatingip-c895bbb7f-hs9bk 1/1 Running 1 2d 10.177.140.46 10.177.140.16 <none> <none>
nginx-floatingip-c895bbb7f-tkl8j 1/1 Running 0 2d 10.177.140.53 10.177.140.18 <none> <none>
nginx-floatingip-c895bbb7f-tplc9 1/1 Running 1 2d 10.177.140.44 10.177.140.16 <none> <none>
[root@k8s-master-01 ~]# kubectl exec -it nginx-floatingip-c895bbb7f-hs9bk -- sh
/ # ping 10.177.140.46
PING 10.177.140.46 (10.177.140.46): 56 data bytes
64 bytes from 10.177.140.46: seq=0 ttl=64 time=0.046 ms
64 bytes from 10.177.140.46: seq=1 ttl=64 time=0.070 ms
^C
--- 10.177.140.46 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 0.046/0.058/0.070 ms
/ # curl 10.177.140.46
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
body {
width: 35em;
margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif;
}
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>
<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>
<p><em>Thank you for using nginx.</em></p>
</body>
</html>
/ #
But ping or curl pod both failed if pod scheduled in the node. For example, the pod nginx-floatingip-c895bbb7f-hs9bk
scheduled in the 10.177.140.16
node, ping or curl both failed
from galaxy.
May I ask what is the underlay network? Is it a VPC or a IDC network?
Lots of vpc network drops packets from unknown mac address. If you use galaxy-k8s-vlan cni, it connects pods with host via a veth pair, thus pods has their own mac addresses.
from galaxy.
Yes, I use the galaxy-k8s-vlan in the IDC underlay network. galaxy.json is
{
"NetworkConf":[
{"name":"tke-route-eni","type":"tke-route-eni","eni":"eth1","routeTable":1},
{"name":"galaxy-flannel","type":"galaxy-flannel", "delegate":{"type":"galaxy-veth"},"subnetFile":"/run/flannel/subnet.env"},
{"name":"galaxy-k8s-vlan","type":"galaxy-k8s-vlan", "device":"ens192", "switch":"ipvlan", "ipvlan_mode":"l2"},
{"name":"galaxy-k8s-sriov","type": "galaxy-k8s-sriov", "device": "ens192", "vf_num": 10}
],
"DefaultNetworks": ["galaxy-flannel"],
"ENIIPNetwork": "galaxy-k8s-vlan"
}
How to create a veth pair in the pod when using the ipvlan mode which disturbed me very much. I have already turn on promiscuous mode in the host
from galaxy.
You mean ping pod on another node or ping the other pod on the same node is unreachable ?
from galaxy.
if the pod in the node, this node ping pod is unreachable
from galaxy.
moby/moby#21735 (comment) @currycan
Note: In both Macvlan and Ipvlan you are not able to ping or communicate with the default namespace IP address. For example, if you create a container and try to ping the Docker host's eth0 it will not work. That traffic is explicitly filtered by the kernel modules themselves to offer additional provider isolation and security.
The default namespace is not reachable per ipvlan design in order to isolate container namespaces from the underlying host.
from galaxy.
@chenchun
If using the floating IP, the pod's livenessProbe and readinessProbe will be unavailable, which will be very terrible.
I get some other information from: https://hansedong.github.io/2019/03/19/14/
But how to create another veth pair in the pod like this:
{
"name": "cni0",
"cniVersion": "0.3.1",
"plugins": [
{
"nodename": "k8s-node-2",
"name": "myipvlan",
"type": "ipvlan",
"debug": true,
"master": "eth0",
"mode": "l2",
"ipam": {
"type": "host-local",
"subnet": "172.18.12.0/24",
"rangeStart": "172.18.12.211",
"rangeEnd": "172.18.12.230",
"gateway": "172.18.12.1",
"routes": [
{
"dst": "0.0.0.0/0"
}
]
}
},
{
"name": "ptp",
"type": "unnumbered-ptp",
"hostInterface": "eth0",
"containerInterface": "veth0",
"ipMasq": true
}
]
}
from galaxy.
@currycan I would rather suggest you to use galaxy-underlay-veth instead of ipvlan which is based on proxy_arp.
It's the ideal solution, livenessProbe, readinessProbe and kubernetes service all works.
from galaxy.
@chenchun
I changed the mode to galaxy-underlay-veth, and probes work well.
But the network seems something wrong, the domain name can't be resolved in the pod:
/ # nslookup cloud.tencent.com
;; connection timed out; no servers could be reached
/ # cat /etc/resolv.conf
nameserver 172.31.0.10
search default.svc.cluster.local svc.cluster.local cluster.local
options ndots:5
/ # route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 10.177.143.254 0.0.0.0 UG 0 0 0 eth0
10.177.140.0 0.0.0.0 255.255.252.0 U 0 0 0 eth0
from galaxy.
Can you try ping 172.31.0.10? and also try to ping coredns pod ip directly?
Is your coredns pod using flannel network? Does the flannel network still work between these two hosts ?
from galaxy.
I also suggest you to try running coredns with host network which is more simple and reliable.
from galaxy.
Coredns and flannel are running, and coredns is running using the flannel cni. ping coredns cluster ip and pod ip are reachable.
And if running coredns with host network, Do I still need to create a service for coredns?
from galaxy.
@chenchun I tested it for a long time and finally found that it was a problem with the dnsPolicy configuration of coreDNS deployment.The value of dnsPolicy must be "default"
from galaxy.
So, everything is working now?
from galaxy.
@chenchun Yes, thank you very much!
from galaxy.
Related Issues (20)
- make galaxy pods critical guaranteed scheduling
- Enable recreated ENI without restart
- Support configuring multiple device of vlan cni plugin
- Support configuring routes if setup multiple networks for a pod
- Revisit release ip api
- Avoid allocating ips that are used by other cloud systems
- vlan cni: move ip from vlan device to bridge
- quick start failed HOT 17
- galaxy-sdn network name missing HOT 1
- Add liveness and readiness gates for galaxy-ipam
- Add underlay network quick start doc HOT 1
- Vlan CNI switch=ipvlan ipvlan_mode=l3, l3s do not send gratuitous arp HOT 4
- Replace privileged securityContext of galaxy with network admin capability
- pod ip conflicted
- 是否可以增加一个配置 sriov 和 macvlan 的教程? HOT 1
- 将release-policy设置为never时,删除POD 重建后,IP 并不能固定 HOT 6
- Galaxy can not create pod sucess with containerd runtime HOT 2
- galaxy spelling mistakes
- 请问不同node的网卡名称不一致该如何配置
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from galaxy.