Comments (6)
Thanks - trying the new one :) I'll close this and report if i see this again ๐ค
from terraform-hcloud-kube-hetzner.
Hi @mysticaltech thanks for the tips. I changed the IPs to the other 2 control planes in the kubeconfig.yml and got the below messages respectively:
k3s-control-plane-1: Error from server (InternalError): an error on the server ("apiserver not ready") has prevented the request from succeeding
k3s-control-plane-2: The connection to the server 5.161.81.162:6443 was refused - did you specify the right host or port?
(same as k3s-control-plan-1)
Here is a dump of the last relevant logs (i believe) from k3s-control-plane-10:
Feb 18 00:27:24 static k3s[1485]: E0218 00:27:24.240705 1485 status.go:71] apiserver received an error that is not an metav1.Status: context.deadlineExceededError{}: context deadline exceeded
Feb 18 00:27:24 static k3s[1485]: W0218 00:27:24.241283 1485 server.go:1299] [core] grpc: Server.processUnaryRPC failed to write status: connection error: desc = "transport is closing"
Feb 18 00:27:24 static k3s[1485]: E0218 00:27:24.241307 1485 writers.go:117] apiserver was unable to write a JSON response: http: Handler timeout
Feb 18 00:27:24 static k3s[1485]: E0218 00:27:24.242417 1485 status.go:71] apiserver received an error that is not an metav1.Status: &errors.errorString{s:"http: Handler timeout"}: http: Handler timeout
Feb 18 00:27:24 static k3s[1485]: E0218 00:27:24.243533 1485 writers.go:130] apiserver was unable to write a fallback JSON response: http: Handler timeout
Feb 18 00:27:24 static k3s[1485]: I0218 00:27:24.244625 1485 trace.go:205] Trace[31918229]: "Get" url:/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/kube-controller-manager,user-agent:k3s/v1.22.3 (linux/amd64) kubernetes/5d8c744/leader-election,audit-id:394cad73-0e98-4b0d-8cdc-e88764c23711,client:127.0.0.1,accept:application/vnd.kubernetes.protobuf, */*,protocol:HTTP/2.0 (18-Feb-2022 00:27:19.241) (total time: 5003ms):
Feb 18 00:27:24 static k3s[1485]: Trace[31918229]: [5.003546588s] [5.003546588s] END
Feb 18 00:27:24 static k3s[1485]: E0218 00:27:24.244961 1485 timeout.go:135] post-timeout activity - time-elapsed: 4.001971ms, GET "/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/kube-controller-manager" result: <nil>
Feb 18 00:27:24 static k3s[1485]: {"level":"warn","ts":"2022-02-18T00:27:24.444Z","caller":"etcdserver/v3_server.go:815","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":15038221776147062290,"retry-timeout":"500ms"}
Feb 18 00:27:24 static k3s[1485]: {"level":"warn","ts":"2022-02-18T00:27:24.944Z","caller":"etcdserver/v3_server.go:815","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":15038221776147062290,"retry-timeout":"500ms"}
Feb 18 00:27:25 static k3s[1485]: {"level":"warn","ts":"2022-02-18T00:27:25.445Z","caller":"etcdserver/v3_server.go:815","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":15038221776147062290,"retry-timeout":"500ms"}
Feb 18 00:27:25 static k3s[1485]: {"level":"warn","ts":"2022-02-18T00:27:25.946Z","caller":"etcdserver/v3_server.go:815","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":15038221776147062290,"retry-timeout":"500ms"}
Feb 18 00:27:26 static k3s[1485]: {"level":"warn","ts":"2022-02-18T00:27:26.446Z","caller":"etcdserver/v3_server.go:815","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":15038221776147062290,"retry-timeout":"500ms"}
Feb 18 00:27:26 static k3s[1485]: I0218 00:27:26.601777 1485 trace.go:205] Trace[1096205153]: "cacher list" type:*admissionregistration.MutatingWebhookConfiguration (18-Feb-2022 00:27:23.600) (total time: 3000ms):
Feb 18 00:27:26 static k3s[1485]: Trace[1096205153]: [3.000823979s] [3.000823979s] END
Feb 18 00:27:26 static k3s[1485]: I0218 00:27:26.601981 1485 trace.go:205] Trace[211907812]: "List" url:/apis/admissionregistration.k8s.io/v1/mutatingwebhookconfigurations,user-agent:kube-state-metrics/v2.0.0 (linux/amd64) kube-state-metrics/,audit-id:3e0ed6be-5660-4270-a2d4-e80c4a8ea3d5,client:10.0.1.1,accept:application/vnd.kubernetes.protobuf,application/json,protocol:HTTP/1.1 (18-Feb-2022 00:27:23.600) (total time: 3001ms):
Feb 18 00:27:26 static k3s[1485]: Trace[211907812]: [3.001114394s] [3.001114394s] END
Feb 18 00:27:26 static k3s[1485]: {"level":"warn","ts":"2022-02-18T00:27:26.673Z","caller":"etcdserver/cluster_util.go:288","msg":"failed to reach the peer URL","address":"https://10.0.0.3:2380/version","remote-member-id":"44b74a9b080ff870","error":"Get \"https://10.0.0.3:2380/version\": dial tcp 10.0.0.3:2380: i/o timeout"}
Feb 18 00:27:26 static k3s[1485]: {"level":"warn","ts":"2022-02-18T00:27:26.673Z","caller":"etcdserver/cluster_util.go:155","msg":"failed to get version","remote-member-id":"44b74a9b080ff870","error":"Get \"https://10.0.0.3:2380/version\": dial tcp 10.0.0.3:2380: i/o timeout"}
Feb 18 00:27:26 static k3s[1485]: {"level":"warn","ts":"2022-02-18T00:27:26.699Z","logger":"raft","caller":"etcdserver/zap_raft.go:85","msg":"fde9dd315b6d0b2 stepped down to follower since quorum is not active"}
Feb 18 00:27:26 static k3s[1485]: {"level":"info","ts":"2022-02-18T00:27:26.699Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"fde9dd315b6d0b2 became follower at term 3"}
Feb 18 00:27:26 static k3s[1485]: {"level":"info","ts":"2022-02-18T00:27:26.699Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"raft.node: fde9dd315b6d0b2 lost leader fde9dd315b6d0b2 at term 3"}
Feb 18 00:27:26 static k3s[1485]: I0218 00:27:26.753307 1485 trace.go:205] Trace[1013056742]: "cacher list" type:*core.Service (18-Feb-2022 00:27:23.752) (total time: 3000ms):
Feb 18 00:27:26 static k3s[1485]: Trace[1013056742]: [3.000953712s] [3.000953712s] END
Feb 18 00:27:26 static k3s[1485]: I0218 00:27:26.753573 1485 trace.go:205] Trace[485383494]: "List" url:/api/v1/services,user-agent:kube-state-metrics/v2.0.0 (linux/amd64) kube-state-metrics/,audit-id:b3eba650-41a2-4a9a-ac5c-59a0d0795452,client:10.0.1.1,accept:application/vnd.kubernetes.protobuf,application/json,protocol:HTTP/1.1 (18-Feb-2022 00:27:23.752) (total time: 3001ms):
Feb 18 00:27:26 static k3s[1485]: Trace[485383494]: [3.001236562s] [3.001236562s] END
Feb 18 00:27:26 static k3s[1485]: I0218 00:27:26.780071 1485 trace.go:205] Trace[454733439]: "cacher list" type:*core.PersistentVolume (18-Feb-2022 00:27:23.779) (total time: 3000ms):
Feb 18 00:27:26 static k3s[1485]: Trace[454733439]: [3.000144994s] [3.000144994s] END
Feb 18 00:27:26 static k3s[1485]: I0218 00:27:26.780257 1485 trace.go:205] Trace[403897678]: "List" url:/api/v1/persistentvolumes,user-agent:csi-attacher/v0.0.0 (linux/amd64) kubernetes/$Format,audit-id:12be4c46-e55e-46a7-8d63-a3f7e2fe32f2,client:10.0.1.1,accept:application/json, */*,protocol:HTTP/1.1 (18-Feb-2022 00:27:23.779) (total time: 3000ms):
Feb 18 00:27:26 static k3s[1485]: Trace[403897678]: [3.000399301s] [3.000399301s] END
Feb 18 00:27:26 static k3s[1485]: I0218 00:27:26.859301 1485 trace.go:205] Trace[1989561755]: "cacher list" type:*storage.VolumeAttachment (18-Feb-2022 00:27:23.858) (total time: 3000ms):
Feb 18 00:27:26 static k3s[1485]: Trace[1989561755]: [3.000442703s] [3.000442703s] END
Feb 18 00:27:26 static k3s[1485]: I0218 00:27:26.859543 1485 trace.go:205] Trace[1203395251]: "List" url:/apis/storage.k8s.io/v1/volumeattachments,user-agent:csi-provisioner/v0.0.0 (linux/amd64) kubernetes/$Format,audit-id:d4fbea16-f6b2-47fd-992f-463e55e8a0bc,client:10.0.1.1,accept:application/json, */*,protocol:HTTP/1.1 (18-Feb-2022 00:27:23.858) (total time: 3000ms):
Feb 18 00:27:26 static k3s[1485]: Trace[1203395251]: [3.000777741s] [3.000777741s] END
Feb 18 00:27:26 static k3s[1485]: I0218 00:27:26.861466 1485 trace.go:205] Trace[130762900]: "cacher list" type:*storage.CSINode (18-Feb-2022 00:27:23.860) (total time: 3000ms):
Feb 18 00:27:26 static k3s[1485]: Trace[130762900]: [3.00077687s] [3.00077687s] END
Feb 18 00:27:26 static k3s[1485]: I0218 00:27:26.861643 1485 trace.go:205] Trace[1105326689]: "List" url:/apis/storage.k8s.io/v1/csinodes,user-agent:csi-provisioner/v0.0.0 (linux/amd64) kubernetes/$Format,audit-id:91cee22c-fc8e-4a7d-84b9-956482896d5a,client:10.0.1.1,accept:application/json, */*,protocol:HTTP/1.1 (18-Feb-2022 00:27:23.860) (total time: 3000ms):
Feb 18 00:27:26 static k3s[1485]: Trace[1105326689]: [3.000991402s] [3.000991402s] END
Feb 18 00:27:26 static k3s[1485]: I0218 00:27:26.887041 1485 trace.go:205] Trace[2035177880]: "cacher list" type:*unstructured.Unstructured (18-Feb-2022 00:27:23.886) (total time: 3000ms):
Feb 18 00:27:26 static k3s[1485]: Trace[2035177880]: [3.000398188s] [3.000398188s] END
Feb 18 00:27:26 static k3s[1485]: I0218 00:27:26.890501 1485 trace.go:205] Trace[2142499412]: "List" url:/apis/traefik.containo.us/v1alpha1/ingressroutetcps,user-agent:traefik/2.5.0 (linux/amd64) kubernetes/crd,audit-id:93b0da4c-54cf-42cd-97a5-3c715f7a2f8f,client:10.0.1.2,accept:application/json, */*,protocol:HTTP/1.1 (18-Feb-2022 00:27:23.886) (total time: 3003ms):
Feb 18 00:27:26 static k3s[1485]: Trace[2142499412]: [3.003883441s] [3.003883441s] END
Feb 18 00:27:26 static k3s[1485]: {"level":"warn","ts":"2022-02-18T00:27:26.946Z","caller":"etcdserver/v3_server.go:815","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":15038221776147062290,"retry-timeout":"500ms"}
Feb 18 00:27:26 static k3s[1485]: {"level":"info","ts":"2022-02-18T00:27:26.946Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"fde9dd315b6d0b2 no leader at term 3; dropping index reading msg"}
Feb 18 00:27:27 static k3s[1485]: I0218 00:27:27.135168 1485 trace.go:205] Trace[1842589464]: "cacher list" type:*core.Secret (18-Feb-2022 00:27:24.134) (total time: 3000ms):
Feb 18 00:27:27 static k3s[1485]: Trace[1842589464]: [3.000398107s] [3.000398107s] END
Feb 18 00:27:27 static k3s[1485]: I0218 00:27:27.135457 1485 trace.go:205] Trace[800629758]: "cacher list" type:*networking.IngressClass (18-Feb-2022 00:27:24.134) (total time: 3000ms):
Feb 18 00:27:27 static k3s[1485]: Trace[800629758]: [3.000772461s] [3.000772461s] END
Feb 18 00:27:27 static k3s[1485]: I0218 00:27:27.135645 1485 trace.go:205] Trace[250511849]: "List" url:/apis/networking.k8s.io/v1/ingressclasses,user-agent:traefik/2.5.0 (linux/amd64) kubernetes/ingress,audit-id:a556ac86-143a-4fff-b6ce-8a1f3884c2c6,client:10.0.1.2,accept:application/json, */*,protocol:HTTP/1.1 (18-Feb-2022 00:27:24.134) (total time: 3001ms):
Feb 18 00:27:27 static k3s[1485]: Trace[250511849]: [3.001004457s] [3.001004457s] END
Feb 18 00:27:27 static k3s[1485]: I0218 00:27:27.135820 1485 trace.go:205] Trace[1439385675]: "List" url:/api/v1/secrets,user-agent:traefik/2.5.0 (linux/amd64) kubernetes/crd,audit-id:03ed2b1e-d94d-4618-9f8f-62cc540cbe05,client:10.0.1.2,accept:application/json, */*,protocol:HTTP/1.1 (18-Feb-2022 00:27:24.134) (total time: 3001ms):
Feb 18 00:27:27 static k3s[1485]: Trace[1439385675]: [3.001148157s] [3.001148157s] END
Feb 18 00:27:27 static k3s[1485]: I0218 00:27:27.212478 1485 trace.go:205] Trace[1799538963]: "cacher list" type:*core.Service (18-Feb-2022 00:27:24.211) (total time: 3000ms):
Feb 18 00:27:27 static k3s[1485]: Trace[1799538963]: [3.000917214s] [3.000917214s] END
Feb 18 00:27:27 static k3s[1485]: I0218 00:27:27.212689 1485 trace.go:205] Trace[1804022479]: "List" url:/api/v1/namespaces/lens-metrics/services,user-agent:Prometheus/2.27.1,audit-id:f07c56ca-7700-46db-9bac-b58c931487f2,client:10.0.1.1,accept:application/json, */*,protocol:HTTP/1.1 (18-Feb-2022 00:27:24.211) (total time: 3001ms):
Feb 18 00:27:27 static k3s[1485]: Trace[1804022479]: [3.001159417s] [3.001159417s] END
Feb 18 00:27:27 static k3s[1485]: I0218 00:27:27.212939 1485 trace.go:205] Trace[1724410243]: "cacher list" type:*core.ResourceQuota (18-Feb-2022 00:27:24.211) (total time: 3000ms):
Feb 18 00:27:27 static k3s[1485]: Trace[1724410243]: [3.000905552s] [3.000905552s] END
Feb 18 00:27:27 static k3s[1485]: I0218 00:27:27.212958 1485 trace.go:205] Trace[570188394]: "cacher list" type:*core.Service (18-Feb-2022 00:27:24.212) (total time: 3000ms):
Feb 18 00:27:27 static k3s[1485]: Trace[570188394]: [3.000415902s] [3.000415902s] END
Feb 18 00:27:27 static k3s[1485]: I0218 00:27:27.213315 1485 trace.go:205] Trace[396921058]: "List" url:/api/v1/namespaces/lens-metrics/services,user-agent:Prometheus/2.27.1,audit-id:e4834361-c059-449d-b188-962fa3301c29,client:10.0.1.1,accept:application/json, */*,protocol:HTTP/1.1 (18-Feb-2022 00:27:24.212) (total time: 3000ms):
Feb 18 00:27:27 static k3s[1485]: Trace[396921058]: [3.00078711s] [3.00078711s] END
Feb 18 00:27:27 static k3s[1485]: I0218 00:27:27.213487 1485 trace.go:205] Trace[1318853724]: "List" url:/api/v1/resourcequotas,user-agent:kube-state-metrics/v2.0.0 (linux/amd64) kube-state-metrics/,audit-id:92fae7c9-0c34-4e83-8965-759be489bf99,client:10.0.1.1,accept:application/vnd.kubernetes.protobuf,application/json,protocol:HTTP/1.1 (18-Feb-2022 00:27:24.211) (total time: 3001ms):
Feb 18 00:27:27 static k3s[1485]: Trace[1318853724]: [3.001921268s] [3.001921268s] END
Feb 18 00:27:27 static k3s[1485]: {"level":"warn","ts":"2022-02-18T00:27:27.447Z","caller":"etcdserver/v3_server.go:815","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":15038221776147062290,"retry-timeout":"500ms"}
Feb 18 00:27:27 static k3s[1485]: {"level":"info","ts":"2022-02-18T00:27:27.447Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"fde9dd315b6d0b2 no leader at term 3; dropping index reading msg"}
Feb 18 00:27:27 static k3s[1485]: {"level":"warn","ts":"2022-02-18T00:27:27.947Z","caller":"etcdserver/v3_server.go:815","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":15038221776147062290,"retry-timeout":"500ms"}
Feb 18 00:27:27 static k3s[1485]: {"level":"info","ts":"2022-02-18T00:27:27.947Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"fde9dd315b6d0b2 no leader at term 3; dropping index reading msg"}
Feb 18 00:27:28 static k3s[1485]: {"level":"warn","ts":"2022-02-18T00:27:28.448Z","caller":"etcdserver/v3_server.go:815","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":15038221776147062290,"retry-timeout":"500ms"}
Feb 18 00:27:28 static k3s[1485]: {"level":"info","ts":"2022-02-18T00:27:28.448Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"fde9dd315b6d0b2 no leader at term 3; dropping index reading msg"}
Feb 18 00:27:28 static k3s[1485]: {"level":"warn","ts":"2022-02-18T00:27:28.949Z","caller":"etcdserver/v3_server.go:815","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":15038221776147062290,"retry-timeout":"500ms"}
Feb 18 00:27:28 static k3s[1485]: {"level":"info","ts":"2022-02-18T00:27:28.949Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"fde9dd315b6d0b2 no leader at term 3; dropping index reading msg"}
Feb 18 00:27:29 static k3s[1485]: {"level":"warn","ts":"2022-02-18T00:27:29.012Z","caller":"v3rpc/interceptor.go:197","msg":"request stats","start time":"2022-02-18T00:27:19.017Z","time spent":"9.994647691s","remote":"127.0.0.1:51596","response type":"/etcdserverpb.KV/Txn","request count":0,"request size":0,"response count":0,"response size":0,"request content":""}
Feb 18 00:27:29 static k3s[1485]: I0218 00:27:29.012273 1485 trace.go:205] Trace[566594349]: "GuaranteedUpdate etcd3" type:*coordination.Lease (18-Feb-2022 00:27:19.016) (total time: 9995ms):
Feb 18 00:27:29 static k3s[1485]: Trace[566594349]: [9.995541569s] [9.995541569s] END
Feb 18 00:27:29 static k3s[1485]: E0218 00:27:29.015072 1485 writers.go:117] apiserver was unable to write a JSON response: http: Handler timeout
Feb 18 00:27:29 static k3s[1485]: E0218 00:27:29.015108 1485 status.go:71] apiserver received an error that is not an metav1.Status: &errors.errorString{s:"http: Handler timeout"}: http: Handler timeout
Feb 18 00:27:29 static k3s[1485]: E0218 00:27:29.015160 1485 finisher.go:175] FinishRequest: post-timeout activity - time-elapsed: 33.031ยตs, panicked: false, err: context deadline exceeded, panic-reason: <nil>
Feb 18 00:27:29 static k3s[1485]: E0218 00:27:29.015232 1485 controller.go:187] failed to update lease, error: Put "https://127.0.0.1:6443/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/k3s-control-plane-0?timeout=10s": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
Feb 18 00:27:29 static k3s[1485]: E0218 00:27:29.016840 1485 writers.go:130] apiserver was unable to write a fallback JSON response: http: Handler timeout
Feb 18 00:27:29 static k3s[1485]: I0218 00:27:29.018097 1485 trace.go:205] Trace[45207846]: "Update" url:/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/k3s-control-plane-0,user-agent:k3s/v1.22.3 (linux/amd64) kubernetes/5d8c744,audit-id:79b06743-d075-40c6-9309-54fc3bf5669b,client:127.0.0.1,accept:application/vnd.kubernetes.protobuf,application/json,protocol:HTTP/1.1 (18-Feb-2022 00:27:19.016) (total time: 10001ms):
Feb 18 00:27:29 static k3s[1485]: Trace[45207846]: [10.001507037s] [10.001507037s] END
Feb 18 00:27:29 static k3s[1485]: E0218 00:27:29.024268 1485 timeout.go:135] post-timeout activity - time-elapsed: 8.826889ms, PUT "/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/k3s-control-plane-0" result: <nil>
Feb 18 00:27:29 static k3s[1485]: E0218 00:27:29.239960 1485 leaderelection.go:330] error retrieving resource lock kube-system/kube-controller-manager: Get "https://127.0.0.1:6444/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/kube-controller-manager?timeout=5s": context deadline exceeded
Feb 18 00:27:29 static k3s[1485]: I0218 00:27:29.240005 1485 leaderelection.go:283] failed to renew lease kube-system/kube-controller-manager: timed out waiting for the condition
Feb 18 00:27:29 static k3s[1485]: F0218 00:27:29.240168 1485 controllermanager.go:287] leaderelection lost
Feb 18 00:27:29 static k3s[1485]: E0218 00:27:29.240478 1485 status.go:71] apiserver received an error that is not an metav1.Status: &errors.errorString{s:"context canceled"}: context canceled
Feb 18 00:27:29 static k3s[1485]: E0218 00:27:29.240587 1485 writers.go:117] apiserver was unable to write a JSON response: http: Handler timeout
Feb 18 00:27:29 static k3s[1485]: E0218 00:27:29.241565 1485 status.go:71] apiserver received an error that is not an metav1.Status: &errors.errorString{s:"http: Handler timeout"}: http: Handler timeout
Feb 18 00:27:29 static k3s[1485]: I0218 00:27:29.241726 1485 event.go:291] "Event occurred" object="" kind="Lease" apiVersion="coordination.k8s.io/v1" type="Normal" reason="LeaderElection" message="static_229a9ab6-7621-4e9e-81ba-c12ae30ba0da stopped leading"
Feb 18 00:27:29 static k3s[1485]: E0218 00:27:29.242396 1485 runtime.go:76] Observed a panic: F0218 00:27:29.240168 1485 controllermanager.go:287] leaderelection lost
Feb 18 00:27:29 static k3s[1485]: goroutine 13846 [running]:
Feb 18 00:27:29 static k3s[1485]: k8s.io/apimachinery/pkg/util/runtime.logPanic({0x5588d550c9e0, 0xc02154cc10})
Feb 18 00:27:29 static k3s[1485]: /home/abuild/rpmbuild/BUILD/k3s-1.22.3-k3s1/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:74 +0x85
Feb 18 00:27:29 static k3s[1485]: k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0x1})
Feb 18 00:27:29 static k3s[1485]: /home/abuild/rpmbuild/BUILD/k3s-1.22.3-k3s1/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:48 +0x75
Feb 18 00:27:29 static k3s[1485]: panic({0x5588d550c9e0, 0xc02154cc10})
Feb 18 00:27:29 static k3s[1485]: /usr/lib64/go/1.17/src/runtime/panic.go:1038 +0x215
Feb 18 00:27:29 static k3s[1485]: k8s.io/klog/v2.(*loggingT).output(0x5588d9b426a0, 0x3, {0x0, 0x0}, 0xc000a46b60, 0x0, {0x5588d78746d0, 0x5588d659c400}, 0xc00a83a2c0, 0x0)
Feb 18 00:27:29 static k3s[1485]: /home/abuild/rpmbuild/BUILD/k3s-1.22.3-k3s1/vendor/k8s.io/klog/v2/klog.go:970 +0x685
Feb 18 00:27:29 static k3s[1485]: k8s.io/klog/v2.(*loggingT).printf(0x0, 0x0, {0x0, 0x0}, {0x0, 0x0}, {0x5588d4496738, 0x13}, {0x0, 0x0, ...})
Feb 18 00:27:29 static k3s[1485]: /home/abuild/rpmbuild/BUILD/k3s-1.22.3-k3s1/vendor/k8s.io/klog/v2/klog.go:753 +0x1e5
Feb 18 00:27:29 static k3s[1485]: k8s.io/klog/v2.Fatalf(...)
Feb 18 00:27:29 static k3s[1485]: /home/abuild/rpmbuild/BUILD/k3s-1.22.3-k3s1/vendor/k8s.io/klog/v2/klog.go:1495
Feb 18 00:27:29 static k3s[1485]: k8s.io/kubernetes/cmd/kube-controller-manager/app.Run.func4()
Feb 18 00:27:29 static k3s[1485]: /home/abuild/rpmbuild/BUILD/k3s-1.22.3-k3s1/vendor/k8s.io/kubernetes/cmd/kube-controller-manager/app/controllermanager.go:287 +0x5c
Feb 18 00:27:29 static k3s[1485]: k8s.io/client-go/tools/leaderelection.(*LeaderElector).Run.func1()
Feb 18 00:27:29 static k3s[1485]: /home/abuild/rpmbuild/BUILD/k3s-1.22.3-k3s1/vendor/k8s.io/client-go/tools/leaderelection/leaderelection.go:203 +0x1f
Feb 18 00:27:29 static k3s[1485]: k8s.io/client-go/tools/leaderelection.(*LeaderElector).Run(0xc00e1b10e0, {0x5588d65f3a90, 0xc000074038})
Feb 18 00:27:29 static k3s[1485]: /home/abuild/rpmbuild/BUILD/k3s-1.22.3-k3s1/vendor/k8s.io/client-go/tools/leaderelection/leaderelection.go:213 +0x189
Feb 18 00:27:29 static k3s[1485]: k8s.io/client-go/tools/leaderelection.RunOrDie({0x5588d65f3a90, 0xc000074038}, {{0x5588d6643bc0, 0xc0129eac80}, 0x37e11d600, 0x2540be400, 0x77359400, {0xc00d8121e0, 0x5588d6465860, 0x0}, ...})
Feb 18 00:27:29 static k3s[1485]: /home/abuild/rpmbuild/BUILD/k3s-1.22.3-k3s1/vendor/k8s.io/client-go/tools/leaderelection/leaderelection.go:226 +0x94
Feb 18 00:27:29 static k3s[1485]: k8s.io/kubernetes/cmd/kube-controller-manager/app.leaderElectAndRun(0xc010cf08c0, {0xc0123fb6e0, 0x2b}, 0xc00fea72d8, {0x5588d4453d06, 0x6}, {0x5588d44b37a2, 0x17}, {0xc00d8121e0, 0x5588d6465860, ...})
Feb 18 00:27:29 static k3s[1485]: /home/abuild/rpmbuild/BUILD/k3s-1.22.3-k3s1/vendor/k8s.io/kubernetes/cmd/kube-controller-manager/app/controllermanager.go:688 +0x2c5
Feb 18 00:27:29 static k3s[1485]: created by k8s.io/kubernetes/cmd/kube-controller-manager/app.Run
Feb 18 00:27:29 static k3s[1485]: /home/abuild/rpmbuild/BUILD/k3s-1.22.3-k3s1/vendor/k8s.io/kubernetes/cmd/kube-controller-manager/app/controllermanager.go:272 +0x745
Feb 18 00:27:29 static k3s[1485]: panic: unreachable
Feb 18 00:27:29 static k3s[1485]: goroutine 13846 [running]:
Feb 18 00:27:29 static k3s[1485]: k8s.io/kubernetes/cmd/kube-controller-manager/app.leaderElectAndRun(0xc010cf08c0, {0xc0123fb6e0, 0x2b}, 0xc00fea72d8, {0x5588d4453d06, 0x6}, {0x5588d44b37a2, 0x17}, {0xc00d8121e0, 0x5588d6465860, ...})
Feb 18 00:27:29 static k3s[1485]: /home/abuild/rpmbuild/BUILD/k3s-1.22.3-k3s1/vendor/k8s.io/kubernetes/cmd/kube-controller-manager/app/controllermanager.go:698 +0x2d8
Feb 18 00:27:29 static k3s[1485]: created by k8s.io/kubernetes/cmd/kube-controller-manager/app.Run
Feb 18 00:27:29 static k3s[1485]: /home/abuild/rpmbuild/BUILD/k3s-1.22.3-k3s1/vendor/k8s.io/kubernetes/cmd/kube-controller-manager/app/controllermanager.go:272 +0x745
Feb 18 00:27:29 static systemd[1]: k3s-server.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
Feb 18 00:27:29 static systemd[1]: k3s-server.service: Failed with result 'exit-code'.
Feb 18 00:27:29 static systemd[1]: k3s-server.service: Unit process 2215 (containerd-shim) remains running after unit stopped.
Feb 18 00:27:29 static systemd[1]: k3s-server.service: Unit process 2218 (containerd-shim) remains running after unit stopped.
Feb 18 00:27:29 static systemd[1]: k3s-server.service: Unit process 2432 (containerd-shim) remains running after unit stopped.
Feb 18 00:27:29 static systemd[1]: k3s-server.service: Unit process 2513 (containerd-shim) remains running after unit stopped.
Feb 18 00:27:29 static systemd[1]: k3s-server.service: Unit process 2656 (containerd-shim) remains running after unit stopped.
Feb 18 00:27:29 static systemd[1]: k3s-server.service: Unit process 2856 (containerd-shim) remains running after unit stopped.
Feb 18 00:27:29 static systemd[1]: k3s-server.service: Consumed 1h 2min 29.664s CPU time.
static:~ #
Per your instructions, I ran journalctl -u k3s-server > k3s-server.log
to get them. I had deployed this cluster fresh from master about 3 days ago. I saw the PR coming for the straight-from-binary but didn't realize that had dropped already. I will grab that and start over from scratch and let you know. As always, thanks for your help!
from terraform-hcloud-kube-hetzner.
Hello @TimHeckel, I have seen this happen when the number of control plane nodes is less than 3 and kured
still reboots the nodes, it's can't maintain quorum and fails. So in that case, in the docs it is advised to turn off automatic upgrades when the number of control plane nodes is less than 3, and do the upgrade manually instead.
What you can do, is try changing the IP in kubeconfig.yaml, to the one of another control plane node. Get those with hcloud server list | grep k3s-control-plane
.
If all IP are unreachable, indeed, you have to power-cycle them with hcloud server reboot k3s-control-plan-<x>
.
All of this is good, but the old system had a few issues, now all fixed, we have changed a lot of things since yesterday, we now deploy k3s from the original binary straight from Github. If you can switch to the new cluster, it should be a lot better now.
from terraform-hcloud-kube-hetzner.
In the new system, with at least 3 control plane nodes, it has always remained online. If one of the cp nodes goes down for restart, the other 2 are still reachable. Same when k3s gets upgraded automatically!
from terraform-hcloud-kube-hetzner.
If you see this again on any cluster @TimHeckel, please dump the logs to a file, and upload it here.
On old cluster (the one you use now), for control-plane nodes:
ssh [email protected] -i ~/.ssh/id_ed25519 -o StrictHostKeyChecking=no journalctl -u k3s-server > k3s-server.log
On the new cluster, for control-plane nodes:
ssh [email protected] -i ~/.ssh/id_ed25519 -o StrictHostKeyChecking=no journalctl -u k3s > k3s.log
For agents, on both clusters:
ssh [email protected] -i ~/.ssh/id_ed25519 -o StrictHostKeyChecking=no journalctl -u k3s-agent > k3s-agent.log
Same for : kubectl -n kube-system logs -l name=kured > kured.log
Any of these log dumps would help bring more clarity to the matter.
from terraform-hcloud-kube-hetzner.
Thanks! In the "old" version, there was a colision with nodes IP and the LB one. Try with the new system and let me know ๐ค
from terraform-hcloud-kube-hetzner.
Related Issues (20)
- [Bug]: pod system-upgrade pod/system-upgrade-controller- CrashLoopBackOff HOT 2
- [Bug]: kustomization error: remote-exec provisioner error on .terraform/modules/api_k3s.kube-hetzner/init.tf line 289, in resource "null_resource" "kustomization" HOT 5
- Allow to disable public IP assignment on nodes HOT 1
- Find a way to allow the repo to work with either x86 or arm images only HOT 3
- [Bug]: Unwanted empty Load Balancer is created even if not requested HOT 4
- Autoscaling doesn't work HOT 1
- Cilium crash after update to 2.13.x HOT 2
- [Bug]: After #1257 autoscaler stopped working. HOT 12
- Autoscaler failed to verify certificate: x509 HOT 3
- Can't access control-planes with proxied DNS entries from cloudflare HOT 5
- [Bug]: `cluster-autoscaler` does not wait long enough for new server to become available HOT 2
- [Bug]: autoscaler nodes do not (allow to) set kubelet-args like kube-reserved and system-reserved
- [Bug]: Creation on new cluster stuck on configuring agent node HOT 12
- [Feature Request]: Add a note somewhere in the README that selinux enablement can lead to pods trying to use volumes with many files never booting
- [Bug]: (remote-exec): error: timed out waiting for the condition on deployments/system-upgrade-controller HOT 2
- Missing "cluster-init" option in config.yaml in the only control plane node. HOT 4
- [Bug]: Invalid provider configuration with terraform plan | apply HOT 2
- [Bug]: terraform validate fails "Names in agent_nodepools must be unique." HOT 2
- [Bug]: Autoupgrade nodes seems to lead to not ready nodes that need manual reboots HOT 8
- Longhorn installation fails (CRDs not installed) HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from terraform-hcloud-kube-hetzner.