Giter VIP home page Giter VIP logo

Comments (6)

TimHeckel avatar TimHeckel commented on May 25, 2024 2

Thanks - trying the new one :) I'll close this and report if i see this again ๐Ÿคž

from terraform-hcloud-kube-hetzner.

TimHeckel avatar TimHeckel commented on May 25, 2024 1

Hi @mysticaltech thanks for the tips. I changed the IPs to the other 2 control planes in the kubeconfig.yml and got the below messages respectively:

k3s-control-plane-1: Error from server (InternalError): an error on the server ("apiserver not ready") has prevented the request from succeeding

k3s-control-plane-2: The connection to the server 5.161.81.162:6443 was refused - did you specify the right host or port? (same as k3s-control-plan-1)

Here is a dump of the last relevant logs (i believe) from k3s-control-plane-10:

Feb 18 00:27:24 static k3s[1485]: E0218 00:27:24.240705    1485 status.go:71] apiserver received an error that is not an metav1.Status: context.deadlineExceededError{}: context deadline exceeded
Feb 18 00:27:24 static k3s[1485]: W0218 00:27:24.241283    1485 server.go:1299] [core] grpc: Server.processUnaryRPC failed to write status: connection error: desc = "transport is closing"
Feb 18 00:27:24 static k3s[1485]: E0218 00:27:24.241307    1485 writers.go:117] apiserver was unable to write a JSON response: http: Handler timeout
Feb 18 00:27:24 static k3s[1485]: E0218 00:27:24.242417    1485 status.go:71] apiserver received an error that is not an metav1.Status: &errors.errorString{s:"http: Handler timeout"}: http: Handler timeout
Feb 18 00:27:24 static k3s[1485]: E0218 00:27:24.243533    1485 writers.go:130] apiserver was unable to write a fallback JSON response: http: Handler timeout
Feb 18 00:27:24 static k3s[1485]: I0218 00:27:24.244625    1485 trace.go:205] Trace[31918229]: "Get" url:/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/kube-controller-manager,user-agent:k3s/v1.22.3 (linux/amd64) kubernetes/5d8c744/leader-election,audit-id:394cad73-0e98-4b0d-8cdc-e88764c23711,client:127.0.0.1,accept:application/vnd.kubernetes.protobuf, */*,protocol:HTTP/2.0 (18-Feb-2022 00:27:19.241) (total time: 5003ms):
Feb 18 00:27:24 static k3s[1485]: Trace[31918229]: [5.003546588s] [5.003546588s] END
Feb 18 00:27:24 static k3s[1485]: E0218 00:27:24.244961    1485 timeout.go:135] post-timeout activity - time-elapsed: 4.001971ms, GET "/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/kube-controller-manager" result: <nil>
Feb 18 00:27:24 static k3s[1485]: {"level":"warn","ts":"2022-02-18T00:27:24.444Z","caller":"etcdserver/v3_server.go:815","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":15038221776147062290,"retry-timeout":"500ms"}
Feb 18 00:27:24 static k3s[1485]: {"level":"warn","ts":"2022-02-18T00:27:24.944Z","caller":"etcdserver/v3_server.go:815","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":15038221776147062290,"retry-timeout":"500ms"}
Feb 18 00:27:25 static k3s[1485]: {"level":"warn","ts":"2022-02-18T00:27:25.445Z","caller":"etcdserver/v3_server.go:815","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":15038221776147062290,"retry-timeout":"500ms"}
Feb 18 00:27:25 static k3s[1485]: {"level":"warn","ts":"2022-02-18T00:27:25.946Z","caller":"etcdserver/v3_server.go:815","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":15038221776147062290,"retry-timeout":"500ms"}
Feb 18 00:27:26 static k3s[1485]: {"level":"warn","ts":"2022-02-18T00:27:26.446Z","caller":"etcdserver/v3_server.go:815","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":15038221776147062290,"retry-timeout":"500ms"}
Feb 18 00:27:26 static k3s[1485]: I0218 00:27:26.601777    1485 trace.go:205] Trace[1096205153]: "cacher list" type:*admissionregistration.MutatingWebhookConfiguration (18-Feb-2022 00:27:23.600) (total time: 3000ms):
Feb 18 00:27:26 static k3s[1485]: Trace[1096205153]: [3.000823979s] [3.000823979s] END
Feb 18 00:27:26 static k3s[1485]: I0218 00:27:26.601981    1485 trace.go:205] Trace[211907812]: "List" url:/apis/admissionregistration.k8s.io/v1/mutatingwebhookconfigurations,user-agent:kube-state-metrics/v2.0.0 (linux/amd64) kube-state-metrics/,audit-id:3e0ed6be-5660-4270-a2d4-e80c4a8ea3d5,client:10.0.1.1,accept:application/vnd.kubernetes.protobuf,application/json,protocol:HTTP/1.1 (18-Feb-2022 00:27:23.600) (total time: 3001ms):
Feb 18 00:27:26 static k3s[1485]: Trace[211907812]: [3.001114394s] [3.001114394s] END
Feb 18 00:27:26 static k3s[1485]: {"level":"warn","ts":"2022-02-18T00:27:26.673Z","caller":"etcdserver/cluster_util.go:288","msg":"failed to reach the peer URL","address":"https://10.0.0.3:2380/version","remote-member-id":"44b74a9b080ff870","error":"Get \"https://10.0.0.3:2380/version\": dial tcp 10.0.0.3:2380: i/o timeout"}
Feb 18 00:27:26 static k3s[1485]: {"level":"warn","ts":"2022-02-18T00:27:26.673Z","caller":"etcdserver/cluster_util.go:155","msg":"failed to get version","remote-member-id":"44b74a9b080ff870","error":"Get \"https://10.0.0.3:2380/version\": dial tcp 10.0.0.3:2380: i/o timeout"}
Feb 18 00:27:26 static k3s[1485]: {"level":"warn","ts":"2022-02-18T00:27:26.699Z","logger":"raft","caller":"etcdserver/zap_raft.go:85","msg":"fde9dd315b6d0b2 stepped down to follower since quorum is not active"}
Feb 18 00:27:26 static k3s[1485]: {"level":"info","ts":"2022-02-18T00:27:26.699Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"fde9dd315b6d0b2 became follower at term 3"}
Feb 18 00:27:26 static k3s[1485]: {"level":"info","ts":"2022-02-18T00:27:26.699Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"raft.node: fde9dd315b6d0b2 lost leader fde9dd315b6d0b2 at term 3"}
Feb 18 00:27:26 static k3s[1485]: I0218 00:27:26.753307    1485 trace.go:205] Trace[1013056742]: "cacher list" type:*core.Service (18-Feb-2022 00:27:23.752) (total time: 3000ms):
Feb 18 00:27:26 static k3s[1485]: Trace[1013056742]: [3.000953712s] [3.000953712s] END
Feb 18 00:27:26 static k3s[1485]: I0218 00:27:26.753573    1485 trace.go:205] Trace[485383494]: "List" url:/api/v1/services,user-agent:kube-state-metrics/v2.0.0 (linux/amd64) kube-state-metrics/,audit-id:b3eba650-41a2-4a9a-ac5c-59a0d0795452,client:10.0.1.1,accept:application/vnd.kubernetes.protobuf,application/json,protocol:HTTP/1.1 (18-Feb-2022 00:27:23.752) (total time: 3001ms):
Feb 18 00:27:26 static k3s[1485]: Trace[485383494]: [3.001236562s] [3.001236562s] END
Feb 18 00:27:26 static k3s[1485]: I0218 00:27:26.780071    1485 trace.go:205] Trace[454733439]: "cacher list" type:*core.PersistentVolume (18-Feb-2022 00:27:23.779) (total time: 3000ms):
Feb 18 00:27:26 static k3s[1485]: Trace[454733439]: [3.000144994s] [3.000144994s] END
Feb 18 00:27:26 static k3s[1485]: I0218 00:27:26.780257    1485 trace.go:205] Trace[403897678]: "List" url:/api/v1/persistentvolumes,user-agent:csi-attacher/v0.0.0 (linux/amd64) kubernetes/$Format,audit-id:12be4c46-e55e-46a7-8d63-a3f7e2fe32f2,client:10.0.1.1,accept:application/json, */*,protocol:HTTP/1.1 (18-Feb-2022 00:27:23.779) (total time: 3000ms):
Feb 18 00:27:26 static k3s[1485]: Trace[403897678]: [3.000399301s] [3.000399301s] END
Feb 18 00:27:26 static k3s[1485]: I0218 00:27:26.859301    1485 trace.go:205] Trace[1989561755]: "cacher list" type:*storage.VolumeAttachment (18-Feb-2022 00:27:23.858) (total time: 3000ms):
Feb 18 00:27:26 static k3s[1485]: Trace[1989561755]: [3.000442703s] [3.000442703s] END
Feb 18 00:27:26 static k3s[1485]: I0218 00:27:26.859543    1485 trace.go:205] Trace[1203395251]: "List" url:/apis/storage.k8s.io/v1/volumeattachments,user-agent:csi-provisioner/v0.0.0 (linux/amd64) kubernetes/$Format,audit-id:d4fbea16-f6b2-47fd-992f-463e55e8a0bc,client:10.0.1.1,accept:application/json, */*,protocol:HTTP/1.1 (18-Feb-2022 00:27:23.858) (total time: 3000ms):
Feb 18 00:27:26 static k3s[1485]: Trace[1203395251]: [3.000777741s] [3.000777741s] END
Feb 18 00:27:26 static k3s[1485]: I0218 00:27:26.861466    1485 trace.go:205] Trace[130762900]: "cacher list" type:*storage.CSINode (18-Feb-2022 00:27:23.860) (total time: 3000ms):
Feb 18 00:27:26 static k3s[1485]: Trace[130762900]: [3.00077687s] [3.00077687s] END
Feb 18 00:27:26 static k3s[1485]: I0218 00:27:26.861643    1485 trace.go:205] Trace[1105326689]: "List" url:/apis/storage.k8s.io/v1/csinodes,user-agent:csi-provisioner/v0.0.0 (linux/amd64) kubernetes/$Format,audit-id:91cee22c-fc8e-4a7d-84b9-956482896d5a,client:10.0.1.1,accept:application/json, */*,protocol:HTTP/1.1 (18-Feb-2022 00:27:23.860) (total time: 3000ms):
Feb 18 00:27:26 static k3s[1485]: Trace[1105326689]: [3.000991402s] [3.000991402s] END
Feb 18 00:27:26 static k3s[1485]: I0218 00:27:26.887041    1485 trace.go:205] Trace[2035177880]: "cacher list" type:*unstructured.Unstructured (18-Feb-2022 00:27:23.886) (total time: 3000ms):
Feb 18 00:27:26 static k3s[1485]: Trace[2035177880]: [3.000398188s] [3.000398188s] END
Feb 18 00:27:26 static k3s[1485]: I0218 00:27:26.890501    1485 trace.go:205] Trace[2142499412]: "List" url:/apis/traefik.containo.us/v1alpha1/ingressroutetcps,user-agent:traefik/2.5.0 (linux/amd64) kubernetes/crd,audit-id:93b0da4c-54cf-42cd-97a5-3c715f7a2f8f,client:10.0.1.2,accept:application/json, */*,protocol:HTTP/1.1 (18-Feb-2022 00:27:23.886) (total time: 3003ms):
Feb 18 00:27:26 static k3s[1485]: Trace[2142499412]: [3.003883441s] [3.003883441s] END
Feb 18 00:27:26 static k3s[1485]: {"level":"warn","ts":"2022-02-18T00:27:26.946Z","caller":"etcdserver/v3_server.go:815","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":15038221776147062290,"retry-timeout":"500ms"}
Feb 18 00:27:26 static k3s[1485]: {"level":"info","ts":"2022-02-18T00:27:26.946Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"fde9dd315b6d0b2 no leader at term 3; dropping index reading msg"}
Feb 18 00:27:27 static k3s[1485]: I0218 00:27:27.135168    1485 trace.go:205] Trace[1842589464]: "cacher list" type:*core.Secret (18-Feb-2022 00:27:24.134) (total time: 3000ms):
Feb 18 00:27:27 static k3s[1485]: Trace[1842589464]: [3.000398107s] [3.000398107s] END
Feb 18 00:27:27 static k3s[1485]: I0218 00:27:27.135457    1485 trace.go:205] Trace[800629758]: "cacher list" type:*networking.IngressClass (18-Feb-2022 00:27:24.134) (total time: 3000ms):
Feb 18 00:27:27 static k3s[1485]: Trace[800629758]: [3.000772461s] [3.000772461s] END
Feb 18 00:27:27 static k3s[1485]: I0218 00:27:27.135645    1485 trace.go:205] Trace[250511849]: "List" url:/apis/networking.k8s.io/v1/ingressclasses,user-agent:traefik/2.5.0 (linux/amd64) kubernetes/ingress,audit-id:a556ac86-143a-4fff-b6ce-8a1f3884c2c6,client:10.0.1.2,accept:application/json, */*,protocol:HTTP/1.1 (18-Feb-2022 00:27:24.134) (total time: 3001ms):
Feb 18 00:27:27 static k3s[1485]: Trace[250511849]: [3.001004457s] [3.001004457s] END
Feb 18 00:27:27 static k3s[1485]: I0218 00:27:27.135820    1485 trace.go:205] Trace[1439385675]: "List" url:/api/v1/secrets,user-agent:traefik/2.5.0 (linux/amd64) kubernetes/crd,audit-id:03ed2b1e-d94d-4618-9f8f-62cc540cbe05,client:10.0.1.2,accept:application/json, */*,protocol:HTTP/1.1 (18-Feb-2022 00:27:24.134) (total time: 3001ms):
Feb 18 00:27:27 static k3s[1485]: Trace[1439385675]: [3.001148157s] [3.001148157s] END
Feb 18 00:27:27 static k3s[1485]: I0218 00:27:27.212478    1485 trace.go:205] Trace[1799538963]: "cacher list" type:*core.Service (18-Feb-2022 00:27:24.211) (total time: 3000ms):
Feb 18 00:27:27 static k3s[1485]: Trace[1799538963]: [3.000917214s] [3.000917214s] END
Feb 18 00:27:27 static k3s[1485]: I0218 00:27:27.212689    1485 trace.go:205] Trace[1804022479]: "List" url:/api/v1/namespaces/lens-metrics/services,user-agent:Prometheus/2.27.1,audit-id:f07c56ca-7700-46db-9bac-b58c931487f2,client:10.0.1.1,accept:application/json, */*,protocol:HTTP/1.1 (18-Feb-2022 00:27:24.211) (total time: 3001ms):
Feb 18 00:27:27 static k3s[1485]: Trace[1804022479]: [3.001159417s] [3.001159417s] END
Feb 18 00:27:27 static k3s[1485]: I0218 00:27:27.212939    1485 trace.go:205] Trace[1724410243]: "cacher list" type:*core.ResourceQuota (18-Feb-2022 00:27:24.211) (total time: 3000ms):
Feb 18 00:27:27 static k3s[1485]: Trace[1724410243]: [3.000905552s] [3.000905552s] END
Feb 18 00:27:27 static k3s[1485]: I0218 00:27:27.212958    1485 trace.go:205] Trace[570188394]: "cacher list" type:*core.Service (18-Feb-2022 00:27:24.212) (total time: 3000ms):
Feb 18 00:27:27 static k3s[1485]: Trace[570188394]: [3.000415902s] [3.000415902s] END
Feb 18 00:27:27 static k3s[1485]: I0218 00:27:27.213315    1485 trace.go:205] Trace[396921058]: "List" url:/api/v1/namespaces/lens-metrics/services,user-agent:Prometheus/2.27.1,audit-id:e4834361-c059-449d-b188-962fa3301c29,client:10.0.1.1,accept:application/json, */*,protocol:HTTP/1.1 (18-Feb-2022 00:27:24.212) (total time: 3000ms):
Feb 18 00:27:27 static k3s[1485]: Trace[396921058]: [3.00078711s] [3.00078711s] END
Feb 18 00:27:27 static k3s[1485]: I0218 00:27:27.213487    1485 trace.go:205] Trace[1318853724]: "List" url:/api/v1/resourcequotas,user-agent:kube-state-metrics/v2.0.0 (linux/amd64) kube-state-metrics/,audit-id:92fae7c9-0c34-4e83-8965-759be489bf99,client:10.0.1.1,accept:application/vnd.kubernetes.protobuf,application/json,protocol:HTTP/1.1 (18-Feb-2022 00:27:24.211) (total time: 3001ms):
Feb 18 00:27:27 static k3s[1485]: Trace[1318853724]: [3.001921268s] [3.001921268s] END
Feb 18 00:27:27 static k3s[1485]: {"level":"warn","ts":"2022-02-18T00:27:27.447Z","caller":"etcdserver/v3_server.go:815","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":15038221776147062290,"retry-timeout":"500ms"}
Feb 18 00:27:27 static k3s[1485]: {"level":"info","ts":"2022-02-18T00:27:27.447Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"fde9dd315b6d0b2 no leader at term 3; dropping index reading msg"}
Feb 18 00:27:27 static k3s[1485]: {"level":"warn","ts":"2022-02-18T00:27:27.947Z","caller":"etcdserver/v3_server.go:815","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":15038221776147062290,"retry-timeout":"500ms"}
Feb 18 00:27:27 static k3s[1485]: {"level":"info","ts":"2022-02-18T00:27:27.947Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"fde9dd315b6d0b2 no leader at term 3; dropping index reading msg"}
Feb 18 00:27:28 static k3s[1485]: {"level":"warn","ts":"2022-02-18T00:27:28.448Z","caller":"etcdserver/v3_server.go:815","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":15038221776147062290,"retry-timeout":"500ms"}
Feb 18 00:27:28 static k3s[1485]: {"level":"info","ts":"2022-02-18T00:27:28.448Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"fde9dd315b6d0b2 no leader at term 3; dropping index reading msg"}
Feb 18 00:27:28 static k3s[1485]: {"level":"warn","ts":"2022-02-18T00:27:28.949Z","caller":"etcdserver/v3_server.go:815","msg":"waiting for ReadIndex response took too long, retrying","sent-request-id":15038221776147062290,"retry-timeout":"500ms"}
Feb 18 00:27:28 static k3s[1485]: {"level":"info","ts":"2022-02-18T00:27:28.949Z","logger":"raft","caller":"etcdserver/zap_raft.go:77","msg":"fde9dd315b6d0b2 no leader at term 3; dropping index reading msg"}
Feb 18 00:27:29 static k3s[1485]: {"level":"warn","ts":"2022-02-18T00:27:29.012Z","caller":"v3rpc/interceptor.go:197","msg":"request stats","start time":"2022-02-18T00:27:19.017Z","time spent":"9.994647691s","remote":"127.0.0.1:51596","response type":"/etcdserverpb.KV/Txn","request count":0,"request size":0,"response count":0,"response size":0,"request content":""}
Feb 18 00:27:29 static k3s[1485]: I0218 00:27:29.012273    1485 trace.go:205] Trace[566594349]: "GuaranteedUpdate etcd3" type:*coordination.Lease (18-Feb-2022 00:27:19.016) (total time: 9995ms):
Feb 18 00:27:29 static k3s[1485]: Trace[566594349]: [9.995541569s] [9.995541569s] END
Feb 18 00:27:29 static k3s[1485]: E0218 00:27:29.015072    1485 writers.go:117] apiserver was unable to write a JSON response: http: Handler timeout
Feb 18 00:27:29 static k3s[1485]: E0218 00:27:29.015108    1485 status.go:71] apiserver received an error that is not an metav1.Status: &errors.errorString{s:"http: Handler timeout"}: http: Handler timeout
Feb 18 00:27:29 static k3s[1485]: E0218 00:27:29.015160    1485 finisher.go:175] FinishRequest: post-timeout activity - time-elapsed: 33.031ยตs, panicked: false, err: context deadline exceeded, panic-reason: <nil>
Feb 18 00:27:29 static k3s[1485]: E0218 00:27:29.015232    1485 controller.go:187] failed to update lease, error: Put "https://127.0.0.1:6443/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/k3s-control-plane-0?timeout=10s": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
Feb 18 00:27:29 static k3s[1485]: E0218 00:27:29.016840    1485 writers.go:130] apiserver was unable to write a fallback JSON response: http: Handler timeout
Feb 18 00:27:29 static k3s[1485]: I0218 00:27:29.018097    1485 trace.go:205] Trace[45207846]: "Update" url:/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/k3s-control-plane-0,user-agent:k3s/v1.22.3 (linux/amd64) kubernetes/5d8c744,audit-id:79b06743-d075-40c6-9309-54fc3bf5669b,client:127.0.0.1,accept:application/vnd.kubernetes.protobuf,application/json,protocol:HTTP/1.1 (18-Feb-2022 00:27:19.016) (total time: 10001ms):
Feb 18 00:27:29 static k3s[1485]: Trace[45207846]: [10.001507037s] [10.001507037s] END
Feb 18 00:27:29 static k3s[1485]: E0218 00:27:29.024268    1485 timeout.go:135] post-timeout activity - time-elapsed: 8.826889ms, PUT "/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/k3s-control-plane-0" result: <nil>
Feb 18 00:27:29 static k3s[1485]: E0218 00:27:29.239960    1485 leaderelection.go:330] error retrieving resource lock kube-system/kube-controller-manager: Get "https://127.0.0.1:6444/apis/coordination.k8s.io/v1/namespaces/kube-system/leases/kube-controller-manager?timeout=5s": context deadline exceeded
Feb 18 00:27:29 static k3s[1485]: I0218 00:27:29.240005    1485 leaderelection.go:283] failed to renew lease kube-system/kube-controller-manager: timed out waiting for the condition
Feb 18 00:27:29 static k3s[1485]: F0218 00:27:29.240168    1485 controllermanager.go:287] leaderelection lost
Feb 18 00:27:29 static k3s[1485]: E0218 00:27:29.240478    1485 status.go:71] apiserver received an error that is not an metav1.Status: &errors.errorString{s:"context canceled"}: context canceled
Feb 18 00:27:29 static k3s[1485]: E0218 00:27:29.240587    1485 writers.go:117] apiserver was unable to write a JSON response: http: Handler timeout
Feb 18 00:27:29 static k3s[1485]: E0218 00:27:29.241565    1485 status.go:71] apiserver received an error that is not an metav1.Status: &errors.errorString{s:"http: Handler timeout"}: http: Handler timeout
Feb 18 00:27:29 static k3s[1485]: I0218 00:27:29.241726    1485 event.go:291] "Event occurred" object="" kind="Lease" apiVersion="coordination.k8s.io/v1" type="Normal" reason="LeaderElection" message="static_229a9ab6-7621-4e9e-81ba-c12ae30ba0da stopped leading"
Feb 18 00:27:29 static k3s[1485]: E0218 00:27:29.242396    1485 runtime.go:76] Observed a panic: F0218 00:27:29.240168    1485 controllermanager.go:287] leaderelection lost
Feb 18 00:27:29 static k3s[1485]: goroutine 13846 [running]:
Feb 18 00:27:29 static k3s[1485]: k8s.io/apimachinery/pkg/util/runtime.logPanic({0x5588d550c9e0, 0xc02154cc10})
Feb 18 00:27:29 static k3s[1485]:         /home/abuild/rpmbuild/BUILD/k3s-1.22.3-k3s1/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:74 +0x85
Feb 18 00:27:29 static k3s[1485]: k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0x1})
Feb 18 00:27:29 static k3s[1485]:         /home/abuild/rpmbuild/BUILD/k3s-1.22.3-k3s1/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:48 +0x75
Feb 18 00:27:29 static k3s[1485]: panic({0x5588d550c9e0, 0xc02154cc10})
Feb 18 00:27:29 static k3s[1485]:         /usr/lib64/go/1.17/src/runtime/panic.go:1038 +0x215
Feb 18 00:27:29 static k3s[1485]: k8s.io/klog/v2.(*loggingT).output(0x5588d9b426a0, 0x3, {0x0, 0x0}, 0xc000a46b60, 0x0, {0x5588d78746d0, 0x5588d659c400}, 0xc00a83a2c0, 0x0)
Feb 18 00:27:29 static k3s[1485]:         /home/abuild/rpmbuild/BUILD/k3s-1.22.3-k3s1/vendor/k8s.io/klog/v2/klog.go:970 +0x685
Feb 18 00:27:29 static k3s[1485]: k8s.io/klog/v2.(*loggingT).printf(0x0, 0x0, {0x0, 0x0}, {0x0, 0x0}, {0x5588d4496738, 0x13}, {0x0, 0x0, ...})
Feb 18 00:27:29 static k3s[1485]:         /home/abuild/rpmbuild/BUILD/k3s-1.22.3-k3s1/vendor/k8s.io/klog/v2/klog.go:753 +0x1e5
Feb 18 00:27:29 static k3s[1485]: k8s.io/klog/v2.Fatalf(...)
Feb 18 00:27:29 static k3s[1485]:         /home/abuild/rpmbuild/BUILD/k3s-1.22.3-k3s1/vendor/k8s.io/klog/v2/klog.go:1495
Feb 18 00:27:29 static k3s[1485]: k8s.io/kubernetes/cmd/kube-controller-manager/app.Run.func4()
Feb 18 00:27:29 static k3s[1485]:         /home/abuild/rpmbuild/BUILD/k3s-1.22.3-k3s1/vendor/k8s.io/kubernetes/cmd/kube-controller-manager/app/controllermanager.go:287 +0x5c
Feb 18 00:27:29 static k3s[1485]: k8s.io/client-go/tools/leaderelection.(*LeaderElector).Run.func1()
Feb 18 00:27:29 static k3s[1485]:         /home/abuild/rpmbuild/BUILD/k3s-1.22.3-k3s1/vendor/k8s.io/client-go/tools/leaderelection/leaderelection.go:203 +0x1f
Feb 18 00:27:29 static k3s[1485]: k8s.io/client-go/tools/leaderelection.(*LeaderElector).Run(0xc00e1b10e0, {0x5588d65f3a90, 0xc000074038})
Feb 18 00:27:29 static k3s[1485]:         /home/abuild/rpmbuild/BUILD/k3s-1.22.3-k3s1/vendor/k8s.io/client-go/tools/leaderelection/leaderelection.go:213 +0x189
Feb 18 00:27:29 static k3s[1485]: k8s.io/client-go/tools/leaderelection.RunOrDie({0x5588d65f3a90, 0xc000074038}, {{0x5588d6643bc0, 0xc0129eac80}, 0x37e11d600, 0x2540be400, 0x77359400, {0xc00d8121e0, 0x5588d6465860, 0x0}, ...})
Feb 18 00:27:29 static k3s[1485]:         /home/abuild/rpmbuild/BUILD/k3s-1.22.3-k3s1/vendor/k8s.io/client-go/tools/leaderelection/leaderelection.go:226 +0x94
Feb 18 00:27:29 static k3s[1485]: k8s.io/kubernetes/cmd/kube-controller-manager/app.leaderElectAndRun(0xc010cf08c0, {0xc0123fb6e0, 0x2b}, 0xc00fea72d8, {0x5588d4453d06, 0x6}, {0x5588d44b37a2, 0x17}, {0xc00d8121e0, 0x5588d6465860, ...})
Feb 18 00:27:29 static k3s[1485]:         /home/abuild/rpmbuild/BUILD/k3s-1.22.3-k3s1/vendor/k8s.io/kubernetes/cmd/kube-controller-manager/app/controllermanager.go:688 +0x2c5
Feb 18 00:27:29 static k3s[1485]: created by k8s.io/kubernetes/cmd/kube-controller-manager/app.Run
Feb 18 00:27:29 static k3s[1485]:         /home/abuild/rpmbuild/BUILD/k3s-1.22.3-k3s1/vendor/k8s.io/kubernetes/cmd/kube-controller-manager/app/controllermanager.go:272 +0x745
Feb 18 00:27:29 static k3s[1485]: panic: unreachable
Feb 18 00:27:29 static k3s[1485]: goroutine 13846 [running]:
Feb 18 00:27:29 static k3s[1485]: k8s.io/kubernetes/cmd/kube-controller-manager/app.leaderElectAndRun(0xc010cf08c0, {0xc0123fb6e0, 0x2b}, 0xc00fea72d8, {0x5588d4453d06, 0x6}, {0x5588d44b37a2, 0x17}, {0xc00d8121e0, 0x5588d6465860, ...})
Feb 18 00:27:29 static k3s[1485]:         /home/abuild/rpmbuild/BUILD/k3s-1.22.3-k3s1/vendor/k8s.io/kubernetes/cmd/kube-controller-manager/app/controllermanager.go:698 +0x2d8
Feb 18 00:27:29 static k3s[1485]: created by k8s.io/kubernetes/cmd/kube-controller-manager/app.Run
Feb 18 00:27:29 static k3s[1485]:         /home/abuild/rpmbuild/BUILD/k3s-1.22.3-k3s1/vendor/k8s.io/kubernetes/cmd/kube-controller-manager/app/controllermanager.go:272 +0x745
Feb 18 00:27:29 static systemd[1]: k3s-server.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
Feb 18 00:27:29 static systemd[1]: k3s-server.service: Failed with result 'exit-code'.
Feb 18 00:27:29 static systemd[1]: k3s-server.service: Unit process 2215 (containerd-shim) remains running after unit stopped.
Feb 18 00:27:29 static systemd[1]: k3s-server.service: Unit process 2218 (containerd-shim) remains running after unit stopped.
Feb 18 00:27:29 static systemd[1]: k3s-server.service: Unit process 2432 (containerd-shim) remains running after unit stopped.
Feb 18 00:27:29 static systemd[1]: k3s-server.service: Unit process 2513 (containerd-shim) remains running after unit stopped.
Feb 18 00:27:29 static systemd[1]: k3s-server.service: Unit process 2656 (containerd-shim) remains running after unit stopped.
Feb 18 00:27:29 static systemd[1]: k3s-server.service: Unit process 2856 (containerd-shim) remains running after unit stopped.
Feb 18 00:27:29 static systemd[1]: k3s-server.service: Consumed 1h 2min 29.664s CPU time.
static:~ #

Per your instructions, I ran journalctl -u k3s-server > k3s-server.log to get them. I had deployed this cluster fresh from master about 3 days ago. I saw the PR coming for the straight-from-binary but didn't realize that had dropped already. I will grab that and start over from scratch and let you know. As always, thanks for your help!

from terraform-hcloud-kube-hetzner.

mysticaltech avatar mysticaltech commented on May 25, 2024

Hello @TimHeckel, I have seen this happen when the number of control plane nodes is less than 3 and kured still reboots the nodes, it's can't maintain quorum and fails. So in that case, in the docs it is advised to turn off automatic upgrades when the number of control plane nodes is less than 3, and do the upgrade manually instead.

What you can do, is try changing the IP in kubeconfig.yaml, to the one of another control plane node. Get those with hcloud server list | grep k3s-control-plane.

If all IP are unreachable, indeed, you have to power-cycle them with hcloud server reboot k3s-control-plan-<x>.

All of this is good, but the old system had a few issues, now all fixed, we have changed a lot of things since yesterday, we now deploy k3s from the original binary straight from Github. If you can switch to the new cluster, it should be a lot better now.

from terraform-hcloud-kube-hetzner.

mysticaltech avatar mysticaltech commented on May 25, 2024

In the new system, with at least 3 control plane nodes, it has always remained online. If one of the cp nodes goes down for restart, the other 2 are still reachable. Same when k3s gets upgraded automatically!

from terraform-hcloud-kube-hetzner.

mysticaltech avatar mysticaltech commented on May 25, 2024

If you see this again on any cluster @TimHeckel, please dump the logs to a file, and upload it here.

On old cluster (the one you use now), for control-plane nodes:

ssh [email protected] -i ~/.ssh/id_ed25519 -o StrictHostKeyChecking=no journalctl -u k3s-server > k3s-server.log

On the new cluster, for control-plane nodes:

ssh [email protected] -i ~/.ssh/id_ed25519 -o StrictHostKeyChecking=no journalctl -u k3s > k3s.log

For agents, on both clusters:

ssh [email protected] -i ~/.ssh/id_ed25519 -o StrictHostKeyChecking=no journalctl -u k3s-agent > k3s-agent.log

Same for : kubectl -n kube-system logs -l name=kured > kured.log

Any of these log dumps would help bring more clarity to the matter.

from terraform-hcloud-kube-hetzner.

mysticaltech avatar mysticaltech commented on May 25, 2024

Thanks! In the "old" version, there was a colision with nodes IP and the LB one. Try with the new system and let me know ๐Ÿคž

from terraform-hcloud-kube-hetzner.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.