Comments (12)
This terraform thing is too flakey i'm afraid. As far as I'm concerned, it has worked for a few days. Now I cannot create nodeools anymore. It's stuck in creating state even though the servers are up and running in the Hetzner console.
On the servers I get:
"Waiting to retrieve agent configuration; server is not ready: Node password rejected, duplicate hostname or contents of '/etc/rancher/node/password' may not match server node-passwd entry, try enabling a unique node name with the --with-node-id flag"
from terraform-hcloud-kube-hetzner.
Might be connected to the (unresolved) discussion I started recently #1287
I'm having weird behaviour after updating the nodes with a recent microos update. Weird network connectivity issues, that I couldn't figure out yet (I just rolled back and disabled updates for now).
Edit: I also saw some "503 Service Unavailable" and "connection refused" in my logs. I know those are very generic errors, but still.
from terraform-hcloud-kube-hetzner.
Now when running again I see the following logs for systemctl status k3s-agent
:
el=info msg="Waiting to retrieve agent configuration; server is not ready: CA cert validation failed: Get \"https://127.0.0.1:6444/cacerts\": EOF"
el=info msg="Waiting to retrieve agent configuration; server is not ready: failed to get CA certs: Get \"https://127.0.0.1:6444/cacerts\": EOF"
el=info msg="Waiting to retrieve agent configuration; server is not ready: failed to get CA certs: Get \"https://127.0.0.1:6444/cacerts\": EOF"
el=info msg="Waiting to retrieve agent configuration; server is not ready: failed to get CA certs: Get \"https://127.0.0.1:6444/cacerts\": EOF"
el=info msg="Waiting to retrieve agent configuration; server is not ready: failed to get CA certs: Get \"https://127.0.0.1:6444/cacerts\": EOF"
el=info msg="Waiting to retrieve agent configuration; server is not ready: failed to get CA certs: Get \"https://127.0.0.1:6444/cacerts\": EOF"
el=info msg="Waiting to retrieve agent configuration; server is not ready: failed to get CA certs: Get \"https://127.0.0.1:6444/cacerts\": EOF"
el=info msg="Waiting to retrieve agent configuration; server is not ready: https://127.0.0.1:6444/v1-k3s/serving-kubelet.crt: 503 Service Unavailable"
el=info msg="Waiting to retrieve agent configuration; server is not ready: https://127.0.0.1:6444/v1-k3s/serving-kubelet.crt: 503 Service Unavailable"
el=info msg="Waiting to retrieve agent configuration; server is not ready: https://127.0.0.1:6444/v1-k3s/serving-kubelet.crt: 503 Service Unavailable"
from terraform-hcloud-kube-hetzner.
Hmm, I might have had the same too, I had nodes unable to come back to life after a reboot after a k3s upgrade. Replaced the nodes (long live longhorn) and turned off upgrades. Have not verified that this was the real problem though, but haven't seen it happen again either. No need to roll back anything though: the fresh nodes are on the latest k3s and have microos updating weekly without issues. Just not automatically upgrading k3s.
from terraform-hcloud-kube-hetzner.
I disabled wireguard and recreated the cluster some time later. I haven't checked if wireguard works better with cillium or if that was the actual problem.
from terraform-hcloud-kube-hetzner.
Did you manage to make it work? What about you @andi0b ?
No, I'm currently on easter holiday and didn't investigate it more. I just disabled kured (I think with something like kubectl -n kube-system annotate ds kured weave.works/kured-node-lock='{"nodeID":"manual"}'
) and rolled back the nodes to the last working snapshot (i think with transactional-update rollback [number]
).
from terraform-hcloud-kube-hetzner.
Folks, this was probably due to a bug in system upgrade controller, now fixed. Make sure to upgrade with terraform init -upgrade
. If such an issue comes again, please don't hesitate to open another one with your kube.tf. Closing this one for now.
from terraform-hcloud-kube-hetzner.
@kube-hetzner/core Any ideas?
@mateuszlewko Try with cni_ plugin="cilium"
, I would guess it works better with wireguard.
from terraform-hcloud-kube-hetzner.
Hi, I got the same error timed out waiting for the condition on deployments/system-upgrade-controller
with a very similiar configuration and cilium enabled.
from terraform-hcloud-kube-hetzner.
Considering this as a occasional hiccup, but will monitor the situation.
from terraform-hcloud-kube-hetzner.
@kimdre Could you share your kube.tf please.
from terraform-hcloud-kube-hetzner.
@mateuszlewko Did you manage to make it work? What about you @andi0b ?
from terraform-hcloud-kube-hetzner.
Related Issues (20)
- Not able to upgrade Traefik HOT 1
- [Bug]: Sudden drop of public internet connectivity for some nodes of arm64 cluster HOT 10
- [Bug]: zram_size not passed on HOT 4
- [Bug]: Terraform Validate fails agent_nodepools HOT 1
- [Bug]: Waiting for load-balancer to get an IP... Hangs HOT 2
- Disable the default load balancer HOT 7
- [Bug]: nginx stuck deploying when not scheduling on control-plane
- Upgrading a clean cluster 1.27 to 1.28 - one of the nodes stuck in emergency mode HOT 1
- Update `cluster-autoscaler` version HOT 4
- Restore hangs waiting for load balancer ip HOT 2
- Allow specifying an existing Floating IP HOT 3
- [Bug]: Disabling SELINUX option is not working HOT 3
- Solution for multiple networks for nodes
- [Bug]: image pull backoff error with latest: hetznercloud/hcloud-csi-driver:v2.7.0 HOT 2
- On GitLab, waiting for MicroOS to become available HOT 5
- [Bug]: /etc/cloud/rename_interface.sh: No such file or directory HOT 6
- [Bug]: helm releases keep installing after disabling them in kube.tf HOT 3
- [Bug]: Terraform does not deploy well HOT 2
- Allow patching default Helm values HOT 1
- [Bug]: Unknown connection HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from terraform-hcloud-kube-hetzner.