kube-hetzner / terraform-hcloud-kube-hetzner Goto Github PK
View Code? Open in Web Editor NEWOptimized and Maintenance-free Kubernetes on Hetzner Cloud in one command!
License: MIT License
Optimized and Maintenance-free Kubernetes on Hetzner Cloud in one command!
License: MIT License
I've set up terraform.tfvars with my hcloud_token, public_key, private_key, and:
location = "nbg1" # change to `ash` for us-east Ashburn, Virginia location
network_region = "eu-central" # change to `us-east` if location is ash
agent_server_type = "cx11"
control_plane_server_type = "cpx11"
lb_server_type = "lb11"
And as a result of terraform apply
I can see:
hcloud_server.first_control_plane (local-exec): Executing: ["/bin/sh" "-c" "ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -i /Users/....user_name..../.ssh/id_rsa [email protected]... '(sleep 2; reboot)&'; sleep 3"]
hcloud_server.first_control_plane (local-exec): Warning: Permanently added '...IP...' (ED25519) to the list of known hosts.
hcloud_server.first_control_plane (local-exec): Connection to ...IP... closed by remote host.
hcloud_server.first_control_plane: Still creating... [2m10s elapsed]
hcloud_server.first_control_plane: Provisioning with 'local-exec'...
hcloud_server.first_control_plane (local-exec): Executing: ["/bin/sh" "-c" "until ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -i /Users/....user_name..../.ssh/id_rsa -o ConnectTimeout=2 [email protected]... true 2> /dev/null\ndo\n echo \"Waiting for MicroOS to reboot and become available...\"\n sleep 3\ndone\n"]
hcloud_server.first_control_plane (local-exec): Waiting for MicroOS to reboot and become available...
hcloud_server.first_control_plane (local-exec): Waiting for MicroOS to reboot and become available...
hcloud_server.first_control_plane: Still creating... [2m20s elapsed]
hcloud_server.first_control_plane (local-exec): Waiting for MicroOS to reboot and become available...
hcloud_server.first_control_plane (local-exec): Waiting for MicroOS to reboot and become available...
hcloud_server.first_control_plane: Still creating... [2m30s elapsed]
hcloud_server.first_control_plane (local-exec): Waiting for MicroOS to reboot and become available...
hcloud_server.first_control_plane (local-exec): Waiting for MicroOS to reboot and become available...
hcloud_server.first_control_plane (local-exec): Waiting for MicroOS to reboot and become available...
hcloud_server.first_control_plane: Still creating... [2m40s elapsed]
hcloud_server.first_control_plane (local-exec): Waiting for MicroOS to reboot and become available...
hcloud_server.first_control_plane (local-exec): Waiting for MicroOS to reboot and become available...
hcloud_server.first_control_plane (local-exec): Waiting for MicroOS to reboot and become available...
hcloud_server.first_control_plane: Still creating... [2m50s elapsed]
hcloud_server.first_control_plane (local-exec): Waiting for MicroOS to reboot and become available...
hcloud_server.first_control_plane (local-exec): Waiting for MicroOS to reboot and become available...
hcloud_server.first_control_plane (local-exec): Waiting for MicroOS to reboot and become available...
hcloud_server.first_control_plane (local-exec): Waiting for MicroOS to reboot and become available...
hcloud_server.first_control_plane: Still creating... [3m0s elapsed]
hcloud_server.first_control_plane (local-exec): Waiting for MicroOS to reboot and become available...
hcloud_server.first_control_plane (local-exec): Waiting for MicroOS to reboot and become available...
hcloud_server.first_control_plane (local-exec): Waiting for MicroOS to reboot and become available...
hcloud_server.first_control_plane: Still creating... [3m10s elapsed]
hcloud_server.first_control_plane (local-exec): Waiting for MicroOS to reboot and become available...
hcloud_server.first_control_plane (local-exec): Waiting for MicroOS to reboot and become available...
hcloud_server.first_control_plane (local-exec): Waiting for MicroOS to reboot and become available...
hcloud_server.first_control_plane: Still creating... [3m20s elapsed]
hcloud_server.first_control_plane (local-exec): Waiting for MicroOS to reboot and become available...
hcloud_server.first_control_plane (local-exec): Waiting for MicroOS to reboot and become available...
hcloud_server.first_control_plane (local-exec): Waiting for MicroOS to reboot and become available...
hcloud_server.first_control_plane: Still creating... [3m30s elapsed]
hcloud_server.first_control_plane: Provisioning with 'file'...
hcloud_server.first_control_plane: Still creating... [3m40s elapsed]
hcloud_server.first_control_plane: Still creating... [3m50s elapsed]
hcloud_server.first_control_plane: Still creating... [4m0s elapsed]
hcloud_server.first_control_plane: Still creating... [4m10s elapsed]
hcloud_server.first_control_plane: Still creating... [4m20s elapsed]
hcloud_server.first_control_plane: Still creating... [4m30s elapsed]
hcloud_server.first_control_plane: Still creating... [4m40s elapsed]
hcloud_server.first_control_plane: Still creating... [4m50s elapsed]
hcloud_server.first_control_plane: Still creating... [5m0s elapsed]
hcloud_server.first_control_plane: Still creating... [5m10s elapsed]
hcloud_server.first_control_plane: Still creating... [5m20s elapsed]
hcloud_server.first_control_plane: Still creating... [5m30s elapsed]
hcloud_server.first_control_plane: Still creating... [5m40s elapsed]
hcloud_server.first_control_plane: Still creating... [5m50s elapsed]
hcloud_server.first_control_plane: Still creating... [6m0s elapsed]
hcloud_server.first_control_plane: Still creating... [6m10s elapsed]
hcloud_server.first_control_plane: Still creating... [6m20s elapsed]
hcloud_server.first_control_plane: Still creating... [6m30s elapsed]
hcloud_server.first_control_plane: Still creating... [6m40s elapsed]
hcloud_server.first_control_plane: Still creating... [6m50s elapsed]
hcloud_server.first_control_plane: Still creating... [7m0s elapsed]
hcloud_server.first_control_plane: Still creating... [7m10s elapsed]
hcloud_server.first_control_plane: Still creating... [7m20s elapsed]
hcloud_server.first_control_plane: Still creating... [7m30s elapsed]
hcloud_server.first_control_plane: Still creating... [7m40s elapsed]
hcloud_server.first_control_plane: Still creating... [7m50s elapsed]
hcloud_server.first_control_plane: Still creating... [8m0s elapsed]
hcloud_server.first_control_plane: Still creating... [8m10s elapsed]
hcloud_server.first_control_plane: Still creating... [8m20s elapsed]
hcloud_server.first_control_plane: Still creating... [8m30s elapsed]
╷
│ Error: file provisioner error
│
│ with hcloud_server.first_control_plane,
│ on master.tf line 55, in resource "hcloud_server" "first_control_plane":
│ 55: provisioner "file" {
│
│ timeout - last error: dial tcp ...IP...:22: connect: operation timed out
╵
I've tried to ssh to this machine:
$ ssh [email protected]... -o StrictHostKeyChecking=no
ssh: connect to host ...IP... port 22: Operation timed out
and do it again after manual restart of this machine from hetzner web and unfortunately it looks that this machine is not responding.
Do you have an idea how to diagnose it or what could be the problem?
Thanks :)
@mnencia Have created like you a cluster of 3 controls and 2 agents. And to accelerated the process, have issued touch /var/run/reboot-required
on all five of them, to simulate a post-update scenario.
I will report back on what happens after that. Please do not hesitate to share your findings here too.
Hi guys
Thanks to all contributors to this amazing project!
Is it possible to disable the treafik ingress controller at all? I would prefer using nginx-ingress or istio-gateway as ingress solution. For that, I didn't need the Loadbalancer and the treafik installation.
I'm also open to contributing such a toggle function but would need some inputs on how to implement it the best way. :)
cheers,
Johann Schley
i followed the instructions from the readme file and the error
local-exec provisioner error Error running command 'kubectl -n kube-system create secret generic hcloud exit status 1. Output: Unable to connect to the server: dial tcp
pops up.
Am i missing anything?
Recently, the master branch has been quite unstable and required users to either stay on an unsupported, older commit of kube-hetzner or to re-provision their whole cluster.
I believe that the time would be right to agree on a versioning scheme and implement at least a minimum of a release process to communicate breaking changes more clearly.
My proposal would be to just use https://semver.org/ and start to define a process after which we could release a 1.0.0 (or 0.1.0 if you prefer ;). Ideally we would end up with a git tag, an auto-generated github release & a ready-to-use module published on registry.terraform.io.
I also started a GitHub project regarding the whole thing, you can find it linked in the sidebar of this issue or at https://github.com/orgs/kube-hetzner/projects/1
eager to hear what @mysticaltech, @mnencia and others are thinking!
I've set up terraform.tfvars with my hcloud_token, public_key, private_key, and:
location = "nbg1" # change to `ash` for us-east Ashburn, Virginia location
network_region = "eu-central" # change to `us-east` if location is ash
agent_server_type = "cx11"
control_plane_server_type = "cpx11"
lb_server_type = "lb11"
servers_num = 1
agents_num = 0
# only one server is chosen because all servers has the same issue so I tried to focus on only one
just after success steps in host/main.tf:
# Install MicroOS
# Issue a reboot command
I can see:
module.first_control_plane.hcloud_server.server (local-exec): Executing: ["/bin/sh" "-c" "until ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -i /Users/drackowski/.ssh/id_rsa -o ConnectTimeout=2 [email protected] true 2> /dev/null\ndo\n echo \"Waiting for MicroOS to reboot and become available...\"\n sleep 3\ndone\n"]
module.first_control_plane.hcloud_server.server (local-exec): Waiting for MicroOS to reboot and become available...
module.first_control_plane.hcloud_server.server: Still creating... [2m10s elapsed]
module.first_control_plane.hcloud_server.server (local-exec): Waiting for MicroOS to reboot and become available...
...
...
...
module.first_control_plane.hcloud_server.server: Provisioning with 'remote-exec'...
module.first_control_plane.hcloud_server.server (remote-exec): Connecting to remote host via SSH...
module.first_control_plane.hcloud_server.server (remote-exec): Host: .......
module.first_control_plane.hcloud_server.server (remote-exec): User: root
module.first_control_plane.hcloud_server.server (remote-exec): Password: false
module.first_control_plane.hcloud_server.server (remote-exec): Private key: true
module.first_control_plane.hcloud_server.server (remote-exec): Certificate: false
module.first_control_plane.hcloud_server.server (remote-exec): SSH Agent: true
module.first_control_plane.hcloud_server.server (remote-exec): Checking Host Key: false
module.first_control_plane.hcloud_server.server (remote-exec): Target Platform: unix
Error response is:
╷
│ Error: remote-exec provisioner error
│
│ with module.first_control_plane.hcloud_server.server,
│ on modules/host/main.tf line 60, in resource "hcloud_server" "server":
│ 60: provisioner "remote-exec" {
│
│ timeout - last error: dial tcp ........:22: i/o timeout
╵
When I'm trying to ssh to this machine it's not responding "timed out" after long time
On server console I can see welcome message from openSUSE with few SSH host keys and "static login: "
Do you know what can I check there else to diagnose what has happened? 😅
Full terraform apply console output is in attachment full console output.txt
I followed the Readme and am getting this error. It seemed to have created the 3 control planes, network and firewall but not the nodepool/nodes or load balancer.
Error: invalid input in field 'name' (invalid_input): [name => [Name must be a valid hostname.]]
│
│ with module.agents["myname_nodes-1"].hcloud_server.server,
│ on modules/host/main.tf line 1, in resource "hcloud_server" "server":
│ 1: resource "hcloud_server" "server" {
The option to add an existing certificates to the hetzner loadbalaner.
In my case I have an cloudflare origin server certificate.
I recently contacted Hetzner to increase my limits to deploy 3 master nodes and 3 worker nodes and after the limit increase I executed terraform but the script exited with an unknown error
╷
│ Error: hcloud/setRescue: hcclient/WaitForActions: action 382332309 failed: Unknown Error (unknown_error)
│
│ with hcloud_server.control_planes[0],
│ on servers.tf line 1, in resource "hcloud_server" "control_planes":
│ 1: resource "hcloud_server" "control_planes" {
│
here is my terraform.tfvars
# You need to replace these
hcloud_token = "my-token"
public_key = "/home/user/.ssh/id_ed25519.pub"
# Must be "private_key = null" when you want to use ssh-agent, for a Yubikey like device auth or an SSH key-pair with passphrase
private_key = "/home/user/.ssh/id_ed25519"
# These can be customized, or left with the default values
# For Hetzner locations see https://docs.hetzner.com/general/others/data-centers-and-connection/
# For Hetzner server types see https://www.hetzner.com/cloud
location = "fsn1" # change to `ash` for us-east Ashburn, Virginia location
network_region = "eu-central" # change to `us-east` if location is ash
agent_server_type = "cx41"
control_plane_server_type = "cx21"
lb_server_type = "lb21"
# At least 3 server nodes is recommended for HA, otherwise you need to turn off automatic upgrade (see ReadMe).
servers_num = 3
# For agent nodes, at least 2 is recommended for HA, but you can keep automatic upgrades.
agents_num = 3
# If you want to use a specific Hetzner CCM and CSI version, set them below, otherwise leave as is for the latest versions
# hetzner_ccm_version = ""
# hetzner_csi_version = ""
# If you want to kustomize the Hetzner CCM and CSI containers with the "latest" tags and imagePullPolicy Always,
# to have them automatically update when the node themselve get updated via the rancher system upgrade controller, the default is "false".
# If you choose to keep the default of "false", you can always use ArgoCD to monitor the CSI and CCM manifest for new releases,
# that is probably the more "vanilla" option to keep these components always updated.
# hetzner_ccm_containers_latest = true
# hetzner_csi_containers_latest = true
# If you want to use letsencrypt with tls Challenge, the email address is used to send you certificates expiration notices
traefik_acme_tls = true
traefik_acme_email = "my-email"
# If you want to allow non-control-plane workloads to run on the control-plane nodes set "true" below. The default is "false".
# allow_scheduling_on_control_plane = true
I have not edited any other file.
This issue provides visibility into Renovate updates and their statuses. Learn more
This repository currently has no open or pending branches.
Hi,
the metrics-server in my cluster is unable to scrape metrics from nodes:
I0305 08:15:26.352996 1 serving.go:341] Generated self-signed cert (/tmp/apiserver.crt, /tmp/apiserver.key)
I0305 08:15:26.719418 1 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController
I0305 08:15:26.719485 1 shared_informer.go:240] Waiting for caches to sync for RequestHeaderAuthRequestController
I0305 08:15:26.719421 1 configmap_cafile_content.go:202] Starting client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0305 08:15:26.719520 1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0305 08:15:26.719496 1 configmap_cafile_content.go:202] Starting client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0305 08:15:26.719646 1 shared_informer.go:240] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0305 08:15:26.720159 1 dynamic_serving_content.go:130] Starting serving-cert::/tmp/apiserver.crt::/tmp/apiserver.key
I0305 08:15:26.720267 1 secure_serving.go:202] Serving securely on :4443
I0305 08:15:26.720356 1 tlsconfig.go:240] Starting DynamicServingCertificateController
E0305 08:15:26.723158 1 scraper.go:139] "Failed to scrape node" err="Get \"https://10.2.0.1:10250/stats/summary?only_cpu_and_memory=true\": x509: certificate is valid for 127.0.0.1, 88.198.105.71, not 10.2.0.1" node="agent-big-0"
I0305 08:15:26.820176 1 shared_informer.go:247] Caches are synced for RequestHeaderAuthRequestController
I0305 08:15:26.820185 1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file
I0305 08:15:26.820191 1 shared_informer.go:247] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file
I0305 08:15:27.267832 1 server.go:188] "Failed probe" probe="metric-storage-ready" err="not metrics to serve"
I0305 08:15:27.636114 1 server.go:188] "Failed probe" probe="metric-storage-ready" err="not metrics to serve"
I0305 08:15:28.269638 1 server.go:188] "Failed probe" probe="metric-storage-ready" err="not metrics to serve"
I0305 08:15:29.635527 1 server.go:188] "Failed probe" probe="metric-storage-ready" err="not metrics to serve"
I0305 08:15:31.635288 1 server.go:188] "Failed probe" probe="metric-storage-ready" err="not metrics to serve"
I0305 08:15:33.636699 1 server.go:188] "Failed probe" probe="metric-storage-ready" err="not metrics to serve"
I0305 08:15:35.635678 1 server.go:188] "Failed probe" probe="metric-storage-ready" err="not metrics to serve"
I0305 08:15:37.636936 1 server.go:188] "Failed probe" probe="metric-storage-ready" err="not metrics to serve"
I0305 08:15:39.635200 1 server.go:188] "Failed probe" probe="metric-storage-ready" err="not metrics to serve"
I0305 08:15:41.635422 1 server.go:188] "Failed probe" probe="metric-storage-ready" err="not metrics to serve"
E0305 08:15:41.711552 1 scraper.go:139] "Failed to scrape node" err="Get \"https://10.2.0.1:10250/stats/summary?only_cpu_and_memory=true\": x509: certificate is valid for 127.0.0.1, 88.198.105.71, not 10.2.0.1" node="agent-big-0"
E0305 08:15:56.722408 1 scraper.go:139] "Failed to scrape node" err="Get \"https://10.2.0.1:10250/stats/summary?only_cpu_and_memory=true\": x509: certificate is valid for 127.0.0.1, 88.198.105.71, not 10.2.0.1" node="agent-big-0"
E0305 08:16:11.698713 1 scraper.go:139] "Failed to scrape node" err="Get \"https://10.2.0.1:10250/stats/summary?only_cpu_and_memory=true\": x509: certificate is valid for 127.0.0.1, 88.198.105.71, not 10.2.0.1" node="agent-big-0"
E0305 08:16:26.707787 1 scraper.go:139] "Failed to scrape node" err="Get \"https://10.2.0.1:10250/stats/summary?only_cpu_and_memory=true\": x509: certificate is valid for 127.0.0.1, 88.198.105.71, not 10.2.0.1" node="agent-big-0"
My cluster only consists of that one big agent and 3 control nodes. Any idea whats happening here?
One of the nodes were tainted. I tried to reapply Terraform via tf apply
.
That didn't work, so I deleted the node (agent-3) completely in the Hetzner UI
and tried a tf plan
and tf apply --auto-apply
hcloud_server.agents[3]: Creating...
╷
│ Error: hcloud/inlineAttachServerToNetwork: attach server to network: provided IP is not available (ip_not_available)
│
│ with hcloud_server.agents[3],
│ on agents.tf line 1, in resource "hcloud_server" "agents":
│ 1: resource "hcloud_server" "agents" {
The agent-3 was created but not attached to K8.
After that I tried to increase the node numbers from 3 agents to 5.
Node agent-4 was created and attached, but node-3 was still not able to attach:
complete output:
tf apply --auto-approve
random_password.k3s_token: Refreshing state... [id=none]
local_file.traefik_config: Refreshing state... [id=25ba84696ee16d68f5b98f6ea6b70bb14c3c530c]
hcloud_placement_group.k3s_placement_group: Refreshing state... [id=19653]
hcloud_ssh_key.default: Refreshing state... [id=5492430]
hcloud_network.k3s: Refreshing state... [id=1352333]
hcloud_firewall.k3s: Refreshing state... [id=290151]
hcloud_network_subnet.k3s: Refreshing state... [id=1352333-10.0.0.0/16]
local_file.hetzner_csi_config: Refreshing state... [id=aa232912bcf86722e32b698e1e077522c7f02a9d]
local_file.hetzner_ccm_config: Refreshing state... [id=f5ec6cb5689cb5830d04857365d567edae562174]
hcloud_server.first_control_plane: Refreshing state... [id=17736249]
hcloud_server.control_planes[0]: Refreshing state... [id=17736377]
hcloud_server.control_planes[1]: Refreshing state... [id=17736378]
hcloud_server.agents[5]: Refreshing state... [id=17861319]
hcloud_server.agents[3]: Refreshing state... [id=17869801]
hcloud_server.agents[0]: Refreshing state... [id=17736379]
hcloud_server.agents[1]: Refreshing state... [id=17736385]
hcloud_server.agents[4]: Refreshing state... [id=17858945]
hcloud_server.agents[2]: Refreshing state... [id=17736383]
Note: Objects have changed outside of Terraform
Terraform detected the following changes made outside of Terraform since the last "terraform apply":
# hcloud_placement_group.k3s_placement_group has been changed
~ resource "hcloud_placement_group" "k3s_placement_group" {
id = "19653"
name = "k3s-placement-group"
~ servers = [
+ 17869801,
# (8 unchanged elements hidden)
]
# (2 unchanged attributes hidden)
}
# hcloud_server.agents[3] has been changed
~ resource "hcloud_server" "agents" {
+ datacenter = "fsn1-dc14"
id = "17869801"
+ ipv4_address = "78.47.82.149"
+ ipv6_address = "2a01:4f8:c17:8d4a::1"
+ ipv6_network = "2a01:4f8:c17:8d4a::/64"
name = "k3s-agent-3"
+ status = "running"
# (12 unchanged attributes hidden)
- network {
- alias_ips = [] -> null
- ip = "10.0.0.8" -> null
- network_id = 1352333 -> null
}
}
# hcloud_firewall.k3s has been changed
~ resource "hcloud_firewall" "k3s" {
id = "290151"
name = "k3s-firewall"
# (1 unchanged attribute hidden)
+ apply_to {
+ server = 17869801
}
# (21 unchanged blocks hidden)
}
Unless you have made equivalent changes to your configuration, or ignored the relevant attributes using ignore_changes, the following plan may
include actions to undo or respond to these changes.
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
-/+ destroy and then create replacement
Terraform will perform the following actions:
# hcloud_server.agents[3] is tainted, so must be replaced
-/+ resource "hcloud_server" "agents" {
+ backup_window = (known after apply)
~ datacenter = "fsn1-dc14" -> (known after apply)
~ id = "17869801" -> (known after apply)
~ ipv4_address = "78.47.82.xxx" -> (known after apply)
~ ipv6_address = "2a01:4f8:c17:xxxx::1" -> (known after apply)
~ ipv6_network = "2a01:4f8:c17:xxxx::/64" -> (known after apply)
name = "k3s-agent-3"
~ status = "running" -> (known after apply)
# (12 unchanged attributes hidden)
+ network {
+ alias_ips = []
+ ip = "10.0.0.8"
+ mac_address = (known after apply)
+ network_id = 1352333
}
}
Plan: 1 to add, 0 to change, 1 to destroy.
Changes to Outputs:
~ agents_public_ip = [
# (2 unchanged elements hidden)
"138.201.246.xxx",
+ (known after apply),
+ "78.46.163.xxx",
+ "49.12.100.xxx",
]
hcloud_server.agents[3]: Destroying... [id=17869801]
hcloud_server.agents[3]: Destruction complete after 2s
hcloud_server.agents[3]: Creating...
hcloud_server.agents[3]: Still creating... [10s elapsed]
╷
│ Error: hcloud/inlineAttachServerToNetwork: attach server to network: provided IP is not available (ip_not_available)
│
│ with hcloud_server.agents[3],
│ on agents.tf line 1, in resource "hcloud_server" "agents":
│ 1: resource "hcloud_server" "agents" {
│
╵
How can I get agent-3 working again?
thank you in advance
Just apply the following command kubectl apply -f https://raw.githubusercontent.com/hetznercloud/csi-driver/v1.6.0/deploy/kubernetes/hcloud-csi.yml
Somehow terraform destroy
keeps hanging on destroying the network:
hcloud_placement_group.k3s_placement_group: Destruction complete after 0s
hcloud_firewall.k3s: Destruction complete after 0s
...
..
.
hcloud_network_subnet.k3s: Still destroying... [id=1352246-10.0.0.0/16, 10m40s elapsed]
hcloud_network_subnet.k3s: Still destroying... [id=1352246-10.0.0.0/16, 10m50s elapsed]
hcloud_network_subnet.k3s: Still destroying... [id=1352246-10.0.0.0/16, 11m0s elapsed]
hcloud_network_subnet.k3s: Still destroying... [id=1352246-10.0.0.0/16, 11m10s elapsed]
hcloud_network_subnet.k3s: Still destroying... [id=1352246-10.0.0.0/16, 11m20s elapsed]
hcloud_network_subnet.k3s: Still destroying... [id=1352246-10.0.0.0/16, 11m30s elapsed]
hcloud_network_subnet.k3s: Still destroying... [id=1352246-10.0.0.0/16, 11m40s elapsed]
I also tried to reapply terraform destroy
When I manually delete the network in the UI, it finishes a few seconds later
Can someone tell me please what I am doing wrong?
❯ tf apply --auto-approve
Terraform used the selected providers to generate the following execution plan. Resource actions are
indicated with the following symbols:
+ create
<= read (data resources)
Terraform will perform the following actions:
# data.remote_file.kubeconfig will be read during apply
# (config refers to values not yet known)
<= data "remote_file" "kubeconfig" {
+ content = (known after apply)
+ id = (known after apply)
+ path = "/etc/rancher/k3s/k3s.yaml"
+ conn {
+ agent = false
+ host = (known after apply)
+ port = 22
+ private_key = (sensitive value)
+ user = "root"
}
}
# hcloud_firewall.k3s will be created
+ resource "hcloud_firewall" "k3s" {
+ id = (known after apply)
+ labels = (known after apply)
+ name = "k3s"
+ rule {
+ destination_ips = [
+ "0.0.0.0/0",
]
+ direction = "out"
+ protocol = "icmp"
+ source_ips = []
}
+ rule {
+ destination_ips = [
+ "0.0.0.0/0",
]
+ direction = "out"
+ port = "123"
+ protocol = "udp"
+ source_ips = []
}
+ rule {
+ destination_ips = [
+ "0.0.0.0/0",
]
+ direction = "out"
+ port = "443"
+ protocol = "tcp"
+ source_ips = []
}
+ rule {
+ destination_ips = [
+ "0.0.0.0/0",
]
+ direction = "out"
+ port = "53"
+ protocol = "tcp"
+ source_ips = []
}
+ rule {
+ destination_ips = [
+ "0.0.0.0/0",
]
+ direction = "out"
+ port = "53"
+ protocol = "udp"
+ source_ips = []
}
+ rule {
+ destination_ips = [
+ "0.0.0.0/0",
]
+ direction = "out"
+ port = "80"
+ protocol = "tcp"
+ source_ips = []
}
+ rule {
+ destination_ips = []
+ direction = "in"
+ protocol = "icmp"
+ source_ips = [
+ "0.0.0.0/0",
]
}
+ rule {
+ destination_ips = []
+ direction = "in"
+ protocol = "icmp"
+ source_ips = [
+ "10.0.0.0/8",
+ "127.0.0.1/32",
+ "169.254.169.254/32",
+ "213.239.246.1/32",
]
}
+ rule {
+ destination_ips = []
+ direction = "in"
+ port = "22"
+ protocol = "tcp"
+ source_ips = [
+ "0.0.0.0/0",
]
}
+ rule {
+ destination_ips = []
+ direction = "in"
+ port = "6443"
+ protocol = "tcp"
+ source_ips = [
+ "0.0.0.0/0",
]
}
+ rule {
+ destination_ips = []
+ direction = "in"
+ port = "any"
+ protocol = "tcp"
+ source_ips = [
+ "10.0.0.0/8",
+ "127.0.0.1/32",
+ "169.254.169.254/32",
+ "213.239.246.1/32",
]
}
+ rule {
+ destination_ips = []
+ direction = "in"
+ port = "any"
+ protocol = "udp"
+ source_ips = [
+ "10.0.0.0/8",
+ "127.0.0.1/32",
+ "169.254.169.254/32",
+ "213.239.246.1/32",
]
}
}
# hcloud_network.k3s will be created
+ resource "hcloud_network" "k3s" {
+ delete_protection = false
+ id = (known after apply)
+ ip_range = "10.0.0.0/8"
+ name = "k3s"
}
# hcloud_network_subnet.k3s will be created
+ resource "hcloud_network_subnet" "k3s" {
+ gateway = (known after apply)
+ id = (known after apply)
+ ip_range = "10.0.0.0/16"
+ network_id = (known after apply)
+ network_zone = "eu-central"
+ type = "cloud"
}
# hcloud_placement_group.k3s will be created
+ resource "hcloud_placement_group" "k3s" {
+ id = (known after apply)
+ labels = {
+ "engine" = "k3s"
+ "provisioner" = "terraform"
}
+ name = "k3s"
+ servers = (known after apply)
+ type = "spread"
}
# hcloud_server.agents[0] will be created
+ resource "hcloud_server" "agents" {
+ backup_window = (known after apply)
+ backups = false
+ datacenter = (known after apply)
+ delete_protection = false
+ firewall_ids = (known after apply)
+ id = (known after apply)
+ image = "ubuntu-20.04"
+ ipv4_address = (known after apply)
+ ipv6_address = (known after apply)
+ ipv6_network = (known after apply)
+ keep_disk = false
+ labels = {
+ "engine" = "k3s"
+ "provisioner" = "terraform"
}
+ location = "fsn1"
+ name = "k3s-agent-0"
+ placement_group_id = (known after apply)
+ rebuild_protection = false
+ rescue = "linux64"
+ server_type = "cpx21"
+ ssh_keys = (known after apply)
+ status = (known after apply)
+ network {
+ alias_ips = []
+ ip = "10.0.1.1"
+ mac_address = (known after apply)
+ network_id = (known after apply)
}
}
# hcloud_server.agents[1] will be created
+ resource "hcloud_server" "agents" {
+ backup_window = (known after apply)
+ backups = false
+ datacenter = (known after apply)
+ delete_protection = false
+ firewall_ids = (known after apply)
+ id = (known after apply)
+ image = "ubuntu-20.04"
+ ipv4_address = (known after apply)
+ ipv6_address = (known after apply)
+ ipv6_network = (known after apply)
+ keep_disk = false
+ labels = {
+ "engine" = "k3s"
+ "provisioner" = "terraform"
}
+ location = "fsn1"
+ name = "k3s-agent-1"
+ placement_group_id = (known after apply)
+ rebuild_protection = false
+ rescue = "linux64"
+ server_type = "cpx21"
+ ssh_keys = (known after apply)
+ status = (known after apply)
+ network {
+ alias_ips = []
+ ip = "10.0.1.2"
+ mac_address = (known after apply)
+ network_id = (known after apply)
}
}
# hcloud_server.control_planes[0] will be created
+ resource "hcloud_server" "control_planes" {
+ backup_window = (known after apply)
+ backups = false
+ datacenter = (known after apply)
+ delete_protection = false
+ firewall_ids = (known after apply)
+ id = (known after apply)
+ image = "ubuntu-20.04"
+ ipv4_address = (known after apply)
+ ipv6_address = (known after apply)
+ ipv6_network = (known after apply)
+ keep_disk = false
+ labels = {
+ "engine" = "k3s"
+ "provisioner" = "terraform"
}
+ location = "fsn1"
+ name = "k3s-control-plane-1"
+ placement_group_id = (known after apply)
+ rebuild_protection = false
+ rescue = "linux64"
+ server_type = "cpx11"
+ ssh_keys = (known after apply)
+ status = (known after apply)
+ network {
+ alias_ips = []
+ ip = "10.0.0.3"
+ mac_address = (known after apply)
+ network_id = (known after apply)
}
}
# hcloud_server.control_planes[1] will be created
+ resource "hcloud_server" "control_planes" {
+ backup_window = (known after apply)
+ backups = false
+ datacenter = (known after apply)
+ delete_protection = false
+ firewall_ids = (known after apply)
+ id = (known after apply)
+ image = "ubuntu-20.04"
+ ipv4_address = (known after apply)
+ ipv6_address = (known after apply)
+ ipv6_network = (known after apply)
+ keep_disk = false
+ labels = {
+ "engine" = "k3s"
+ "provisioner" = "terraform"
}
+ location = "fsn1"
+ name = "k3s-control-plane-2"
+ placement_group_id = (known after apply)
+ rebuild_protection = false
+ rescue = "linux64"
+ server_type = "cpx11"
+ ssh_keys = (known after apply)
+ status = (known after apply)
+ network {
+ alias_ips = []
+ ip = "10.0.0.4"
+ mac_address = (known after apply)
+ network_id = (known after apply)
}
}
# hcloud_server.first_control_plane will be created
+ resource "hcloud_server" "first_control_plane" {
+ backup_window = (known after apply)
+ backups = false
+ datacenter = (known after apply)
+ delete_protection = false
+ firewall_ids = (known after apply)
+ id = (known after apply)
+ image = "ubuntu-20.04"
+ ipv4_address = (known after apply)
+ ipv6_address = (known after apply)
+ ipv6_network = (known after apply)
+ keep_disk = false
+ labels = {
+ "engine" = "k3s"
+ "provisioner" = "terraform"
}
+ location = "fsn1"
+ name = "k3s-control-plane-0"
+ placement_group_id = (known after apply)
+ rebuild_protection = false
+ rescue = "linux64"
+ server_type = "cpx11"
+ ssh_keys = (known after apply)
+ status = (known after apply)
+ network {
+ alias_ips = []
+ ip = "10.0.0.2"
+ mac_address = (known after apply)
+ network_id = (known after apply)
}
}
# hcloud_ssh_key.k3s will be created
+ resource "hcloud_ssh_key" "k3s" {
+ fingerprint = (known after apply)
+ id = (known after apply)
+ name = "k3s"
+ public_key = "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC5mH6iwpbJY+ssGIJUVsClE5LO/e9/YhA2k+oOP6VzxK2f9GutJu6wYNd6re5Ma1BRZL1ld95QKs/k1F1HWq75y1VJMawD+72+7OR6eT1nwJyrFDVk801UgCuOPJtLGAjNXx9uT2AMKZ08crnRGap3XzjLynVxoeETndINMew3LKnaL3zGkrDRRZnysrIoB3c8ywS9WlQxB5M3zdMICQ6aqsonIHChDybHnKb+wEKFUbND5ga/V1VG2GUR18uNGu01Zpxxof566C+26owSfrnA9R7KllUI/+/zYTqFRt5a2F3B/k0I+5WhSsAuRbI/eundl1oTP4sAtJ8qKBt20VYL [email protected]"
}
# local_file.kubeconfig will be created
+ resource "local_file" "kubeconfig" {
+ directory_permission = "0777"
+ file_permission = "600"
+ filename = "kubeconfig.yaml"
+ id = (known after apply)
+ sensitive_content = (sensitive value)
}
# random_password.k3s_token will be created
+ resource "random_password" "k3s_token" {
+ id = (known after apply)
+ length = 48
+ lower = true
+ min_lower = 0
+ min_numeric = 0
+ min_special = 0
+ min_upper = 0
+ number = true
+ result = (sensitive value)
+ special = false
+ upper = true
}
Plan: 12 to add, 0 to change, 0 to destroy.
Changes to Outputs:
+ agents_public_ip = [
+ (known after apply),
+ (known after apply),
]
+ controlplanes_public_ip = [
+ (known after apply),
+ (known after apply),
+ (known after apply),
]
+ kubeconfig = (sensitive value)
+ kubeconfig_file = (sensitive value)
random_password.k3s_token: Creating...
random_password.k3s_token: Creation complete after 0s [id=none]
hcloud_network.k3s: Creating...
hcloud_placement_group.k3s: Creating...
hcloud_ssh_key.k3s: Creating...
hcloud_firewall.k3s: Creating...
hcloud_placement_group.k3s: Creation complete after 1s [id=21532]
hcloud_ssh_key.k3s: Creation complete after 1s [id=5557860]
hcloud_network.k3s: Creation complete after 1s [id=1370757]
hcloud_network_subnet.k3s: Creating...
hcloud_firewall.k3s: Creation complete after 1s [id=300569]
hcloud_network_subnet.k3s: Creation complete after 1s [id=1370757-10.0.0.0/16]
hcloud_server.first_control_plane: Creating...
hcloud_server.first_control_plane: Still creating... [10s elapsed]
hcloud_server.first_control_plane: Provisioning with 'file'...
hcloud_server.first_control_plane: Still creating... [20s elapsed]
hcloud_server.first_control_plane: Still creating... [30s elapsed]
hcloud_server.first_control_plane: Provisioning with 'remote-exec'...
hcloud_server.first_control_plane (remote-exec): Connecting to remote host via SSH...
hcloud_server.first_control_plane (remote-exec): Host: 49.12.221.176
hcloud_server.first_control_plane (remote-exec): User: root
hcloud_server.first_control_plane (remote-exec): Password: false
hcloud_server.first_control_plane (remote-exec): Private key: true
hcloud_server.first_control_plane (remote-exec): Certificate: false
hcloud_server.first_control_plane (remote-exec): SSH Agent: true
hcloud_server.first_control_plane (remote-exec): Checking Host Key: false
hcloud_server.first_control_plane (remote-exec): Target Platform: unix
hcloud_server.first_control_plane (remote-exec): Connected!
hcloud_server.first_control_plane (remote-exec): + apt-get install -y aria2
hcloud_server.first_control_plane: Still creating... [40s elapsed]
hcloud_server.first_control_plane (remote-exec): Reading package lists... 0%
hcloud_server.first_control_plane (remote-exec): Reading package lists... 0%
hcloud_server.first_control_plane (remote-exec): Reading package lists... 16%
hcloud_server.first_control_plane (remote-exec): Reading package lists... Done
hcloud_server.first_control_plane (remote-exec): Building dependency tree... 0%
hcloud_server.first_control_plane (remote-exec): Building dependency tree... 0%
hcloud_server.first_control_plane (remote-exec): Building dependency tree... 50%
hcloud_server.first_control_plane (remote-exec): Building dependency tree... 50%
hcloud_server.first_control_plane (remote-exec): Building dependency tree... Done
hcloud_server.first_control_plane (remote-exec): Reading state information... 0%
hcloud_server.first_control_plane (remote-exec): Reading state information... 0%
hcloud_server.first_control_plane (remote-exec): Reading state information... Done
hcloud_server.first_control_plane (remote-exec): The following additional packages will be installed:
hcloud_server.first_control_plane (remote-exec): libaria2-0 libc-ares2
hcloud_server.first_control_plane (remote-exec): The following NEW packages will be installed:
hcloud_server.first_control_plane (remote-exec): aria2 libaria2-0 libc-ares2
hcloud_server.first_control_plane (remote-exec): 0 upgraded, 3 newly installed, 0 to remove and 0 not upgraded.
hcloud_server.first_control_plane (remote-exec): Need to get 1,571 kB of archives.
hcloud_server.first_control_plane (remote-exec): After this operation, 6,225 kB of additional disk space will be used.
hcloud_server.first_control_plane (remote-exec): 0% [Working]
hcloud_server.first_control_plane (remote-exec): Get:1 http://mirror.hetzner.com/debian/packages bullseye/main amd64 libc-ares2 amd64 1.17.1-1+deb11u1 [102 kB]
hcloud_server.first_control_plane (remote-exec): 1% [1 libc-ares2 14.2 kB/102 kB 14%]
hcloud_server.first_control_plane (remote-exec): 12% [Working]
hcloud_server.first_control_plane (remote-exec): Get:2 http://mirror.hetzner.com/debian/packages bullseye/main amd64 libaria2-0 amd64 1.35.0-3 [1,107 kB]
hcloud_server.first_control_plane (remote-exec): 13% [2 libaria2-0 28.6 kB/1,107 kB 3%]
hcloud_server.first_control_plane (remote-exec): 75% [Waiting for headers]
hcloud_server.first_control_plane (remote-exec): Get:3 http://mirror.hetzner.com/debian/packages bullseye/main amd64 aria2 amd64 1.35.0-3 [362 kB]
hcloud_server.first_control_plane (remote-exec): 77% [3 aria2 35.8 kB/362 kB 10%]
hcloud_server.first_control_plane (remote-exec): 100% [Working]
hcloud_server.first_control_plane (remote-exec): Fetched 1,571 kB in 0s (4,481 kB/s)
hcloud_server.first_control_plane (remote-exec): Selecting previously unselected package libc-ares2:amd64.
hcloud_server.first_control_plane (remote-exec): (Reading database ...
hcloud_server.first_control_plane (remote-exec): (Reading database ... 5%
hcloud_server.first_control_plane (remote-exec): (Reading database ... 10%
hcloud_server.first_control_plane (remote-exec): (Reading database ... 15%
hcloud_server.first_control_plane (remote-exec): (Reading database ... 20%
hcloud_server.first_control_plane (remote-exec): (Reading database ... 25%
hcloud_server.first_control_plane (remote-exec): (Reading database ... 30%
hcloud_server.first_control_plane (remote-exec): (Reading database ... 35%
hcloud_server.first_control_plane (remote-exec): (Reading database ... 40%
hcloud_server.first_control_plane (remote-exec): (Reading database ... 45%
hcloud_server.first_control_plane (remote-exec): (Reading database ... 50%
hcloud_server.first_control_plane (remote-exec): (Reading database ... 55%
hcloud_server.first_control_plane (remote-exec): (Reading database ... 60%
hcloud_server.first_control_plane (remote-exec): (Reading database ... 65%
hcloud_server.first_control_plane (remote-exec): (Reading database ... 70%
hcloud_server.first_control_plane (remote-exec): (Reading database ... 75%
hcloud_server.first_control_plane (remote-exec): (Reading database ... 80%
hcloud_server.first_control_plane (remote-exec): (Reading database ... 85%
hcloud_server.first_control_plane (remote-exec): (Reading database ... 90%
hcloud_server.first_control_plane (remote-exec): (Reading database ... 95%
hcloud_server.first_control_plane (remote-exec): (Reading database ... 100%
hcloud_server.first_control_plane (remote-exec): (Reading database ... 62163 files and directories currently installed.)
hcloud_server.first_control_plane (remote-exec): Preparing to unpack .../libc-ares2_1.17.1-1+deb11u1_amd64.deb ...
hcloud_server.first_control_plane (remote-exec): Unpacking libc-ares2:amd64 (1.17.1-1+deb11u1) ...
hcloud_server.first_control_plane (remote-exec): Selecting previously unselected package libaria2-0:amd64.
hcloud_server.first_control_plane (remote-exec): Preparing to unpack .../libaria2-0_1.35.0-3_amd64.deb ...
hcloud_server.first_control_plane (remote-exec): Unpacking libaria2-0:amd64 (1.35.0-3) ...
hcloud_server.first_control_plane (remote-exec): Selecting previously unselected package aria2.
hcloud_server.first_control_plane (remote-exec): Preparing to unpack .../aria2_1.35.0-3_amd64.deb ...
hcloud_server.first_control_plane (remote-exec): Unpacking aria2 (1.35.0-3) ...
hcloud_server.first_control_plane (remote-exec): Setting up libc-ares2:amd64 (1.17.1-1+deb11u1) ...
hcloud_server.first_control_plane (remote-exec): Setting up libaria2-0:amd64 (1.35.0-3) ...
hcloud_server.first_control_plane (remote-exec): Setting up aria2 (1.35.0-3) ...
hcloud_server.first_control_plane (remote-exec): Processing triggers for man-db (2.9.4-2) ...
hcloud_server.first_control_plane (remote-exec): Processing triggers for libc-bin (2.31-13+deb11u2) ...
hcloud_server.first_control_plane (remote-exec): + aria2c --follow-metalink=mem https://download.opensuse.org/tumbleweed/appliances/openSUSE-MicroOS.x86_64-k3s-kvm-and-xen.qcow2.meta4
hcloud_server.first_control_plane (remote-exec): 02/13 07:16:42 [NOTICE] Downloading 1 item(s)
hcloud_server.first_control_plane (remote-exec): [#19122d 0B/0B CN:1 DL:0B]
hcloud_server.first_control_plane (remote-exec): 02/13 07:16:43 [NOTICE] Download complete: [MEMORY]openSUSE-MicroOS.x86_64-16.0.0-k3s-kvm-and-xen-Snapshot20220210.qcow2.meta4
hcloud_server.first_control_plane (remote-exec): [#3aa26e 15MiB/601MiB(2%) CN:5 DL:20MiB
hcloud_server.first_control_plane (remote-exec): [#3aa26e 407MiB/601MiB(67%) CN:5 DL:232
hcloud_server.first_control_plane (remote-exec): [#3aa26e 549MiB/601MiB(91%) CN:2 DL:200
hcloud_server.first_control_plane (remote-exec): [#3aa26e 563MiB/601MiB(93%) CN:2 DL:150
hcloud_server.first_control_plane: Still creating... [50s elapsed]
hcloud_server.first_control_plane (remote-exec): [#3aa26e 573MiB/601MiB(95%) CN:1 DL:121
hcloud_server.first_control_plane (remote-exec): [#3aa26e 576MiB/601MiB(95%) CN:1 DL:100
hcloud_server.first_control_plane (remote-exec): [#3aa26e 578MiB/601MiB(96%) CN:1 DL:85M
hcloud_server.first_control_plane (remote-exec): [#3aa26e 580MiB/601MiB(96%) CN:1 DL:74M
hcloud_server.first_control_plane (remote-exec): [#3aa26e 583MiB/601MiB(97%) CN:1 DL:66M
hcloud_server.first_control_plane (remote-exec): [#3aa26e 586MiB/601MiB(97%) CN:1 DL:60M
hcloud_server.first_control_plane (remote-exec): [#3aa26e 590MiB/601MiB(98%) CN:1 DL:54M
hcloud_server.first_control_plane (remote-exec): 02/13 07:16:54 [NOTICE] Download complete: /root/openSUSE-MicroOS.x86_64-16.0.0-k3s-kvm-and-xen-Snapshot20220210.qcow2
hcloud_server.first_control_plane (remote-exec): Download Results:
hcloud_server.first_control_plane (remote-exec): gid |stat|avg speed |path/URI
hcloud_server.first_control_plane (remote-exec): ======+====+===========+=======================================================
hcloud_server.first_control_plane (remote-exec): 19122d|OK | 141KiB/s|[MEMORY]openSUSE-MicroOS.x86_64-16.0.0-k3s-kvm-and-xen-Snapshot20220210.qcow2.meta4
hcloud_server.first_control_plane (remote-exec): 3aa26e|OK | 53MiB/s|/root/openSUSE-MicroOS.x86_64-16.0.0-k3s-kvm-and-xen-Snapshot20220210.qcow2
hcloud_server.first_control_plane (remote-exec): Status Legend:
hcloud_server.first_control_plane (remote-exec): (OK):download completed.
hcloud_server.first_control_plane (remote-exec): + + grep -ie ^opensuse.*microos.*k3s.*qcow2$
hcloud_server.first_control_plane (remote-exec): ls -a
hcloud_server.first_control_plane (remote-exec): + qemu-img convert -p -f qcow2 -O host_device openSUSE-MicroOS.x86_64-16.0.0-k3s-kvm-and-xen-Snapshot20220210.qcow2 /dev/sda
hcloud_server.first_control_plane (remote-exec): (0.00/100%)
hcloud_server.first_control_plane (remote-exec): (1.00/100%)
hcloud_server.first_control_plane (remote-exec): (2.01/100%)
hcloud_server.first_control_plane (remote-exec): (3.01/100%)
hcloud_server.first_control_plane (remote-exec): (4.01/100%)
hcloud_server.first_control_plane (remote-exec): (5.02/100%)
hcloud_server.first_control_plane (remote-exec): (6.02/100%)
hcloud_server.first_control_plane (remote-exec): (7.05/100%)
hcloud_server.first_control_plane (remote-exec): (8.05/100%)
hcloud_server.first_control_plane (remote-exec): (9.06/100%)
hcloud_server.first_control_plane (remote-exec): (10.06/100%)
hcloud_server.first_control_plane (remote-exec): (11.07/100%)
hcloud_server.first_control_plane (remote-exec): (12.08/100%)
hcloud_server.first_control_plane (remote-exec): (13.08/100%)
hcloud_server.first_control_plane: Still creating... [1m0s elapsed]
hcloud_server.first_control_plane (remote-exec): (14.08/100%)
hcloud_server.first_control_plane (remote-exec): (15.09/100%)
hcloud_server.first_control_plane (remote-exec): (16.09/100%)
hcloud_server.first_control_plane (remote-exec): (17.10/100%)
hcloud_server.first_control_plane (remote-exec): (18.10/100%)
hcloud_server.first_control_plane (remote-exec): (19.10/100%)
hcloud_server.first_control_plane (remote-exec): (20.11/100%)
hcloud_server.first_control_plane (remote-exec): (21.11/100%)
hcloud_server.first_control_plane (remote-exec): (22.11/100%)
hcloud_server.first_control_plane (remote-exec): (23.12/100%)
hcloud_server.first_control_plane (remote-exec): (24.12/100%)
hcloud_server.first_control_plane (remote-exec): (25.12/100%)
hcloud_server.first_control_plane (remote-exec): (26.13/100%)
hcloud_server.first_control_plane (remote-exec): (27.13/100%)
hcloud_server.first_control_plane (remote-exec): (28.14/100%)
hcloud_server.first_control_plane (remote-exec): (29.14/100%)
hcloud_server.first_control_plane (remote-exec): (30.14/100%)
hcloud_server.first_control_plane (remote-exec): (31.15/100%)
hcloud_server.first_control_plane (remote-exec): (32.15/100%)
hcloud_server.first_control_plane (remote-exec): (33.15/100%)
hcloud_server.first_control_plane (remote-exec): (34.16/100%)
hcloud_server.first_control_plane (remote-exec): (35.16/100%)
hcloud_server.first_control_plane (remote-exec): (36.16/100%)
hcloud_server.first_control_plane (remote-exec): (37.17/100%)
hcloud_server.first_control_plane (remote-exec): (38.17/100%)
hcloud_server.first_control_plane (remote-exec): (39.18/100%)
hcloud_server.first_control_plane (remote-exec): (40.18/100%)
hcloud_server.first_control_plane (remote-exec): (41.18/100%)
hcloud_server.first_control_plane (remote-exec): (42.19/100%)
hcloud_server.first_control_plane (remote-exec): (43.19/100%)
hcloud_server.first_control_plane (remote-exec): (44.19/100%)
hcloud_server.first_control_plane (remote-exec): (45.20/100%)
hcloud_server.first_control_plane (remote-exec): (46.20/100%)
hcloud_server.first_control_plane (remote-exec): (47.20/100%)
hcloud_server.first_control_plane (remote-exec): (48.21/100%)
hcloud_server.first_control_plane (remote-exec): (49.21/100%)
hcloud_server.first_control_plane (remote-exec): (50.22/100%)
hcloud_server.first_control_plane (remote-exec): (51.22/100%)
hcloud_server.first_control_plane (remote-exec): (52.22/100%)
hcloud_server.first_control_plane (remote-exec): (53.23/100%)
hcloud_server.first_control_plane (remote-exec): (54.23/100%)
hcloud_server.first_control_plane (remote-exec): (55.23/100%)
hcloud_server.first_control_plane (remote-exec): (56.24/100%)
hcloud_server.first_control_plane (remote-exec): (57.24/100%)
hcloud_server.first_control_plane (remote-exec): (58.24/100%)
hcloud_server.first_control_plane (remote-exec): (59.25/100%)
hcloud_server.first_control_plane (remote-exec): (60.25/100%)
hcloud_server.first_control_plane (remote-exec): (61.28/100%)
hcloud_server.first_control_plane (remote-exec): (62.28/100%)
hcloud_server.first_control_plane (remote-exec): (63.29/100%)
hcloud_server.first_control_plane (remote-exec): (64.29/100%)
hcloud_server.first_control_plane (remote-exec): (65.29/100%)
hcloud_server.first_control_plane (remote-exec): (66.30/100%)
hcloud_server.first_control_plane (remote-exec): (67.30/100%)
hcloud_server.first_control_plane (remote-exec): (68.30/100%)
hcloud_server.first_control_plane (remote-exec): (69.31/100%)
hcloud_server.first_control_plane (remote-exec): (70.31/100%)
hcloud_server.first_control_plane (remote-exec): (71.32/100%)
hcloud_server.first_control_plane (remote-exec): (72.32/100%)
hcloud_server.first_control_plane (remote-exec): (73.32/100%)
hcloud_server.first_control_plane (remote-exec): (74.33/100%)
hcloud_server.first_control_plane (remote-exec): (75.33/100%)
hcloud_server.first_control_plane (remote-exec): (76.33/100%)
hcloud_server.first_control_plane (remote-exec): (77.34/100%)
hcloud_server.first_control_plane (remote-exec): (78.34/100%)
hcloud_server.first_control_plane (remote-exec): (79.34/100%)
hcloud_server.first_control_plane (remote-exec): (80.35/100%)
hcloud_server.first_control_plane (remote-exec): (81.35/100%)
hcloud_server.first_control_plane (remote-exec): (82.36/100%)
hcloud_server.first_control_plane (remote-exec): (83.36/100%)
hcloud_server.first_control_plane (remote-exec): (84.36/100%)
hcloud_server.first_control_plane (remote-exec): (85.37/100%)
hcloud_server.first_control_plane (remote-exec): (86.38/100%)
hcloud_server.first_control_plane (remote-exec): (87.38/100%)
hcloud_server.first_control_plane (remote-exec): (88.38/100%)
hcloud_server.first_control_plane (remote-exec): (89.39/100%)
hcloud_server.first_control_plane (remote-exec): (90.39/100%)
hcloud_server.first_control_plane (remote-exec): (91.39/100%)
hcloud_server.first_control_plane (remote-exec): (92.40/100%)
hcloud_server.first_control_plane (remote-exec): (93.40/100%)
hcloud_server.first_control_plane (remote-exec): (94.40/100%)
hcloud_server.first_control_plane (remote-exec): (95.41/100%)
hcloud_server.first_control_plane (remote-exec): (96.41/100%)
hcloud_server.first_control_plane (remote-exec): (97.42/100%)
hcloud_server.first_control_plane (remote-exec): (98.42/100%)
hcloud_server.first_control_plane (remote-exec): (99.42/100%)
hcloud_server.first_control_plane (remote-exec): (100.00/100%)
hcloud_server.first_control_plane (remote-exec): (100.00/100%)
hcloud_server.first_control_plane (remote-exec): + sgdisk -e /dev/sda
hcloud_server.first_control_plane (remote-exec): The operation has completed successfully.
hcloud_server.first_control_plane (remote-exec): + parted -s /dev/sda resizepart 4 99%
hcloud_server.first_control_plane (remote-exec): + parted -s /dev/sda mkpart primary ext2 99% 100%
hcloud_server.first_control_plane (remote-exec): + partprobe /dev/sda
hcloud_server.first_control_plane (remote-exec): + udevadm settle
hcloud_server.first_control_plane (remote-exec): + fdisk -l /dev/sda
hcloud_server.first_control_plane (remote-exec): Disk /dev/sda: 38.15 GiB, 40961572864 bytes, 80003072 sectors
hcloud_server.first_control_plane (remote-exec): Disk model: QEMU HARDDISK
hcloud_server.first_control_plane (remote-exec): Units: sectors of 1 * 512 = 512 bytes
hcloud_server.first_control_plane (remote-exec): Sector size (logical/physical): 512 bytes / 512 bytes
hcloud_server.first_control_plane (remote-exec): I/O size (minimum/optimal): 512 bytes / 512 bytes
hcloud_server.first_control_plane (remote-exec): Disklabel type: gpt
hcloud_server.first_control_plane (remote-exec): Disk identifier: EC33AA26-C0DC-4B6C-AF09-4CA8108C7753
hcloud_server.first_control_plane (remote-exec): Device Start End Sectors Size Type
hcloud_server.first_control_plane (remote-exec): /dev/sda1 2048 6143 4096 2M BIOS
hcloud_server.first_control_plane (remote-exec): /dev/sda2 6144 47103 40960 20M EFI
hcloud_server.first_control_plane (remote-exec): /dev/sda3 47104 31438847 31391744 15G Linu
hcloud_server.first_control_plane (remote-exec): /dev/sda4 31438848 79203041 47764194 22.8G Linu
hcloud_server.first_control_plane (remote-exec): /dev/sda5 79204352 80001023 796672 389M Linu
hcloud_server.first_control_plane (remote-exec): + mount /dev/sda4 /mnt/
hcloud_server.first_control_plane (remote-exec): + btrfs filesystem resize max /mnt
hcloud_server.first_control_plane (remote-exec): Resize '/mnt' of 'max'
hcloud_server.first_control_plane (remote-exec): + umount /mnt
hcloud_server.first_control_plane (remote-exec): + mke2fs -L ignition /dev/sda5
hcloud_server.first_control_plane (remote-exec): mke2fs 1.46.2 (28-Feb-2021)
hcloud_server.first_control_plane (remote-exec): Discarding device blocks: done
hcloud_server.first_control_plane (remote-exec): Creating filesystem with 398336 1k blocks and 99960 inodes
hcloud_server.first_control_plane (remote-exec): Filesystem UUID: 8a3cd038-472e-4812-abe5-ad2f7a5980ef
hcloud_server.first_control_plane (remote-exec): Superblock backups stored on blocks:
hcloud_server.first_control_plane (remote-exec): 8193, 24577, 40961, 57345, 73729, 204801, 221185
hcloud_server.first_control_plane (remote-exec): Allocating group tables: done
hcloud_server.first_control_plane (remote-exec): Writing inode tables: done
hcloud_server.first_control_plane (remote-exec): Writing superblocks and filesystem accounting information: done
hcloud_server.first_control_plane (remote-exec): + mount /dev/sda5 /mnt
hcloud_server.first_control_plane (remote-exec): + mkdir /mnt/ignition
hcloud_server.first_control_plane (remote-exec): + cp /root/config.ign /mnt/ignition/config.ign
hcloud_server.first_control_plane (remote-exec): + umount /mnt
hcloud_server.first_control_plane: Provisioning with 'local-exec'...
hcloud_server.first_control_plane (local-exec): Executing: ["/bin/sh" "-c" "ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -i ~/.ssh/id_rsa [email protected] '(sleep 2; reboot)&'; sleep 3"]
hcloud_server.first_control_plane: Still creating... [1m10s elapsed]
hcloud_server.first_control_plane (local-exec): Warning: Permanently added '49.12.221.176' (ECDSA) to the list of known hosts.
hcloud_server.first_control_plane (local-exec): Connection to 49.12.221.176 closed by remote host.
hcloud_server.first_control_plane: Provisioning with 'local-exec'...
hcloud_server.first_control_plane (local-exec): Executing: ["/bin/sh" "-c" "until ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -i ~/.ssh/id_rsa -o ConnectTimeout=2 [email protected] true 2> /dev/null\ndo\n echo \"Waiting for MicroOS to reboot and become available...\"\n sleep 2\ndone\n"]
hcloud_server.first_control_plane (local-exec): Waiting for MicroOS to reboot and become available...
hcloud_server.first_control_plane: Still creating... [1m20s elapsed]
hcloud_server.first_control_plane (local-exec): Waiting for MicroOS to reboot and become available...
hcloud_server.first_control_plane (local-exec): Waiting for MicroOS to reboot and become available...
hcloud_server.first_control_plane (local-exec): Waiting for MicroOS to reboot and become available...
hcloud_server.first_control_plane: Still creating... [1m30s elapsed]
hcloud_server.first_control_plane (local-exec): Waiting for MicroOS to reboot and become available...
hcloud_server.first_control_plane (local-exec): Waiting for MicroOS to reboot and become available...
hcloud_server.first_control_plane (local-exec): Waiting for MicroOS to reboot and become available...
hcloud_server.first_control_plane: Still creating... [1m40s elapsed]
hcloud_server.first_control_plane (local-exec): Waiting for MicroOS to reboot and become available...
hcloud_server.first_control_plane (local-exec): Waiting for MicroOS to reboot and become available...
hcloud_server.first_control_plane (local-exec): Waiting for MicroOS to reboot and become available...
hcloud_server.first_control_plane (local-exec): Waiting for MicroOS to reboot and become available...
hcloud_server.first_control_plane (local-exec): Waiting for MicroOS to reboot and become available...
hcloud_server.first_control_plane: Still creating... [1m50s elapsed]
hcloud_server.first_control_plane (local-exec): Waiting for MicroOS to reboot and become available...
hcloud_server.first_control_plane: Provisioning with 'file'...
hcloud_server.first_control_plane: Still creating... [2m0s elapsed]
hcloud_server.first_control_plane: Still creating... [2m10s elapsed]
hcloud_server.first_control_plane: Still creating... [2m20s elapsed]
hcloud_server.first_control_plane: Still creating... [2m30s elapsed]
hcloud_server.first_control_plane: Still creating... [2m40s elapsed]
hcloud_server.first_control_plane: Still creating... [2m50s elapsed]
hcloud_server.first_control_plane: Still creating... [3m0s elapsed]
hcloud_server.first_control_plane: Still creating... [3m10s elapsed]
hcloud_server.first_control_plane: Still creating... [3m20s elapsed]
hcloud_server.first_control_plane: Still creating... [3m30s elapsed]
hcloud_server.first_control_plane: Still creating... [3m40s elapsed]
hcloud_server.first_control_plane: Still creating... [3m50s elapsed]
hcloud_server.first_control_plane: Still creating... [4m0s elapsed]
hcloud_server.first_control_plane: Still creating... [4m10s elapsed]
hcloud_server.first_control_plane: Still creating... [4m20s elapsed]
hcloud_server.first_control_plane: Still creating... [4m30s elapsed]
hcloud_server.first_control_plane: Still creating... [4m40s elapsed]
hcloud_server.first_control_plane: Still creating... [4m50s elapsed]
hcloud_server.first_control_plane: Still creating... [5m0s elapsed]
hcloud_server.first_control_plane: Still creating... [5m10s elapsed]
hcloud_server.first_control_plane: Still creating... [5m20s elapsed]
hcloud_server.first_control_plane: Still creating... [5m30s elapsed]
hcloud_server.first_control_plane: Still creating... [5m40s elapsed]
hcloud_server.first_control_plane: Still creating... [5m50s elapsed]
hcloud_server.first_control_plane: Still creating... [6m0s elapsed]
hcloud_server.first_control_plane: Still creating... [6m10s elapsed]
hcloud_server.first_control_plane: Still creating... [6m20s elapsed]
hcloud_server.first_control_plane: Still creating... [6m30s elapsed]
hcloud_server.first_control_plane: Still creating... [6m40s elapsed]
hcloud_server.first_control_plane: Still creating... [6m50s elapsed]
╷
│ Error: file provisioner error
│
│ with hcloud_server.first_control_plane,
│ on master.tf line 54, in resource "hcloud_server" "first_control_plane":
│ 54: provisioner "file" {
│
│ timeout - last error: dial tcp 49.12.221.176:22: connect: operation timed out
╵
Hi again :) So...I thought I understood, but I continue to have issues setting up simple ingress routes. I understand if this is out of scope for the kube-hetzner project, as it may likely just be a traefik configuration that I don't understand. In this scenario, I have installed the whoami
helm chart:
helm repo add cowboysysop https://cowboysysop.github.io/charts/
helm install my-release cowboysysop/whoami
And I can see the ingress
, service
, and pod
deployed all correctly into the default namespace, and port forwarding on the service or pod displays the whoami information, but the external route (https://whoami.site.com) returns an empty reply, like it was never 'caught' via the ingress, I keep getting these messages in the traefik pod's log:
level=error msg="Skipping service: no endpoints found" ingress=whoami-1645104801 namespace=default serviceName=whoami-1645104801 servicePort="&ServiceBackendPort{Name:,Number:80,}" providerName=kubernetes
As always, any help is much appreciated.
@mnencia @phaer At some point I was getting these errors:
It wouldn't even let me curl to the links, nslookup would give the IP, and that works on my personal machine, it responds to HTTPs, but on the node, total silence. Meaning, the IP had downloaded so much, that it was blacklisted.
Had to temporarily "host" the meta4 file, over at https://raw.githubusercontent.com/kube-hetzner/kube-hetzner/staging/.files/openSUSE-MicroOS.x86_64-k3s-kvm-and-xen.qcow2.meta4
It works like a charm and is 10x faster to download, have no idea why 🤯, the only thing is that it would be hard to maintain up-to-date images that way.
Hello, after a fresh terraform apply at hetzner, all my pods are in ready state with the following error messages:
0/4 nodes are available: 4 node(s) had taint {node.cloudprovider.kubernetes.io/uninitialized: true}, that the pod didn't tolerate.
Running the commande:
kubectl taint nodes --all node.cloudprovider.kubernetes.io/uninitialized-
fixed the problem, but the load balancer is not created
Do, you know how can I fix it please ?
Hi - first THANK YOU so much for all the effort you've put into this repository @mysticaltech. I am very new to kubernetes and the whole k3 ecosystem, and your effort to help others is really wonderful. Kudos to you, sir :)
Second - I have a small issue with my TLS and I cannot get it to work. I simply want blog.domain.com to be secured, and though there are lots of ways to get a cert, I'm simply trying to use a wildcard of my own. I've successfully created this using:
kubectl create secret tls domain-tls --cert ./domain.crt --key ./domain.key
And I have the ingress set like (bound to a ghost
deployment)
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: domain-ingress-blog
spec:
tls:
- hosts:
- blog.domain.com
secretName: domain-tls
rules:
- host: blog.domain.com
http:
paths:
- pathType: Prefix
path: "/"
backend:
service:
name: domain-blog-ghost
port:
number: 80
And I've created an A record pointing the blog.domain.com to the traefik load balancer IP with my dns provider.
I am missing something, though, because the default Traefik cert is always shown when I hit blog.domain.com:
My Traefik rendered yaml template is below, resulting from the terraform apply:
apiVersion: helm.cattle.io/v1
kind: HelmChartConfig
metadata:
name: traefik
namespace: kube-system
spec:
valuesContent: |-
service:
enabled: true
type: LoadBalancer
annotations:
"load-balancer.hetzner.cloud/name": "traefik"
# make hetzners load-balancer connect to our nodes via our private k3s-net.
"load-balancer.hetzner.cloud/use-private-ip": "true"
# keep hetzner-ccm from exposing our private ingress ip, which in general isn't routeable from the public internet.
"load-balancer.hetzner.cloud/disable-private-ingress": "true"
# disable ipv6 by default, because external-dns doesn't support AAAA for hcloud yet https://github.com/kubernetes-sigs/external-dns/issues/2044
"load-balancer.hetzner.cloud/ipv6-disabled": "false"
"load-balancer.hetzner.cloud/location": "ash"
"load-balancer.hetzner.cloud/type": "lb11"
"load-balancer.hetzner.cloud/uses-proxyprotocol": "true"
# "load-balancer.hetzner.cloud/http-redirect-http": "true"
"load-balancer.hetzner.cloud/http-sticky-sessions": "true"
additionalArguments:
- "--entryPoints.web.proxyProtocol.trustedIPs=127.0.0.1/32,10.0.0.0/8"
- "--entryPoints.websecure.proxyProtocol.trustedIPs=127.0.0.1/32,10.0.0.0/8"
- "--entryPoints.web.forwardedHeaders.trustedIPs=127.0.0.1/32,10.0.0.0/8"
- "--entryPoints.websecure.forwardedHeaders.trustedIPs=127.0.0.1/32,10.0.0.0/8"
Any help would be greatly appreciated. Thanks again :)
When I Setup a Cluster it seems that the server are not created with a Placement Group of the Type "Spread".
This should be common practice though to maximise availability should the host machine fail.
There is a fairly recent Tutorial on Hetzner Community that mentions that.
I am trying to create a small cluster with 1 control plane and 2 agents. I already increased the timeout of the bash script to 500 and still having the issue. I tried to create a new project and generated a new api token and still the same.
Here's the loop output
null_resource.first_control_plane: Still creating... [10m50s elapsed]
null_resource.first_control_plane (remote-exec): Waiting for load-balancer to get an IP...
null_resource.first_control_plane (remote-exec): Waiting for load-balancer to get an IP...
null_resource.first_control_plane (remote-exec): Waiting for load-balancer to get an IP...
Here is my vars file
location = "fsn1"
network_region = "eu-central"
agent_server_type = "cpx21"
control_plane_server_type = "cpx11"
lb_server_type = "lb11"
servers_num = 1
agents_num = 2
We need to remove SSH password auth ideally through ignition, but if not possible, through combustion. And also do some basic hardening of that service if needed.
First and foremost, we need to find the location of the SSH config file.
Recently Rancher, the creators of k3s and k3os has been bought by SUSE. And in doing so, they've dropped official support for k3os (k3s on the other hand is thriving and has been separated from Rancher).
I went on to contact Jacob Blain Christen, the lead maintainer of k3os, and he told me that he'll continue to do releases on the weekends and that the project could live on if the community maintained it.
However, that is not a stable backing for this project, so I made my own research and concluded that OpenSuse MicroOS, has HUGE backing has it piggybacks on Tumbleweed, a major OpenSuse distro, and has stable and automated transactional updates, as such it's now the best OS to replace k3os.
What I did:
git clone https://github.com/kube-hetzner/kube-hetzner.git
git checkout staging
export TF_VAR_hcloud_token="my-hcloud-token"
Create a terraform.tfvars file:
# You need to replace these
public_key = "~/.ssh/id_ed25519.pub"
# Must be "private_key = null" when you want to use ssh-agent, for a Yubikey like device auth or an SSH key-pair with passphrase
private_key = "~/.ssh/id_ed25519"
# These can be customized, or left with the default values
# For Hetzner locations see https://docs.hetzner.com/general/others/data-centers-and-connection/
# For Hetzner server types see https://www.hetzner.com/cloud
location = "nbg1" # change to `ash` for us-east Ashburn, Virginia location
network_region = "eu-central" # change to `us-east` if location is ash
agent_server_type = "cpx21"
control_plane_server_type = "cpx31"
lb_server_type = "lb11"
servers_num = 3
agents_num = 0
# If you want to use a specific Hetzner CCM and CSI version, set them below, otherwise leave as is for the latest versions
# hetzner_ccm_version = ""
# hetzner_csi_version = ""
# If you want to kustomize the Hetzner CCM and CSI containers with the "latest" tags and imagePullPolicy Always,
# to have them automatically update when the node themselve get updated via the rancher system upgrade controller, the default is "false".
# If you choose to keep the default of "false", you can always use ArgoCD to monitor the CSI and CCM manifest for new releases,
# that is probably the more "vanilla" option to keep these components always updated.
# hetzner_ccm_containers_latest = true
# hetzner_csi_containers_latest = true
# If you want to use letsencrypt with tls Challenge, the email address is used to send you certificates expiration notices
traefik_acme_tls = true
traefik_acme_email = "[email protected]"
# If you want to allow non-control-plane workloads to run on the control-plane nodes set "true" below. The default is "false".
allow_scheduling_on_control_plane = true
What happened:
All resources are properly scheduled, but the load balancer does not point to the control planes.
What I expect:
I expect a reference from the load balancer to the control planes.
Hi,
would be great if we can enable IPv6 on loadbalancer via a variable and get the v6 address just like the v4 (hcloud_load_balancer.traefik.ipv6)
Thanks
and great project!
Hi @mysticaltech !
Finally got time to test this beast repo this weekend ;)
This might sound a bit silly, but I'm kind of stuck on the remote-exec
for the initialization of the first_control_plane
I did clone and create a new terraform.tfvars
with the corresponding token, public key and private key.
Server spins up fine but seems like just can't connect to it.
However, if I manually connect from the host to ssh [email protected]
, this works!
Let me know what I'm missing - and great stuff! thank you for sharing this repo.
The TLS example configures a secret while at the same time providing annotations to let traefik request a certificate from lets encrypt:
https://github.com/kube-hetzner/kube-hetzner/blob/6f6de884ec1baace14b894e5ee1917ffa947e1ca/examples/tls/ingress.yaml#L12
If I get this right the line referencing the secret is wrong here? If thats the case I could do a pull request to fix the documentation and example.
Running terraform apply on an already existing cluster fails because IP changes:
command:
terraform apply -var-file=prod.tfvars -var hcloud_token="REDACTED"
error:
module.kubernetes.hcloud_server.first_control_plane: Modifying... [id=18032205]
╷
│ Error: hcloud/updateServerInlineNetworkAttachments: hcloud/inlineAttachServerToNetwork: attach server to network: provided IP is not available (ip_not_available)
│
│ with module.kubernetes.hcloud_server.first_control_plane,
│ on .terraform/modules/kubernetes/master.tf line 1, in resource "hcloud_server" "first_control_plane":
│ 1: resource "hcloud_server" "first_control_plane" {
│
plan output:
$ terraform plan -var-file=prod.tfvars -var hcloud_token="REDACTED"
module.kubernetes.hcloud_ssh_key.k3s: Refreshing state... [id=5585976]
module.kubernetes.random_password.k3s_token: Refreshing state... [id=none]
module.kubernetes.hcloud_network.k3s: Refreshing state... [id=REDACTED]
module.kubernetes.hcloud_placement_group.k3s: Refreshing state... [id=22200]
module.kubernetes.hcloud_firewall.k3s: Refreshing state... [id=303856]
module.kubernetes.hcloud_network_subnet.k3s: Refreshing state... [id=REDACTED-10.0.0.0/16]
module.kubernetes.hcloud_server.first_control_plane: Refreshing state... [id=18032205]
module.kubernetes.hcloud_server.control_planes[0]: Refreshing state... [id=18036735]
module.kubernetes.hcloud_server.agents[2]: Refreshing state... [id=18076430]
module.kubernetes.hcloud_server.agents[0]: Refreshing state... [id=18032235]
module.kubernetes.hcloud_server.agents[1]: Refreshing state... [id=18036736]
module.kubernetes.hcloud_server.control_planes[1]: Refreshing state... [id=18032231]
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
+ create
~ update in-place
<= read (data resources)
Terraform will perform the following actions:
# module.kubernetes.data.remote_file.kubeconfig will be read during apply
# (config refers to values not yet known)
<= data "remote_file" "kubeconfig" {
+ content = (known after apply)
+ id = (known after apply)
+ path = "/etc/rancher/k3s/k3s.yaml"
+ conn {
+ agent = (sensitive)
+ host = "REDACTED"
+ port = 22
+ user = "root"
}
}
# module.kubernetes.hcloud_server.agents[0] will be updated in-place
~ resource "hcloud_server" "agents" {
id = "18032235"
name = "k3s-agent-0"
# (17 unchanged attributes hidden)
- network {
- alias_ips = [] -> null
- ip = "10.0.1.1" -> null
- mac_address = "86:00:00:04:6a:7d" -> null
- network_id = REDACTED -> null
}
+ network {
+ alias_ips = []
+ ip = "10.0.2.1"
+ mac_address = (known after apply)
+ network_id = REDACTED
}
}
# module.kubernetes.hcloud_server.agents[1] will be updated in-place
~ resource "hcloud_server" "agents" {
id = "18036736"
name = "k3s-agent-1"
# (17 unchanged attributes hidden)
- network {
- alias_ips = [] -> null
- ip = "10.0.1.2" -> null
- mac_address = "86:00:00:04:70:79" -> null
- network_id = REDACTED -> null
}
+ network {
+ alias_ips = []
+ ip = "10.0.2.2"
+ mac_address = (known after apply)
+ network_id = REDACTED
}
}
# module.kubernetes.hcloud_server.agents[2] will be updated in-place
~ resource "hcloud_server" "agents" {
id = "18076430"
name = "k3s-agent-2"
# (17 unchanged attributes hidden)
- network {
- alias_ips = [] -> null
- ip = "10.0.1.3" -> null
- mac_address = "86:00:00:04:9c:3f" -> null
- network_id = REDACTED -> null
}
+ network {
+ alias_ips = []
+ ip = "10.0.2.3"
+ mac_address = (known after apply)
+ network_id = REDACTED
}
}
# module.kubernetes.hcloud_server.control_planes[0] will be updated in-place
~ resource "hcloud_server" "control_planes" {
id = "18036735"
name = "k3s-control-plane-1"
# (17 unchanged attributes hidden)
- network {
- alias_ips = [] -> null
- ip = "10.0.0.3" -> null
- mac_address = "86:00:00:04:70:78" -> null
- network_id = REDACTED -> null
}
+ network {
+ alias_ips = []
+ ip = "10.0.1.2"
+ mac_address = (known after apply)
+ network_id = REDACTED
}
}
# module.kubernetes.hcloud_server.control_planes[1] will be updated in-place
~ resource "hcloud_server" "control_planes" {
id = "18032231"
name = "k3s-control-plane-2"
# (17 unchanged attributes hidden)
- network {
- alias_ips = [] -> null
- ip = "10.0.0.4" -> null
- mac_address = "86:00:00:04:6a:7a" -> null
- network_id = REDACTED -> null
}
+ network {
+ alias_ips = []
+ ip = "10.0.1.3"
+ mac_address = (known after apply)
+ network_id = REDACTED
}
}
# module.kubernetes.hcloud_server.first_control_plane will be updated in-place
~ resource "hcloud_server" "first_control_plane" {
id = "18032205"
name = "k3s-control-plane-0"
# (17 unchanged attributes hidden)
+ network {
+ alias_ips = []
+ ip = "10.0.1.1"
+ mac_address = (known after apply)
+ network_id = REDACTED
}
}
# module.kubernetes.local_file.kubeconfig will be created
+ resource "local_file" "kubeconfig" {
+ directory_permission = "0777"
+ file_permission = "600"
+ filename = "kubeconfig.yaml"
+ id = (known after apply)
+ sensitive_content = (sensitive value)
}
Hi all - interestingly, I've left the k3 cluster running for some time, and twice now the cluster has become completely unreachable. This happens after a couple hours, but I am not sure how many. I think it's somehow tied to the auto-rebooting nature of kured
but that's a guess. If I restart the servers via the hetzner UI one by one, the cluster comes back online.
This is the error I'm getting:
The connection to the server 5.161.69.37:6443 was refused - did you specify the right host or port?
And when I ssh into that box, this is the status I see for the k3 service:
static:~ # systemctl status k3s-server.service
× k3s-server.service - Lightweight Kubernetes
Loaded: loaded (/usr/lib/systemd/system/k3s-server.service; enabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Fri 2022-02-18 00:27:29 UTC; 2h 16min ago
Docs: https://k3s.io
Process: 1478 ExecStartPre=/sbin/modprobe br_netfilter (code=exited, status=0/SUCCESS)
Process: 1484 ExecStartPre=/sbin/modprobe overlay (code=exited, status=0/SUCCESS)
Process: 1485 ExecStart=/usr/bin/k3s server ${SERVER_OPTS} (code=exited, status=2)
Main PID: 1485 (code=exited, status=2)
Tasks: 82
CPU: 1h 2min 31.811s
CGroup: /system.slice/k3s-server.service
├─2215 /usr/sbin/containerd-shim-runc-v2 -namespace k8s.io -id e579d5aad3973a0ce14cbb971f1415469a925718cc2ed07f3556c70688f631f9 -address /run/k3s/containerd/containerd.sock
├─2218 /usr/sbin/containerd-shim-runc-v2 -namespace k8s.io -id a8b619e61d6205f9eb9ae4b7714dc9e5ed68042dcdac8230072682e2f984e972 -address /run/k3s/containerd/containerd.sock
├─2432 /usr/sbin/containerd-shim-runc-v2 -namespace k8s.io -id a31940dafe0a541b55356cdaf11845155b4261ea756ecc253eb4f3d17bacb571 -address /run/k3s/containerd/containerd.sock
├─2513 /usr/sbin/containerd-shim-runc-v2 -namespace k8s.io -id 6b6eb08633e07d89b0485d57bff93bdfd4f768e8da5d8e87edb8bcb58d7c7086 -address /run/k3s/containerd/containerd.sock
├─2656 /usr/sbin/containerd-shim-runc-v2 -namespace k8s.io -id 49b259381dae18c8113ff298eeab50130d6c6836aca4c2a42d57bb577bd0687d -address /run/k3s/containerd/containerd.sock
└─2856 /usr/sbin/containerd-shim-runc-v2 -namespace k8s.io -id a2b2f399521c8270574ab6f5b505806a4775ff4a76a44af53d87b534503a2088 -address /run/k3s/containerd/containerd.sock
Feb 18 00:27:29 static k3s[1485]: /home/abuild/rpmbuild/BUILD/k3s-1.22.3-k3s1/vendor/k8s.io/kubernetes/cmd/kube-controller-manager/app/controllermanager.go:272 +0x745
Feb 18 00:27:29 static systemd[1]: k3s-server.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
Feb 18 00:27:29 static systemd[1]: k3s-server.service: Failed with result 'exit-code'.
Feb 18 00:27:29 static systemd[1]: k3s-server.service: Unit process 2215 (containerd-shim) remains running after unit stopped.
Feb 18 00:27:29 static systemd[1]: k3s-server.service: Unit process 2218 (containerd-shim) remains running after unit stopped.
Feb 18 00:27:29 static systemd[1]: k3s-server.service: Unit process 2432 (containerd-shim) remains running after unit stopped.
Feb 18 00:27:29 static systemd[1]: k3s-server.service: Unit process 2513 (containerd-shim) remains running after unit stopped.
Feb 18 00:27:29 static systemd[1]: k3s-server.service: Unit process 2656 (containerd-shim) remains running after unit stopped.
Feb 18 00:27:29 static systemd[1]: k3s-server.service: Unit process 2856 (containerd-shim) remains running after unit stopped.
Feb 18 00:27:29 static systemd[1]: k3s-server.service: Consumed 1h 2min 29.664s CPU time.
Happy to dig in to find more detail, but I am pretty new to k3 (and k8). I would suggest leaving a cluster running for a bit and seeing if this is also happening to you? My terraforms.tfvars
is pretty simple:
# You need to replace these
hcloud_token = "xxxx"
public_key = "./xxx-kube.pub"
# Must be "private_key = null" when you want to use ssh-agent, for a Yubikey like device auth or an SSH key-pair with passphrase
private_key = "./xxx-kube"
# These can be customized, or left with the default values
# For Hetzner locations see https://docs.hetzner.com/general/others/data-centers-and-connection/
# For Hetzner server types see https://www.hetzner.com/cloud
location = "ash" # change to `ash` for us-east Ashburn, Virginia location
network_region = "us-east" # change to `us-east` if location is ash
agent_server_type = "cpx31"
control_plane_server_type = "cpx11"
lb_server_type = "lb11"
# At least 3 server nodes is recommended for HA, otherwise you need to turn off automatic upgrade (see ReadMe).
servers_num = 3
# For agent nodes, at least 2 is recommended for HA, but you can keep automatic upgrades.
agents_num = 3
# If you want to use a specific Hetzner CCM and CSI version, set them below, otherwise leave as is for the latest versions
# hetzner_ccm_version = ""
# hetzner_csi_version = ""
# If you want to allow non-control-plane workloads to run on the control-plane nodes set "true" below. The default is "false".
# allow_scheduling_on_control_plane = true
As always, thanks for the help! I hope this is just something with my setup, and not universal, but I thought I should report it now that I've seen it happen twice.
Hello, when I run the following command:
terraform apply -auto-approve
,
I have this error:
This is my configuration:
# Only the first values starting with a * are obligatory, the rest can remain with their default values, or you
# could adapt them to your needs.
#
# Note that some values, notably "location" and "public_key" have no effect after the initial cluster has been setup.
# This is in order to keep terraform from re-provisioning all nodes at once which would loose data. If you want to update,
# those, you should instead change the value here and then manually re-provision each node one-by-one. Grep for "lifecycle".
# * Your Hetzner project API token
hcloud_token = "🤐"
# * Your public key
public_key = "id_rsa.pub"
# * Your private key, must be "private_key = null" when you want to use ssh-agent, for a Yubikey like device auth or an SSH key-pair with passphrase
private_key = "id_rsa"
# These can be customized, or left with the default values
# For Hetzner locations see https://docs.hetzner.com/general/others/data-centers-and-connection/
# For Hetzner server types see https://www.hetzner.com/cloud
location = "fsn1" # change to `ash` for us-east Ashburn, Virginia location
network_region = "eu-central" # change to `us-east` if location is ash
# You can have up to as many subnets as you want (preferably if the form of 10.X.0.0/16),
# their primary use is to logically separate the nodes.
# The control_plane network is mandatory.
network_ipv4_subnets = {
control_plane = "10.1.0.0/16"
agent_big = "10.2.0.0/16"
agent_small = "10.3.0.0/16"
}
# At least 3 server nodes is recommended for HA, otherwise you need to turn off automatic upgrade (see ReadMe).
# As per rancher docs, it must be always an odd number, never even! See https://rancher.com/docs/k3s/latest/en/installation/ha-embedded/
# For instance, 1 is ok (non-HA), 2 not ok, 3 is ok (becomes HA).
control_plane_count = 3
# The type of control plane nodes, see https://www.hetzner.com/cloud, the minimum instance supported is cpx11 (just a few cents more than cx11)
control_plane_server_type = "cpx11"
# As for the agent nodepools, below is just an example, if you do not want nodepools, just use one,
# and change the name to what you want, it need not be "agent-big" or "agent-small", also give them the subnet prefer.
# For single node clusters set this equal to {}
agent_nodepools = {
# agent-big = {
# server_type = "cpx21",
# count = 1,
# subnet = "agent_big",
# }
agent-small = {
server_type = "cpx11",
count = 2,
subnet = "agent_small",
}
}
# That will depend on how much load you want it to handle, see https://www.hetzner.com/cloud/load-balancer
load_balancer_type = "lb11"
### The following values are fully optional
# It's best to leave the network range as is, unless you know what you are doing. The default is "10.0.0.0/8".
# network_ipv4_range = "10.0.0.0/8"
# If you want to use a specific Hetzner CCM and CSI version, set them below, otherwise leave as is for the latest versions
# hetzner_ccm_version = ""
# hetzner_csi_version = ""
# If you want to use letsencrypt with tls Challenge, the email address is used to send you certificates expiration notices
traefik_acme_tls = true
traefik_acme_email = "🤐"
# If you want to allow non-control-plane workloads to run on the control-plane nodes set "true" below. The default is "false".
# Also good for single node clusters.
/* allow_scheduling_on_control_plane = true */
# If you want to disable automatic upgrade of k3s, you can set this to false, default is "true".
# automatically_upgrade_k3s = false
# Allows you to specify either stable, latest, or testing (defaults to stable), see https://rancher.com/docs/k3s/latest/en/upgrades/basic/
# initial_k3s_channel = "latest"
# Adding extra firewall rules, like opening a port
# In this example with allow port TCP 5432 for a Postgres service we will open via a nodeport
# More info on the format here https://registry.terraform.io/providers/hetznercloud/hcloud/latest/docs/resources/firewall
# extra_firewall_rules = [
# {
# direction = "in"
# protocol = "tcp"
# port = "5432"
# source_ips = [
# "0.0.0.0/0"
# ]
# },
# ]
Do you know how can I fix this problem please ?
Hi - thanks so much for this project. When I attempt to deploy to the ash
region, using the instructions I get this after the terraform apply -auto-approve
command:
Error: hcloud/inlineAttachServerToNetwork: attach server to network: no subnet or IP available (service_error)
with hcloud_server.first_control_plane,
on master.tf line 1, in resource "hcloud_server" "first_control_plane":
1: resource "hcloud_server" "first_control_plane" {
Hi,
I'm trying to deploy a cluster with the same config than the template but when the servers reboot the deployment is unable to connect to them via SSH.
module.control_planes[1].hcloud_server.server (remote-exec): Connecting to remote host via SSH... module.control_planes[1].hcloud_server.server (remote-exec): Host: XXXXX module.control_planes[1].hcloud_server.server (remote-exec): User: root module.control_planes[1].hcloud_server.server (remote-exec): Password: false module.control_planes[1].hcloud_server.server (remote-exec): Private key: true module.control_planes[1].hcloud_server.server (remote-exec): Certificate: false module.control_planes[1].hcloud_server.server (remote-exec): SSH Agent: true module.control_planes[1].hcloud_server.server (remote-exec): Checking Host Key: false module.control_planes[1].hcloud_server.server (remote-exec): Target Platform: unix module.agents["agent-small-0"].hcloud_server.server: Still creating... [4m30s elapsed] module.agents["agent-big-0"].hcloud_server.server: Still creating... [4m30s elapsed]
With https://console.hetzner.cloud/projects/.../security/certificates there is a comfortable interface for managing https certificates over different contexes.
Is there any way to use them with kube-hetzner?
So I took up a cluster with 3 control planes and 2 agents. Then noticed that only 2 servers and 2 agents were present. But hcloud server list
was listing them all.
Turns out it failed to join, because of "too many learner", as follows:
So I issued systemctl start k3s-server
another time and it worked. Meaning we have to wait and make sure that servers start before, and retry if necessary, before returning success.
@mnencia @phaer Had a very interesting conversation with Richard Brown, he says no RPM is needed and that the btrfs sub-volumes are writable so we can just swap the binary, and voila!
So we can go back to MicroOS vanilla, and just use the k3s binaries from https://github.com/k3s-io/k3s/releases, as is. Maybe have a timer that checks for a new release, if there is one, touch /var/run/reboot-required, Kured drains the node, and reboots, and on reboot, we have a small script that does the swap :)
Let's see - I will try to give it a shot this weekend, but please do not hesitate if you feel inspired.
Also, welcome to the team if you'll accept, just sent you the invitation :) 🍾
Hi there,
thank you very much for your awesome help in my last ticket regarding the TLS configuration. It works now :)
Now im wondering, how exposing a service with a NodePort works. If I understand it correctly the firewall and loadbalancer should get configured to forward the port once I create a NodePort Service in my cluster, but that does not happen.
Is my assumption correct? Is NodePort not possible with kube-hetzner?
Thank you very much
Hi,
I really appreciate your work and have successfully created my own k8s cluster on the Hetzner cloud :)
Now I wanted to add TLS / HTTPS support and wanted to let the TLS connection terminate on the loadbalancer. Automatically retrieved certificates from letsencrypt seem fine. However the loadbalancer does not seem to work when I change its service
from
"[tcp] 443 -> 31028"
to
"[https] 443 -> 30468"
I have completely removed the tcp service for port 80, because I think I will not need it.
The loadbalancer shows 'unhealthy' for this service and I cannot access any ingress anymore.
Can someone please advise me on how to achieve TLS support with the hetzner loadbalancer and traefik ingress? :) Thanks!
commit e7f016f
staging is not available for me:
hcloud_server.first_control_plane (remote-exec): 02/10 12:20:52 [ERROR] CUID#7 - Download aborted. URI=https://raw.githubusercontent.com/kube-hetzner/kube-hetzner/staging/.files/openSUSE-MicroOS.x86_64-k3s-kvm-and-xen.qcow2.meta4
hcloud_server.first_control_plane (remote-exec): Exception: [AbstractCommand.cc:351] errorCode=3 URI=https://raw.githubusercontent.com/kube-hetzner/kube-hetzner/staging/.files/openSUSE-MicroOS.x86_64-k3s-kvm-and-xen.qcow2.meta4
hcloud_server.first_control_plane (remote-exec): -> [HttpSkipResponseCommand.cc:218] errorCode=3 Resource not found
replaced staging with master, now it is running
Hi!
For users that don't have the repo added to helm:
provisioner "local-exec" {
command = "helm repo add cilium https://helm.cilium.io/ "
}
Sometimes Hetzner nodes just fail to enter the rescue mode and the node stays off.
Suggest to taint master node (control-plane-0) with node-role.kubernetes.io/master=true:NoSchedule
as currently it is not tainted.
I.e. adding
# Taint Control Plane as master node to avoid scheduling of workloads here
provisioner "local-exec" {
command = <<-EOT
kubectl taint nodes "${self.name}" node-role.kubernetes.io/master=true:NoSchedule
EOT
}
to master.tf
Because of being not tainted, workload pods can get placed on control-plane-0
and disrupt cluster
(i.e. Cassandra or HDFS pod placed on control-plane-0
with almost 100% guarantee disrupts cluster)
Hello folks, there was an error in the agents' definition launched from Feb 10 to Feb 15. Here's the fix. You have two options.
1/ Scale down, to 0 agents, apply and scale back up, apply.
2/ Login via SSH to each agent and issue a few commands to fix them.
hcloud server list
ssh root@IP -i ~/.ssh/id_ed25519 -o StrictHostKeyChecking=no
systemctl disable k3s-server
systemctl stop k3s-server
systemctl --now enable k3s-agent
In the current form, it's only possible to create a cluster with equally sized nodes.
It would be great to have be able to have different sized nodes like this:
pools:
- id: "memory-pool"
count: 3
size: CX51
- id: "worker-pool"
count: 8
size: CX11
If we like to go further, it's maybe possible to spead the cluster into different physical locations too. But I think that would be a really hard nut, because of private networks asf.
pools:
- id: "memory-pool"
location: "fsn1"
count: 3
size: CX51
- id: "worker-pool"
location: "hel1"
count: 8
size: CX11
If you get the following error:
Error: hcloud/setRescue: hcclient/WaitForActions: action 347680047 failed: Unknown Error (unknown_error)
Please run, terraform apply -auto-approve
again. It happens because rarely Hetzner cloud takes time to enter rescue mode.
[Fixed] k3s failed to start, see journalctl -u k3s, that error happens sometimes on first_control_plane when the eth1 network interface is not present. This bug is rare enough, and we believe it comes from Hetzner, randomly.
If it happens, destroy, and re-apply terraform.
After runing the cluster I get this,
╷
│ Error: local-exec provisioner error
│
│ with hcloud_server.first_control_plane,
│ on master.tf line 44, in resource "hcloud_server" "first_control_plane":
│ 44: provisioner "local-exec" {
│
│ Error running command 'sleep 60 && ping 138.201.89.68 | grep --line-buffered "bytes from" | head -1 && sleep 100 &&
│ scp -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i ./keys/id_rsa
│ [email protected]:/etc/rancher/k3s/k3s.yaml ./kubeconfig.yaml
│ sed -i -e 's/127.0.0.1/138.201.89.68/g' ./kubeconfig.yaml
│ ': exit status 1. Output: 'sleep' is not recognized as an internal or external command,
│ operable program or batch file.
I'm trying to access the server but I have another problem, I have a generated keys using puttygen, I can't connect to the controlplane . does anyine know how to export key from keygen in a proper format? i'm using windows 10
Just a suggestion (but a really nice to have):
In order to keep variables, where they should be, and never touch the .tf
files,
it would be nice if there is a place, where custom ports can be managed:
currently I do this in main.tf
resource "hcloud_firewall" "k3s" {
name = "k3s-firewall"
## My Custom firewall rule
# Postgres
rule {
direction = "out"
protocol = "tcp"
port = "5432"
destination_ips = [
"0.0.0.0/0"
]
}
Maybe that part could be "outsourced" in a firewall.tf
-file or in a nested array in variables.
edit: removed "multiple feature" like mentioned in the first comment
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.