Comments (21)
@phaer You got it right, that was the file indeed, and Kured for OS, it manages everything and it works.
And indeed system-upgrade-controller for k3s, and it also manages everything.
Yes, ideally it would have been great to use only Kured, if the RPM channel now used (devel:kubic/k3s) gave recent versions and was updated regularly. But that is not the case. Also Richard Brown basically admitted that it is not a priority and that the best way to install k3s would be to use the install script, which I verified and it will be ideal indeed.
So we are left with k3s install with the vanilla install script (that supports MicroOS, between, it tests for it in the code) and either a custom way to upgrade and also create a /var/run/reboot-required file and have Kured do the reboot and some custom mechanism to switch the binary at boot.
Or we just use system-ugprade-controller that will do everything for us, without reboot, and fully independent of Kured. It will also execute rarely, just when new stable releases come down.
So I really think that in our case, it's still something really solid that we're planning to do here. Kured for OS, system-upgrade-controller for k3s, fresh from Github, without middlemen.
from terraform-hcloud-kube-hetzner.
In git, see the commit named "before move to k3os".
I think this refers to https://raw.githubusercontent.com/kube-hetzner/kube-hetzner/f308220bfe1236d735172e11b7f1841ca2597d14/manifests/upgrade/plans.yaml ?
I see that you also used kured back then. Is there a way to couple those two upgrade mechanisms? I understand that kured is responsible to upgrade our microos and system-upgrade-controller would be responsible for our k3s binary? That seems a bit sub-optimal for me, as It would complicate to i.e. schedule maintenance windows for a cluster.
from terraform-hcloud-kube-hetzner.
Probably it's even simpler than that, probably the system upgrade controller can do the swap itself, see https://rancher.com/docs/k3s/latest/en/upgrades/automated/
from terraform-hcloud-kube-hetzner.
That gives an example of how to run an upgrade script on the node: https://github.com/rancher/system-upgrade-controller/blob/master/examples/suse/sles.yaml
The system-upgrade-controller will just take care of the rest. Already had it running in the very first versions of this project when it was deployed on Fedora server. In git, see the commit named "before move to k3os".
from terraform-hcloud-kube-hetzner.
Alright, folks, this is done and ready for testing, see the k3s-install
branch.
⌛ It does take a bit longer than the other method because now k3s does not come pre-installed, and one more reboot is required after it installs, because of the k3s-selinux RPM package, for the new snapshot to take effect.
However, clusters get deployed only once - and then they are either agile or stuck. It's an added 5 minutes investment for a far more flexible future!
With this new method, k3s is vanilla, full-fat, and automatically upgrades by following the stable channel (so always latest, greatest, and safest), unless a node label is changed to k3s_upgrade=false
.
You can also change the upgrade channel to the one you prefer want like latest
, stable
, testing
, or even target a specific major version. See https://rancher.com/docs/k3s/latest/en/upgrades/basic/ and https://update.k3s.io/v1-release/channels.
from terraform-hcloud-kube-hetzner.
kubectl get nodes
from terraform-hcloud-kube-hetzner.
kubectl get pods -A
from terraform-hcloud-kube-hetzner.
The next step will be to test the k3s automatic upgrade by changing the upgrade channel from stable
to latest
in plans.yaml
and applying it again. By doing so, we should witness an upgrade to 1.23.x. Will do ASAP.
from terraform-hcloud-kube-hetzner.
Ok, so just following the latest channel did not produce upgrade jobs, probably because it waits for new releases. But setting the version I wanted manually... I replace the channel
like in the plans by version
and Boom!
from terraform-hcloud-kube-hetzner.
While it did the upgrades for each node, concurrency 1:
And after having completed:
from terraform-hcloud-kube-hetzner.
kubectl get plans -n system-upgrade
from terraform-hcloud-kube-hetzner.
kubectl get jobs -n system-upgrade
from terraform-hcloud-kube-hetzner.
Now, will wait for a reboot by Kured to happen, to confirm that everything survives. Looking forward to any feedback.
from terraform-hcloud-kube-hetzner.
Good news! We can probably get away without the second reboot if we use a combustion script to install the k3s-selinux RPM package. Will try ASAP.
from terraform-hcloud-kube-hetzner.
Even better would be to install k3s through combustion itself. That would gain us some time, as indeed it would not require a second reboot, it just boots into the new snapshot.
To do so, both the config file and the combustion script need to be copied on the ignition partition into a combustion folder in rescue mode.
from terraform-hcloud-kube-hetzner.
Just to confirm that it also rebooted with Kured after a MicroOS upgrade like a charm. Both systems work completely separately without interference.
from terraform-hcloud-kube-hetzner.
Ok, called the whole k3s install from combustion, it fails. Probably because the right paths are not available yet. And also probably, because everything in combustion executes in a transactional-update shell.
Will try going just for the RPM, in the hope of avoiding the necessity of a second reboot.
from terraform-hcloud-kube-hetzner.
Finally, k3s-selinux
is being installed via combustion on MicroOS, and that indeed removes the need for the second reboot.
Loading repository data...
Reading installed packages...
Resolving package dependencies...
The following NEW package is going to be installed:
k3s-selinux
1 new package to install.
Overall download size: 19.9 KiB. Already cached: 0 B. After the operation, additional 84.3 KiB will be used.
Continue? [y/n/v/...? shows all options] (y): y
Retrieving package k3s-selinux-0.4-1.sle.noarch (1/1), 19.9 KiB ( 84.3 KiB unpacked)
Checking for file conflicts: [...done]
(1/1) Installing: k3s-selinux-0.4-1.sle.noarch [......done]
Executing %posttrans script 'k3s-selinux-0.4-1.sle.noarch.rpm' [....done]
Application returned with exit status 0.
Transaction completed.
tukit 3.6.2 started
ptions: close 2
Failure (dbus fatal exception).
New default snapshot is #2 (/.snapshots/2/snapshot).
Transaction completed.
Please reboot your machine to activate the changes and avoid data loss.
New default snapshot is #2 (/.snapshots/2/snapshot).
from terraform-hcloud-kube-hetzner.
Thanks, @mnencia for the SSL error fix.
from terraform-hcloud-kube-hetzner.
Here's a forced upgrade of k3s
to a specific version, working like a charm!
from terraform-hcloud-kube-hetzner.
Merged into master!
from terraform-hcloud-kube-hetzner.
Related Issues (20)
- Can't access control-planes with proxied DNS entries from cloudflare HOT 5
- [Bug]: `cluster-autoscaler` does not wait long enough for new server to become available HOT 2
- [Bug]: autoscaler nodes do not (allow to) set kubelet-args like kube-reserved and system-reserved
- [Bug]: Creation on new cluster stuck on configuring agent node HOT 12
- [Feature Request]: Add a note somewhere in the README that selinux enablement can lead to pods trying to use volumes with many files never booting
- [Bug]: (remote-exec): error: timed out waiting for the condition on deployments/system-upgrade-controller HOT 2
- Missing "cluster-init" option in config.yaml in the only control plane node. HOT 4
- [Bug]: Invalid provider configuration with terraform plan | apply HOT 2
- [Bug]: terraform validate fails "Names in agent_nodepools must be unique." HOT 2
- [Bug]: Autoupgrade nodes seems to lead to not ready nodes that need manual reboots HOT 8
- Longhorn installation fails (CRDs not installed) HOT 1
- Allow configuring s3 `etcd-snapshot-retention` in config file HOT 2
- System-upgrade-controller fails to run HOT 5
- [Bug]: Can't restore a copy HOT 2
- [Feature Request]: Collect extra-manifests recursive HOT 2
- [Bug]: Local Rancher Cluster mixed roles validation fails HOT 1
- [Bug]: HOT 1
- [Bug]: Terraform does not stop HOT 13
- [Bug]: ImagePullBackoff of system-upgrade controller HOT 1
- Not able to upgrade Traefik HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from terraform-hcloud-kube-hetzner.