Giter VIP home page Giter VIP logo

cluster-template's Introduction

⛵ Cluster Template

Welcome to my opinionated and extensible template for deploying a single Kubernetes cluster. The goal of this project is to make it easier for people interested in using Kubernetes to deploy a cluster at home on bare-metal or VMs.

At a high level this project makes use of makejinja to read in a configuration file which renders out templates that will allow you to install and manage your Kubernetes cluster with.

✨ Features

The features included will depend on the type of configuration you want to use. There are currently 2 different types of configurations available with this template.

  1. "Flux cluster" - a Kubernetes distribution of your choosing: k3s or Talos. Deploys an opinionated implementation of Flux using GitHub as the Git provider and sops to manage secrets.

  2. "Flux cluster with Cloudflare" - An addition to "Flux cluster" that provides DNS and SSL with Cloudflare. Cloudflare Tunnel is also included to provide external access to certain applications deployed in your cluster.

Other features include:

💻 Machine Preparation

Hopefully some of this peeked your interests! If you are marching forward, now is a good time to choose whether you will deploy a Kubernetes cluster with k3s or Talos.

System requirements

Note

  1. The included behaviour of Talos or k3s is that all nodes are able to run workloads, including the controller nodes. Worker nodes are therefore optional.
  2. Do you have 3 or more nodes? It is highly recommended to make 3 of them controller nodes for a highly available control plane.
  3. Running the cluster on Proxmox VE? My thoughts and recommendations about that are documented here.
Role Cores Memory System Disk
Control 4 (6*) 8GB (24GB*) 100GB (500GB*) SSD/NVMe
Worker 4 (6*) 8GB (24GB*) 100GB (500GB*) SSD/NVMe
* recommended

Talos

  1. Download the latest stable release of Talos from their GitHub releases. You will want to grab either metal-amd64.iso or metal-rpi_generic-arm64.raw.xz depending on your system.

  2. Take note of the OS drive serial numbers you will need them later on.

  3. Flash the iso or raw file to a USB drive and boot to Talos on your nodes with it.

  4. Continue on to 🚀 Getting Started

k3s (AMD64)

  1. Download the latest stable release of Debian from here, then follow this guide to get it installed. Deviations from the guide:

    Choose "Guided - use entire disk"
    Choose "All files in one partition"
    Delete Swap partition
    Uncheck all Debian desktop environment options
  2. [Post install] Remove CD/DVD as apt source

    su -
    sed -i '/deb cdrom/d' /etc/apt/sources.list
    apt update
    exit
  3. [Post install] Enable sudo for your non-root user

    su -
    apt update
    apt install -y sudo
    usermod -aG sudo ${username}
    echo "${username} ALL=(ALL) NOPASSWD:ALL" | tee /etc/sudoers.d/${username}
    exit
    newgrp sudo
    sudo apt update
  4. [Post install] Add SSH keys (or use ssh-copy-id on the client that is connecting)

    📍 First make sure your ssh keys are up-to-date and added to your github account as instructed.

    mkdir -m 700 ~/.ssh
    sudo apt install -y curl
    curl https://github.com/${github_username}.keys > ~/.ssh/authorized_keys
    chmod 600 ~/.ssh/authorized_keys

k3s (RasPi4)

Click here to read about using a RasPi4

[!NOTE]

  1. It is recommended to have an 8GB RasPi model. Most important is to boot from an external SSD/NVMe rather than an SD card. This is supported natively, however if you have an early model you may need to update the bootloader first.
  2. Check the power requirements if using a PoE Hat and a SSD/NVMe dongle.
  1. Download the latest stable release of Debian from here. Do not use Raspbian or DietPi or any other flavor Linux OS.

  2. Flash the image onto an SSD/NVMe drive.

  3. Re-mount the drive to your workstation and then do the following (per the official documentation):

    Open 'sysconf.txt' in a text editor and save it upon updating the information below
      - Change 'root_authorized_key' to your desired public SSH key
      - Change 'root_pw' to your desired root password
      - Change 'hostname' to your desired hostname
  4. Connect SSD/NVMe drive to the Raspberry Pi 4 and power it on.

  5. [Post install] SSH into the device with the root user and then create a normal user account with adduser ${username}

  6. [Post install] Follow steps 3 and 4 from k3s (AMD64).

  7. [Post install] Install python3 which is needed by Ansible.

    sudo apt install -y python3
  8. Continue on to 🚀 Getting Started

🚀 Getting Started

Once you have installed Talos or Debian on your nodes, there are six stages to getting a Flux-managed cluster up and runnning.

Note

For all stages below the commands MUST be ran on your personal workstation within your repository directory

🎉 Stage 1: Create a Git repository

  1. Create a new public repository by clicking the big green "Use this template" button at the top of this page.

  2. Clone your new repo to you local workstation and cd into it.

  3. Continue on to 🌱 Stage 2

🌱 Stage 2: Setup your local workstation

You have two different options for setting up your local workstation.

  • First option is using a devcontainer which requires you to have Docker and VSCode installed. This method is the fastest to get going because all the required CLI tools are provided for you in my devcontainer image.
  • The second option is setting up the CLI tools directly on your workstation.

Devcontainer method

  1. Start Docker and open your repository in VSCode. There will be a pop-up asking you to use the devcontainer, click the button to start using it.

  2. Continue on to 🔧 Stage 3

Non-devcontainer method

  1. Install the most recent version of task, see the installation docs for other supported platforms.

    # Homebrew
    brew install go-task
    # or, Arch
    pacman -S --noconfirm go-task && ln -sf /usr/bin/go-task /usr/local/bin/task
  2. Install the most recent version of direnv, see the installation docs for other supported platforms.

    # Homebrew
    brew install direnv
    # or, Arch
    pacman -S --noconfirm direnv
  3. Hook direnv into your preferred shell, then run:

    task workstation:direnv

    📍 Verify that direnv is setup properly by opening a new terminal and cding into your repository. You should see something like:

    cd /path/to/repo
    direnv: loading /path/to/repo/.envrc
    direnv: export +ANSIBLE_COLLECTIONS_PATH ...  +VIRTUAL_ENV ~PATH
  4. Install the additional required CLI tools

    📍 Not using Homebrew or ArchLinux? Try using the generic Linux task below, if that fails check out the Brewfile/Archfile for what CLI tools needed and install them.

    # Homebrew
    task workstation:brew
    # or, Arch with yay/paru
    task workstation:arch
    # or, Generic Linux (YMMV, this pulls binaires in to ./bin)
    task workstation:generic-linux
  5. Setup a Python virual environment by running the following task command.

    📍 This commands requires Python 3.11+ to be installed.

    task workstation:venv
  6. Continue on to 🔧 Stage 3

🔧 Stage 3: Bootstrap configuration

Note

The config.sample.yaml file contains config that is vital to the bootstrap process.

  1. Generate the config.yaml from the config.sample.yaml configuration file.

    task init
  2. Fill out the config.yaml configuration file using the comments in that file as a guide.

  3. Run the following command which will generate all the files needed to continue.

    task configure
  4. Push you changes to git

    📍 Verify all the ./kubernetes/**/*.sops.* files are encrypted with SOPS

    git add -A
    git commit -m "Initial commit :rocket:"
    git push
  5. Continue on to ⚡ Stage 4

⚡ Stage 4: Prepare your nodes for Kubernetes

Note

For Talos skip ahead to ⛵ Stage 5

k3s

📍 Here we will be running an Ansible playbook to prepare your nodes for running a Kubernetes cluster.

  1. Ensure you are able to SSH into your nodes from your workstation using a private SSH key without a passphrase (for example using a SSH agent). This lets Ansible interact with your nodes.

  2. Install the Ansible dependencies

    task ansible:deps
  3. Verify Ansible can view your config and ping your nodes

    task ansible:list
    task ansible:ping
  4. Run the Ansible prepare playbook (nodes wil reboot when done)

    task ansible:run playbook=cluster-prepare
  5. Continue on to ⛵ Stage 5

⛵ Stage 5: Install Kubernetes

Talos

  1. Deploy your cluster and bootstrap it. This generates secrets, generates the config files for your nodes and applies them. It bootstraps the cluster afterwards, fetches the kubeconfig file and installs Cilium and kubelet-csr-approver. It finishes with some health checks.

    task talos:bootstrap
  2. ⚠️ It might take a while for the cluster to be setup (10+ minutes is normal), during which time you will see a variety of error messages like: "couldn't get current server API group list," "error: no matching resources found", etc. This is a normal. If this step gets interrupted, e.g. by pressing Ctrl + C, you likely will need to nuke the cluster before trying again.

k3s

  1. Install Kubernetes depending on the distribution you chose

    task ansible:run playbook=cluster-installation

Cluster validation

  1. The kubeconfig for interacting with your cluster should have been created in the root of your repository.

  2. Verify the nodes are online

    📍 If this command fails you likely haven't configured direnv as mentioned previously in the guide.

    kubectl get nodes -o wide
    # NAME           STATUS   ROLES                       AGE     VERSION
    # k8s-0          Ready    control-plane,etcd,master   1h      v1.29.1
    # k8s-1          Ready    worker                      1h      v1.29.1
  3. Continue on to 🔹 Stage 6

🔹 Stage 6: Install Flux in your cluster

  1. Verify Flux can be installed

    flux check --pre
    # ► checking prerequisites
    # ✔ kubectl 1.27.3 >=1.18.0-0
    # ✔ Kubernetes 1.27.3+k3s1 >=1.16.0-0
    # ✔ prerequisites checks passed
  2. Install Flux and sync the cluster to the Git repository

    📍 Run task flux:github-deploy-key first if using a private repository.

    task flux:bootstrap
    # namespace/flux-system configured
    # customresourcedefinition.apiextensions.k8s.io/alerts.notification.toolkit.fluxcd.io created
    # ...
  3. Verify Flux components are running in the cluster

    kubectl -n flux-system get pods -o wide
    # NAME                                       READY   STATUS    RESTARTS   AGE
    # helm-controller-5bbd94c75-89sb4            1/1     Running   0          1h
    # kustomize-controller-7b67b6b77d-nqc67      1/1     Running   0          1h
    # notification-controller-7c46575844-k4bvr   1/1     Running   0          1h
    # source-controller-7d6875bcb4-zqw9f         1/1     Running   0          1h

🎤 Verification Steps

Mic check, 1, 2 - In a few moments applications should be lighting up like Christmas in July 🎄

  1. Output all the common resources in your cluster.

    📍 Feel free to use the provided kubernetes tasks for validation of cluster resources or continue to get familiar with the kubectl and flux CLI tools.

    task kubernetes:resources
  2. ⚠️ It might take cert-manager awhile to generate certificates, this is normal so be patient.

  3. 🏆 Congratulations if all goes smooth you will have a Kubernetes cluster managed by Flux and your Git repository is driving the state of your cluster.

  4. 🧠 Now it's time to pause and go get some motel motor oil ☕ and admire you made it this far!

📣 Flux w/ Cloudflare post installation

🌐 Public DNS

The external-dns application created in the networking namespace will handle creating public DNS records. By default, echo-server and the flux-webhook are the only subdomains reachable from the public internet. In order to make additional applications public you must set set the correct ingress class name and ingress annotations like in the HelmRelease for echo-server.

🏠 Home DNS

k8s_gateway will provide DNS resolution to external Kubernetes resources (i.e. points of entry to the cluster) from any device that uses your home DNS server. For this to work, your home DNS server must be configured to forward DNS queries for ${bootstrap_cloudflare.domain} to ${bootstrap_cloudflare.gateway_vip} instead of the upstream DNS server(s) it normally uses. This is a form of split DNS (aka split-horizon DNS / conditional forwarding).

Tip

Below is how to configure a Pi-hole for split DNS. Other platforms should be similar.

  1. Apply this file on the Pihole server while substituting the variables
# /etc/dnsmasq.d/99-k8s-gateway-forward.conf
server=/${bootstrap_cloudflare.domain}/${bootstrap_cloudflare.gateway_vip}
  1. Restart dnsmasq on the server.
  2. Query an internal-only subdomain from your workstation (any internal class ingresses): dig @${home-dns-server-ip} echo-server-internal.${bootstrap_cloudflare.domain}. It should resolve to ${bootstrap_cloudflare.ingress_vip}.

If you're having trouble with DNS be sure to check out these two GitHub discussions: Internal DNS and Pod DNS resolution broken.

... Nothing working? That is expected, this is DNS after all!

📜 Certificates

By default this template will deploy a wildcard certificate using the Let's Encrypt staging environment, which prevents you from getting rate-limited by the Let's Encrypt production servers if your cluster doesn't deploy properly (for example due to a misconfiguration). Once you are sure you will keep the cluster up for more than a few hours be sure to switch to the production servers as outlined in config.yaml.

📍 You will need a production certificate to reach internet-exposed applications through cloudflared.

🪝 Github Webhook

By default Flux will periodically check your git repository for changes. In order to have Flux reconcile on git push you must configure Github to send push events to Flux.

Note

This will only work after you have switched over certificates to the Let's Encrypt Production servers.

  1. Obtain the webhook path

    📍 Hook id and path should look like /hook/12ebd1e363c641dc3c2e430ecf3cee2b3c7a5ac9e1234506f6f5f3ce1230e123

    kubectl -n flux-system get receiver github-receiver -o jsonpath='{.status.webhookPath}'
  2. Piece together the full URL with the webhook path appended

    https://flux-webhook.${bootstrap_cloudflare.domain}/hook/12ebd1e363c641dc3c2e430ecf3cee2b3c7a5ac9e1234506f6f5f3ce1230e123
    
  3. Navigate to the settings of your repository on Github, under "Settings/Webhooks" press the "Add webhook" button. Fill in the webhook url and your bootstrap_github_webhook_token secret and save.

💥 Nuke

There might be a situation where you want to destroy your Kubernetes cluster. This will completely clean the OS of all traces of the Kubernetes distribution you chose and then reboot the nodes.

# k3s: Remove all traces of k3s from the nodes
task ansible:run playbook=cluster-nuke
# Talos: Reset your nodes back to maintenance mode and reboot
task talos:soft-nuke
# Talos: Comletely format your the Talos installation and reboot
task talos:hard-nuke

🤖 Renovate

Renovate is a tool that automates dependency management. It is designed to scan your repository around the clock and open PRs for out-of-date dependencies it finds. Common dependencies it can discover are Helm charts, container images, GitHub Actions, Ansible roles... even Flux itself! Merging a PR will cause Flux to apply the update to your cluster.

To enable Renovate, click the 'Configure' button over at their Github app page and select your repository. Renovate creates a "Dependency Dashboard" as an issue in your repository, giving an overview of the status of all updates. The dashboard has interactive checkboxes that let you do things like advance scheduling or reattempt update PRs you closed without merging.

The base Renovate configuration in your repository can be viewed at .github/renovate.json5. By default it is scheduled to be active with PRs every weekend, but you can change the schedule to anything you want, or remove it if you want Renovate to open PRs right away.

🐛 Debugging

Below is a general guide on trying to debug an issue with an resource or application. For example, if a workload/resource is not showing up or a pod has started but in a CrashLoopBackOff or Pending state.

  1. Start by checking all Flux Kustomizations & Git Repository & OCI Repository and verify they are healthy.

    flux get sources oci -A
    flux get sources git -A
    flux get ks -A
  2. Then check all the Flux Helm Releases and verify they are healthy.

    flux get hr -A
  3. Then check the if the pod is present.

    kubectl -n <namespace> get pods -o wide
  4. Then check the logs of the pod if its there.

    kubectl -n <namespace> logs <pod-name> -f
    # or
    stern -n <namespace> <fuzzy-name>
  5. If a resource exists try to describe it to see what problems it might have.

    kubectl -n <namespace> describe <resource> <name>
  6. Check the namespace events

    kubectl -n <namespace> get events --sort-by='.metadata.creationTimestamp'

Resolving problems that you have could take some tweaking of your YAML manifests in order to get things working, other times it could be a external factor like permissions on NFS. If you are unable to figure out your problem see the help section below.

👉 Help

  • Make a post in this repository's Github Discussions.
  • Start a thread in the #support or #cluster-template channels in the Home Operations Discord server.

❔ What's next

The cluster is your oyster (or something like that). Below are some optional considerations you might want to review.

Ship it

To browse or get ideas on applications people are running, community member @whazor created Kubesearch as a creative way to search Flux HelmReleases across Github and Gitlab.

Storage

The included CSI (openebs in local-hostpath mode) is a great start for storage but soon you might find you need more features like replicated block storage, or to connect to a NFS/SMB/iSCSI server. If you need any of those features be sure to check out the projects like rook-ceph, longhorn, openebs, democratic-csi, csi-driver-nfs, and synology-csi.

🙌 Related Projects

If this repo is too hot to handle or too cold to hold check out these following projects.

  • khuedoan/homelab - Modern self-hosting framework, fully automated from empty disk to operating services with a single command.
  • danmanners/aws-argo-cluster-template - A community opinionated template for deploying Kubernetes clusters on-prem and in AWS using Pulumi, SOPS, Sealed Secrets, GitHub Actions, Renovate, Cilium and more!
  • ricsanfre/pi-cluster - Pi Kubernetes Cluster. Homelab kubernetes cluster automated with Ansible and ArgoCD
  • techno-tim/k3s-ansible - The easiest way to bootstrap a self-hosted High Availability Kubernetes cluster. A fully automated HA k3s etcd install with kube-vip, MetalLB, and more

⭐ Stargazers

Star History Chart

🤝 Thanks

Big shout out to all the contributors, sponsors and everyone else who has helped on this project.

cluster-template's People

Contributors

ajaykumar4 avatar alex-matthews avatar arthurgeek avatar aumer-amr avatar axeii avatar bjw-s avatar brunnels avatar dcplaya avatar divialth avatar geckosplinter avatar giannoug avatar github-actions[bot] avatar gregmulvaney avatar jfroy avatar joryirving avatar lltr avatar mrueg avatar nvanmeurs avatar onedr0p avatar phybros avatar prehor avatar prymalinstynct avatar renovate-bot avatar renovate[bot] avatar samip5 avatar smbonn2005 avatar sp3nx0r avatar wasurerarenai avatar willianpaixao avatar wouterbouvy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cluster-template's Issues

Ansible Ubuntu Prepare playbook Raspberry Pi check tasks throw fatal error

Details

What steps did you take and what happened:

Running task ansible:playbook:ubuntu-prepare, all steps of setup working fine prior. Using this with nodes that are not Raspberry Pis.

What did you expect to happen:

Expected Raspberry Pi check run successfully to show that nodes are no Pis.

Anything else you would like to add:

Additional Information:

Output:

TASK [ubuntu : Check if Raspberry Pi] ***************************************************************************************************************************************************
fatal: [k8s-2]: FAILED! => changed=true 
  cmd:
  - grep
  - -q
  - Raspberry Pi
  - /proc/cpuinfo
  delta: '0:00:00.019747'
  end: '2022-03-06 21:54:00.526425'
  msg: non-zero return code
  rc: 1
  start: '2022-03-06 21:54:00.506678'
  stderr: ''
  stderr_lines: <omitted>
  stdout: ''
  stdout_lines: <omitted>
...ignoring
fatal: [k8s-0]: FAILED! => changed=true 
  cmd:
  - grep
  - -q
  - Raspberry Pi
  - /proc/cpuinfo
  delta: '0:00:00.020600'
  end: '2022-03-06 21:54:00.622424'
  msg: non-zero return code
  rc: 1
  start: '2022-03-06 21:54:00.601824'
  stderr: ''
  stderr_lines: <omitted>
  stdout: ''
  stdout_lines: <omitted>
...ignoring
fatal: [k8s-1]: FAILED! => changed=true 
  cmd:
  - grep
  - -q
  - Raspberry Pi
  - /proc/cpuinfo
  delta: '0:00:00.019085'
  end: '2022-03-06 21:54:00.632873'
  msg: non-zero return code
  rc: 1
  start: '2022-03-06 21:54:00.613788'
  stderr: ''
  stderr_lines: <omitted>
  stdout: ''
  stdout_lines: <omitted>
...ignoring

Renovate Configuration

Details

Whenever I use the renovate.json5 from this template, I get this issue from renovate-bot:


There is an error with this repository's Renovate configuration that needs to be fixed. As a precaution, Renovate will stop PRs until it is resolved.

Location: renovate.json5
Error type: The renovate configuration file contains some invalid settings
Message: packageRules[1]: packageRules cannot combine both matchUpdateTypes and separateMinorPatch. Rule: {"matchDatasources":["docker"],"enabled":true,"commitMessageTopic":"container image {{depName}}","commitMessageExtra":"to {{#if isSingleVersion}}v{{{newVersion}}}{{else}}{{{newValue}}}{{/if}}","matchUpdateTypes":["major","minor","patch"],"separateMinorPatch":true}


When line 45 is commented (// "separateMinorPatch": true)
https://github.com/k8s-at-home/template-cluster-k3s/blob/main/.github/renovate.json5#L45
things are OK.

I wonder is the file renovate.json5 in this repo "wrong" or am I in error somewhere? TIA!

Document authorizing flux to your repository over SSH

Details

Describe the solution you'd like:

  1. Generate new SSH key:
    ssh-keygen -t ecdsa -b 521 -C "github-deploy-key" -f ./cluster/github-deploy-key -q -P ""
  2. Paste public key: https://github.com/onedr0p/home-ops/settings/keys
  3. Create sops secret in cluster/base/flux-system/github-deploy-key.sops.yaml with the contents of:
    # yamllint disable
    apiVersion: v1
    kind: Secret
    metadata:
        name: github-deploy-key
        namespace: flux-system
    stringData:
        # Contents of: github-deploy-key
        identity: |
            -----BEGIN OPENSSH PRIVATE KEY-----
                ...
            -----END OPENSSH PRIVATE KEY-----
        # Output of: curl --silent https://api.github.com/meta | jq --raw-output '"github.com "+.ssh_keys[]'
        known_hosts: |
            github.com ssh-ed25519 ...
            github.com ecdsa-sha2-nistp256 ...
            github.com ssh-rsa ...
  4. Encrypt secret:
    sops --encrypt --in-place ./cluster/base/flux-system/github-deploy-key.sops.yaml
  5. Apply secret to cluster:
    sops -d cluster/base/flux-system/github-deploy-key.sops.yaml | kubectl apply -f -
  6. Update cluster/base/flux-system/gotk-sync.yaml:
    ---
    apiVersion: source.toolkit.fluxcd.io/v1beta2
    kind: GitRepository
    metadata:
      name: flux-system
      namespace: flux-system
    spec:
      interval: 5m0s
      url: ssh://[email protected]/onedr0p/home-ops
      ref:
        branch: main
      secretRef:
        name: github-deploy-key
  7. Commit and push changes
  8. Verify git repository is now using SSH:
    kubectl get gitrepository -n flux-system

invalid downgrade; server version is lower than determined cluster version

Details

What steps did you take and what happened:
I followed all instructions from the readme on a fresh installed ubuntu server 20.04.3 lts and after a reboot, kubernetes won't start.
journalctl -xe gives the following:

Dec 06 15:14:27 k8s-0 k3s[5219]: {"level":"fatal","ts":"2021-12-06T15:14:27.709Z","caller":"membership/cluster.go:790","msg":"invalid downgrade; server version is lower than determined cluster version","current-server-version":"3.4.13","determined-cluster-version":"3.5","stacktrace":"github.com/rancher/k3s/vendor/go.>

I tried from a fresh install twice, got the same results.

What did you expect to happen:
I expect kubernetes to be able to start after a reboot

terraform plan / cloudflare error - Error: Error getting data key: 0 successful groups required, got 0

Details

When tunning the Terraform plan for CloudFlare, I get an error:

12:53:40 fabrice@DESKTOP-FS home-k8s-cluster ±|main ✗|→ task terraform:plan:cloudflare
task: [terraform:plan:cloudflare] terraform plan
╷
│ Error: Error getting data key: 0 successful groups required, got 0
│ 
│   with data.sops_file.cloudflare_secrets,
│   on main.tf line 19, in data "sops_file" "cloudflare_secrets":19: data "sops_file" "cloudflare_secrets" {
│ 
╵
task: Failed to run task "terraform:plan:cloudflare": exit status 1
12:53:51 fabrice@DESKTOP-FS home-k8s-cluster ±|main ✗|→ 

The secret.sops.yaml seems to be OK, it was generated with configure.sh

Ask your question:

Any idea how to resolve this? I tried to re-run the configure.sh, but the error remains. Any help would be appreciated.

Renovate: Dependency Lookup Warnings

Details

After its first run, the onboarding PR by Renovate "complains" about dependency adrienverge/yamllint.git.

See https://github.com/k8s-at-home/template-cluster-k3s/blob/main/.pre-commit-config.yaml#L3

Looks like the .git extension can be safely removed on line 3.
Works on my machine.

Additional Information:

Renovate warning in the PR:

---

### ⚠️ Dependency Lookup Warnings ⚠️

Please correct - or verify that you can safely ignore - these lookup failures before you merge this PR.

-   `Failed to look up dependency adrienverge/yamllint.git`

Files affected: `.pre-commit-config.yaml`


---

How to access services or apps locally?

Details

Ask your question:

Is there any way we can setup the cluster to allow local access to hajimari etc instead of only going through Cloudflare? When I visit BOOTSTRAP_METALLB_TRAEFIK_ADDR locally it gives me an SSL error

Disable Local-Path Provisioner

Details

k3s by default will enable the local-path-provisioner and set it as the default storage class, if any other storage class is added after install such as nfs-subdir-external-provisioner or longhorn you will have to disable local-path as the default storage class each time k3s is restarted.

Anything else you would like to add:

The local-path-provisioner manifest lives here /var/lib/rancher/k3s/server/manifests/local-storage.yaml and the recommended way to disable this provisioner is by using the --disable local-storage option when install k3s. Then the local-storage-provisioner helm chart can be added the cluster after the fact if it is still needed. https://github.com/rancher/local-path-provisioner/blob/master/deploy/chart/README.md#installing-the-chart

Additional Information:

Deprecate kube-vip vip_interface and Ansible interface name checks

Details

Describe the solution you'd like:

kube-vip has supported setting a --autoInterface flag to detect the interface with the default route. This flag is default when vip_interface is not set.

This will simplify configuration as we no longer need to ask for the interface name in the config.

If anyone is able to test this out that would be swell.

kube-vip/kube-vip#273

Migrate services using loadBalancerIP to the Metallb annotation

Details

Describe the solution you'd like:

svc.spec.loadBalancerIP has been deprecated in k8s 1.24. Metallb now has an annotation to assign a Load Balancer IPs.

metallb/metallb#1182

For example the Traefik HelmRelease I believe will look like this:

...
    service:
      enabled: true
      type: LoadBalancer
      annotations:
        metallb.universe.tf/loadBalancerIPs: |-
          "${METALLB_TRAEFIK_ADDR}"
      spec:
        externalTrafficPolicy: Local
...

Should cluster-settings.yaml be encrypted with SOPS?

Details

The README makes references to a ./cluster/base/cluster-settings.sops.yaml file but configure.sh does not encrypt ./cluster/base/cluster-settings.yaml. Do you recommend encrypting this yaml? The variables in there post initialization do not seem concerning.

I learned a lot from the repo, thank you for maintaining it.

Webhook and DNS problem

Details

Ask your question:

So i am trying to add new apps into my cluster, and set up webhooks for flux, my web hook gets http 523. and will not communicate, i know my secret and url is correct. To top that off i do not believe my DNS is working properly(such is DNS after all.). My router is a ubiquiti security gateway pro, which i have forwarded port 443 and 80 to my traefik ip. I ran
dig 192.168.86.150 hajimari.sinlessgamesllc.com
with this as a response:

; <<>> DiG 9.16.8-Ubuntu <<>> 192.168.86.150 hajimari.sinlessgamesllc.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 39759
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 65494
;; QUESTION SECTION:
;192.168.86.150.			IN	A

;; Query time: 27 msec
;; SERVER: 127.0.0.53#53(127.0.0.53)
;; WHEN: Sun Jun 12 16:03:52 MDT 2022
;; MSG SIZE  rcvd: 43

;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 47813
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 65494
;; QUESTION SECTION:
;hajimari.sinlessgamesllc.com.	IN	A

;; Query time: 0 msec
;; SERVER: 127.0.0.53#53(127.0.0.53)
;; WHEN: Sun Jun 12 16:03:52 MDT 2022
;; MSG SIZE  rcvd: 57

though i am uncertain as to how to make this all work properly. Any help would be greatly appreciated.

Issues with kubevip not deploying

Hi,
So the kube-vip containers fails to start, and gives the following error:
Input:
k3s kubectl logs kube-vip-c2wvw --namespace kube-system
Output:
level=fatal msg="lo is not valid interface, reason: get lo failed, error: Link not found"

lo exists on the node, I also tried different interfaces with non of them working.

What am I missing?

hajimari PVC missing namespace

Details

What steps did you take and what happened:

When using the default template the hajimari helm release fails to install because there is not a namespace defined for the PersistentVolumeClaim

flux get kustomizations -A
NAMESPACE       NAME            READY   MESSAGE                                                                                                                 REVISION                                                SUSPENDED 
flux-system     apps            False   PersistentVolumeClaim/hajimari-config namespace not specified, error: the server could not find the requested resource

What did you expect to happen:

I expected the hajimari helm release to be installed and start

Anything else you would like to add:

Updated cluster/apps/default/hajimari/config-pvc.yaml

---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: hajimari-config
  namespace: default
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: local-path
  resources:
    requests:
      storage: 128Mi

Additional Information:

I would personally like to see this app in a namespace other then default.

Action Required: Fix Renovate Configuration

There is an error with this repository's Renovate configuration that needs to be fixed. As a precaution, Renovate will stop PRs until it is resolved.

Location: .github/renovate.json5
Error type: The renovate configuration file contains some invalid settings
Message: packageRules[1]: packageRules cannot combine both matchUpdateTypes and separateMinorPatch. Rule: {"matchDatasources":["docker"],"enabled":true,"commitMessageTopic":"container image {{depName}}","commitMessageExtra":"to {{#if isSingleVersion}}v{{{newVersion}}}{{else}}{{{newValue}}}{{/if}}","matchUpdateTypes":["major","minor","patch"],"separateMinorPatch":true}

Traefik 2

Details

Change the proxy server from nginx with cert-manager to use Traefik 2

Terraform "ERROR: Record does not exist (81044)"

Details

Ask your question:

So I am in the process of setting everything up using this template, and i ran into something; it may be your code; but it's most likely me. I am in the stage of setting up Terraform, But for whatever reason; even after searching through google; I get this error.

sinless777@Sinless-Games-Desktop:~/Desktop/Cluster-1$ task terraform:plan
task: [terraform:plan] terraform plan 
data.sops_file.cloudflare_secrets: Reading...
data.sops_file.cloudflare_secrets: Read complete after 0s [id=-]
data.http.ipv4: Reading...
data.cloudflare_zones.domain: Reading...
data.http.ipv4: Read complete after 0s [id=http://ipv4.icanhazip.com]
data.cloudflare_zones.domain: Read complete after 0s [id=67a974a36446a1eb36e7a1755fa412f4]
cloudflare_record.ipv4: Refreshing state... [id=babc533255810d67bdf4e0f4af518660]
cloudflare_zone_settings_override.cloudflare_settings: Refreshing state... [id=80727c2755fc0be1d60b848b1315749e]
╷
│ Error: Record does not exist. (81044)
│ 
│   with cloudflare_record.ipv4,
│   on main.tf line 85, in resource "cloudflare_record" "ipv4":
│   85: resource "cloudflare_record" "ipv4" {
│ 
╵
task: Failed to run task "terraform:plan": exit status 1

Which after looking at the file, I notice my IDE kicks a few unresolved references, but since the error doesn't pertain to that i am not concerned . So my question is what am i doing wrong. Do i need to go to the cloudflare/cloudflare repo and file an issue or is it the script itself?

I also tried creating the records myself, and i will still get the same error. I double checked, and triple checked and i am confident my config is correct, but i seem to be unable to resolve this myself.

New echo-server TLS configuration is broken

Details

What steps did you take and what happened:

Upon setting up a fresh cluster using this template, echo-server attempts and fails to accept connections over HTTP on the HTTPS port (note that even if you connect this way you get the standard 404 not found from Traefik; the fact still remains that it's sending an HTTP response on the HTTPS port). It's unclear if this is an issue with the new echo-server implementation or the configuration of the new echo server, but regardless it this does not work.

What did you expect to happen: Probably for just about anything other than that to happen (preferably a properly functioning HTTPS connection on the HTTPS port).

Anything else you would like to add: I am available on Discord (xdMatthewbx#1337) outside of work hours to debug; I am currently running on a VM as a PoC before I migrate anything I actually use to this, so now is an ideal time to do any debugging.

Additional Information:

  • This is fixed by reverting commit 89d03e1 (the commit which replaced echo-server) (although this is not a permanent solution).
  • In the broken state no SSL certificate is even requested - only Hajimari and Traefik dashboard had certificates and certificate requests listed when running task cluster:certificates. After the revert echo server requests and successfully obtains a new certificate.
  • I assume the new hosts config is used for a reason, so I have not yet attempted with only that part reverted yet as I am going offline for the night. I will see what happens in this scenario and post an update here tomorrow.

Why not ArgoCD?

Details

Hi guys! I understand that this is "highly opinionated template" and as a non-devops guy I could learn a lot from it.

But please let me ask why not ArgoCD instead of Flux2 for home users?

Background: I'm not anyhow affiliated with ArgoCD, and might miss some obvious disadvantages of using it. And I'm still in the process of building my home k8s cluster. After a brief overview it looks like Argo CD might be better fitting for home users since it provides a nice Web UI with a clear picture, making it easier to debug what is going on in your cluster.

I would appreciate any feedback. Thanks!

Deprecate metallb in favor of kube-vip

Details

Describe the solution you'd like:

kibe-vip is already installed and used for the control plane, why not extend it to do k8s services as well and remove metallb

Add documentation for a single node cluster

Describe the solution you'd like:

For starters using this project it might be easiest to run this on a single node to become familiar with it.
The README could be extended to describe what they'd need to do in order to run it on a single node (e.g. specific settings around control plane, simplifying networking, etc.)

Error while running precommit

Im getting an error trying to run the first command task pre-commit:init
Running on an ubuntu and I checked that python-Levenshtein-wheels is installed and to my knowledge the prereqs are installed...
The question is, is it because Im using the newest version of everything and I see some deprecated error in the logs? or wrong version of python? (but no version is specified in the readme)

Trying to figure out what the problem might be here...


`
An unexpected error has occurred: CalledProcessError: command: ('/home/[USER]/.cache/pre-commit/repobbmtfqhw/py_env-default/bin/python', '-mpip', 'install', '.')
return code: 1
expected return code: 0
stdout:
Processing /home/[USER]/.cache/pre-commit/repobbmtfqhw
Preparing metadata (setup.py): started
Preparing metadata (setup.py): finished with status 'done'
Collecting fuzzywuzzy
Using cached fuzzywuzzy-0.18.0-py2.py3-none-any.whl (18 kB)
Collecting python-Levenshtein-wheels
Using cached python-Levenshtein-wheels-0.13.2.tar.gz (38 kB)
Preparing metadata (setup.py): started
Preparing metadata (setup.py): finished with status 'done'
Building wheels for collected packages: pre-commit-hooks, python-Levenshtein-wheels
Building wheel for pre-commit-hooks (setup.py): started
Building wheel for pre-commit-hooks (setup.py): finished with status 'done'
Created wheel for pre-commit-hooks: filename=pre_commit_hooks-1.1.10-py3-none-any.whl size=12935 sha256=53cbbea1b01fe2dc92cddd95719a6a81f48fab2d0c93623b2b3c780885965c58
Stored in directory: /tmp/pip-ephem-wheel-cache-38rrpw4b/wheels/69/55/9a/7edb989d561af202c6fb115bb4cb89090e30146869784f82c6
Building wheel for python-Levenshtein-wheels (setup.py): started
Building wheel for python-Levenshtein-wheels (setup.py): finished with status 'error'
Running setup.py clean for python-Levenshtein-wheels
Successfully built pre-commit-hooks
Failed to build python-Levenshtein-wheels
Installing collected packages: python-Levenshtein-wheels, fuzzywuzzy, pre-commit-hooks
Running setup.py install for python-Levenshtein-wheels: started
Running setup.py install for python-Levenshtein-wheels: finished with status 'error'

stderr:
ERROR: Command errored out with exit status 1:
command: /home/[USER]/.cache/pre-commit/repobbmtfqhw/py_env-default/bin/python -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-bv001b0l/python-levenshtein-wheels_c9212f793da746af91729b824755bdb2/setup.py'"'"'; file='"'"'/tmp/pip-install-bv001b0l/python-levenshtein-wheels_c9212f793da746af91729b824755bdb2/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(file) if os.path.exists(file) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' bdist_wheel -d /tmp/pip-wheel-hlc8bxfo
cwd: /tmp/pip-install-bv001b0l/python-levenshtein-wheels_c9212f793da746af91729b824755bdb2/
Complete output (14 lines):
running bdist_wheel
running build
running build_py
creating build
creating build/lib.linux-x86_64-3.10
creating build/lib.linux-x86_64-3.10/Levenshtein
copying Levenshtein/StringMatcher.py -> build/lib.linux-x86_64-3.10/Levenshtein
copying Levenshtein/init.py -> build/lib.linux-x86_64-3.10/Levenshtein
running build_ext
building 'Levenshtein._levenshtein' extension
creating build/temp.linux-x86_64-3.10
creating build/temp.linux-x86_64-3.10/Levenshtein
gcc-5 -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -O3 -Wall -fPIC -I/home/linuxbrew/.linuxbrew/include -I/home/linuxbrew/.linuxbrew/opt/[email protected]/include -I/home/linuxbrew/.linuxbrew/opt/sqlite/include -I/home/[USER]/.cache/pre-commit/repobbmtfqhw/py_env-default/include -I/home/linuxbrew/.linuxbrew/opt/[email protected]/include/python3.10 -c Levenshtein/_levenshtein.c -o build/temp.linux-x86_64-3.10/Levenshtein/_levenshtein.o
error: command 'gcc-5' failed: No such file or directory
----------------------------------------
ERROR: Failed building wheel for python-Levenshtein-wheels
ERROR: Command errored out with exit status 1:
command: /home/[USER]/.cache/pre-commit/repobbmtfqhw/py_env-default/bin/python -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-bv001b0l/python-levenshtein-wheels_c9212f793da746af91729b824755bdb2/setup.py'"'"'; file='"'"'/tmp/pip-install-bv001b0l/python-levenshtein-wheels_c9212f793da746af91729b824755bdb2/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(file) if os.path.exists(file) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record /tmp/pip-record-gvg2lees/install-record.txt --single-version-externally-managed --compile --install-headers /home/[USER]/.cache/pre-commit/repobbmtfqhw/py_env-default/include/site/python3.10/python-Levenshtein-wheels
cwd: /tmp/pip-install-bv001b0l/python-levenshtein-wheels_c9212f793da746af91729b824755bdb2/
Complete output (16 lines):
running install
/home/[USER]/.cache/pre-commit/repobbmtfqhw/py_env-default/lib/python3.10/site-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
warnings.warn(
running build
running build_py
creating build
creating build/lib.linux-x86_64-3.10
creating build/lib.linux-x86_64-3.10/Levenshtein
copying Levenshtein/StringMatcher.py -> build/lib.linux-x86_64-3.10/Levenshtein
copying Levenshtein/init.py -> build/lib.linux-x86_64-3.10/Levenshtein
running build_ext
building 'Levenshtein._levenshtein' extension
creating build/temp.linux-x86_64-3.10
creating build/temp.linux-x86_64-3.10/Levenshtein
gcc-5 -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -O3 -Wall -fPIC -I/home/linuxbrew/.linuxbrew/include -I/home/linuxbrew/.linuxbrew/opt/[email protected]/include -I/home/linuxbrew/.linuxbrew/opt/sqlite/include -I/home/[USER]/.cache/pre-commit/repobbmtfqhw/py_env-default/include -I/home/linuxbrew/.linuxbrew/opt/[email protected]/include/python3.10 -c Levenshtein/_levenshtein.c -o build/temp.linux-x86_64-3.10/Levenshtein/_levenshtein.o
error: command 'gcc-5' failed: No such file or directory
----------------------------------------
ERROR: Command errored out with exit status 1: /home/[USER]/.cache/pre-commit/repobbmtfqhw/py_env-default/bin/python -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-bv001b0l/python-levenshtein-wheels_c9212f793da746af91729b824755bdb2/setup.py'"'"'; file='"'"'/tmp/pip-install-bv001b0l/python-levenshtein-wheels_c9212f793da746af91729b824755bdb2/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(file) if os.path.exists(file) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record /tmp/pip-record-gvg2lees/install-record.txt --single-version-externally-managed --compile --install-headers /home/[USER]/.cache/pre-commit/repobbmtfqhw/py_env-default/include/site/python3.10/python-Levenshtein-wheels Check the logs for full command output.

`

Kustomize build failed on flux-system/apps

Details

What steps did you take and what happened:

I couldn't get Traefik to be deployed and got this error on flux-system/apps

image

What did you expect to happen:

flux-systems/apps to be in READY=True state

Anything else you would like to add:

It missing ./cluster/apps/default/echo-server/config-pvc.yaml file. So I created one.

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: echo-server-config
  namespace: default
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: local-path
  resources:
    requests:
      storage: 128Mi

Additional Information:

It worked!

I thought I had accidentally deleted the config-pvc.yaml, but then when I checked this template repo, it's not there as well! or was it?
I don't know if it can only happens to my setup. So, please close this issue if it's not necessary a bug.

Root DNS record?

Details

Ask your question:

Hi!

So why would TF create a "root" DNS record?
And more importantly, while I do not have it: why it fails?

Terraform will perform the following actions:

  # cloudflare_record.root will be created
  + resource "cloudflare_record" "root" {
      + allow_overwrite = false
      + created_on      = (known after apply)
      + hostname        = (known after apply)
      + id              = (known after apply)
      + metadata        = (known after apply)
      + modified_on     = (known after apply)
      + name            = (sensitive)
      + proxiable       = (known after apply)
      + proxied         = true
      + ttl             = 1
      + type            = "CNAME"
      + value           = (sensitive)
      + zone_id         = "cbf293fdb0b3c18eb6fdebc26acea5a4"
    }

Plan: 1 to add, 0 to change, 0 to destroy.

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

cloudflare_record.root: Creating...
cloudflare_record.root: Still creating... [10s elapsed]
cloudflare_record.root: Still creating... [20s elapsed]
cloudflare_record.root: Still creating... [30s elapsed]
╷
│ Error: expected DNS record to not already be present but already exists
│ 
│   with cloudflare_record.root,
│   on main.tf line 94, in resource "cloudflare_record" "root":94: resource "cloudflare_record" "root" {
│ 
╵
task: Failed to run task "terraform:apply:cloudflare": exit status 1

My current DNS records:

...
requests.<domain>.net.	1	IN	CNAME	ipv4.<domain>.net.
rook-ceph.<domain>.net.	1	IN	CNAME	ipv4.<domain>.net.
s3.<domain>.net.	1	IN	CNAME	ipv4.<domain>.net.
...

(no root.domain.net)

Option to integrate ufw or other firewall rules

Details

Describe the solution you'd like:

A template or other method to integrate ufw rules.

Anything else you would like to add:

Before using this template I added ufw on the machine(s). Then, I either need to manually fix ufw on each machine, or write custome templates / scripts. Ideally, for each step such as installing k3s, there would be the option to open the relevant ports.

Additional Information:

Template does not work when using a single control-plane node

Details

What steps did you take and what happened:

I'm getting back to rebuilding my cluster and thought I'd try working through this template and adopt the k8s-at-home opinions. I'll keep debugging on my own but thought I'd document this as I believe I followed the readme verbatim and encountered this issue. I setup 2 fresh ubuntu server node VMs and defined them as _0 and _1 ansible hosts in the .config.env with only the first being control node = True.

TASK [xanmanning.k3s : Check the conditions when a single controller is defined] ***************************************************************************************************************************************************
fatal: [k8s-0]: FAILED! => changed=false
assertion: (k3s_etcd_datastore is not defined or not k3s_etcd_datastore)
evaluated_to: false
msg: Control plane configuration is invalid. Please see notes about k3s_control_node and HA in README.md.
skipping: [k8s-1]

What did you expect to happen:

I hoped to install k3s and bootstrap the cluster moving on to the next steps.

Anything else you would like to add:

2 other minor onboarding points to mention for ease of use:

This warning several times when running ./configure.sh
[PGP] WARN[0000] Deprecation Warning: GPG key fetching from a keyserver within sops will be removed in a future version of sops. See getsops/sops#727 for more information.

I had already defined my own hostnames I wanted on the nodes and noticed that it automatically changed them to hardcoded k8s-0,1,.... I was going to change them back in the inventory but noticed they were also used elsewhere like in the sops yml (provision/ansible/inventory/host_vars/k8s-0.sops.yml).

Additional Information:

Where is ansible var {{ kubevip_interface }} declared?

I'm trying to follow along with the ansible provisioning of the cluster. I can see that k3s_registration_address is being brought in group_vars/kubernetes/k3s.yaml file, but I don't see kubevip_interface declared anywhere.

I can see we declare it as BOOTSTRAP_ANSIBLE_KUBE_VIP_INTERFACE in config.env, and that is picked up in global kube-vip template, but I can't trace the lineage into the ansible template file

Am I missing something?

https://github.com/k8s-at-home/template-cluster-k3s/blob/088c1d9bcf622ce6e522958d106a02b3b15de95d/provision/ansible/playbooks/templates/kube-vip-daemonset.yaml.j2#L41

Secret sops-age not found

Hello,

After following all the steps on the main branch and adding --kubeconfig=provision/kubeconfig to this command:

cat ~/.config/sops/age/keys.txt |
    kubectl create secret generic sops-age \
    --from-file=age.agekey=/dev/stdin

I can see with kubectl --kubeconfig=provision/kubeconfig -n default get secrets:
sops-age Opaque 1 8h
kubectl --kubeconfig=provision/kubeconfig -n default describe secret sops-age:

Name:         sops-age
Namespace:    default
Labels:       <none>
Annotations:  <none>

Type:  Opaque

Data
====
age.agekey:  189 bytes

but whenever I do flux --kubeconfig=provision/kubeconfig get kustomizations -A, I get:

NAMESPACE  	NAME       	READY	MESSAGE                                             	REVISION	SUSPENDED 
flux-system	flux-system	False	decryption secret error: Secret "sops-age" not found	        	False    

Any idea what I might be doing wrong?

Allow custom naming of nodes

Details

I'd like to be able to provide custom names for each of my nodes.

This could be done by allowing a variable in the sample config. If the variable is omitted or "" it can use the default "k8s-00", but if it is included the hostname should be equal to the variable.

# set the hostname of the node
export BOOTSTRAP_ANSIBLE_HOSTNAME_0="control-plane-01"

Renovate Dashboard

This issue lists Renovate updates and detected dependencies. Read the Dependency Dashboard docs to learn more.

Awaiting Schedule

These updates are awaiting their schedule. Click on a checkbox to get an update now.

  • chore(deps): update dependency k3s-io/k3s to v1.24.4+k3s1
  • chore(deps): update helm release ingress-nginx to v4.2.3
  • chore(deps): update helm release kured to v3.0.1
  • chore(deps): update pre-commit hook lucas-c/pre-commit-hooks to v1.3.1
  • chore(deps): update container image ghcr.io/k8s-at-home/kubectl to v1.25.0
  • chore(deps): update container image ghcr.io/tarampampam/error-pages to v2.18.0
  • chore(deps): update dependency community.general to v5.5.0
  • chore(deps): update dependency community.sops to v1.3.0
  • chore(deps): update oxsecurity/megalinter action to v6.6.0
  • chore(deps): update pre-commit hook sirosen/texthooks to v0.4.0
  • chore(deps): update terraform cloudflare to v3.22.0

Open

These updates have all been created already. Click a checkbox below to force a retry/rebase of any.

Detected dependencies

ansible-galaxy
provision/ansible/requirements.yml
  • community.general 5.4.0
  • community.sops 1.2.3
  • ansible.posix 1.4.0
  • ansible.utils 2.6.1
  • xanmanning.k3s v3.2.0
flux
cluster/apps/default/echo-server/helm-release.yaml
  • app-template 0.1.1
cluster/apps/default/hajimari/helm-release.yaml
  • hajimari 1.2.0
cluster/apps/kube-system/cert-manager/helm-release.yaml
  • cert-manager v1.9.1
cluster/apps/kube-system/kured/helm-release.yaml
  • kured 3.0.0
cluster/apps/kube-system/metrics-server/helm-release.yaml
  • metrics-server 3.8.2
cluster/apps/kube-system/reloader/helm-release.yaml
  • reloader v0.0.118
cluster/apps/networking/external-dns/helm-release.yaml
  • external-dns 1.11.0
cluster/apps/networking/ingress-nginx/helm-release.yaml
  • ingress-nginx 4.2.1
cluster/apps/networking/k8s-gateway/helm-release.yaml
  • k8s-gateway 1.1.12
cluster/apps/networking/metallb/helm-release.yaml
  • metallb 0.12.1
github-actions
.github/workflows/invalid-template.yaml
  • dessant/support-requests v2
.github/workflows/lint.yaml
  • actions/checkout v3
  • oxsecurity/megalinter v6.4.0
.github/workflows/support.yaml
  • dessant/support-requests v2
helm-values
cluster/apps/default/echo-server/helm-release.yaml
  • docker.io/jmalloc/echo-server 0.3.3
cluster/apps/default/hajimari/helm-release.yaml
  • ghcr.io/toboshii/hajimari v0.2.0
cluster/apps/networking/cloudflare-ddns/cron-job.yaml
  • ghcr.io/k8s-at-home/kubectl v1.24.3
cluster/apps/networking/ingress-nginx/helm-release.yaml
  • ghcr.io/tarampampam/error-pages 2.17.0
cluster/apps/system-upgrade/system-upgrade-controller/plans/agent.yaml
cluster/apps/system-upgrade/system-upgrade-controller/plans/server.yaml
tmpl/cluster/kube-vip-daemonset.yaml
  • ghcr.io/kube-vip/kube-vip v0.5.0
kubernetes
cluster/apps/default/echo-server/helm-release.yaml
  • HelmRelease helm.toolkit.fluxcd.io/v2beta1
cluster/apps/default/echo-server/kustomization.yaml
  • Kustomization kustomize.config.k8s.io/v1beta1
cluster/apps/default/hajimari/config-pvc.yaml
  • PersistentVolumeClaim v1
cluster/apps/default/hajimari/helm-release.yaml
  • HelmRelease helm.toolkit.fluxcd.io/v2beta1
cluster/apps/default/hajimari/kustomization.yaml
  • Kustomization kustomize.config.k8s.io/v1beta1
cluster/apps/default/kustomization.yaml
  • Kustomization kustomize.config.k8s.io/v1beta1
cluster/apps/default/namespace.yaml
  • Namespace v1
cluster/apps/flux-system/kustomization.yaml
  • Kustomization kustomize.config.k8s.io/v1beta1
cluster/apps/flux-system/namespace.yaml
  • Namespace v1
cluster/apps/kube-system/cert-manager/helm-release.yaml
  • HelmRelease helm.toolkit.fluxcd.io/v2beta1
cluster/apps/kube-system/cert-manager/kustomization.yaml
  • Kustomization kustomize.config.k8s.io/v1beta1
cluster/apps/kube-system/kube-vip/kustomization.yaml
  • Kustomization kustomize.config.k8s.io/v1beta1
cluster/apps/kube-system/kube-vip/rbac.yaml
  • ServiceAccount v1
  • Secret v1
  • ClusterRole rbac.authorization.k8s.io/v1
  • ClusterRoleBinding rbac.authorization.k8s.io/v1
cluster/apps/kube-system/kured/helm-release.yaml
  • HelmRelease helm.toolkit.fluxcd.io/v2beta1
cluster/apps/kube-system/kured/kustomization.yaml
  • Kustomization kustomize.config.k8s.io/v1beta1
cluster/apps/kube-system/kustomization.yaml
  • Kustomization kustomize.config.k8s.io/v1beta1
cluster/apps/kube-system/metrics-server/helm-release.yaml
  • HelmRelease helm.toolkit.fluxcd.io/v2beta1
cluster/apps/kube-system/metrics-server/kustomization.yaml
  • Kustomization kustomize.config.k8s.io/v1beta1
cluster/apps/kube-system/namespace.yaml
  • Namespace v1
cluster/apps/kube-system/reloader/helm-release.yaml
  • HelmRelease helm.toolkit.fluxcd.io/v2beta1
cluster/apps/kube-system/reloader/kustomization.yaml
  • Kustomization kustomize.config.k8s.io/v1beta1
cluster/apps/kustomization.yaml
  • Kustomization kustomize.config.k8s.io/v1beta1
cluster/apps/networking/cloudflare-ddns/cron-job.yaml
  • ghcr.io/k8s-at-home/kubectl v1.24.3
  • CronJob batch/v1
cluster/apps/networking/cloudflare-ddns/kustomization.yaml
  • Kustomization kustomize.config.k8s.io/v1beta1
cluster/apps/networking/external-dns/helm-release.yaml
  • HelmRelease helm.toolkit.fluxcd.io/v2beta1
cluster/apps/networking/external-dns/kustomization.yaml
  • Kustomization kustomize.config.k8s.io/v1beta1
cluster/apps/networking/ingress-nginx/helm-release.yaml
  • HelmRelease helm.toolkit.fluxcd.io/v2beta1
cluster/apps/networking/ingress-nginx/kustomization.yaml
  • Kustomization kustomize.config.k8s.io/v1beta1
cluster/apps/networking/k8s-gateway/helm-release.yaml
  • HelmRelease helm.toolkit.fluxcd.io/v2beta1
cluster/apps/networking/k8s-gateway/kustomization.yaml
  • Kustomization kustomize.config.k8s.io/v1beta1
cluster/apps/networking/kustomization.yaml
  • Kustomization kustomize.config.k8s.io/v1beta1
cluster/apps/networking/metallb/helm-release.yaml
  • HelmRelease helm.toolkit.fluxcd.io/v2beta1
cluster/apps/networking/metallb/kustomization.yaml
  • Kustomization kustomize.config.k8s.io/v1beta1
cluster/apps/networking/namespace.yaml
  • Namespace v1
cluster/apps/system-upgrade/kustomization.yaml
  • Kustomization kustomize.config.k8s.io/v1beta1
cluster/apps/system-upgrade/namespace.yaml
  • Namespace v1
cluster/apps/system-upgrade/system-upgrade-controller/kustomization.yaml
  • Kustomization kustomize.config.k8s.io/v1beta1
cluster/apps/system-upgrade/system-upgrade-controller/plans/agent.yaml
  • Plan upgrade.cattle.io/v1
cluster/apps/system-upgrade/system-upgrade-controller/plans/kustomization.yaml
  • Kustomization kustomize.config.k8s.io/v1beta1
cluster/apps/system-upgrade/system-upgrade-controller/plans/server.yaml
  • Plan upgrade.cattle.io/v1
cluster/bootstrap/kustomization.yaml
  • Kustomization kustomize.config.k8s.io/v1beta1
cluster/charts/bitnami.yaml
  • HelmRepository source.toolkit.fluxcd.io/v1beta2
cluster/charts/bjw-s.yaml
  • HelmRepository source.toolkit.fluxcd.io/v1beta2
cluster/charts/external-dns.yaml
  • HelmRepository source.toolkit.fluxcd.io/v1beta2
cluster/charts/hajimari.yaml
  • HelmRepository source.toolkit.fluxcd.io/v1beta2
cluster/charts/ingress-nginx.yaml
  • HelmRepository source.toolkit.fluxcd.io/v1beta2
cluster/charts/jetstack.yaml
  • HelmRepository source.toolkit.fluxcd.io/v1beta2
cluster/charts/k8s-gateway.yaml
  • HelmRepository source.toolkit.fluxcd.io/v1beta2
cluster/charts/kustomization.yaml
  • Kustomization kustomize.config.k8s.io/v1beta1
cluster/charts/metallb.yaml
  • HelmRepository source.toolkit.fluxcd.io/v1beta2
cluster/charts/metrics-server.yaml
  • HelmRepository source.toolkit.fluxcd.io/v1beta2
cluster/charts/stakater.yaml
  • HelmRepository source.toolkit.fluxcd.io/v1beta2
cluster/charts/weaveworks-kured.yaml
  • HelmRepository source.toolkit.fluxcd.io/v1beta2
cluster/config/kustomization.yaml
  • Kustomization kustomize.config.k8s.io/v1beta1
cluster/core/cluster-issuers/kustomization.yaml
  • Kustomization kustomize.config.k8s.io/v1beta1
cluster/core/cluster-issuers/letsencrypt-production.yaml
  • ClusterIssuer cert-manager.io/v1
cluster/core/cluster-issuers/letsencrypt-staging.yaml
  • ClusterIssuer cert-manager.io/v1
cluster/core/kustomization.yaml
  • Kustomization kustomize.config.k8s.io/v1beta1
cluster/crds/cert-manager/kustomization.yaml
  • Kustomization kustomize.config.k8s.io/v1beta1
cluster/crds/kustomization.yaml
  • Kustomization kustomize.config.k8s.io/v1beta1
cluster/crds/system-upgrade-controller/kustomization.yaml
  • Kustomization kustomize.config.k8s.io/v1beta1
cluster/flux/apps.yaml
  • Kustomization kustomize.toolkit.fluxcd.io/v1beta2
cluster/flux/charts.yaml
  • Kustomization kustomize.toolkit.fluxcd.io/v1beta2
cluster/flux/config.yaml
  • Kustomization kustomize.toolkit.fluxcd.io/v1beta2
cluster/flux/core.yaml
  • Kustomization kustomize.toolkit.fluxcd.io/v1beta2
cluster/flux/crds.yaml
  • Kustomization kustomize.toolkit.fluxcd.io/v1beta2
cluster/flux/flux-system/flux-installation.yaml
  • GitRepository source.toolkit.fluxcd.io/v1beta2
  • Kustomization kustomize.toolkit.fluxcd.io/v1beta2
cluster/flux/flux-system/kustomization.yaml
  • Kustomization kustomize.config.k8s.io/v1beta1
provision/ansible/playbooks/templates/calico-installation.yaml.j2
  • Installation operator.tigera.io/v1
provision/ansible/playbooks/templates/kube-vip-daemonset.yaml.j2
  • ghcr.io/kube-vip/kube-vip v0.5.0
  • DaemonSet apps/v1
tmpl/cluster/cert-manager-secret.sops.yaml
  • Secret v1
tmpl/cluster/cloudflare-ddns-secret.sops.yaml
  • Secret v1
tmpl/cluster/cluster-secrets.sops.yaml
  • Secret v1
tmpl/cluster/cluster-settings.yaml
  • ConfigMap v1
tmpl/cluster/external-dns-secret.sops.yaml
  • Secret v1
tmpl/cluster/flux-cluster.yaml
  • GitRepository source.toolkit.fluxcd.io/v1beta2
  • Kustomization kustomize.toolkit.fluxcd.io/v1beta2
tmpl/cluster/flux-system/kustomization.yaml
  • Kustomization kustomize.config.k8s.io/v1beta1
tmpl/cluster/flux-system/webhooks/github/ingress.yaml
  • Ingress networking.k8s.io/v1
tmpl/cluster/flux-system/webhooks/github/kustomization.yaml
  • Kustomization kustomize.config.k8s.io/v1beta1
tmpl/cluster/flux-system/webhooks/github/receiver.yaml
  • Receiver notification.toolkit.fluxcd.io/v1beta1
tmpl/cluster/flux-system/webhooks/github/secret.sops.yaml
  • Secret v1
tmpl/cluster/flux-system/webhooks/kustomization.yaml
  • Kustomization kustomize.config.k8s.io/v1beta1
tmpl/cluster/kube-vip-daemonset.yaml
  • ghcr.io/kube-vip/kube-vip v0.5.0
  • DaemonSet apps/v1
kustomize
cluster/apps/system-upgrade/system-upgrade-controller/kustomization.yaml
  • rancher/system-upgrade-controller v0.9.1
  • docker.io/rancher/system-upgrade-controller v0.9.1
cluster/bootstrap/kustomization.yaml
  • fluxcd/flux2 v0.32.0
pre-commit
.pre-commit-config.yaml
  • adrienverge/yamllint v1.27.1
  • pre-commit/pre-commit-hooks v4.3.0
  • Lucas-C/pre-commit-hooks v1.3.0
  • sirosen/texthooks 0.3.1
  • k8s-at-home/sops-pre-commit v2.1.1
  • gruntwork-io/pre-commit v0.1.17
regex
cluster/crds/cert-manager/kustomization.yaml
  • quay.io/jetstack/cert-manager-controller v1.9.1
cluster/crds/system-upgrade-controller/kustomization.yaml
  • rancher/system-upgrade-controller v0.9.1
cluster/apps/system-upgrade/system-upgrade-controller/plans/agent.yaml
  • k3s-io/k3s v1.24.3+k3s1
cluster/apps/system-upgrade/system-upgrade-controller/plans/server.yaml
  • k3s-io/k3s v1.24.3+k3s1
cluster/flux/flux-system/flux-installation.yaml
  • fluxcd/flux2 v0.32.0
provision/ansible/inventory/group_vars/kubernetes/k3s.yml
  • k3s-io/k3s v1.24.3+k3s1
terraform
provision/terraform/cloudflare/main.tf
  • cloudflare 3.21.0
  • http 3.0.1
  • sops 0.7.1

  • Check this box to trigger a request for Renovate to run again on this repository

Need help to start

context

Hi;

I'm totally a newbie with kubernetes, but not with docker/docker-compose/VM/...

I tried to login to Discord but I'm unable to see the hCaptcha, so I came back here....

Ask your question:

  • What is the most straight forward way to make it happen, I means
  • What is the minimum number of host required ?
    I though with k3s is was 1

I figure I need a kinds of Unix/Linux compatible system sonInisntalled in a VM with Ubuntu 20.04LTS as my controller/workstation where I installed Python3, Ansible, Pre-Commit, VSCode, Age, GoLang, and integrated direnv in my .bashrc and clone the repo, but....

  • My system say task is not a command when I run task pre-commit:init but pre-commit command did something :p

Moving the Section config higher in the Wiki

Logically; I believe the section https://github.com/k8s-at-home/template-cluster-k3s#page_facing_up-configuration should before https://github.com/k8s-at-home/template-cluster-k3s#closed_lock_with_key-setting-up-age so in that case hopefully others peoplee suppose to started with

BOOTSTRAP_AGE_PUBLIC_KEY ??
should it continue with = then my agekey ??

Sorry for the mess, I'm currently working remotely from a smartphone :p

PS: I'm very keen to help that project to become better even if it look already awesome!!

Unable to get to traefik ingress

Details

Whenever I try to connect to the ingress IP I get a 522 error. Looking at the traefik helm-release, shouldnt there be 443 and 80 ports open here? Still learning this stuff...

    ports:
      traefik:
        expose: true
      web:
        redirectTo: websecure
      websecure:
        tls:
          enabled: true
          options: "default"
      metrics:
        port: 8082
        expose: true
        exposedPort: 8082

What did you expect to happen:

I am able to access the hajimari dashboard

Anything else you would like to add:

Additional Information:

question on <task ansible:playbook:k3s-install>

Hopefully a quick and simple question:

  • I have a current 4 node cluster that I deployed using this template
  • I plan to add another node (slightly different hardware, mainly to improve Plex experience) to the cluster
  • If I installed Ubuntu and added the node's details to the env file plus configured it's nic according the docs, can I just re-run the ansible playbook, to add the new node?
    (It will not affect the existing cluster nodes correct? So the cluster will be uneffected, and only the new node will change...?)

Choosing a specific NIC when there are multiple NICs with calico and kube-vip

Details

This is advanced configuration but I figured it would be worth opening an issue to inform people how to deal with situations where your nodes have multiple active NICs and you want to choose which one Kubernetes uses.

If you nodes have different network interface names, it's advised to make the names consistent across devices. Review this question on AskUbuntu on how to make that happen.

These changes should be made prior to running task configure in the guide.

Calico

By default, attaches to the first available internet interface (https://projectcalico.docs.tigera.io/networking/ip-autodetection).

The way around this could be achieved by setting a interface name regex in the Calico Installation resource.

# ./provision/ansible/playbooks/templates/calico-installation.yaml.j2
---
apiVersion: operator.tigera.io/v1
kind: Installation
metadata:
  name: default
spec:
  calicoNetwork:
    nodeAddressAutodetectionV4:
      interface: "eno.*"
...

kube-vip

By default kube-vip choose the NIC with the default gateway. To assign a specific NIC update the configuration below.

# ./provision/ansible/playbooks/templates/kube-vip-daemonset.yaml.j2
...
          env:
...
            - name: vip_interface
              value: "eno1"
...
# ./tmpl/cluster/kube-vip-daemonset.yaml
...
          env:
...
            - name: vip_interface
              value: "eno1"
...

Stuck at ssh ansible

Hello,

Im just a starter wo want to explorer Kubernetes and make it core for my home lab.

I forked the template and have installed al prerequisites.
If filed in the .config.env file with my variables.
When i run the config.sh --verify it keep saying that it can not connect to the ansible host via ssh.
I've double checked the username and password and they work when ssh trough the terminal. Due the limited info from the error, i do not know where to look further.

Can someone please gave me guidance where to look?

Thank you

Upgrading Calico

Details

Describe the solution you'd like:

Document a way to upgrade Calico, for now the process can be done by running against an already provisioned cluster

kubectl replace -f https://projectcalico.docs.tigera.io/archive/v3.24/manifests/tigera-operator.yaml

After it is upgraded it is wise to manually bump the version in the Ansible config:

https://github.com/k8s-at-home/template-cluster-k3s/blob/63d077e1dd50cb0ae9af5c21d951bec1d78c60ad/provision/ansible/inventory/group_vars/kubernetes/k3s.yml#L31

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.