Giter VIP home page Giter VIP logo

k3s-cluster-on-oracle-cloud-infrastructure's Introduction

Free K3s Cluster on the Oracle Cloud Infrastructure

The motivation of this project is to provide a K3s cluster with four nodes fully automatically, which is composed only of always free infrastructure resources. The deployment will be done Terraform and the user-data scripts which installs K3s automatically and build up the cluster.

Architecture

The cluster infrastructure based on four nodes, two server- and two agent-nodes for your workload. A load balancer which is distributes the traffic to your nodes on port 443. The server-nodes are at the availability domain 2 (AD-2) and the agent node are created in AD-1. The cluster use the storage solution Longhorn, which will use the block storages of the OCI instances and shares the Kubernetes volumes between them. The following diagram give an overview of the infrastructure.

Configuration

First of all, you need to setup some environment variables which are needed by the OCI Terraform provider. The Oracle Cloud Infrastructure documentation gives a good overview of where the IDs and information are located and also explains how to set up Terraform.

export TF_VAR_compartment_id="<COMPARTMENT_ID>"
export TF_VAR_region="<REGION_NAME>"
export TF_VAR_tenancy_ocid="<TENANCY_OICD>"
export TF_VAR_user_ocid="<USER_OICD>"
export TF_VAR_fingerprint="<RSA_FINGERPRINT>"
export TF_VAR_private_key="<PRIVATE_KEY>"
export TF_VAR_ssh_authorized_keys='["<SSH_PUBLIC_KEY>"]'

Deployment

The deployment is a straight forwards process. First, start with a Terraform init:

terraform init

Second, you have to create a Terraform plan by this command:

terraform plan -out .tfplan

And last apply the plan:

terraform apply ".tfplan"

After a couple minutes the OCI instances are created and the Cluster is up and running. And are able to connect via SSH to your Server-node-1 to get the kube-config.

scp rancher@<SERVER_NODE_1_PUBLIC_IP>:/etc/rancher/k3s/k3s.yaml ~/.kube/config

Now you can use kubectl to manage your cluster and check the nodes:

kubectl get nodes

Longhorn Installation

Finally, you have to deploy Longhorn the distributed block storage by the following commands of the kubectl or helm method:

Method 1 by kubectl:

kubectl apply -f https://raw.githubusercontent.com/longhorn/longhorn/v1.2.3/deploy/longhorn.yaml

Method 2 by helm: You can find a shell script with all commands in the services folder which run all the following commands at once.

helm repo add longhorn https://charts.longhorn.io
helm repo update
kubectl create namespace longhorn-system
helm install longhorn longhorn/longhorn --namespace longhorn-system

Additionally, for both methods you have to remove local-path as default provisioner and set Longhorn as default:

kubectl patch storageclass local-path -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"false"}}}'
kubectl patch storageclass longhorn -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'

Check the Longhorn storageclass:

kubectl get storageclass

After a some minutes all pods are in the running state and you can connect to the Longhorn UI by forwarding the port to your machine:

kubectl port-forward deployment/longhorn-ui 8000:8000 -n longhorn-system

Use this URL to access the interface: http://127.0.0.1:8000 .

Automatically certificate creation via Let's Encrypt

For propagating your services, it is strongly recommended to use SSL encryption. In this case you have to deploy certificates for all of your services which should be reachable at the internet. To fulfill this requirement you can use the cert-manager deployment in the services\cert-manager folder.

First, you have to execute the cert-manager.sh or the following commands:

helm repo add jetstack https://charts.jetstack.io
helm repo update

helm install \
  cert-manager jetstack/cert-manager \
  --namespace cert-manager \
  --create-namespace \
  --version v1.7.1 \
  --set installCRDs=true

Second, add a cluster issuer by editing and deploy cluster_issuer.yamlfile by replacing it with your email address and your domain:

...
spec:
  acme:
    email: <your_email>@<your-domain>.<tld> # replace
...

Finally, when you deploy a service you have to add an ingress resource. You can use the example file ingress_example.yaml and edit it for your service:

...
spec:
  rules:
  - host: <subdomain>.<your-domain>.<tld>                # replace
    http:
      paths:
      - path: /
        backend:
          serviceName: <service-name>                    # replace
          servicePort: 80
  tls:
  - hosts:
    - <subdomain>.<your-domain>.<tld>                    # replace
    secretName: <subdomain>-<your-domain>-<tld>-prod-tls # replace
...

The last step needs to be done for every service. In this deployment step the cert-manager will handle the communication to Let's Encrypt and add the certificate to your service ingress resource.

To Do's

  • Terraform Load Balancer deployment

k3s-cluster-on-oracle-cloud-infrastructure's People

Contributors

howstricks avatar ngeorger avatar r0b2g1t avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

k3s-cluster-on-oracle-cloud-infrastructure's Issues

It looks like the shape VM.Standard.E2.1.Micro isn't available in your region.

It looks like the shape VM.Standard.E2.1.Micro isn't available in your region.
You can check it by:
Go to Home > Governance > Limits, Quotas and Usage to see where your VM.Standard.E2.1.Micro shapes are available.

_Originally posted by @r0b2g1t in https://github.com/r0b2g1t/k3s-cluster-on-oracle-cloud-infrastructure/issues/2#issuecomment-1083209846_

For others, its available, but not in the availability domain set by the script.

Changing the AD, allowed the instance to be created.

What is Server Node 1?

scp rancher@<SERVER_NODE_1_PUBLIC_IP>:/etc/rancher/k3s/k3s.yaml ~/.kube/config

and what is server node 1?

The server-nodes are at the availability domain 2 (AD-2) and the agent node are created in AD-1.

Cannot you just say which processor arch is for the server and which one for the agent nudes?

Error: 404;NotAuthorizedOrNotFound - caused by hard-coded compute image IDs in the wrong region

A terraform apply throws this error:

│ Error: 404-NotAuthorizedOrNotFound, Authorization failed or requested resource not found.
│ Suggestion: Either the resource has been deleted or service Core Instance need policy to access this resource. Policy reference: https://docs.oracle.com/en-us/iaas/Content/Identity/Reference/policyreference.htm
│ Documentation: https://registry.terraform.io/providers/oracle/oci/latest/docs/resources/core_instance 
│ API Reference: https://docs.oracle.com/iaas/api/#/en/iaas/20160918/Instance/LaunchInstance 
│ Request Target: POST https://iaas.eu-amsterdam-1.oraclecloud.com/20160918/instances 
│ Provider version: 4.119.0, released on 2023-05-03.  
│ Service: Core Instance 
│ Operation Name: LaunchInstance 
│ OPC request ID: 74975954df7e642d81de039b4c702ba1/380DB4D8255D882C2E7CABB029A00BAF/93E0CE782AD76D87AF42585D88D0D369 
│ 
│ 
│   with module.compute.oci_core_instance.server_0,
│   on compute/main.tf line 1, in resource "oci_core_instance" "server_0":
│    1: resource "oci_core_instance" "server_0" {

Suggestion: instead of hard-coded into a locals section in the variables.tf file, use a data lookup using the correct region / shape.

Workaround: If anyone else is using the eu-amsterdam-1 region you can use these:

    // Canonical-Ubuntu-22.04-Minimal-aarch64-2023.04.18-0 eu-amsterdam-1
    source_id   = "ocid1.image.oc1.eu-amsterdam-1.aaaaaaaa4yojwha4rdsnagwul4ncgy45lx7q2g3pd5ru2io6rsx6pog35mfq"
    // Canonical-Ubuntu-20.04-Minimal-2023.04.19-0 eu-amsterdam-1
    source_id   = "ocid1.image.oc1.eu-amsterdam-1.aaaaaaaaml7w5cdrj2fzoa7yaaa4ynymkolvshyz3cc4rbxymk52kcwxt6ma"

If you're in another region, use the oci command to look them. You'll need your compartment ID.

oci compute image list --shape "VM.Standard.E2.1.Micro" --compartment-id ${YOUR_COMPARTMENT_ID}

Fails to create Core Instace

I'm using root user access key and then when I try to run the script it creates the network just fine but fails to create the actual instance.

╷
│ Error: 404-NotAuthorizedOrNotFound
│ Provider version: 4.67.0, released on 2022-03-10.
│ Service: Core Instance
│ Error Message: Authorization failed or requested resource not found.
│ OPC request ID: f39808081be0c66f025ee81eb87dbf3a/37EFF46E0A38072AF67D0343D2CDAC94/4F30CD595B5C0CC47C715402A16EA37A
│ Suggestion: Either the resource has been deleted or service Core Instance need policy to access this resource. Policy reference: https://docs.oracle.com/en-us/iaas/Content/Identity/Reference/policyreference.htm
│
│
│   with module.compute.oci_core_instance.server_1,
│   on compute/main.tf line 1, in resource "oci_core_instance" "server_1":
│    1: resource "oci_core_instance" "server_1" {
│

Error: shape VM.Standard.E2.1.Micro not found

I am getting the following error after updating the ubuntu image to the one for the region -> eu-frankfurt-1

module.compute.oci_core_instance.worker[1]: Creating...
module.compute.oci_core_instance.worker[0]: Creating...
╷
│ Error: 404-NotAuthorizedOrNotFound 
│ Provider version: 4.69.0, released on 2022-03-23.  
│ Service: Core Instance 
│ Error Message: shape VM.Standard.E2.1.Micro not found 
│ OPC request ID: f0e16950924de78391d594d1481FF2194F/DF250747F204905D2CE7474649978B15 
│ Suggestion: Either the resource has been deleted or service Core Instance need policy to access this resource. Policy reference: https://docs.oracle.com/en-us/iaas/Content/Identity/Reference/policyreference.htm
│ 
│ 
│   with module.compute.oci_core_instance.worker[0],
│   on compute/main.tf line 68, in resource "oci_core_instance" "worker":
│   68: resource "oci_core_instance" "worker" {
│ 
╵

This is the worker node config I have;

  worker_instance_config = {
    shape_id = "VM.Standard.E2.1.Micro"
    ocpus    = 1
    ram      = 1
    // Canonical-Ubuntu-20.04-2022.03.02-0
    source_id   = "ocid1.image.oc1.eu-frankfurt-1.aaaaaaaavcrpbwmm75t6azhxgepxah6vigiwwvruti3gj2frhuxnvhzn3e5a"
    source_type = "image"
    worker_ip_0 = "10.0.0.21"
    worker_ip_1 = "10.0.0.22"
    // release: v0.21.5-k3s2r1
    k3os_image = "https://github.com/rancher/k3os/releases/download/v0.21.5-k3s2r1/k3os-amd64.iso"
    metadata = {
      "ssh_authorized_keys" = join("\n", var.ssh_authorized_keys)
    }
  }

The two server nodes VM.Standard.A1.Flex are up and running fine.

I am stuck unfortunately and any help would be gratefully received :)

Also, are starting and destroying instances chargeable :)

security rules and kubectl

Your security rules are not allowing kubectl to reach the cluster externally.

Also the public ip of the machine would need to be added to its allowed cert/listeners.

Havn't looked into how your creating the K3s but:
curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC="--tls-san x.x.x.x" sh -s -
with x.x.x.x being the public IP would give you access.

or edit what k3s is serving directly:
kubectl -n kube-system edit secrets/k3s-serving

Error: Self-referential block

Hi,

When I try to run TF plan command I get the following error:

terraform plan -out .tfplan
var.private_key_password
  Password for private key to use for signing

  Enter a value: 

module.network.data.oci_identity_availability_domain.ad: Reading...
module.network.data.oci_identity_availability_domain.ad: Read complete after 0s [id=ocid1.availabilitydomain.oc1..xxxxxx]
╷
│ Error: Self-referential block
│ 
│   on compute/main.tf line 24, in resource "oci_core_instance" "server_0":
│   24:           server_0_ip = oci_core_instance.server_0.private_ip,
│ 
│ Configuration for oci_core_instance.server_0 may not refer to itself.
╵
╷
│ Error: Self-referential block
│ 
│   on compute/main.tf line 24, in resource "oci_core_instance" "server_0":
│   24:           server_0_ip = oci_core_instance.server_0.private_ip,
│ 
│ Configuration for oci_core_instance.server_0 may not refer to itself.

I had to change the network/data.tf ad_number = 1 just because the region af-johannesburg-1 doesn't allow/have ad_number=2

Any idea on how I can fix this?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.