Giter VIP home page Giter VIP logo

kbst / terraform-kubestack Goto Github PK

View Code? Open in Web Editor NEW
626.0 15.0 90.0 640 KB

Kubestack is a framework for Kubernetes platform engineering teams to define the entire cloud native stack in one Terraform code base and continuously evolve the platform safely through GitOps.

Home Page: https://www.kubestack.com

License: Apache License 2.0

HCL 90.80% Shell 1.35% Smarty 0.65% Python 1.71% Dockerfile 3.76% Makefile 1.73%
gitops-framework aws gcp azure terraform terraform-modules terraform-framework kubernetes hacktoberfest devops gitops platform-engineering

terraform-kubestack's Introduction

Kubestack, The Open Source Gitops Framework

Kubestack

The Open Source Terraform framework for Kubernetes Platform Engineering

Status GitHub Issues GitHub Pull Requests

GitHub Repo stars Twitter Follow

Introduction

Kubestack is a Terraform framework for Kubernetes Platform Engineering teams to define the entire cloud native stack in one Terraform code base and continuously evolve the platform safely through GitOps.

Highlights

Getting Started

For the easiest way to get started, follow the Kubestack tutorial. The tutorial will help you get started with the Kubestack framework and build a Kubernetes platform application teams love.

Getting Help

Official Documentation
Refer to the official documentation for a deeper dive into how to use and configure Kubestack.

Community Help
If you have any questions while following the tutorial, join the #kubestack channel on the Kubernetes community. To create an account request an invitation.

Contributing

This repository holds Terraform modules in directories matching the respective provider name, e.g. aws, azurerm, google. Additionally common holds the modules that are used for all providers. Most notably the metadata module that ensures a consistent naming scheme and the cluster_services module which integrates Kustomize into the Terraform apply.

Each cloud provider specific module directory always has a cluster and _modules directories. The cluster module is user facing and once Kubestack is out of beta the goal is to not change the module interface unless the major version changes. The cluster module then internally uses the module in _modules that holds the actual implementation.

The quickstart directory is home to the source for the zip files that are used to bootstrap the user repositories when following the tutorial.

The tests directory holds a set of happy path tests.

Contributions to the Kubestack framework are welcome and encouraged. Before contributing, please read the Contributing and Code of Conduct Guidelines.

One super simple way to contribute to the success of this project is to give it a star.

GitHub Repo stars

Kubestack Repositories

terraform-kubestack's People

Contributors

ajrpayne avatar anhdle14 avatar cbek avatar cctechwiz avatar cpanato avatar dependabot[bot] avatar ederst avatar feend78 avatar gullitmiranda avatar jdmarble avatar jfreuter-fin avatar krpatel19 avatar leewardbound avatar mark5cinco avatar markszabo avatar nullck avatar pst avatar rzk avatar sondabar avatar soulshake avatar spazzy757 avatar to266 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

terraform-kubestack's Issues

Upgrade Images to use Terraform 0.13

We came into an issue where there was a requirement to use TF 0.13, I would have thought it would have been an easy switch (just switch out the KubeStack Docker image with a Custom one that has TF 0.13 installed). The issue arises that the KubeStack makes use of Kubestacks custom Kustomize Terraform provider which is installed in the same container. The pathway for custom terraform providers has been changed. So you need to dig into Kubestack and pull out the Dockerfile and update it accordingly in order to get it to work.

There where two problems with this:

  • The KubeStack Dockerfile setup is quite layered so it was difficult to pinpoint the exact places to make the change
  • You lose the ability to just pull Kubestacks image and have to maintain your own

AKS: Enable CNI/Advanced Networking Configuration

Enhancement Request:

As a user, I would like to configure my Kubestack environment in Azure with CNI/Advanced Networking Config to allow greater control over networking resources.

Background

We are looking to use Kubestack to manage our AKS clusters within a fairly complex hybrid cloud/on-prem environment. Because the on premises resources we need to access from within the cluster are in a 10.10.10.0/24 subnet, and the vnets created automatically by the AKS managed service have an addressable space of 10.0.0.0/8, which contains 10.10.10.0/24, the collision of these address spaces have prevented us from creating a VPN gateway to link the AKS vnet and the local subnet. Using CNI will allow us to create a vnet with a smaller address space that we can configure and manage as needed.

Detail

To enable this behavior, the following changes would be needed:

  • The user would create a custom vnet and subnet (outside the Kubestack directory, in a wider TF project?)
  • The user would pass the subnet ID in to Kubestack (via config.auto.tfvars?)
  • The user would pass in, minimally, values for service_cidr and dns_server_ip (optionally also network_plugin, network_policy, and pod_cidr)

Challenge

In order to complete the CNI setup, the user will need to associate the network security group which is generated by the AKS managed service with the custom subnet. Unfortunately the group name includes a randomly generated number, i.e. aks-agentpool-22342344-nsg, and the azurerm_kubernetes_cluster provider does not expose this group name directly within the Terraform state. It is possible to retrieve it using an external data provider and a brief shell script. To facilitate this, it would be helpful if the Kubestack cluster module could expose the name of the auto-created node_resource_group that contains the NSG.

Kind should use lowercase name

The kind_cluster name should be lowercase, otherwise apply will be failing:

.terraform/modules/eks_zero/kind/_modules/kind/main.tf line 12, in resource "kind_cluster" "current":
  12: resource "kind_cluster" "current" {

EKS: Istio sidecars are not being added.

Istio was not adding sidecars to pods. After some digging, I found an issue which mentioned that EKS needs an ingress rule on the workers for port 443 to the masters. Adding this ingress rule to the worker security group allowed the sidecars to be created.

EKS:AMI name for creating EKS WorkerNode is different in eu-central-1

Hello Team ,

https://github.com/kbst/terraform-kubestack/blob/master/aws/_modules/eks/workers_asg.tf

The filter provided for retrieving the AMI for creating EKS Worked node is not working for eu-central-1 region. It's not retrieving any AMI .

aws ec2 --region eu-central-1 describe-images --filters "Name=name,Values=amazon-eks-node*" --owner 602401143452

{
"Images": []
}

When tried the same from us-west-2 regions . It is able to retrieve results.

The reason being . The EKS-worker-node AMIs in eu-central-1 are having the EKS-K8s versions in it .

"Name": "amazon-eks-node-1.11-v20190109"

So the filter value has to be

["amazon-eks-node-${aws_eks_cluster.current.version}-v*"]

{
            "Architecture": "x86_64",
            "CreationDate": "2019-01-10T00:31:41.000Z",
            "ImageId": "ami-010caa98bae9a09e2",
            "ImageLocation": "602401143452/amazon-eks-node-1.11-v20190109",
            "ImageType": "machine",
            "Public": true,
            "OwnerId": "602401143452",
            "State": "available",
            "BlockDeviceMappings": [
                {
                    "DeviceName": "/dev/xvda",
                    "Ebs": {
                        "DeleteOnTermination": true,
                        "SnapshotId": "snap-0cb69b54efae33948",
                        "VolumeSize": 20,
                        "VolumeType": "gp2",
                        "Encrypted": false
                    }
                }
            ],
            "Description": "EKS Kubernetes Worker AMI with AmazonLinux2 image",
            "EnaSupport": true,
            "Hypervisor": "xen",
            "Name": "amazon-eks-node-1.11-v20190109",
            "RootDeviceName": "/dev/xvda",
            "RootDeviceType": "ebs",
            "SriovNetSupport": "simple",
            "VirtualizationType": "hvm"
        }

Please make the corresponding changes so that it works for eu-central-1 region as well.

Version 1.11 of Terraform kubernetes provider causes stack launch failure

Using the latest version of the kubernetes provider fails to launch the aws quickstart stack (quickstart/src/configurations/eks) with error:

Error: Failed to initialize config: invalid configuration: no configuration has been provided

This seems to be a known issue - see hashicorp/terraform-provider-kubernetes#759

Pinning the provider at 1.10 seems to be a workaround for the issue:

quickstart/src/configurations/eks/versions.tf
terraform {
  required_version = ">= 0.12"
  required_providers {
    kubernetes = "1.10"
  }
}

GKE: Make autoscaling the default for clusters

Goal of the Kubestack modules is to harmonize the managed K8s solutions across the different cloud providers as much as possible. This should be reflected in the input variables too. Currently EKS has min, max node settings. The GKE module does not currently enable the autoscaler and does not expose the min, max variables.

Extending tf definitions of kubestack clusters with custom requirements

Currently none of the kubestack modules have any outputs. This might make extending the clusters with custom terraform code difficult if one also wishes not to touch the internals in order to provision custom resources around the cluster.

There's a couple questions really:

  1. How do we envision extending the cluster with custom terraform declarations?
  2. How do we upgrade the kubestack version?
  3. Should we start implementing outputs similar to the output proposed in #133 ?

I envision upgrades being a rather manual process, maybe git merge upstream, keeping kbst/terraform as the upstream.
At the same time I'd envision a single extensions.tf using the proposed outputs in order to extend the clusters with custom terraform declarations.

AKS: vnet/subnet names don't follow convention

When the AKS module creates vnet and subnet names, it does not use the metadata name. The generated names only worked for multiple environments in the same Azure resource group. But not for multiple clusters per environment in the same resource group.

Error while executing `terraform apply`

Hi there,

Following the quickstart documentation in the step 3. Bootstrap the Ops-and Apps-cluster pair I got the following error:

$ terraform apply --auto-approve

module.gke_zero.module.cluster.data.external.gcloud_account: Refreshing state...
module.gke_zero.module.cluster.data.google_client_config.default: Refreshing state...
module.gke_zero.module.cluster.module.cluster_services.data.kustomization.current: Refreshing state...

Error: Failed to initialize config: invalid configuration: no configuration has been provided

  on .terraform/modules/gke_zero/google/_modules/gke/provider.tf line 4, in provider "kubernetes":
   4: provider "kubernetes" {

Am I forgetting some configuration or could this be a bug?

Cheers!

Grant permissions to GKE nodes to pull images from GCR

Hi,

When provisioning the GKE infrastructure, it would be great to be able to pull images from the GCR that is within the same project as the GKE cluster.

Currently, there is an error:

Failed to pull image "gcr.io/xxxx-xxx/img:tag": rpc error: code = Unknown desc = Error response from daemon: pull access denied for gcr.io/xxxx-xxx/api, repository does not exist or may require 'docker login'

Update Get Started tutorial

The .github/workflows/main.yaml pipeline at the Get Started tutorial should be updated now that new repositories are named "main" instead of "master", I tried this change and it seems to be working:

github.ref == 'refs/heads/master' || github.ref == 'refs/heads/main' || startsWith(github.ref, 'refs/tags/apps-deploy-')

without this change, Terraform apply won't run in the ops cluster when merging into master

Create secret containing cluster information

The AWS Cluster Autoscaler needs to know the name of the cluster for auto discovery. Currently the recommended way of getting this information in the deployment is through secrets. Only the cluster name is needed, however, any other information may become beneficial later on.

how kubestacks can be used with terraform Cloud

Hi,

I've been reading about kubestacks and I like the idea of gitops though we are using TF Cloud and bootstrap documentation focuses only on running TF locally or via pipeline. Is it possible to implement kuberstack on TF Cloud? Thank you

GKE: Do not use default compute service account for clusters

We aim to support both separate project_id for ops and apps as well as same project_id for ops and apps so that teams that can't easily have a second project_id can use Kubestack.

For that, ops and apps should not use the default compute service account for nodes anymore but have dedicated, per cluster service accounts.

EKS: Does not have attribute error on destroy right after a successful destroy

Step #3 - "terraform destroy": Error: Error applying plan:
Step #3 - "terraform destroy": 
Step #3 - "terraform destroy": 3 error(s) occurred:
Step #3 - "terraform destroy": 
Step #3 - "terraform destroy": * module.eks_zero.module.cluster.module.node_pool.var.cluster_ca: Resource 'aws_eks_cluster.current' does not have attribute 'certificate_authority.0.data' for variable 'aws_eks_cluster.current.certificate_authority.0.data'
Step #3 - "terraform destroy": * module.eks_zero.module.cluster.module.node_pool.var.cluster_endpoint: Resource 'aws_eks_cluster.current' does not have attribute 'endpoint' for variable 'aws_eks_cluster.current.endpoint'
Step #3 - "terraform destroy": * module.eks_zero.module.cluster.module.node_pool.var.cluster_name: Resource 'aws_eks_cluster.current' does not have attribute 'name' for variable 'aws_eks_cluster.current.name'

Possibly related: terraform-aws-modules/terraform-aws-eks#262

Make the Setup of the Ingress More Modular

There are some instances where the networking setup requires different approaches to what KubeStack has (for instance running the cluster using Cloudflare as the DNS). it would be great to pull out the Ingress setup and make it more modular so you could have an opt-In approach.

EKS: Support assuming roles across organizations in aws-iam-authenticator

To support the common setup of one org for identities and sub orgs for the clusters, aws-iam-authenticator needs to be configured to assume a role. The role may need to be created first. And we need to figure out if and if so how to allow assuming the created role.

Automating the creation and setup of that role seems preferable over expecting the user to provide a role. Subject to investigation and discussion.

GKE: Do not use default VPC

We aim to support both separate project_id for ops and apps as well as same project_id for ops and apps so that teams that can't easily have a second project_id can use Kubestack.

For that, ops and apps should not use the default VPC for nodes anymore but have dedicated, per cluster VPCs.

AKS: Automatically rotate service principal credentials

The AKS module currently creates a random password to use for the service principal valid for 90 days. However the password is currently not rotated. There are instructions how to do this using the Azure CLI.

Test if the credential rotation can be automated using Terraform and local provisioners. So that we can keep a password that's only valid for a limited number of days and then rotate it before it expires.

Credentials in state and log output

Currently, the way how a kubeconfig file is generated for EKS, AKS and GKE has the downside of credentials ending up in TF state and log output from TF runs.

The idea is to instead of generating kubeconfig files, using environment variables to configure kubectl to apply the kustomize output.

The TF kubernetes provider may cause similar problems. Needs to be investigated as well.

Feature request: cluster logging [EKS]

After manually enabling control plane logging on a Kubestack-managed cluster via the EKS console, a terraform plan shows this diff:

  # module.eks_platform.module.cluster.aws_eks_cluster.current will be updated in-place
  ~ resource "aws_eks_cluster" "current" {
        [...]
      ~ enabled_cluster_log_types = [
          - "api",
          - "audit",
          - "authenticator",
          - "controllerManager",
          - "scheduler",
        ]

It would be great if the github.com/kbst/terraform-kubestack//aws/cluster module could accept arguments to set enabled_cluster_log_types.

Can no longer get DNS information

The current documentation states that to get the required DNS information, you need to run terraform output --module.... However, when running this with Terraform 0.12+ you get the following error:

Error: Unsupported option

The -module option is no longer supported since Terraform 0.12, because now
only root outputs are persisted in the state.

The Terraform documentation states that you must now specify outputs by adding:

output "fqdn" {
  value = module.cluster_metadata.fqdn
}

and then access it via terraform output fqdn.

Another output for ingress_zone_name_servers is also needed.

Make Workspace Names Configurable

While the Ops/Apps names are valid and I understand the theory behind it. Some companies have very strict naming patterns to identify between production and non-production (also some companies are pedantic for instance all env's must have 3 letters, prd, stg). It would be great if you could switch out the default workspace names in order to allow for companies with strict naming patterns tp not move away from their best practices

Cluster Recreate Causes Kustomize Module To Fail

Problem

We are recreating our cluster to enable the Private Node Pools, the issue seems to be that because we are recreating the cluster, the Kustomize Provider is trying to communicate with the Kubernetes Cluster on the default Localhost

Logs

Error: ResourceDiff: Get "http://localhost/api?timeout=32s": dial tcp 127.0.0.1:80: connect: connection refused

  on .terraform/modules/gke_zero/common/cluster_services/main.tf line 16, in resource "kustomization_resource" "current":
  16: resource "kustomization_resource" "current" {


Error: Process completed with exit code 1.

Steps To Reproduce

Create a cluster with the setting:

enable_private_nodes = false

Then Once created change the value:

enable_private_nodes = true

and run on that TF workspace:

terraform plan

Workaround

Currently there is a workaround by using:

terraform apply --target=<cluster module>

This will update the cluster which should then fix the problem

Support OpenStack

Are there plans on KubeStack supporting deployment on OpenStack as well? I would be very much interested in this.

Thanks!

Remove `both` overlay

On the infrastructure side, ops inherits everything from apps and can overwrite attributes if necessary.

On the cluster manifests side however, the both overlay breaks that same logic. Instead of ops and apps both inheriting from both, ops should simply inherit from apps and apps would inherit from common.

GitHub template project for kubestack based local and other cloud providers K8s clusters.

I was looking for a way to manage my local kind based k8s cluster with installed required components. I have initially automated the requirement; however, I looked for a better approach and learned about the Kubestack project.

I did try the Kind Kubestack example, and it worked fine; however, getting it worked without using a container is a bit hacky(compiling and installing in the terraform modules in the local .terraform folder). I thought it would be great if we will GitHub template project with kind and kustomize terraform modules with one terraform stack instead of two, and based on that, the user can easily create local or other cloud-based projects as per the requirement.

Let me know If you are up with this, and I will be happy to contribute to the corresponding changes. Thanks.

AKS: Adding labels to node_pool requires delete and recreate of the cluster

Adding node_labels to the default node pool forces a destroy and recreate plan.

  # module.aks_zero.module.cluster.azurerm_kubernetes_cluster.current must be replaced
-/+ resource "azurerm_kubernetes_cluster" "current" {
     [...]

      ~ default_node_pool {
          - availability_zones    = [] -> null
          - enable_node_public_ip = false -> null
          ~ max_pods              = 110 -> (known after apply)
            name                  = "default"
          ~ node_count            = 1 -> (known after apply)
          ~ node_labels           = { # forces replacement
              + "kubestack.com-cluster_domain"          = "azure.infra.serverwolken.de"
              + "kubestack.com-cluster_fqdn"            = "kbstacctest-ops-westeurope.azure.infra.serverwolken.de"
              + "kubestack.com-cluster_name"            = "kbstacctest-ops-westeurope"
              + "kubestack.com-cluster_provider_name"   = "azure"
              + "kubestack.com-cluster_provider_region" = "westeurope"
              + "kubestack.com-cluster_workspace"       = "ops"
            }
          - node_taints           = [] -> null
          ~ orchestrator_version  = "1.18.14" -> (known after apply)
          - tags                  = {} -> null
            # (7 unchanged attributes hidden)
        }
    }

GKE Service Account Name to Long

Problem

The service account name that is created, has a limit of 28 characters, this means that if your workspace name is to long i.e (kubestack-staging) when you run apply you will get the following error:

Acquiring state lock. This may take a few moments...
Refreshing Terraform state in-memory prior to plan...
The refreshed state will be used to calculate this plan, but will not be
persisted to local or remote state storage.
module.gke_zero.module.cluster.data.external.gcloud_account: Refreshing state...
module.gke_zero.module.cluster.data.google_client_config.default: Refreshing state...
module.gke_zero.module.cluster.module.cluster_services.data.kustomization.current: Refreshing state...
------------------------------------------------------------------------
Error: "account_id" ("sp-kubestack-staging-us-central1") doesn't match regexp "^[a-z](?:[-a-z0-9]{4,28}[a-z0-9])$"
  on .terraform/modules/gke_zero/google/_modules/gke/service_account.tf line 1, in resource "google_service_account" "current":
   1: resource "google_service_account" "current" {

Short term workaround

Obviously the work around is to minimize the workspace name, however if this is the proposed solution organisations might have to change naming standards (i.e a organisation that is using staging everywhere would need to start using stg). This should then be documented

Proposed fix

I am not sure about the complexity that this would create, but a potential fix would be to remove the region that is added as the suffix thus allowing for more flexibility in naming conventions

Feature request: arbitrary runtime script execution

Hi, really liking Kubestack so far! ๐Ÿ™

It would be useful if the kubestack/framework image had a location where arbitrary scripts could be placed, which would then be executed at runtime by the default entrypoint.

This would facilitate running some extra tasks at runtime without needing to override the default entrypoint (and fixing it when that entrypoint changes).

It would enable some nice and lazy thing like, say, authenticating to a cluster in CI (just to illustrate):

FROM kubestack/framework:v0.11.0-beta.0-eks
RUN echo "[ -f $HOME/.kube/config ] || aws eks update-kubeconfig --name platform-apps-us-east-2 --alias woot --region us-east-2" >> /opt/bin/entrypoint.d/cluster-auth.sh \
    && chmod +x /opt/bin/entrypoint.d/cluster-auth.sh

Example of other images using this pattern

  • postgres uses /docker-entrypoint-initdb.d/ as described here
  • nginx uses /docker-entrypoint.d/ as shown here

GKE By default, include NAT gateway to allow internet access

Following upstream, the new default for GKE clusters is to use private nodes. Private nodes do not have internet access and require a NAT gateway.

Check if upstream includes a NAT gateway by default, if so, include one as well and add opt-out varialbe for NAT gateway provisioning.

Nss-wrapper doesn't work with user and group id 0

We need to use the same user id inside the container, as the user has outside the container, so that files written to the mounted volume from inside the container are accessible by the user on the host.

In addition to that the nss-wrapper script used as the ENTRYPOINT ensures a user and group are defined inside the container with matching user and group ids.

This works well in the current implementation for Mac and Linux where the user most likely runs as an unprivileged user. On Windows using WSL, it seems the user is always root. For some reason uid 0 and gid 0 outside the container breaks the script when the container is started like this:

docker run --rm -ti \
    -v `pwd`:/infra \
    -u `id -u`:`id -g` \
    IMAGE:TAG

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.