Giter VIP home page Giter VIP logo

terraform-google-jx's Introduction

Jenkins X GKE Module

Terraform Version

NOTE: While the required minimum Terraform version is 0.12.0, automated CI tests are performed with 0.13 only. The only expected compatibility issues to be aware of are around provider requirements. For more information see here


This repo contains a Terraform module for provisioning a Kubernetes cluster for Jenkins X on Google Cloud.

What is a Terraform module

A Terraform "module" refers to a self-contained package of Terraform configurations that are managed as a group. For more information around modules refer to the Terraform documentation.

How do you use this module

Prerequisites

To make use of this module, you need a Google Cloud project. Instructions on how to setup such a project can be found in the Google Cloud Installation and Setup guide. You need your Google Cloud project id as an input variable for using this module.

You also need to install the Cloud SDK, in particular gcloud. You find instructions on how to install and authenticate in the Google Cloud Installation and Setup guide as well.

Once you have gcloud installed, you need to create Application Default Credentials by running:

gcloud auth application-default login

Alternatively, you can export the environment variable GOOGLE_APPLICATION_CREDENTIALS referencing the path to a Google Cloud service account key file.

Last but not least, ensure you have the following binaries installed:

  • gcloud
  • kubectl ~> 1.14.0
    • kubectl comes bundled with the Cloud SDK
  • terraform ~> 0.12.0
    • Terraform installation instruction can be found here

Cluster provisioning

A default Jenkins X ready cluster can be provisioned by creating a file main.tf in an empty directory with the following content:

module "jx" {
  source  = "jenkins-x/jx/google"

  gcp_project = "<my-gcp-project-id>"
}

output "jx_requirements" {
  value = module.jx.jx_requirements
}  

You can then apply this Terraform configuration via:

terraform init
terraform apply

This creates a cluster within the specified Google Cloud project with all possible configuration options defaulted.

⚠️ Note: This example is for getting up and running quickly. It is not intended for a production cluster. Refer to Production cluster considerations for things to consider when creating a production cluster.

On completion of terraform apply a jx_requirements output is available which can be used as input to jx boot. Refer to Running jx boot for more information.

In the default configuration, no custom domain is used. DNS resolution occurs via nip.io. For more information on how to configure and use a custom domain, refer to Using a custom domain.

If you just want to experiment with Jenkins X, you can set force_destroy to true. This allows you to remove all generated resources when running terraform destroy, including any generated buckets including their content.

If you want to remove a cluster with the terraform destroy command and the cluster is protected by the deletion_protection=true attribute, you can override the attribute by setting the delete_protect variable to false. It is recommended to override this value and the time of cluster deletion and you should successfully apply the attribute value change before attempting the terraform destroy command.

The following two paragraphs provide the full list of configuration and output variables of this Terraform module.

Inputs

Name Description Type Default Required
apex_domain The parent / apex domain to be used for the cluster string "" no
apex_domain_gcp_project The GCP project the apex domain is managed by, used to write recordsets for a subdomain if set. Defaults to current project. string "" no
apex_domain_integration_enabled Flag that when set attempts to create delegation records in apex domain to point to domain created by this module bool true no
artifact_description artifact registry repository Description string "jenkins-x Docker Repository" no
artifact_enable Create artifact registry repository bool true no
artifact_location artifact registry repository Location string "us-central1" no
artifact_repository_id artifact registry repository Name string "oci" no
autoscaler_location_policy location policy for primary node pool string "ANY" no
autoscaler_max_node_count primary node pool max nodes number 5 no
autoscaler_min_node_count primary node pool min nodes number 3 no
bucket_location Bucket location for storage string "US" no
cluster_location The location (region or zone) in which the cluster master will be created. If you specify a zone (such as us-central1-a), the cluster will be a zonal cluster with a single cluster master. If you specify a region (such as us-west1), the cluster will be a regional cluster with multiple masters spread across zones in the region string "us-central1-a" no
cluster_name Name of the Kubernetes cluster to create string "" no
cluster_network The name of the network (VPC) to which the cluster is connected string "default" no
cluster_subnetwork The name of the subnetwork to which the cluster is connected. Leave blank when using the 'default' vpc to generate a subnet for your cluster string "" no
create_ui_sa Whether the service accounts for the UI should be created bool true no
delete_protect Flag used to set the deletion_protection attribute to prevent cluster deletion bool true no
dev_env_approvers List of git users allowed to approve pull request for dev enviornment repository list(string) [] no
enable_backup Whether or not Velero backups should be enabled bool false no
enable_primary_node_pool create a node pool for primary nodes if disabled you must create your own pool bool true no
enable_private_endpoint (Beta) Whether the master's internal IP address is used as the cluster endpoint. Requires VPC-native bool false no
enable_private_nodes (Beta) Whether nodes have internal IP addresses only. Requires VPC-native bool false no
force_destroy Flag to determine whether storage buckets get forcefully destroyed bool false no
gcp_project The name of the GCP project to use string n/a yes
git_owner_requirement_repos The git id of the owner for the requirement repositories string "" no
gsm Enables Google Secrets Manager, not available with JX2 bool false no
initial_cluster_node_count Initial number of cluster nodes number 3 no
initial_primary_node_pool_node_count Initial primary node pool nodes number 3 no
ip_range_pods The IP range in CIDR notation to use for pods. Set to /netmask (e.g. /18) to have a range chosen with a specific netmask. Enables VPC-native string "" no
ip_range_services The IP range in CIDR notation use for services. Set to /netmask (e.g. /21) to have a range chosen with a specific netmask. Enables VPC-native string "" no
jenkins_x_namespace Kubernetes namespace to install Jenkins X in string "jx" no
jx2 Is a Jenkins X 2 install bool true no
jx_bot_token Bot token used to interact with the Jenkins X cluster git repository string "" no
jx_bot_username Bot username used to interact with the Jenkins X cluster git repository string "" no
jx_git_operator_version The jx-git-operator helm chart version string "0.0.192" no
jx_git_url URL for the Jenins X cluster git repository string "" no
kuberhealthy Enables Kuberhealthy helm installation bool true no
lets_encrypt_production Flag to determine wether or not to use the Let's Encrypt production server. bool true no
master_authorized_networks List of master authorized networks. If none are provided, disallow external access (except the cluster node IPs, which GKE automatically allowlists). list(object({ cidr_block = string, display_name = string })) [] no
master_ipv4_cidr_block The IP range in CIDR notation to use for the hosted master network. This range must not overlap with any other ranges in use within the cluster's network, and it must be a /28 subnet string "10.0.0.0/28" no
max_pods_per_node Max gke nodes = 2^($CIDR_RANGE_PER_NODE-$POD_NETWORK_CIDR) (see gke docs) number 64 no
node_disk_size Node disk size in GB string "100" no
node_disk_type Node disk type, either pd-standard or pd-ssd string "pd-standard" no
node_machine_type Node type for the Kubernetes cluster string "n1-standard-2" no
node_preemptible Use preemptible nodes bool false no
node_spot Use spot nodes bool false no
parent_domain Deprecated Please use apex_domain variable instead.r string "" no
parent_domain_gcp_project Deprecated Please use apex_domain_gcp_project variable instead. string "" no
release_channel The GKE release channel to subscribe to. See https://cloud.google.com/kubernetes-engine/docs/concepts/release-channels string "REGULAR" no
resource_labels Set of labels to be applied to the cluster map(any) {} no
subdomain Optional sub domain for the installation string "" no
tls_email Email used by Let's Encrypt. Required for TLS when apex_domain is specified string "" no
vault_url URL to an external Vault instance in case Jenkins X shall not create its own system Vault string "" no
velero_namespace Kubernetes namespace for Velero string "velero" no
velero_schedule The Velero backup schedule in cron notation to be set in the Velero Schedule CRD (see default-backup.yaml) string "0 * * * *" no
velero_ttl The the lifetime of a velero backup to be set in the Velero Schedule CRD (see default-backup.yaml) string "720h0m0s" no
version_stream_ref The git ref for version stream to use when booting Jenkins X. See https://jenkins-x.io/docs/concepts/version-stream/ string "master" no
version_stream_url The URL for the version stream to use when booting Jenkins X. See https://jenkins-x.io/docs/concepts/version-stream/ string "https://github.com/jenkins-x/jenkins-x-versions.git" no
webhook Jenkins X webhook handler for git provider string "lighthouse" no
zone Zone in which to create the cluster (deprecated, use cluster_location instead) string "" no

Outputs

Name Description
backup_bucket_url The URL to the bucket for backup storage
cluster_location The location of the created Kubernetes cluster
cluster_name The name of the created Kubernetes cluster
connect The cluster connection string to use once Terraform apply finishes
externaldns_dns_name ExternalDNS name
externaldns_ns ExternalDNS nameservers
gcp_project The GCP project in which the resources got created
jx_requirements The jx-requirements rendered output
log_storage_url The URL to the bucket for log storage
report_storage_url The URL to the bucket for report storage
repository_storage_url The URL to the bucket for artifact storage
tekton_sa_email The Tekton service account email address, useful to provide further IAM bindings
tekton_sa_name The Tekton service account name, useful to provide further IAM bindings
vault_bucket_url The URL to the bucket for secret storage

Artifact Registry in setup with multiple Jenkins X clusters

In a multi cluster setup, you should leave the value of artifact_enable as true only in a development cluster and set artifact_enable = false for other clusters. A development cluster is one where application build pipelines are executed. If you have multiple development clusters you can set artifact_repository_id to different values for them. Alternatively you can have artifact_enable = true in one and manually copy the values of cluster.registry and cluster.dockerRegistryOrg from jx-requirements.yml from that cluster repository to the other cdevelopment cluster repositories.

If you leave artifact_enable as true for multiple clusters and don't override artifact_repository_id terraform will fail since it can't create an already existing repository.

Migration from Container to Artifact Registry

Google has deprecated gcr.io and now recommends the use of Artifact Registry. The default of this module is now to create and use a repository in Artifact Registry for container images.

Google GKE clusters automatically have permissions to download from the Artifact Registry. For multi cluster setups across different projects, additional permission configurations may be necessary.

Configuration Note

The jx-requirements.yml will be automatically updated by the Jenkins X boot job when triggered by a push to the main branch of the cluster repository.

Migration Options

Here are two strategies for transitioning container images from gcr.io to the Artifact Registry:

Don't Migrate Existing Images
  • Continue developing applications as usual. New images, upon their release, will be pushed to the Artifact Registry.
  • Important: Ensure that all builds are triggered and applications are promoted before Google completely shuts down the Container Registry. This step is critical to avoid disruptions in service. To identify which images from you container registry are currently used in your cluster, you can use the following command line (replace project_id with your actual GCP project id):
kubectl get pods --all-namespaces -o jsonpath="{range .items[*].spec['initContainers', 'containers'][*]}{.image}{'\n'}{end}" | fgrep gcr.io/project_id | sort -u
Migrate Existing Images

If you have a large number of applications running that are unlikely to be released in the coming year, migration of images to artifact registry while retaining the image names (in the domain gcr.io) could be considered. This means that existing helm charts will continue to work.

This process is not supported by this terraform module, instead you need to follow the steps outlined in the guide Set up repositories with gcr.io domain support. These steps include create the a repository in Artifact Registry, migrate images to it from container registry and enable redirection of gcr.io traffic.

If you keep the default settings for this module it will create another artifact repository that will be used for new images. If you want to use gcr.io artifact repository for new images you should set artifact_enable = false.

Using a custom domain

If you want to use a custom domain with your Jenkins X installation, you need to provide values for the variables apex_domain and tls_email. apex_domain is the fully qualified domain name you want to use and tls_email is the email address you want to use for issuing Let's Encrypt TLS certificates.

Before you apply the Terraform configuration, you also need to create a Cloud DNS managed zone, with the DNS name in the managed zone matching your custom domain name, for example in the case of example.jenkins-x.rocks as domain:

Creating a Managed Zone

When creating the managed zone, a set of DNS servers get created which you need to specify in the DNS settings of your DNS registrar.

DNS settings of Managed Zone

It is essential that before you run jx boot, your DNS servers settings are propagated, and your domain resolves. You can use DNS checker to check whether your domain settings have propagated.

When a custom domain is provided, Jenkins X uses ExternalDNS together with cert-manager to create A record entries in your managed zone for the various exposed applications.

If apex_domain id not set, your cluster will use nip.io in order to create publicly resolvable URLs of the form http://<app-name>-<environment-name>.<cluster-ip>.nip.io.

Production cluster considerations

The configuration as seen in Cluster provisioning is not suited for creating and maintaining a production Jenkins X cluster. The following is a list of considerations for a production usecase.

  • Specify the version attribute of the module, for example:

    module "jx" {
      source  = "jenkins-x/jx/google"
      version = "1.2.4"
      # insert your configuration
    }
    
    output "jx_requirements" {
     value = module.jx.jx_requirements
    }

    Specifying the version ensures that you are using a fixed version and that version upgrades cannot occur unintented.

  • Keep the Terraform configuration under version control, by creating a dedicated repository for your cluster configuration or by adding it to an already existing infrastructure repository.

  • Setup a Terraform backend to securely store and share the state of your cluster. For more information refer to Configuring a Terraform backend.

Configuring a Terraform backend

A "backend" in Terraform determines how state is loaded and how an operation such as apply is executed. By default, Terraform uses the local backend which keeps the state of the created resources on the local file system. This is problematic since sensitive information will be stored on disk and it is not possible to share state across a team. When working with Google Cloud a good choice for your Terraform backend is the gcs backend which stores the Terraform state in a Google Cloud Storage bucket. The examples directory of this repository contains configuration examples for using the gcs backed with and without optionally configured customer supplied encryption key.

To use the gcs backend you will need to create the bucket upfront. You can use gsutil to create the bucket:

gsutil mb gs://<my-bucket-name>/

It is also recommended to enable versioning on the bucket as an additional safety net in case of state corruption.

gsutil versioning set on gs://<my-bucket-name>

You can verify whether a bucket has versioning enabled via:

gsutil versioning get gs://<my-bucket-name>

FAQ

How do I get the latest version of the terraform-google-jx module

terraform init -upgrade

How to I specify a specific google provider version

provider "google" {
  version = "~> 2.12.0"
  project = var.gcp_project
}

provider "google-beta" {
  version = "~> 2.12.0"
  project = var.gcp_project
}

Why do I need Application Default Credentials

The recommended way to authenticate to the Google Cloud API is by using a service account. This allows for authentication regardless of where your code runs. This Terraform module expects authentication via a service account key. You can either specify the path to this key directly using the GOOGLE_APPLICATION_CREDENTIALS environment variable or you can run gcloud auth application-default login. In the latter case gcloud obtains user access credentials via a web flow and puts them in the well-known location for Application Default Credentials (ADC), usually ~/.config/gcloud/application_default_credentials.json.

Development

Releasing

At the moment there is no release pipeline defined in jenkins-x.yml. A Terraform release does not require building an artifact, only a tag needs to be created and pushed. To make this task easier and there is a helper script release.sh which simplifies this process and creates the changelog as well:

./scripts/release.sh

This can be executed on demand whenever a release is required. For the script to work the envrionment variable $GH_TOKEN must be exported and reference a valid GitHub API token.

How do I contribute

Contributions are very welcome! Check out the Contribution Guidelines for instructions.

terraform-google-jx's People

Contributors

abayer avatar adolfo avatar ankitm123 avatar cagiti avatar garethjevans avatar haysclark avatar hferentschik avatar hrvolapeter avatar jenkins-x-bot avatar jenkins-x-bot-test avatar joostvdg avatar joshuasimon-taulia avatar jotka avatar jstrachan avatar khos2ow avatar kyounger avatar m1pl avatar mmn0o7 avatar msvticket avatar patrickleet avatar paukul avatar peter-poki avatar rawlingsj avatar sergiogiuffrida avatar tgelpi avatar tomhobson avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

terraform-google-jx's Issues

Restrict the length of the google service accounts

Atm, the Google service account ids are prefixed with the cluster name. The total length is 30 characters. This can easily lead to problems:

Error: "account_id" ("pr-6999-252-terraform-boot-vault-ko") doesn't match regexp "^[a-z](?:[-a-z0-9]{4,28}[a-z0-9])$"
  on ../terraform-google-**/modules/cluster/serviceaccount.tf line 28, in resource "google_service_account" "kaniko_sa":
  28: resource "google_service_account" "kaniko_sa" {

Let's try to ensure that the name is capped in case it is too long.

Have separate pipeline steps for cluster creation and shellspec tests

At the moment cluster creation and testing happens in the same pipeline step within the ci.sh script. When looking at the logs it would be nice to have separate steps for that.

The reason for having the tests in ci.sh is that the cluster destruction happens currently as well in ci.sh via a tap handler.

One solution would be to split ci.sh into two scripts.

zone is not copied into jx-requirements.yaml

When setting up a gke cluster using this module (v1.4.0) with a non-default zone (europe-west1-b in my case) the terraform module does not copy it into the resulting jx-requirements.yml.

Therefore calling jx boot -r jx-requirements.yml afterwards fails since jx-requirements.yml points to zone us-central1-a.

Clarify getting started example in README

Currently, the example uses:

module "jx" {
  source  = "jenkins-x/jx/google"

  gcp_project = var.gcp_project
}

That's confusing since using a variable for the GCP project would also force you to define a variables.tf. To simplify the example should not use variables, but show to use the property inlined.

Fail to apply module because of Error 403: The caller does not have permission, forbidden

I am logged in with correct email on gcloud but I get all this errors:
Screenshot 2020-05-30 at 16 48 38

Steps to reproduce:

  • gcloud init -> choose email and project by id
  • gcloud auth application-default login
  • add main.tf ->
    module "jx" {
      source = "jenkins-x/jx/google"
      gcp_project = "[project-name]"
      cluster_name = "[cluster-name]"
      force_destroy = true
      version = "1.3.2"
      zone = "europe-west3-a"
    }
    
  • terraform init
  • terraform apply

Expected:
Finish apply process successfully

Terraform plan

When I execute terraform plan, there are 54 resources to be created. Are all those mandatory? Also, what are they used for? If terraform plan is also used as "executable specification of requirements", we need to know what is each of those resources used for, so that we know what to create depending on the choices of the components we want to use with JX.

Allow specifying backup bucket location

It seems I cannot specify the location of the bucket for the backups: https://github.com/jenkins-x/terraform-google-jx/blob/master/modules/backup/main.tf#L7

I do not see an option. And when I plan I get the following:

# module.jx.module.backup.google_storage_bucket.backup_bucket will be created
  + resource "google_storage_bucket" "backup_bucket" {
      + bucket_policy_only = (known after apply)
      + force_destroy      = false
      + id                 = (known after apply)
      + location           = "US"
      + name               = (known after apply)
      + project            = (known after apply)
      + self_link          = (known after apply)
      + storage_class      = "STANDARD"
      + url                = (known after apply)
    }
      + location           = "US"

As EU citizen, I generally want to store my data in the EU, or at least have the option to. See: https://cloud.google.com/storage/docs/locations

jx-requirements.yaml output for devEnvApprovers is malformed

After successfully creating a cluster as per Readme, the resulting jx-requirements.yaml file does not have a properly formed empty value for devEnvApprovers.

cluster:
  clusterName: "dks-cjxd8-terra"
  devEnvApprovers: 

  environmentGitOwner: ""
  project: "apps-dev-229310"
  provider: gke
  zone: "us-central1-a"

Rremove release pipeline from jenkins-x.yml

To better control, semantic versioning lets, for now, remove the release pipeline config and to only manual tags. For Terraform Module releases a tag is all that is needed.

With the current release pipeline, we always will get a patch version increment which is not what you want if you want to increase minor or patch.

Generate terraform output for jx-requirements rather than a local file

The intention of the jx-requirements.yaml file is that it's only ever read at the initial setup of Jenkins X on a new cluster. With that said, producing the local_file resource means that terraform constantly manages that resource and wants to update a file which no longer has as much relevance to the cluster than it did.
The proposal is to define the jx-requirements.yaml as an output that could possibly be used to perform the initial boot like (after the terraform module has created all of the resources):

jx boot -r <(terraform output jx-requirements)

Having the Jenkins X requirements available in this manner removes the pain of having to maintain the local file within terraform and the git repository.

Add tests for conditional logic within Terraform scripts

Certain flags enable conditional resources. We should be able to tests that these conditional work by using shellspec and the terraform plan. Looking at the plan should be enough to assert that certain resources will or will not be created.

jx-requirements.yml out of sync

Initially the tf module generates the jx-requirements.yml which is meant to be provided to jx boot -r.
However jx boot creates a new environment git repository that contains jx-requirements.yml as well.
This is potentially inconsistent because changes made using subsequent tf module executions do not make it into the environment repository.

An approach to solve this could be to let the tf module write the information contained within jx-requirements.yml into the k8s cluster so that jx boot could read it from there more consistently.

Allow to provide external Vault configuration

The latest jx version allows using an internal (using the Vault operator) or external Vault instance for secret storage. The Terraform module should allow configuring either of these two options.

Wrong path to jx-requirements.yaml.tpl when using Terraform registry as source

main.tf:

module "jx" {
  source  = "jenkins-x/jx/google"
  version = "1.2.1"
  gcp_project = "foo"
}
$ terraform init 
$ terraform plan
Error: Error in function call

  on .terraform/modules/jx/terraform-google-jx-1.2.1/main.tf line 186, in resource "local_file" "jx-requirements":
 186:   content = templatefile("${path.cwd}/modules/jx-requirements.yaml.tpl", {

Error applying IAM policy for service account after running terraform apply for creating gks cluster

After running terraform apply using terraform file

module "jx" {
  source  = "jenkins-x/jx/google"

  gcp_project = "<my-gcp-project-id>"
}

At the end of process I get:

Error: Error applying IAM policy for service account 'projects/x-project-275408/serviceAccounts/[email protected]': Error setting IAM policy for service account 'projects/x-project-275408/serviceAccounts/[email protected]': googleapi: Error 400: Identity namespace does not exist (x-project-275408.svc.id.goog)., badRequest

on .terraform/modules/jx/terraform-google-jx-1.2.5/modules/backup/main.tf line 44, in resource "google_service_account_iam_member" "velero_sa_workload_identity_user":
44: resource "google_service_account_iam_member" "velero_sa_workload_identity_user" {

Error: Error applying IAM policy for service account 'projects/x-project-275408/serviceAccounts/[email protected]': Error setting IAM policy for service account 'projects/x-project-275408/serviceAccounts/[email protected]': googleapi: Error 400: Identity namespace does not exist (x-project-275408.svc.id.goog)., badRequest

on .terraform/modules/jx/terraform-google-jx-1.2.5/modules/cluster/serviceaccount.tf line 73, in resource "google_service_account_iam_member" "build_controller_sa_workload_identity_user":
73: resource "google_service_account_iam_member" "build_controller_sa_workload_identity_user" {

Error: Error applying IAM policy for service account 'projects/x-project-275408/serviceAccounts/[email protected]': Error setting IAM policy for service account 'projects/x-project-275408/serviceAccounts/[email protected]': googleapi: Error 400: Identity namespace does not exist (x-project-275408.svc.id.goog)., badRequest

on .terraform/modules/jx/terraform-google-jx-1.2.5/modules/cluster/serviceaccount.tf line 124, in resource "google_service_account_iam_member" "kaniko_sa_workload_identity_user":
124: resource "google_service_account_iam_member" "kaniko_sa_workload_identity_user" {

Error: Error applying IAM policy for service account 'projects/x-project-275408/serviceAccounts/tf-jx-arriving-frog-tekton@x-project-275408.iam.gserviceaccount.com': Error setting IAM policy for service account 'projects/x-project-275408/serviceAccounts/tf-jx-arriving-frog-tekton@x-project-275408.iam.gserviceaccount.com': googleapi: Error 400: Identity namespace does not exist (x-project-275408.svc.id.goog)., badRequest

on .terraform/modules/jx/terraform-google-jx-1.2.5/modules/cluster/serviceaccount.tf line 133, in resource "google_service_account_iam_member" "tekton_sa_workload_identity_user":
133: resource "google_service_account_iam_member" "tekton_sa_workload_identity_user" {

Creating namespaces seems like scope creep for this module

resource "kubernetes_namespace" "cert-manager" {
  metadata {
    name = var.cert-manager-namespace
  }
  lifecycle {
    ignore_changes = [
      metadata[0].labels,
      metadata[0].annotations,
    ]
  }
}

This module creates namespaces (jx, cert-manager, velero), which seems to be outside the scope of this modules intent. From past experiences it becomes difficult when mixing resource creation where resources have differing lifecycles and where more than one agent is responsible for configuring said resources (can cause conflicts).

Use Terraform variable validation for checking TLS email

The tls_email variable needs to be conditionally set if custom domains are enabled. Right now when you set parent_domain there is no validation which ensures that you also set tls_email. Missing to set this variable will read to issues later during jx boot.

Using Terraforms variable validation - https://www.terraform.io/docs/configuration/variables.html - this conditional dependency could be expressed. On the downside, variable validation is atm still an experimental feature.

Make creation of Velero resources optional

ATM, the backup configuration (backup storage bucket, namespace, service account, etc) are always created. Given that this is not fully tested/supported yet, it makes sense to only create resources conditionally. One needs to explicitly opt in.

On the downside, it means that if you at a later stage want to enable it, you can create the missing resources by enabling the feature and re-running terraform apply, but then you will have to manually sync the required jx-requirements.yml changes into the jx-requirements.yml of your dev repository.

Include disabled ingress section for jx-requirements.yaml

Even if a parent_domain parameter is not specified in a main.tf file, include a disabled ingress section in the jx-requirements.yaml output. This will better allow for the subsequent ability to supply DNS and/or TLS values prior to running jx boot

Remove intermediate jx module directory

The current repo layout looks like this:

├── jx
│   ├── main.tf
│   ├── modules
│   │   ├── backup
│   │   ├── cluster
│   │   ├── dns
│   │   └── vault
│   ├── output.tf
│   └── variables.tf
├── jx-requirements.yaml
├── jx-requirements.yaml.tpl
├── main.tf
├── output.tf
└── variables.tf

The intermediate jx directory seems superfluous and should be removed. This means we are effectively merging main.tf of the root level with the one in jx.

As a side effect, we have to duplicate/copy fewer variables and outputs across module boundaries.

Allow to specify cluster resource labels as variables

To allow to apply custom labels to the created cluster, it makes sense to allow specifying cluster resource labels as input variables to the main Terraform script.

They belong to the google_container_cluster resource and look like this:

        resource_labels  = {
            "env"       = "staging"
            "managedby" = "terraform"
        }

Conditionally create the managed zone for DNS

Currently, we are not creating the managed zone with Terraform, but require it to be created upfront when using a custom domain. There are two reasons for that:

  1. If Terraform creates the managed zone and Jenkins X get installed, new recordsets are getting created in the zone. This prevents a clean destroy since you first have to manually delete all records
  2. More importantly, if one creates the managed zone as part of the script, one will get a set of nameservers as output which one has to manually set with the DNS registrar. You then have to wait until the DNS settings have propagated before running jx boot.

Making the managed zone creation optional might be useful in some cases where you want to manage as much as possible via Terraform.

Not all input parameters are used

Using min_node_count and max_node_count

Using the min_node_count and max_node_count 
module "jx" {
  source  = "jenkins-x/jx/google"
  gcp_project = ...

  cluster_name = ...
  git_owner_requirement_repos = "hemp0r"
  min_node_count = 1
  max_node_count = 2
  dev_env_approvers = ["hemp0r"]
  zone = "europe-west3"
  
  force_destroy = true 
}

leads to

+ resource "google_container_node_pool" "jx_node_pool" {
      + cluster             = ...
      + id                  = (known after apply)
      + initial_node_count  = 3
      + instance_group_urls = (known after apply)
      + location            = "europe-west3"
      + max_pods_per_node   = (known after apply)
      + name                = "autoscale-pool"
      + name_prefix         = (known after apply)
      + node_count          = (known after apply)
      + project             = (known after apply)
      + region              = (known after apply)
      + version             = (known after apply)
      + zone                = (known after apply)

      + autoscaling {
          + max_node_count = 5
          + min_node_count = 3
        }

      + management {
          + auto_repair  = true
          + auto_upgrade = false
        }

      + node_config {
          + disk_size_gb      = 100
          + disk_type         = (known after apply)
          + guest_accelerator = (known after apply)
          + image_type        = (known after apply)
          + labels            = (known after apply)
          + local_ssd_count   = (known after apply)
          + machine_type      = "n1-standard-2"
          + metadata          = (known after apply)
          + oauth_scopes      = [
              + "https://www.googleapis.com/auth/cloud-platform",
              + "https://www.googleapis.com/auth/compute",
              + "https://www.googleapis.com/auth/devstorage.full_control",
              + "https://www.googleapis.com/auth/logging.write",
              + "https://www.googleapis.com/auth/monitoring",
              + "https://www.googleapis.com/auth/service.management",
              + "https://www.googleapis.com/auth/servicecontrol",
            ]
          + preemptible       = false
          + service_account   = (known after apply)

          + workload_metadata_config {
              + node_metadata = "GKE_METADATA_SERVER"
            }
        }
    }

Error applying IAM policy

I'm getting the following error message.

Error: Error applying IAM policy for service account 'projects/jx-20200617185112/serviceAccounts/[email protected]': Error setting IAM policy for service account 'projects/jx-20200617185112/serviceAccounts/[email protected]': googleapi: Error 400: Identity namespace does not exist (jx-20200617185112.svc.id.goog)., badRequest

I tried it twice on a fresh Google project with the same result. However, if I re-run terraform apply again, it works. I'm guessing its some kind of a race condition and it tries to create resources before its dependencies are created.

Make the cluster regional

The zone field should be changed to location. Depending on the value, GKE will be zonal (e.g., us-east1-b) or regional (us-east1).

Make sure to update the docs to clarify that the initial number of nodes is multiplied by the number of zones. If, for example, a cluster is regional, 1 node size would result in 3 nodes (one in each of the three zones).

Provide scripts/documentation on how to manage state

We need to decide how we want to deal with state management? At the very least we need to document the importance of remote state storage with some references on how to do it. Even better might be to provide examples.

Find better alternative for local release note generation

The helper script release.sh uses:

jx step changelog -v $version -p $prev_tag_base -r $current_tag_base --generate-yaml=false --no-dev-release --update-release=false

This works, however, this step is supposed to run in cluster and prints a lot of warnings when running locally. Partly because it will always use the pipeline's user credentials.
Also, due to the credential issue, the changelog is just written to the console and one needs to manually cut&paste.

A short jq script should do the trick.

Terraform apply breaks when service accounts have been deleted and restored

Detailed by the issue here: hashicorp/terraform-provider-google#4276

Get bad request errors like so:

  on jx/modules/cluster/serviceaccount.tf line 60, in resource "google_project_iam_member" "tekton_sa_project_viewer_binding":
  60: resource "google_project_iam_member" "tekton_sa_project_viewer_binding" {

The resolution is to modify jx/main.tf and change:

provider "google" {
  version = "~> 3.10.0"
  project = var.gcp_project
  zone    = var.zone
}

provider "google-beta" {
  version = "~> 3.10.0"
  project = var.gcp_project
  zone    = var.zone
}

to

provider "google" {
  version = "~> 2.12.0"
  project = var.gcp_project
  zone    = var.zone
}

provider "google-beta" {
  version = "~> 2.12.0"
  project = var.gcp_project
  zone    = var.zone
}

Empty tuple

terraform destroy produced the following output.

...
module.jx.module.cluster.google_container_cluster.jx_cluster: Destruction complete after 2m47s

Error: Error trying to delete bucket vault-jx-demo-53af71c741fa containing objects without `force_destroy` set to true



Error: Error trying to delete bucket logs-jx-demo-53af71c741fa containing objects without `force_destroy` set to true

That part is OK.

After that, I added force_destroy = true' to my Terraform definition and re-run terraform destroy`. It produced the following error.

...

Error: Error trying to delete bucket vault-jx-demo-53af71c741fa containing objects without `force_destroy` set to true



Error: Error trying to delete bucket logs-jx-demo-53af71c741fa containing objects without `force_destroy` set to true



Error: Invalid index

  on .terraform/modules/jx/terraform-google-jx-1.4.0/modules/cluster/outputs.tf line 22, in output "report_storage_url":
  22:     value = google_storage_bucket.report_bucket[0].url
    |----------------
    | google_storage_bucket.report_bucket is empty tuple

The given key does not identify an element in this collection value.


Error: Invalid index

  on .terraform/modules/jx/terraform-google-jx-1.4.0/modules/cluster/outputs.tf line 26, in output "repository_storage_url":
  26:     value = google_storage_bucket.repository_bucket[0].url
    |----------------
    | google_storage_bucket.repository_bucket is empty tuple

The given key does not identify an element in this collection value.

local terraform provider version scope too narrow

We've experienced an issue within our repository where we use a newer 1.4.0 version of the local terraform provider. Due to changes in the minor version bumps the current constraint of ~> 1.2.0 of the local provider corrupts our terraform state by adding content_base64 = null, resulting in the following error:

Error: unsupported attribute "content_base64"

Going forward I think all provider scopes in this module need to have their version constraints relaxed (whilst retaining the minimum version constraint), thus allowing the consuming project to determine which version to use.
https://www.terraform.io/docs/configuration/modules.html#provider-version-constraints-in-modules

Namespace "cert-manager" already exists

When the tf module is run without parent_domain configured tf does not create the cert-manager namespace. However if one decides to configure parent_domain and reapply the module afterwards it fails with the following error since the namespace already exists:

Error: namespaces "cert-manager" already exists
  on .terraform/modules/jx/modules/dns/main.tf line 106, in resource "kubernetes_namespace" "cert-manager":
 106: resource "kubernetes_namespace" "cert-manager" {

As a fix the cert-manager namespace could always be created by tf.

'Error: google: could not find default credentials' when specifying a credential file

module version: 1.4.0
google provider version: 3.25.0
Terraform v0.12.24
OS Ubuntu 20.04
Linux Kernel: 5.4

When specifying the service account credentials file like this:

provider "google" {
  credentials = file("secret.json")
  project = "my-project"
  zone = "europe-west1-b"
}

terraform plan fails with:

Refreshing Terraform state in-memory prior to plan...
The refreshed state will be used to calculate this plan, but will not be
persisted to local or remote state storage.


Error: google: could not find default credentials. See https://developers.google.com/accounts/docs/application-default-credentials for more information.

  on .terraform/modules/automation-cluster/terraform-google-jx-1.4.0/main.tf line 13, in provider "google":
  13: provider "google" {

The very same configuration work e.g. with (from the gcp website):

// Terraform plugin for creating random ids
resource "random_id" "instance_id" {
 byte_length = 8
}

// A single Google Cloud Engine instance
resource "google_compute_instance" "default" {
 name         = "flask-vm-${random_id.instance_id.hex}"
 machine_type = "f1-micro"
 zone         = "us-west1-a"

 boot_disk {
   initialize_params {
     image = "debian-cloud/debian-9"
   }
 }

// Make sure flask is installed on all new instances for later steps
 metadata_startup_script = "sudo apt-get update; sudo apt-get install -yq build-essential python-pip rsync; pip install flask"

 network_interface {
   network = "default"

   access_config {
     // Include this section to give the VM an external ip address
   }
 }
}

Workaround: Remove the credentials variable from the configuration and set the GOOGLE_CLOUD_KEYFILE_JSON environment variable.

Kubernetes Engine API not enabled

I'm getting the following error when running the module against a newly created Google project.

...
Error: googleapi: Error 403: Kubernetes Engine API is not enabled for this project. Please ensure it is enabled in Google Cloud Console and try again: visit https://console.cloud.google.com/apis/api/container.googleapis.com/overview?project=jx-20200617144145 to do so., forbidden

  on .terraform/modules/jx/terraform-google-jx-1.4.0/modules/cluster/main.tf line 6, in resource "google_container_cluster" "jx_cluster":
   6: resource "google_container_cluster" "jx_cluster" {
...

I believe that the module is already enabling quite a few APIs so it probably makes sense to enable that one through the module as well. Otherwise, maybe we should add a note to the docs that the "Kubernetes Engine API" should be enabled (e.g., gcloud services enable container.googleapis.com --project $PROJECT_ID).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.