jenkins-x / terraform-google-jx Goto Github PK

A Terraform module for creating Jenkins X infrastructure on Google Cloud

License: Apache License 2.0

HCL 81.48% Smarty 2.84% Shell 13.78% Makefile 1.91%

jenkins-x terraform terraform-module hacktoberfest

terraform-google-jx's Introduction

Jenkins X GKE Module

NOTE: While the required minimum Terraform version is 0.12.0, automated CI tests are performed with 0.13 only. The only expected compatibility issues to be aware of are around provider requirements. For more information see here

This repo contains a Terraform module for provisioning a Kubernetes cluster for Jenkins X on Google Cloud.

What is a Terraform module
How do you use this module
FAQ
Development
- Releasing
How do I contribute

What is a Terraform module

A Terraform "module" refers to a self-contained package of Terraform configurations that are managed as a group. For more information around modules refer to the Terraform documentation.

How do you use this module

Prerequisites

To make use of this module, you need a Google Cloud project. Instructions on how to setup such a project can be found in the Google Cloud Installation and Setup guide. You need your Google Cloud project id as an input variable for using this module.

You also need to install the Cloud SDK, in particular gcloud. You find instructions on how to install and authenticate in the Google Cloud Installation and Setup guide as well.

Once you have gcloud installed, you need to create Application Default Credentials by running:

gcloud auth application-default login

Alternatively, you can export the environment variable GOOGLE_APPLICATION_CREDENTIALS referencing the path to a Google Cloud service account key file.

Last but not least, ensure you have the following binaries installed:

gcloud
kubectl ~> 1.14.0
- kubectl comes bundled with the Cloud SDK
terraform ~> 0.12.0
- Terraform installation instruction can be found here

Cluster provisioning

A default Jenkins X ready cluster can be provisioned by creating a file main.tf in an empty directory with the following content:

module "jx" {
  source  = "jenkins-x/jx/google"

  gcp_project = "<my-gcp-project-id>"
}

output "jx_requirements" {
  value = module.jx.jx_requirements
}

You can then apply this Terraform configuration via:

terraform init
terraform apply

This creates a cluster within the specified Google Cloud project with all possible configuration options defaulted.

⚠️ Note: This example is for getting up and running quickly. It is not intended for a production cluster. Refer to Production cluster considerations for things to consider when creating a production cluster.

On completion of terraform apply a jx_requirements output is available which can be used as input to jx boot. Refer to Running jx boot for more information.

In the default configuration, no custom domain is used. DNS resolution occurs via nip.io. For more information on how to configure and use a custom domain, refer to Using a custom domain.

If you just want to experiment with Jenkins X, you can set force_destroy to true. This allows you to remove all generated resources when running terraform destroy, including any generated buckets including their content.

If you want to remove a cluster with the terraform destroy command and the cluster is protected by the deletion_protection=true attribute, you can override the attribute by setting the delete_protect variable to false. It is recommended to override this value and the time of cluster deletion and you should successfully apply the attribute value change before attempting the terraform destroy command.

The following two paragraphs provide the full list of configuration and output variables of this Terraform module.

Inputs

Name	Description	Type	Default	Required
apex_domain	The parent / apex domain to be used for the cluster	`string`	`""`	no
apex_domain_gcp_project	The GCP project the apex domain is managed by, used to write recordsets for a subdomain if set. Defaults to current project.	`string`	`""`	no
apex_domain_integration_enabled	Flag that when set attempts to create delegation records in apex domain to point to domain created by this module	`bool`	`true`	no
artifact_description	artifact registry repository Description	`string`	`"jenkins-x Docker Repository"`	no
artifact_enable	Create artifact registry repository	`bool`	`true`	no
artifact_location	artifact registry repository Location	`string`	`"us-central1"`	no
artifact_repository_id	artifact registry repository Name	`string`	`"oci"`	no
autoscaler_location_policy	location policy for primary node pool	`string`	`"ANY"`	no
autoscaler_max_node_count	primary node pool max nodes	`number`	`5`	no
autoscaler_min_node_count	primary node pool min nodes	`number`	`3`	no
bucket_location	Bucket location for storage	`string`	`"US"`	no
cluster_location	The location (region or zone) in which the cluster master will be created. If you specify a zone (such as us-central1-a), the cluster will be a zonal cluster with a single cluster master. If you specify a region (such as us-west1), the cluster will be a regional cluster with multiple masters spread across zones in the region	`string`	`"us-central1-a"`	no
cluster_name	Name of the Kubernetes cluster to create	`string`	`""`	no
cluster_network	The name of the network (VPC) to which the cluster is connected	`string`	`"default"`	no
cluster_subnetwork	The name of the subnetwork to which the cluster is connected. Leave blank when using the 'default' vpc to generate a subnet for your cluster	`string`	`""`	no
create_ui_sa	Whether the service accounts for the UI should be created	`bool`	`true`	no
delete_protect	Flag used to set the `deletion_protection` attribute to prevent cluster deletion	`bool`	`true`	no
dev_env_approvers	List of git users allowed to approve pull request for dev enviornment repository	`list(string)`	`[]`	no
enable_backup	Whether or not Velero backups should be enabled	`bool`	`false`	no
enable_primary_node_pool	create a node pool for primary nodes if disabled you must create your own pool	`bool`	`true`	no
enable_private_endpoint	(Beta) Whether the master's internal IP address is used as the cluster endpoint. Requires VPC-native	`bool`	`false`	no
enable_private_nodes	(Beta) Whether nodes have internal IP addresses only. Requires VPC-native	`bool`	`false`	no
force_destroy	Flag to determine whether storage buckets get forcefully destroyed	`bool`	`false`	no
gcp_project	The name of the GCP project to use	`string`	n/a	yes
git_owner_requirement_repos	The git id of the owner for the requirement repositories	`string`	`""`	no
gsm	Enables Google Secrets Manager, not available with JX2	`bool`	`false`	no
initial_cluster_node_count	Initial number of cluster nodes	`number`	`3`	no
initial_primary_node_pool_node_count	Initial primary node pool nodes	`number`	`3`	no
ip_range_pods	The IP range in CIDR notation to use for pods. Set to /netmask (e.g. /18) to have a range chosen with a specific netmask. Enables VPC-native	`string`	`""`	no
ip_range_services	The IP range in CIDR notation use for services. Set to /netmask (e.g. /21) to have a range chosen with a specific netmask. Enables VPC-native	`string`	`""`	no
jenkins_x_namespace	Kubernetes namespace to install Jenkins X in	`string`	`"jx"`	no
jx2	Is a Jenkins X 2 install	`bool`	`true`	no
jx_bot_token	Bot token used to interact with the Jenkins X cluster git repository	`string`	`""`	no
jx_bot_username	Bot username used to interact with the Jenkins X cluster git repository	`string`	`""`	no
jx_git_operator_version	The jx-git-operator helm chart version	`string`	`"0.0.192"`	no
jx_git_url	URL for the Jenins X cluster git repository	`string`	`""`	no
kuberhealthy	Enables Kuberhealthy helm installation	`bool`	`true`	no
lets_encrypt_production	Flag to determine wether or not to use the Let's Encrypt production server.	`bool`	`true`	no
master_authorized_networks	List of master authorized networks. If none are provided, disallow external access (except the cluster node IPs, which GKE automatically allowlists).	`list(object({ cidr_block = string, display_name = string }))`	`[]`	no
master_ipv4_cidr_block	The IP range in CIDR notation to use for the hosted master network. This range must not overlap with any other ranges in use within the cluster's network, and it must be a /28 subnet	`string`	`"10.0.0.0/28"`	no
max_pods_per_node	Max gke nodes = 2^($CIDR_RANGE_PER_NODE-$POD_NETWORK_CIDR) (see gke docs)	`number`	`64`	no
node_disk_size	Node disk size in GB	`string`	`"100"`	no
node_disk_type	Node disk type, either pd-standard or pd-ssd	`string`	`"pd-standard"`	no
node_machine_type	Node type for the Kubernetes cluster	`string`	`"n1-standard-2"`	no
node_preemptible	Use preemptible nodes	`bool`	`false`	no
node_spot	Use spot nodes	`bool`	`false`	no
parent_domain	Deprecated Please use apex_domain variable instead.r	`string`	`""`	no
parent_domain_gcp_project	Deprecated Please use apex_domain_gcp_project variable instead.	`string`	`""`	no
release_channel	The GKE release channel to subscribe to. See https://cloud.google.com/kubernetes-engine/docs/concepts/release-channels	`string`	`"REGULAR"`	no
resource_labels	Set of labels to be applied to the cluster	`map(any)`	`{}`	no
subdomain	Optional sub domain for the installation	`string`	`""`	no
tls_email	Email used by Let's Encrypt. Required for TLS when apex_domain is specified	`string`	`""`	no
vault_url	URL to an external Vault instance in case Jenkins X shall not create its own system Vault	`string`	`""`	no
velero_namespace	Kubernetes namespace for Velero	`string`	`"velero"`	no
velero_schedule	The Velero backup schedule in cron notation to be set in the Velero Schedule CRD (see default-backup.yaml)	`string`	`"0 * * * *"`	no
velero_ttl	The the lifetime of a velero backup to be set in the Velero Schedule CRD (see default-backup.yaml)	`string`	`"720h0m0s"`	no
version_stream_ref	The git ref for version stream to use when booting Jenkins X. See https://jenkins-x.io/docs/concepts/version-stream/	`string`	`"master"`	no
version_stream_url	The URL for the version stream to use when booting Jenkins X. See https://jenkins-x.io/docs/concepts/version-stream/	`string`	`"https://github.com/jenkins-x/jenkins-x-versions.git"`	no
webhook	Jenkins X webhook handler for git provider	`string`	`"lighthouse"`	no
zone	Zone in which to create the cluster (deprecated, use cluster_location instead)	`string`	`""`	no

Outputs

Name	Description
backup_bucket_url	The URL to the bucket for backup storage
cluster_location	The location of the created Kubernetes cluster
cluster_name	The name of the created Kubernetes cluster
connect	The cluster connection string to use once Terraform apply finishes
externaldns_dns_name	ExternalDNS name
externaldns_ns	ExternalDNS nameservers
gcp_project	The GCP project in which the resources got created
jx_requirements	The jx-requirements rendered output
log_storage_url	The URL to the bucket for log storage
report_storage_url	The URL to the bucket for report storage
repository_storage_url	The URL to the bucket for artifact storage
tekton_sa_email	The Tekton service account email address, useful to provide further IAM bindings
tekton_sa_name	The Tekton service account name, useful to provide further IAM bindings
vault_bucket_url	The URL to the bucket for secret storage

Artifact Registry in setup with multiple Jenkins X clusters

In a multi cluster setup, you should leave the value of artifact_enable as true only in a development cluster and set artifact_enable = false for other clusters. A development cluster is one where application build pipelines are executed. If you have multiple development clusters you can set artifact_repository_id to different values for them. Alternatively you can have artifact_enable = true in one and manually copy the values of cluster.registry and cluster.dockerRegistryOrg from jx-requirements.yml from that cluster repository to the other cdevelopment cluster repositories.

If you leave artifact_enable as true for multiple clusters and don't override artifact_repository_id terraform will fail since it can't create an already existing repository.

Migration from Container to Artifact Registry

Google has deprecated gcr.io and now recommends the use of Artifact Registry. The default of this module is now to create and use a repository in Artifact Registry for container images.

Google GKE clusters automatically have permissions to download from the Artifact Registry. For multi cluster setups across different projects, additional permission configurations may be necessary.

Configuration Note

The jx-requirements.yml will be automatically updated by the Jenkins X boot job when triggered by a push to the main branch of the cluster repository.

Migration Options

Here are two strategies for transitioning container images from gcr.io to the Artifact Registry:

Don't Migrate Existing Images

Continue developing applications as usual. New images, upon their release, will be pushed to the Artifact Registry.
Important: Ensure that all builds are triggered and applications are promoted before Google completely shuts down the Container Registry. This step is critical to avoid disruptions in service. To identify which images from you container registry are currently used in your cluster, you can use the following command line (replace project_id with your actual GCP project id):

kubectl get pods --all-namespaces -o jsonpath="{range .items[*].spec['initContainers', 'containers'][*]}{.image}{'\n'}{end}" | fgrep gcr.io/project_id | sort -u

Migrate Existing Images

If you have a large number of applications running that are unlikely to be released in the coming year, migration of images to artifact registry while retaining the image names (in the domain gcr.io) could be considered. This means that existing helm charts will continue to work.

This process is not supported by this terraform module, instead you need to follow the steps outlined in the guide Set up repositories with gcr.io domain support. These steps include create the a repository in Artifact Registry, migrate images to it from container registry and enable redirection of gcr.io traffic.

If you keep the default settings for this module it will create another artifact repository that will be used for new images. If you want to use gcr.io artifact repository for new images you should set artifact_enable = false.

Using a custom domain

If you want to use a custom domain with your Jenkins X installation, you need to provide values for the variables apex_domain and tls_email. apex_domain is the fully qualified domain name you want to use and tls_email is the email address you want to use for issuing Let's Encrypt TLS certificates.

Before you apply the Terraform configuration, you also need to create a Cloud DNS managed zone, with the DNS name in the managed zone matching your custom domain name, for example in the case of example.jenkins-x.rocks as domain:

When creating the managed zone, a set of DNS servers get created which you need to specify in the DNS settings of your DNS registrar.

It is essential that before you run jx boot, your DNS servers settings are propagated, and your domain resolves. You can use DNS checker to check whether your domain settings have propagated.

When a custom domain is provided, Jenkins X uses ExternalDNS together with cert-manager to create A record entries in your managed zone for the various exposed applications.

If apex_domain id not set, your cluster will use nip.io in order to create publicly resolvable URLs of the form http://<app-name>-<environment-name>.<cluster-ip>.nip.io.

Production cluster considerations

The configuration as seen in Cluster provisioning is not suited for creating and maintaining a production Jenkins X cluster. The following is a list of considerations for a production usecase.

Specify the version attribute of the module, for example:

module "jx" {
  source  = "jenkins-x/jx/google"
  version = "1.2.4"
  # insert your configuration
}

output "jx_requirements" {
 value = module.jx.jx_requirements
}

Specifying the version ensures that you are using a fixed version and that version upgrades cannot occur unintented.

Keep the Terraform configuration under version control, by creating a dedicated repository for your cluster configuration or by adding it to an already existing infrastructure repository.
Setup a Terraform backend to securely store and share the state of your cluster. For more information refer to Configuring a Terraform backend.

Configuring a Terraform backend

A "backend" in Terraform determines how state is loaded and how an operation such as apply is executed. By default, Terraform uses the local backend which keeps the state of the created resources on the local file system. This is problematic since sensitive information will be stored on disk and it is not possible to share state across a team. When working with Google Cloud a good choice for your Terraform backend is the gcs backend which stores the Terraform state in a Google Cloud Storage bucket. The examples directory of this repository contains configuration examples for using the gcs backed with and without optionally configured customer supplied encryption key.

To use the gcs backend you will need to create the bucket upfront. You can use gsutil to create the bucket:

gsutil mb gs://<my-bucket-name>/

It is also recommended to enable versioning on the bucket as an additional safety net in case of state corruption.

gsutil versioning set on gs://<my-bucket-name>

You can verify whether a bucket has versioning enabled via:

gsutil versioning get gs://<my-bucket-name>

FAQ

How do I get the latest version of the terraform-google-jx module

terraform init -upgrade

How to I specify a specific google provider version

provider "google" {
  version = "~> 2.12.0"
  project = var.gcp_project
}

provider "google-beta" {
  version = "~> 2.12.0"
  project = var.gcp_project
}

Why do I need Application Default Credentials

The recommended way to authenticate to the Google Cloud API is by using a service account. This allows for authentication regardless of where your code runs. This Terraform module expects authentication via a service account key. You can either specify the path to this key directly using the GOOGLE_APPLICATION_CREDENTIALS environment variable or you can run gcloud auth application-default login. In the latter case gcloud obtains user access credentials via a web flow and puts them in the well-known location for Application Default Credentials (ADC), usually ~/.config/gcloud/application_default_credentials.json.

Development

Releasing

At the moment there is no release pipeline defined in jenkins-x.yml. A Terraform release does not require building an artifact, only a tag needs to be created and pushed. To make this task easier and there is a helper script release.sh which simplifies this process and creates the changelog as well:

./scripts/release.sh

This can be executed on demand whenever a release is required. For the script to work the envrionment variable $GH_TOKEN must be exported and reference a valid GitHub API token.

How do I contribute

Contributions are very welcome! Check out the Contribution Guidelines for instructions.

terraform-google-jx's People

Contributors

Stargazers

Watchers

Forkers

cagiti mpailloncy abayer ccojocar hferentschik tomhobson apg84 dmk-group allthecode0 joostvdg dmi-fr garethjevans laloyalo jwat000010 kyounger deas polo149278 rakesh283343 adrianstancu-q patrickleet jobcespedes ankitm123 onophris nextops rawlingsj viridischem jstrachan phlo-c recursionpharma rheehot itmagix spring-financial-group k8helmdemo andrejs-sisojevs hervelemeur saurav9878 galaxycard sadym schrodinger chrismellard haysclark flowx-ai ranker-dev naensma paukul khos2ow joshuasimon-taulia mikhanid 4thechain franklinharry cryptocurrency-services rm250702 raghugitty okopoko geckorobotics gcp-automation zebratechnologies navarrepratt11 emea-digital appkube-automation dailymotion labs-manarabd mollypi isabella232 mn0o7 godtrilla miguelcedeno finsys matt-thoughtforge mmn0o7 blmars meisterveda-jx3 rochana-atapattu hxfhuxiaofeng liagkouv boyshostel itsparser workcanvas adrhcorp jx3rocks neekhild blooo-io ankurbatheja07 dokify jx3rocks vanshmadan jenkinsx3-demoproject sergiogiuffrida forcemcs

terraform-google-jx's Issues

Restrict the length of the google service accounts

Atm, the Google service account ids are prefixed with the cluster name. The total length is 30 characters. This can easily lead to problems:

Error: "account_id" ("pr-6999-252-terraform-boot-vault-ko") doesn't match regexp "^[a-z](?:[-a-z0-9]{4,28}[a-z0-9])$"
  on ../terraform-google-**/modules/cluster/serviceaccount.tf line 28, in resource "google_service_account" "kaniko_sa":
  28: resource "google_service_account" "kaniko_sa" {

Let's try to ensure that the name is capped in case it is too long.

Have separate pipeline steps for cluster creation and shellspec tests

At the moment cluster creation and testing happens in the same pipeline step within the ci.sh script. When looking at the logs it would be nice to have separate steps for that.

The reason for having the tests in ci.sh is that the cluster destruction happens currently as well in ci.sh via a tap handler.

One solution would be to split ci.sh into two scripts.

zone is not copied into jx-requirements.yaml

When setting up a gke cluster using this module (v1.4.0) with a non-default zone (europe-west1-b in my case) the terraform module does not copy it into the resulting jx-requirements.yml.

Therefore calling jx boot -r jx-requirements.yml afterwards fails since jx-requirements.yml points to zone us-central1-a.

Clarify getting started example in README

Currently, the example uses:

module "jx" {
  source  = "jenkins-x/jx/google"

  gcp_project = var.gcp_project
}

That's confusing since using a variable for the GCP project would also force you to define a variables.tf. To simplify the example should not use variables, but show to use the property inlined.

Fail to apply module because of Error 403: The caller does not have permission, forbidden

I am logged in with correct email on gcloud but I get all this errors:

Steps to reproduce:

gcloud init -> choose email and project by id
gcloud auth application-default login

add main.tf ->

module "jx" {
  source = "jenkins-x/jx/google"
  gcp_project = "[project-name]"
  cluster_name = "[cluster-name]"
  force_destroy = true
  version = "1.3.2"
  zone = "europe-west3-a"
}

terraform init
terraform apply

Expected:
Finish apply process successfully

Unable to delete storage bucket during 'terraform destroy'

When trying to execute terraform destroy, the following error message pops up.

Error: Error reading Storage Bucket "backup-jx-demo-4cabfed7eb9f": googleapi: Error 403: [email protected] does not have storage.buckets.get access to backup-jx-demo-4cabfed7eb9f., forbidden

Terraform plan

When I execute terraform plan, there are 54 resources to be created. Are all those mandatory? Also, what are they used for? If terraform plan is also used as "executable specification of requirements", we need to know what is each of those resources used for, so that we know what to create depending on the choices of the components we want to use with JX.

Allow specifying backup bucket location

It seems I cannot specify the location of the bucket for the backups: https://github.com/jenkins-x/terraform-google-jx/blob/master/modules/backup/main.tf#L7

I do not see an option. And when I plan I get the following:

# module.jx.module.backup.google_storage_bucket.backup_bucket will be created
  + resource "google_storage_bucket" "backup_bucket" {
      + bucket_policy_only = (known after apply)
      + force_destroy      = false
      + id                 = (known after apply)
      + location           = "US"
      + name               = (known after apply)
      + project            = (known after apply)
      + self_link          = (known after apply)
      + storage_class      = "STANDARD"
      + url                = (known after apply)
    }

      + location           = "US"

As EU citizen, I generally want to store my data in the EU, or at least have the option to. See: https://cloud.google.com/storage/docs/locations

jx-requirements.yaml output for devEnvApprovers is malformed

After successfully creating a cluster as per Readme, the resulting jx-requirements.yaml file does not have a properly formed empty value for devEnvApprovers.

cluster:
  clusterName: "dks-cjxd8-terra"
  devEnvApprovers: 

  environmentGitOwner: ""
  project: "apps-dev-229310"
  provider: gke
  zone: "us-central1-a"

Rremove release pipeline from jenkins-x.yml

To better control, semantic versioning lets, for now, remove the release pipeline config and to only manual tags. For Terraform Module releases a tag is all that is needed.

With the current release pipeline, we always will get a patch version increment which is not what you want if you want to increase minor or patch.

Should we just output variables instead of templated jx-requirements.yml

The current approach generates a jx-requirements.yml which is only useful for the initial jx boot. After the initial jx boot the generated jx-requirements.yml is disconnected from the jx-requirements.yml in the boot repo.

Generate terraform output for jx-requirements rather than a local file

The intention of the jx-requirements.yaml file is that it's only ever read at the initial setup of Jenkins X on a new cluster. With that said, producing the local_file resource means that terraform constantly manages that resource and wants to update a file which no longer has as much relevance to the cluster than it did.
The proposal is to define the jx-requirements.yaml as an output that could possibly be used to perform the initial boot like (after the terraform module has created all of the resources):

jx boot -r <(terraform output jx-requirements)

Having the Jenkins X requirements available in this manner removes the pain of having to maintain the local file within terraform and the git repository.

Add tests for conditional logic within Terraform scripts

Certain flags enable conditional resources. We should be able to tests that these conditional work by using shellspec and the terraform plan. Looking at the plan should be enough to assert that certain resources will or will not be created.

Rename generated jx-requirements.yaml to jx-requirements.yml

This is to match the expected default for jx boot and having to avoid using the -r option.

jx-requirements.yml out of sync

Initially the tf module generates the jx-requirements.yml which is meant to be provided to jx boot -r.
However jx boot creates a new environment git repository that contains jx-requirements.yml as well.
This is potentially inconsistent because changes made using subsequent tf module executions do not make it into the environment repository.

An approach to solve this could be to let the tf module write the information contained within jx-requirements.yml into the k8s cluster so that jx boot could read it from there more consistently.

Allow to provide external Vault configuration

The latest jx version allows using an internal (using the Vault operator) or external Vault instance for secret storage. The Terraform module should allow configuring either of these two options.

Wrong path to jx-requirements.yaml.tpl when using Terraform registry as source

main.tf:

module "jx" {
  source  = "jenkins-x/jx/google"
  version = "1.2.1"
  gcp_project = "foo"
}

$ terraform init 
$ terraform plan
Error: Error in function call

  on .terraform/modules/jx/terraform-google-jx-1.2.1/main.tf line 186, in resource "local_file" "jx-requirements":
 186:   content = templatefile("${path.cwd}/modules/jx-requirements.yaml.tpl", {

Error applying IAM policy for service account after running terraform apply for creating gks cluster

After running terraform apply using terraform file

module "jx" {
  source  = "jenkins-x/jx/google"

  gcp_project = "<my-gcp-project-id>"
}

At the end of process I get:

Error: Error applying IAM policy for service account 'projects/x-project-275408/serviceAccounts/[email protected]': Error setting IAM policy for service account 'projects/x-project-275408/serviceAccounts/[email protected]': googleapi: Error 400: Identity namespace does not exist (x-project-275408.svc.id.goog)., badRequest

on .terraform/modules/jx/terraform-google-jx-1.2.5/modules/backup/main.tf line 44, in resource "google_service_account_iam_member" "velero_sa_workload_identity_user":
44: resource "google_service_account_iam_member" "velero_sa_workload_identity_user" {

Error: Error applying IAM policy for service account 'projects/x-project-275408/serviceAccounts/[email protected]': Error setting IAM policy for service account 'projects/x-project-275408/serviceAccounts/[email protected]': googleapi: Error 400: Identity namespace does not exist (x-project-275408.svc.id.goog)., badRequest

on .terraform/modules/jx/terraform-google-jx-1.2.5/modules/cluster/serviceaccount.tf line 73, in resource "google_service_account_iam_member" "build_controller_sa_workload_identity_user":
73: resource "google_service_account_iam_member" "build_controller_sa_workload_identity_user" {

Error: Error applying IAM policy for service account 'projects/x-project-275408/serviceAccounts/[email protected]': Error setting IAM policy for service account 'projects/x-project-275408/serviceAccounts/[email protected]': googleapi: Error 400: Identity namespace does not exist (x-project-275408.svc.id.goog)., badRequest

on .terraform/modules/jx/terraform-google-jx-1.2.5/modules/cluster/serviceaccount.tf line 124, in resource "google_service_account_iam_member" "kaniko_sa_workload_identity_user":
124: resource "google_service_account_iam_member" "kaniko_sa_workload_identity_user" {

Error: Error applying IAM policy for service account 'projects/x-project-275408/serviceAccounts/tf-jx-arriving-frog-tekton@x-project-275408.iam.gserviceaccount.com': Error setting IAM policy for service account 'projects/x-project-275408/serviceAccounts/tf-jx-arriving-frog-tekton@x-project-275408.iam.gserviceaccount.com': googleapi: Error 400: Identity namespace does not exist (x-project-275408.svc.id.goog)., badRequest

on .terraform/modules/jx/terraform-google-jx-1.2.5/modules/cluster/serviceaccount.tf line 133, in resource "google_service_account_iam_member" "tekton_sa_workload_identity_user":
133: resource "google_service_account_iam_member" "tekton_sa_workload_identity_user" {

Error applying IAM policy for service account after running terraform apply for creating gks cluster

See #74

Hitting the same issue with version 1.3.2

Creating namespaces seems like scope creep for this module

resource "kubernetes_namespace" "cert-manager" {
  metadata {
    name = var.cert-manager-namespace
  }
  lifecycle {
    ignore_changes = [
      metadata[0].labels,
      metadata[0].annotations,
    ]
  }
}

This module creates namespaces (jx, cert-manager, velero), which seems to be outside the scope of this modules intent. From past experiences it becomes difficult when mixing resource creation where resources have differing lifecycles and where more than one agent is responsible for configuring said resources (can cause conflicts).

Create random cluster name

The idea is to use the same approach as for jx create cluster.
See https://github.com/jenkins-x/terraform-aws-eks-jx/blob/master/local.tf#L10

Use Terraform variable validation for checking TLS email

The tls_email variable needs to be conditionally set if custom domains are enabled. Right now when you set parent_domain there is no validation which ensures that you also set tls_email. Missing to set this variable will read to issues later during jx boot.

Using Terraforms variable validation - https://www.terraform.io/docs/configuration/variables.html - this conditional dependency could be expressed. On the downside, variable validation is atm still an experimental feature.

Give the ability to restrain access to control plane

See https://cloud.google.com/kubernetes-engine/docs/how-to/hardening-your-cluster#restrict_network_access_to_the_control_plane_and_nodes

We could optionally give the ability to restrain access to k8s control plane to only a specific set of CIDR blocks, by specifying https://www.terraform.io/docs/providers/google/r/container_cluster.html#master_authorized_networks_config

Expose ability to specify kubernetes version or release channel for GKE Clusters

There doesn't exist to be a method to specify a version of the kubernetes or release channel for GKE clusters in this module. Allowing the user to specify this would be beneficial.

Make creation of Velero resources optional

ATM, the backup configuration (backup storage bucket, namespace, service account, etc) are always created. Given that this is not fully tested/supported yet, it makes sense to only create resources conditionally. One needs to explicitly opt in.

On the downside, it means that if you at a later stage want to enable it, you can create the missing resources by enabling the feature and re-running terraform apply, but then you will have to manually sync the required jx-requirements.yml changes into the jx-requirements.yml of your dev repository.

Include disabled ingress section for jx-requirements.yaml

Even if a parent_domain parameter is not specified in a main.tf file, include a disabled ingress section in the jx-requirements.yaml output. This will better allow for the subsequent ability to supply DNS and/or TLS values prior to running jx boot

Provider version scope too narrow for module

As per https://www.terraform.io/docs/configuration/modules.html#provider-version-constraints-in-modules

Remove intermediate jx module directory

The current repo layout looks like this:

├── jx
│   ├── main.tf
│   ├── modules
│   │   ├── backup
│   │   ├── cluster
│   │   ├── dns
│   │   └── vault
│   ├── output.tf
│   └── variables.tf
├── jx-requirements.yaml
├── jx-requirements.yaml.tpl
├── main.tf
├── output.tf
└── variables.tf

The intermediate jx directory seems superfluous and should be removed. This means we are effectively merging main.tf of the root level with the one in jx.

As a side effect, we have to duplicate/copy fewer variables and outputs across module boundaries.

Allow specifying the underlying disk type for nodepool instances

Currently uses the default of pd-standard. In my instance I wanted to specify pd-ssd.

Allow to specify cluster resource labels as variables

To allow to apply custom labels to the created cluster, it makes sense to allow specifying cluster resource labels as input variables to the main Terraform script.

They belong to the google_container_cluster resource and look like this:

        resource_labels  = {
            "env"       = "staging"
            "managedby" = "terraform"
        }

Update scripts to use workload identity

Install Jenkins X as part of the Terraform script

This could bee achieved via the Helm provisioner. It would require further work and refactoring on the Jenkins X side, but eventually, it should be possible.

Add required resources for jx-ui

See https://github.com/jenkins-x/terraform-aws-eks-jx/blob/master/modules/cluster/irsa.tf#L283

The UI when installed needs permissions to could storage for accessing the logs

Storage bucket is always in zone/location "US"

At the moment storage buckets are created within the region "US" and ignoring the zone variable, since the location is not set via script.

Upgrade to latest terraform-provider-google

We downgraded version to 2.12.0 as part of issue #40. There was an underlying issue in the provider which was captured in this issue hashicorp/terraform-provider-google#4276.

Looks like the issue is not closed and resolved, so we can try to upgrade the provider again, especially since we are atm using a very old version.

Conditionally create the managed zone for DNS

Currently, we are not creating the managed zone with Terraform, but require it to be created upfront when using a custom domain. There are two reasons for that:

If Terraform creates the managed zone and Jenkins X get installed, new recordsets are getting created in the zone. This prevents a clean destroy since you first have to manually delete all records
More importantly, if one creates the managed zone as part of the script, one will get a set of nameservers as output which one has to manually set with the DNS registrar. You then have to wait until the DNS settings have propagated before running jx boot.

Making the managed zone creation optional might be useful in some cases where you want to manage as much as possible via Terraform.

Not all input parameters are used

Using min_node_count and max_node_count

Using the min_node_count and max_node_count 
module "jx" {
  source  = "jenkins-x/jx/google"
  gcp_project = ...

  cluster_name = ...
  git_owner_requirement_repos = "hemp0r"
  min_node_count = 1
  max_node_count = 2
  dev_env_approvers = ["hemp0r"]
  zone = "europe-west3"
  
  force_destroy = true 
}

leads to

+ resource "google_container_node_pool" "jx_node_pool" {
      + cluster             = ...
      + id                  = (known after apply)
      + initial_node_count  = 3
      + instance_group_urls = (known after apply)
      + location            = "europe-west3"
      + max_pods_per_node   = (known after apply)
      + name                = "autoscale-pool"
      + name_prefix         = (known after apply)
      + node_count          = (known after apply)
      + project             = (known after apply)
      + region              = (known after apply)
      + version             = (known after apply)
      + zone                = (known after apply)

      + autoscaling {
          + max_node_count = 5
          + min_node_count = 3
        }

      + management {
          + auto_repair  = true
          + auto_upgrade = false
        }

      + node_config {
          + disk_size_gb      = 100
          + disk_type         = (known after apply)
          + guest_accelerator = (known after apply)
          + image_type        = (known after apply)
          + labels            = (known after apply)
          + local_ssd_count   = (known after apply)
          + machine_type      = "n1-standard-2"
          + metadata          = (known after apply)
          + oauth_scopes      = [
              + "https://www.googleapis.com/auth/cloud-platform",
              + "https://www.googleapis.com/auth/compute",
              + "https://www.googleapis.com/auth/devstorage.full_control",
              + "https://www.googleapis.com/auth/logging.write",
              + "https://www.googleapis.com/auth/monitoring",
              + "https://www.googleapis.com/auth/service.management",
              + "https://www.googleapis.com/auth/servicecontrol",
            ]
          + preemptible       = false
          + service_account   = (known after apply)

          + workload_metadata_config {
              + node_metadata = "GKE_METADATA_SERVER"
            }
        }
    }

Add support for Azure

Would it make sense to also support Azure? Looks like it would be based on community feedback jenkins-x/jx#6780 (comment)

/cc @daveconde

Error applying IAM policy

I'm getting the following error message.

Error: Error applying IAM policy for service account 'projects/jx-20200617185112/serviceAccounts/[email protected]': Error setting IAM policy for service account 'projects/jx-20200617185112/serviceAccounts/[email protected]': googleapi: Error 400: Identity namespace does not exist (jx-20200617185112.svc.id.goog)., badRequest

I tried it twice on a fresh Google project with the same result. However, if I re-run terraform apply again, it works. I'm guessing its some kind of a race condition and it tries to create resources before its dependencies are created.

Add documentation for GOOGLE_APPLICATION_CREDENTIALS

See. https://www.terraform.io/docs/providers/google/guides/provider_reference.html

Make the cluster regional

The zone field should be changed to location. Depending on the value, GKE will be zonal (e.g., us-east1-b) or regional (us-east1).

Make sure to update the docs to clarify that the initial number of nodes is multiplied by the number of zones. If, for example, a cluster is regional, 1 node size would result in 3 nodes (one in each of the three zones).

Provide scripts/documentation on how to manage state

We need to decide how we want to deal with state management? At the very least we need to document the importance of remote state storage with some references on how to do it. Even better might be to provide examples.

Find better alternative for local release note generation

The helper script release.sh uses:

jx step changelog -v $version -p $prev_tag_base -r $current_tag_base --generate-yaml=false --no-dev-release --update-release=false

This works, however, this step is supposed to run in cluster and prints a lot of warnings when running locally. Partly because it will always use the pipeline's user credentials.
Also, due to the credential issue, the changelog is just written to the console and one needs to manually cut&paste.

A short jq script should do the trick.

Terraform apply breaks when service accounts have been deleted and restored

Detailed by the issue here: hashicorp/terraform-provider-google#4276

Get bad request errors like so:

  on jx/modules/cluster/serviceaccount.tf line 60, in resource "google_project_iam_member" "tekton_sa_project_viewer_binding":
  60: resource "google_project_iam_member" "tekton_sa_project_viewer_binding" {

The resolution is to modify jx/main.tf and change:

provider "google" {
  version = "~> 3.10.0"
  project = var.gcp_project
  zone    = var.zone
}

provider "google-beta" {
  version = "~> 3.10.0"
  project = var.gcp_project
  zone    = var.zone
}

provider "google" {
  version = "~> 2.12.0"
  project = var.gcp_project
  zone    = var.zone
}

provider "google-beta" {
  version = "~> 2.12.0"
  project = var.gcp_project
  zone    = var.zone
}

Empty tuple

terraform destroy produced the following output.

...
module.jx.module.cluster.google_container_cluster.jx_cluster: Destruction complete after 2m47s

Error: Error trying to delete bucket vault-jx-demo-53af71c741fa containing objects without `force_destroy` set to true



Error: Error trying to delete bucket logs-jx-demo-53af71c741fa containing objects without `force_destroy` set to true

That part is OK.

After that, I added force_destroy = true' to my Terraform definition and re-run terraform destroy`. It produced the following error.

...

Error: Error trying to delete bucket vault-jx-demo-53af71c741fa containing objects without `force_destroy` set to true



Error: Error trying to delete bucket logs-jx-demo-53af71c741fa containing objects without `force_destroy` set to true



Error: Invalid index

  on .terraform/modules/jx/terraform-google-jx-1.4.0/modules/cluster/outputs.tf line 22, in output "report_storage_url":
  22:     value = google_storage_bucket.report_bucket[0].url
    |----------------
    | google_storage_bucket.report_bucket is empty tuple

The given key does not identify an element in this collection value.


Error: Invalid index

  on .terraform/modules/jx/terraform-google-jx-1.4.0/modules/cluster/outputs.tf line 26, in output "repository_storage_url":
  26:     value = google_storage_bucket.repository_bucket[0].url
    |----------------
    | google_storage_bucket.repository_bucket is empty tuple

The given key does not identify an element in this collection value.

local terraform provider version scope too narrow

We've experienced an issue within our repository where we use a newer 1.4.0 version of the local terraform provider. Due to changes in the minor version bumps the current constraint of ~> 1.2.0 of the local provider corrupts our terraform state by adding content_base64 = null, resulting in the following error:

Error: unsupported attribute "content_base64"

Going forward I think all provider scopes in this module need to have their version constraints relaxed (whilst retaining the minimum version constraint), thus allowing the consuming project to determine which version to use.
https://www.terraform.io/docs/configuration/modules.html#provider-version-constraints-in-modules

Namespace "cert-manager" already exists

When the tf module is run without parent_domain configured tf does not create the cert-manager namespace. However if one decides to configure parent_domain and reapply the module afterwards it fails with the following error since the namespace already exists:

Error: namespaces "cert-manager" already exists
  on .terraform/modules/jx/modules/dns/main.tf line 106, in resource "kubernetes_namespace" "cert-manager":
 106: resource "kubernetes_namespace" "cert-manager" {

As a fix the cert-manager namespace could always be created by tf.

'Error: google: could not find default credentials' when specifying a credential file

module version: 1.4.0
google provider version: 3.25.0
Terraform v0.12.24
OS Ubuntu 20.04
Linux Kernel: 5.4

When specifying the service account credentials file like this:

provider "google" {
  credentials = file("secret.json")
  project = "my-project"
  zone = "europe-west1-b"
}

terraform plan fails with:

Refreshing Terraform state in-memory prior to plan...
The refreshed state will be used to calculate this plan, but will not be
persisted to local or remote state storage.


Error: google: could not find default credentials. See https://developers.google.com/accounts/docs/application-default-credentials for more information.

  on .terraform/modules/automation-cluster/terraform-google-jx-1.4.0/main.tf line 13, in provider "google":
  13: provider "google" {

The very same configuration work e.g. with (from the gcp website):

// Terraform plugin for creating random ids
resource "random_id" "instance_id" {
 byte_length = 8
}

// A single Google Cloud Engine instance
resource "google_compute_instance" "default" {
 name         = "flask-vm-${random_id.instance_id.hex}"
 machine_type = "f1-micro"
 zone         = "us-west1-a"

 boot_disk {
   initialize_params {
     image = "debian-cloud/debian-9"
   }
 }

// Make sure flask is installed on all new instances for later steps
 metadata_startup_script = "sudo apt-get update; sudo apt-get install -yq build-essential python-pip rsync; pip install flask"

 network_interface {
   network = "default"

   access_config {
     // Include this section to give the VM an external ip address
   }
 }
}

Workaround: Remove the credentials variable from the configuration and set the GOOGLE_CLOUD_KEYFILE_JSON environment variable.

Kubernetes Engine API not enabled

I'm getting the following error when running the module against a newly created Google project.

...
Error: googleapi: Error 403: Kubernetes Engine API is not enabled for this project. Please ensure it is enabled in Google Cloud Console and try again: visit https://console.cloud.google.com/apis/api/container.googleapis.com/overview?project=jx-20200617144145 to do so., forbidden

  on .terraform/modules/jx/terraform-google-jx-1.4.0/modules/cluster/main.tf line 6, in resource "google_container_cluster" "jx_cluster":
   6: resource "google_container_cluster" "jx_cluster" {
...

I believe that the module is already enabling quite a few APIs so it probably makes sense to enable that one through the module as well. Otherwise, maybe we should add a note to the docs that the "Kubernetes Engine API" should be enabled (e.g., gcloud services enable container.googleapis.com --project $PROJECT_ID).

Allow specifying cluster version

I believe this should be possible to add, but as of now, we always use the default version, and it'd be good to allow that to be overridden.

Relates to jenkins-x/jx#6957

jenkins-x / terraform-google-jx Goto Github PK

terraform-google-jx's Introduction

Jenkins X GKE Module

What is a Terraform module

How do you use this module

Prerequisites

Cluster provisioning

Inputs

Outputs

Artifact Registry in setup with multiple Jenkins X clusters

Migration from Container to Artifact Registry

Configuration Note

Migration Options

Don't Migrate Existing Images

Migrate Existing Images

Using a custom domain

Production cluster considerations

Configuring a Terraform backend

FAQ

How do I get the latest version of the terraform-google-jx module

How to I specify a specific google provider version

Why do I need Application Default Credentials

Development

Releasing

How do I contribute

terraform-google-jx's People

Contributors

Stargazers

Watchers

Forkers

terraform-google-jx's Issues

Recommend Projects

Recommend Topics

Recommend Org