Giter VIP home page Giter VIP logo

terraform-ibm-satellite's People

Contributors

anilkumarnagaraj avatar blintmester avatar crhuff-ibm avatar david-shepard avatar finken2 avatar hkantare avatar jsloyer avatar kavya498 avatar lionelmace avatar ocofaigh avatar sirspidey avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

terraform-ibm-satellite's Issues

`terraform plan` lists a resource will be replaced, but `terraform apply` throws an error

Problem Statement

When running tf plan --out=./tfplan, I see this

...
  # module.apply_aws_sat_location.module.satellite-location.null_resource.satellite_location is tainted, so must be replaced
-/+ resource "null_resource" "satellite_location" {
      ~ id       = "6249488150138523163" -> (known after apply)
        triggers = {
            "API_KEY"        = "NEbVeyabM26IvMkS-Xxw2hrhZqkpVxah7wA8m8aTm_RX"
            "ENDPOINT"       = "cloud.ibm.com"
            "LOCATION"       = "aws-us-east-1-nvirginia-test-z-2"
            "PROVIDER"       = "aws"
            "REGION"         = "us-east"
            "RESOURCE_GROUP" = "Default"
        }
    }

Plan: 1 to add, 0 to change, 1 to destroy.

------------------------------------------------------------------------

When I run tf apply "./tfplan" I get this

Error: Error running command '. .terraform/modules/apply_aws_sat_location.satellite-location/modules/location/../../modules/location/scripts/location.sh': exit status 1. Output: API endpoint: https://cloud.ibm.com
Authenticating...
OK

Targeted account NabooQA (910f83366dbf46a091aa3c69d320438a) <-> 2137386

Targeted resource group Default

Targeted region us-east

                      
API endpoint:      https://cloud.ibm.com   
Region:            us-east   
User:              [email protected]   
Account:           NabooQA (910f83366dbf46a091aa3c69d320438a) <-> 2137386   
Resource group:    Default   
CF API endpoint:      
Org:                  
Space:                
Location= aws-us-east-1-nvirginia-test-z-2
************* satellite location already exist *****************

Expected

Terraform apply doesn't delete a location that already exists as this will cause issues. For example,

Use case 1:
If we own all clusters under that location it means they will all get deleted prior to deleting the location
Then everything would get rebuilt causing outages on the order of hours

Use case 2:
If we don't own all the clusters in that location and we also have our own clusters we won’t be able to remove the location.

We need the location to not get recreated on subsequent apply calls.

Variable host_labels too restrictive

Issue:

When attempting to use labels that include dashes, host_labels validation fails.

https://github.com/terraform-ibm-modules/terraform-ibm-satellite/blob/main/modules/host/variables.tf#L12

Solution

Support dashes:

variable "host_labels" {
  description = "Host labels to assign host to control plane "
  type        = list(string)
  default     = null

  validation {
    condition     = can([for s in var.host_labels : regex("^[a-zA-Z0-9-:]+$", s)])
    error_message = "Label must be of the form `key:value`."
  }
}

Azure VMs not created

We ran the azure quick start to create a location and none of the azure VMs were created. Other resources were created ok, such as data disks, subnets, etc, but not the VMs. Looking at the schematics logs the terraform apply function for the VMs never ran. See the attached schematics log file.
Uploading schem-novm(1).txt…

Azure retryable error when checking for VM

After creating a VM the terraform gets a retryable error when it's checking if the VM exists from Azure. However, this causes the terraform to exit even though the VM is created successfully:

 2022/02/10 14:32:28 Terraform apply | azurerm_linux_virtual_machine.az_host[8]: Creation complete after 1m17s [id=/subscriptions/xxxhiddenxxx/resourceGroups/ps-az-cp4ba-5562/providers/Microsoft.Compute/virtualMachines/ps-az-cp4ba-6963-vm-8]
 2022/02/10 14:32:28 Terraform apply | 
 2022/02/10 14:32:28 Terraform apply | Error: waiting for creation of Linux Virtual Machine: (Name "ps-az-cp4ba-6963-vm-0" / Resource Group "ps-az-cp4ba-5562"): Code="RetryableError" Message="A retryable error occurred."

@hkavya26 This is the issue being discussed in slack thread: https://ibm-argonauts.slack.com/archives/C01149RMSCU/p1644509021980169
QueryResult.csv.txt
schem_new.txt

`terraform destroy` doesn't delete the satellite location

Problem statement

tf destroy doesn't delete the satellite location
When I run tf destroy, I get this

module.apply_aws_sat_location.module.satellite-location.null_resource.satellite_location: Refreshing state... [id=2014765997865129552]
module.apply_aws_sat_location.null_resource.createIfNotExists: Refreshing state... [id=8409264704357310651]

An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
  - destroy

Terraform will perform the following actions:

  # module.apply_aws_sat_location.null_resource.createIfNotExists will be destroyed
  - resource "null_resource" "createIfNotExists" {
      - id = "8409264704357310651" -> null
    }

  # module.apply_aws_sat_location.module.satellite-location.null_resource.satellite_location will be destroyed
  - resource "null_resource" "satellite_location" {
      - id       = "2014765997865129552" -> null
      - triggers = {
          - "API_KEY"        = "NEbVeyabM26IvMkS-Xxw2hrhZqkpVxah7wA8m8aTm_RX"
          - "ENDPOINT"       = "cloud.ibm.com"
          - "LOCATION"       = "aws-us-east-1-nvirginia-test-z-2"
          - "PROVIDER"       = "aws"
          - "REGION"         = "us-east"
          - "RESOURCE_GROUP" = "Default"
        } -> null
    }

Plan: 0 to add, 0 to change, 2 to destroy.

Do you really want to destroy all resources?
  Terraform will destroy all your managed infrastructure, as shown above.
  There is no undo. Only 'yes' will be accepted to confirm.

  Enter a value: yes

module.apply_aws_sat_location.null_resource.createIfNotExists: Destroying... [id=8409264704357310651]
module.apply_aws_sat_location.module.satellite-location.null_resource.satellite_location: Destroying... [id=2014765997865129552]
module.apply_aws_sat_location.null_resource.createIfNotExists: Destruction complete after 0s
module.apply_aws_sat_location.module.satellite-location.null_resource.satellite_location: Provisioning with 'local-exec'...
module.apply_aws_sat_location.module.satellite-location.null_resource.satellite_location (local-exec): Executing: ["/bin/sh" "-c" ". .terraform/modules/apply_aws_sat_location.satellite-location/modules/location/../../modules/location/scripts/destroy.sh"]
module.apply_aws_sat_location.module.satellite-location.null_resource.satellite_location (local-exec): git-askpass-helper.sh main.tf output.tf terraform.tfstate terraform.tfstate.backup tf-naboo-qa.sh tf-pierre.sh tfplan workspace-aws-useast1-nvirginia.json Deleting location git-askpass-helper.sh main.tf output.tf terraform.tfstate terraform.tfstate.backup tf-naboo-qa.sh tf-pierre.sh tfplan workspace-aws-useast1-nvirginia.json
module.apply_aws_sat_location.module.satellite-location.null_resource.satellite_location (local-exec): LOCATION= aws-us-east-1-nvirginia-test-z-2
module.apply_aws_sat_location.module.satellite-location.null_resource.satellite_location (local-exec): API endpoint: https://cloud.ibm.com
module.apply_aws_sat_location.module.satellite-location.null_resource.satellite_location (local-exec): Authenticating...
module.apply_aws_sat_location.module.satellite-location.null_resource.satellite_location (local-exec): OK

module.apply_aws_sat_location.module.satellite-location.null_resource.satellite_location (local-exec): Targeted account NabooQA (910f83366dbf46a091aa3c69d320438a) <-> 2137386

module.apply_aws_sat_location.module.satellite-location.null_resource.satellite_location (local-exec): Targeted resource group Default

module.apply_aws_sat_location.module.satellite-location.null_resource.satellite_location (local-exec): Targeted region us-east


module.apply_aws_sat_location.module.satellite-location.null_resource.satellite_location (local-exec): API endpoint:      https://cloud.ibm.com
module.apply_aws_sat_location.module.satellite-location.null_resource.satellite_location (local-exec): Region:            us-east
module.apply_aws_sat_location.module.satellite-location.null_resource.satellite_location (local-exec): User:              [email protected]
module.apply_aws_sat_location.module.satellite-location.null_resource.satellite_location (local-exec): Account:           NabooQA (910f83366dbf46a091aa3c69d320438a) <-> 2137386
module.apply_aws_sat_location.module.satellite-location.null_resource.satellite_location (local-exec): Resource group:    Default
module.apply_aws_sat_location.module.satellite-location.null_resource.satellite_location (local-exec): CF API endpoint:
module.apply_aws_sat_location.module.satellite-location.null_resource.satellite_location (local-exec): Org:
module.apply_aws_sat_location.module.satellite-location.null_resource.satellite_location (local-exec): Space:
module.apply_aws_sat_location.module.satellite-location.null_resource.satellite_location (local-exec): Removing location aws-us-east-1-nvirginia-test-z-2...
module.apply_aws_sat_location.module.satellite-location.null_resource.satellite_location: Still destroying... [id=2014765997865129552, 10s elapsed]
module.apply_aws_sat_location.module.satellite-location.null_resource.satellite_location: Still destroying... [id=2014765997865129552, 20s elapsed]
module.apply_aws_sat_location.module.satellite-location.null_resource.satellite_location: Still destroying... [id=2014765997865129552, 30s elapsed]
module.apply_aws_sat_location.module.satellite-location.null_resource.satellite_location (local-exec): FAILED
module.apply_aws_sat_location.module.satellite-location.null_resource.satellite_location (local-exec): Could not complete request:
module.apply_aws_sat_location.module.satellite-location.null_resource.satellite_location (local-exec): Unable to connect to https://containers.cloud.ibm.com/global/v2/satellite/removeController

module.apply_aws_sat_location.module.satellite-location.null_resource.satellite_location (local-exec): Incident ID: 9fe32175-1249-4bdc-8870-4afaaf67e87f

module.apply_aws_sat_location.module.satellite-location.null_resource.satellite_location: Destruction complete after 39s

Destroy complete! Resources: 2 destroyed.

But when I look in the IBM Cloud Satellite locations GUI, I see this
image
Subsequent tf destroy runs show this:

Do you really want to destroy all resources?
  Terraform will destroy all your managed infrastructure, as shown above.
  There is no undo. Only 'yes' will be accepted to confirm.

  Enter a value: yes


Destroy complete! Resources: 0 destroyed.

Expected

The satellite location isn't hanging after tf destroy

`/tmp/.schematics/addhost.sh` Doesn't exist on `terraform plan`

Issue

When running terraform init; terraform plan, I get the following error

  on .terraform/modules/apply_aws_sat_location/aws/sat-location/modules/ec2/ec2.tf line 11, in resource "aws_instance" "ec2_instance":
  11:   user_data = file("/tmp/.schematics/addhost.sh")

Invalid value for "path" parameter: no file exists at
/tmp/.schematics/addhost.sh; this function works only with files that are
distributed as part of the configuration source code, so if this file will be
created by a resource in this configuration you must instead obtain this
result from an attribute of that resource.

Debugging

Attempt 1

Terrafom seems to expect the file to exist before a plan and apply. By running

mkdir /tmp/.schematics; touch /tmp.schematics/addhost.sh

I'm able to bypass the error locally; however, we're using this in a terraform as a service cloud and can't rely on local setup commands

Attempt 2

Adding the above local commands to the module location.sh.

Still got the same error because terraform doesn't see the file at initialization. Even with a depends_on = [module.satellite-location], terraform errors.

Solution thoughts

Perhaps Terraform's local_file will be the solution. That would require refactoring the last part of the module, but the file could also be passed as output from the module and imported into ec2 instances.

Cannot auto assign hosts to cluster due to wrong label

I'm provisioning a cluster on Satellite in Azure using these scripts:
https://github.com/lionelmace/satellite-utils/tree/main/terraform/satellite-azure

The creation of the Azure VMs, Satellite location and the cluster is successfull.

I'm missing the final step where the VMs gets automatically assigned to the cluster.
image

Where in the terraform script can I change the

The module ibm_satellite_location create a label with zone eastus-1 instead of us-east-1

data "ibm_satellite_location" "location" {
        {
            cluster_name = "infrastructure"
            host_id      = "54b69e06af9f051ba8ce"
            host_labels  = {
                "cpu"      = "4"
                "env"      = "prod"
                "memory"   = "16266544"
                "os"       = "RHEL7"
                "provider" = "azure"
                "zone"     = "eastus-1"
            }
            host_name    = "satellite-azure-vm-0"
            ip_address   = "10.0.1.5"
            status       = "normal"
            zone         = "us-east-1"
        },

Due to this difference, the host auto assignement cannot performed. Can we fix that in the module to use the format us-east?

What is the workaround in the meantime?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.