Giter VIP home page Giter VIP logo

Comments (15)

crsimmons avatar crsimmons commented on September 23, 2024

It looks like terraform is trying to downgrade the RDS Postgres version from 9.6.11 to 9.6.6. I'm not sure why you would be running 9.6.11 because we pin to 9.6.6 in our terraform (and have done since March 2018).

What version of Concourse-up are you using?

from concourse-up.

crsimmons avatar crsimmons commented on September 23, 2024

I've done a bit more research and have found that one of our deployments is using 9.6.11. I haven't figured out why or how yet though.

from concourse-up.

crsimmons avatar crsimmons commented on September 23, 2024

It seems that RDS will do a version upgrade in the maintenance window unless you tell it not to. In terraform we aren't specifying the option but it defaults to true. I think we've deployed against our 9.6.11 deployment before so I'm not sure why your terraform is breaking in this way. Does it work if you manually run concourse-up deploy <your deployment>?

from concourse-up.

sureshgoli81 avatar sureshgoli81 commented on September 23, 2024

Hi, Today we have started updating the concourse-up and found same issue as described above.
BTW, AWS is stopped supporting 9.6.6 and now the minor version for postgresql is 9.6.11.

I would suggest in terraform engine_version keep as 9.6. auto_minor_version_upgrade is by default true not not set to false explicitly.

As of now our concourse-up deployment become unstable due to upgrade failure

from concourse-up.

DanielJonesEB avatar DanielJonesEB commented on September 23, 2024

@sureshgoli81 Thanks for reporting - we're looking into this currently.

Can you please explain exactly what you mean by

As of now our concourse-up deployment become unstable due to upgrade failure

Is your Concourse still running, and can builds execute? Does the deployment fail?

from concourse-up.

sureshgoli81 avatar sureshgoli81 commented on September 23, 2024

Hi, No concourse is not running, because our build is failed with error "failed to obtain lock on concourse deployment". When we tried again with concourse-up deploy <project name> we are getting terraform error related to engine_version.

from concourse-up.

DanielJonesEB avatar DanielJonesEB commented on September 23, 2024

Thanks @sureshgoli81.

This should be a recoverable situation. failed to obtain lock on concourse deployment is a BOSH error, whereby there was one deployment ongoing on the concourse deployment (which the renew-certs job will have started), and presumably another deployment was attempted.

If you have BOSH CLI access, I'd recommend doing a bosh vms -d concourse, see if any are unhappy, and consider a bosh restart on the web VM. Then pause the renew-certs pipeline until we get the issue fixed.

from concourse-up.

sureshgoli81 avatar sureshgoli81 commented on September 23, 2024

Hi, Yes. That was our initial Idea however we couldn't able to login to bosh director due to X509 Error related to bosh ca-cert and after looking at the bosh creds file in our bucket. We noticed it was still holding the old bosh director ca cert values.
Now, we have destroyed the setup and recreating the concourse.

from concourse-up.

sureshgoli81 avatar sureshgoli81 commented on September 23, 2024

Just noticed, while provisioning postgesql in AWS RDS. It allows to create DB with 9.6.6. Since auto_minor_version_upgrade is true by default. The RDS gets updated to 9.6.11 in second run of concourse-up deploy and then in third run of concourse-up deploy we started getting error with engine version error while checking the provisioned resource by terraform. So there are two options either by setting auto_minor_version_upgrade to false or only specify 9.6 in engine_version

from concourse-up.

crsimmons avatar crsimmons commented on September 23, 2024

My understanding is that the auto_minor_version_upgrade option being true means that RDS will update your engine in the instance's maintenance window when Amazon detects there is a meaningful minor update. I would expect the first deploy to set 9.6.6 then every subsequent deploy until your first maintenance window will work. After RDS does maintenance and updates your instance then you will see the terraform error.

We're currently considering bumping the engine version to 9.6.11 and setting auto_minor_version_upgrade to false so this doesn't happen again. I've put a separate story in our backlog for increasing the hardcoded version when new versions become available.

I wouldn't expect this terraform issue to negatively impact BOSH. In Concourse-Up the terraform runs first and therefore errors out before anything happens with BOSH. I'm not sure why you would have different bosh director ca cert values. If they are being updated then terraform failing would not stop eval "$(concourse-up info --env <deployment>)" from working. I agree with @DanielJonesEB that your problem sounds unrelated to this issue.

from concourse-up.

DanielJonesEB avatar DanielJonesEB commented on September 23, 2024

Hi, Yes. That was our initial Idea however we couldn't able to login to bosh director due to X509 Error related to bosh ca-cert and after looking at the bosh creds file in our bucket. We noticed it was still holding the old bosh director ca cert values.

I'm not sure why that would have happened, we'll keep an eye out for similar issues. For future reference, another way around this would have been to forcibly terminate the web VM through the AWS console and wait for BOSH's resurrector to recreate it.

from concourse-up.

sureshgoli81 avatar sureshgoli81 commented on September 23, 2024

After re-pavement of concourse-up. Now we are seeing another strange error related to credhub.
We are seeing below error while logging credhub.
credhub api Setting the target url: https://<domain name>:8844/ Error connecting to the targeted API: "Get https://<domain name>:8844/info: x509: certificate is not valid for any names, but wanted to match <<domain Name>>". Please validate your target and retry your request.
Below steps i have followed while during deployment and post deployment of concourse-up:

concourse-up deploy apci --region eu-central-1 --domain <domain name> --workers 3 --web-size xlarge --db-size medium

Post deployment Steps:
Login to concourse for setting-up pipeline: fly --target apci login --insecure --concourse-url https://<domain name>> --username admin --password <<PWD>>
Here login is success and i am able to list workers as well

Login to credhub :
eval "$(concourse-up info --region eu-central-1  --iaas AWS --env apci)"
credhub api
This is failed with above said error.
I am unable to set the parameters in the credhub as well

from concourse-up.

phynias avatar phynias commented on September 23, 2024

@crsimmons when i manually try to do a manual deploy i get the following:


  engine_version: "9.6.11" => "9.6.6"

Error: Error applying plan:

1 error(s) occurred:

* aws_db_instance.default: 1 error(s) occurred:

* aws_db_instance.default: Error modifying DB Instance terraform-20180817015841858400000002: InvalidParameterCombination: Cannot upgrade postgres from 9.6.11 to 9.6.6
	status code: 400, request id: 4a35af32-14ef-4ec4-8481-9a663d83ad73

Terraform does not automatically rollback in the face of errors.
Instead, your Terraform state file has been partially updated with
any resources that successfully completed. Please address the error
above and apply again to incrementally change your infrastructure.

from concourse-up.

DanielJonesEB avatar DanielJonesEB commented on September 23, 2024

Hi @phynias, we've got a fix for this incoming which is just awaiting PM acceptance. If you're feeling brave, you can build from commit 98030ba and use that to deploy. It's passed all system tests.

from concourse-up.

evadinckel avatar evadinckel commented on September 23, 2024

Hi everyone,

Thank you for reporting the issue to us.
This is to let you know that a patch release has just been published for this fix (@phynias)
https://github.com/EngineerBetter/concourse-up/releases/tag/0.20.1

Closing this thread now as the conversation on the other issue reported on this page has been carried out separately: [https://github.com//issues/97]

Best regards,
Eva

from concourse-up.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.