Comments (15)
It looks like terraform is trying to downgrade the RDS Postgres version from 9.6.11 to 9.6.6. I'm not sure why you would be running 9.6.11 because we pin to 9.6.6 in our terraform (and have done since March 2018).
What version of Concourse-up are you using?
from concourse-up.
I've done a bit more research and have found that one of our deployments is using 9.6.11. I haven't figured out why or how yet though.
from concourse-up.
It seems that RDS will do a version upgrade in the maintenance window unless you tell it not to. In terraform we aren't specifying the option but it defaults to true. I think we've deployed against our 9.6.11 deployment before so I'm not sure why your terraform is breaking in this way. Does it work if you manually run concourse-up deploy <your deployment>
?
from concourse-up.
Hi, Today we have started updating the concourse-up and found same issue as described above.
BTW, AWS is stopped supporting 9.6.6 and now the minor version for postgresql is 9.6.11.
I would suggest in terraform engine_version
keep as 9.6. auto_minor_version_upgrade is by default true not not set to false explicitly.
As of now our concourse-up deployment become unstable due to upgrade failure
from concourse-up.
@sureshgoli81 Thanks for reporting - we're looking into this currently.
Can you please explain exactly what you mean by
As of now our concourse-up deployment become unstable due to upgrade failure
Is your Concourse still running, and can builds execute? Does the deployment fail?
from concourse-up.
Hi, No concourse is not running, because our build is failed with error "failed to obtain lock on concourse deployment". When we tried again with concourse-up deploy <project name>
we are getting terraform error related to engine_version.
from concourse-up.
Thanks @sureshgoli81.
This should be a recoverable situation. failed to obtain lock on concourse deployment
is a BOSH error, whereby there was one deployment ongoing on the concourse
deployment (which the renew-certs
job will have started), and presumably another deployment was attempted.
If you have BOSH CLI access, I'd recommend doing a bosh vms -d concourse
, see if any are unhappy, and consider a bosh restart
on the web
VM. Then pause the renew-certs
pipeline until we get the issue fixed.
from concourse-up.
Hi, Yes. That was our initial Idea however we couldn't able to login to bosh director due to X509 Error related to bosh ca-cert and after looking at the bosh creds file in our bucket. We noticed it was still holding the old bosh director ca cert values.
Now, we have destroyed the setup and recreating the concourse.
from concourse-up.
Just noticed, while provisioning postgesql in AWS RDS. It allows to create DB with 9.6.6. Since auto_minor_version_upgrade is true
by default. The RDS gets updated to 9.6.11 in second run of concourse-up deploy and then in third run of concourse-up deploy we started getting error with engine version error while checking the provisioned resource by terraform. So there are two options either by setting auto_minor_version_upgrade to false
or only specify 9.6 in engine_version
from concourse-up.
My understanding is that the auto_minor_version_upgrade
option being true
means that RDS will update your engine in the instance's maintenance window when Amazon detects there is a meaningful minor update. I would expect the first deploy to set 9.6.6
then every subsequent deploy until your first maintenance window will work. After RDS does maintenance and updates your instance then you will see the terraform error.
We're currently considering bumping the engine version to 9.6.11
and setting auto_minor_version_upgrade
to false so this doesn't happen again. I've put a separate story in our backlog for increasing the hardcoded version when new versions become available.
I wouldn't expect this terraform issue to negatively impact BOSH. In Concourse-Up the terraform runs first and therefore errors out before anything happens with BOSH. I'm not sure why you would have different bosh director ca cert values. If they are being updated then terraform failing would not stop eval "$(concourse-up info --env <deployment>)"
from working. I agree with @DanielJonesEB that your problem sounds unrelated to this issue.
from concourse-up.
Hi, Yes. That was our initial Idea however we couldn't able to login to bosh director due to X509 Error related to bosh ca-cert and after looking at the bosh creds file in our bucket. We noticed it was still holding the old bosh director ca cert values.
I'm not sure why that would have happened, we'll keep an eye out for similar issues. For future reference, another way around this would have been to forcibly terminate the web VM through the AWS console and wait for BOSH's resurrector to recreate it.
from concourse-up.
After re-pavement of concourse-up. Now we are seeing another strange error related to credhub.
We are seeing below error while logging credhub.
credhub api Setting the target url: https://<domain name>:8844/ Error connecting to the targeted API: "Get https://<domain name>:8844/info: x509: certificate is not valid for any names, but wanted to match <<domain Name>>". Please validate your target and retry your request.
Below steps i have followed while during deployment and post deployment of concourse-up:
concourse-up deploy apci --region eu-central-1 --domain <domain name> --workers 3 --web-size xlarge --db-size medium
Post deployment Steps:
Login to concourse for setting-up pipeline: fly --target apci login --insecure --concourse-url https://<domain name>> --username admin --password <<PWD>>
Here login is success and i am able to list workers as well
Login to credhub :
eval "$(concourse-up info --region eu-central-1 --iaas AWS --env apci)"
credhub api
This is failed with above said error.
I am unable to set the parameters in the credhub as well
from concourse-up.
@crsimmons when i manually try to do a manual deploy i get the following:
engine_version: "9.6.11" => "9.6.6"
Error: Error applying plan:
1 error(s) occurred:
* aws_db_instance.default: 1 error(s) occurred:
* aws_db_instance.default: Error modifying DB Instance terraform-20180817015841858400000002: InvalidParameterCombination: Cannot upgrade postgres from 9.6.11 to 9.6.6
status code: 400, request id: 4a35af32-14ef-4ec4-8481-9a663d83ad73
Terraform does not automatically rollback in the face of errors.
Instead, your Terraform state file has been partially updated with
any resources that successfully completed. Please address the error
above and apply again to incrementally change your infrastructure.
from concourse-up.
Hi @phynias, we've got a fix for this incoming which is just awaiting PM acceptance. If you're feeling brave, you can build from commit 98030ba and use that to deploy. It's passed all system tests.
from concourse-up.
Hi everyone,
Thank you for reporting the issue to us.
This is to let you know that a patch release has just been published for this fix (@phynias)
https://github.com/EngineerBetter/concourse-up/releases/tag/0.20.1
Closing this thread now as the conversation on the other issue reported on this page has been carried out separately: [https://github.com//issues/97]
Best regards,
Eva
from concourse-up.
Related Issues (20)
- GCP deployment fails on credhub job HOT 6
- Unable to get concourse info HOT 1
- Adding support for DigitalOcean HOT 2
- deployment name is too long. 12 character limit HOT 6
- What is the process for moving to Control Tower? HOT 4
- Grafana config HOT 5
- Timeout when trying to provision new workers on GCP HOT 2
- Add support for Windows HOT 1
- Concourse-Up failing to deploy to GCP HOT 12
- What roles are needed for the GCP service account?
- Configure GCP region? HOT 1
- dns challenge clean up fails on GCP HOT 2
- dns resolution on mac is broken while concourse-up is trying to create letsencryp cert HOT 2
- 0.20.0 release notes typo HOT 2
- InfluxDB has default retention policy resulting in out of disk on web node HOT 4
- Suggestion: Incorporate one more ops-file to add two web-nodes to overcome single point of failure of single web node
- Credhub login failure related to CA certs HOT 11
- upgrade to 0.20.1 fails HOT 15
- GCP empty bucket name HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from concourse-up.