Comments (11)
Thanks to @pieterlange for the PR to fix this problem on master. That change should be built/published now, so this issue should be fixed.
Ideally the master/edge/latest
mutable tags wouldn't be the way these images are consumed, but I think discussion of these tags belongs in a different issue.
Thanks for reporting the issue
from awscli.
Thanks for pointing this out @mumoshu , we have pinned versions and setup a internal docker registry after quay.io was down and now we fetch new versions from quay or gcr to keep synched.
from awscli.
So, i know it's bad to depend on "latest" / "master" unversioned image tags but this has been kube-aws's dependency since it's inception and may potentially hit every kube-aws deployed cluster. Can we restore previous behaviour on master
for now @euank and give users some time to update their clusters?
from awscli.
Maybe @ankon, but I suggest change it to 'off' to be sure. I had an old 0.9.3 etcd reboot even with the 'etcd-lock' errors. It appeared to be for updates, but maybe it was for a different reason.
from awscli.
I noticed on quay.io/coreos/awscli that the 'master' image now has a 'CMD' in the Dockerfile, but I cannot see where this actually comes from.
from awscli.
The CMD /bin/sh
comes from this change in alpine: gliderlabs/docker-alpine#199
The FROM alpine
is implicitly including that if a new enough alpine image is referenced.
We can fix this by explicitly overriding cmd to be /bin/sh -c
to match the previous behavior.
It can also be worked around by using either of the following commands:
/usr/bin/rkt run ..... quay.io/coreos/awscli:latest --exec=aws -- s3 ....
# or
/usr/bin/rkt run ..... quay.io/coreos/awscli:latest --exec=sh -- -c "aws s3 ...."
from awscli.
For now I worked around this by using :latest, which right now seems to be "previous". It would be great to have versions on this repository, so that one can point to the specific images.
Right now I need to recover my complete kubernetes cluster, as the attempt to fix this problem ("cannot bring up new nodes") created some major mess due to some critical other commands failing. I think it would be best to restore the previous behavior for now, under the assumption that this might have actually affected more people, and the change being an "API change" for awscli:master, if you will :)
from awscli.
I'm in the same boat with an old cluster that needs upgrading, but it looks like the current kube-aws behaviour would also trigger this.
from awscli.
cc'ing @Fsero @danielfm @c-knowles @camilb @whereisaaron
In case your node starts failing while fetching cloud-configs from s3, this issue would help. awsCliImage.tag
to something older or override the aws command in cloud-config-*
as @ankon suggested.
from awscli.
Thanks @mumoshu. We should warn kube-aws
users to check their etcd nodes ASAP. Versions of kube-aws
from at least 0.9.3 0.9.5-rc.5 and earlier deployed etcd
with the etcd-lock
reboot strategy. Given this issue, it appears the next automatic update reboot could bring down everyone's k8s etcd cluster without notice? The current 0.9.5 and later versions appears safe ('reboot-strategy: "off"') by default. Not sure about 0.9.4/5/6/7/8?
It looks like this was fixed on March 20 for version v0.9.5-rc.5 and later, changing the etcd node strategy to 'off'. So any cluster older than ~9 months is ripe for sudden etcd failure.
@ankon sorry you got to be the canary!
This is obviously a hard lesson in using versioned tags! To be fair though, coreos/awscli
doesn't actually have any tagged versions. Only master
, latest
, and edge
(or is that big hex string a 'version'?). Could we now tag the old ('latest') version and this new version with a versioned tag?
Given the purpose of this container is purely to provide aws
could we tag the old and new versions with the awscli version it contains? Then current and future clusters can be tagged to one of these versions, and we won't have this 'cluster apocalypse' again with the next breaking change :-).
from awscli.
Versions of kube-aws from at least 0.9.7 and earlier deployed etcd with the etcd-lock reboot strategy.
Just had a minor heart-attack, that was a good pointer to check!
At least on my (horribly old kube-aws 0.9.4-ish) cluster it seems that this etcd is indeed configured to use the 'etcd-lock' reboot strategy, BUT locksmithd complains about a missing "reboot window", and so the reboot doesn't actually happen. I'm in the progress of replacing this cluster, so right now things seem to be ok ... for me.
Given the purpose of this container is purely to provide aws could we tag the old and new versions with the awscli version it contains?
This seems reasonable to me, there doesn't seem to be much of a point to use an arbitrary AWS CLI version when the actual needed parts are known in advance (copy stuff from S3, something else? :D).
from awscli.
Related Issues (7)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from awscli.