Comments (7)
Thanks for the info! This should be enough for me to try to repro this and resolve this. I should be able to get working on this today if not tomorrow, and hopefully have a fix soon after.
from kubergrunt.
Thanks for reporting this bug. Can you share:
- Are you using the kubernetes in-tree controller (type
nlb
) or the AWS LoadBalancer controller (typeexternal
) for the LoadBalancer Service? - Are you using
instance
mode (default) orip
mode? - Do you have more than one LoadBalancer Service using NLBs?
Once I have this information, I'll start working on it to repro and investigate a fix.
from kubergrunt.
Thanks for the quick follow up.
Are you using the kubernetes in-tree controller (type nlb) or the AWS LoadBalancer controller (type external) for the LoadBalancer Service?
Internal NLB
Are you using instance mode (default) or ip mode?
Default
Do you have more than one LoadBalancer Service using NLBs?
Yes
from kubergrunt.
FYI I tried manually increasing the desired ec2 count and draining the node but faced two issues. Could you please provide your insight on this.
- We have pvc for some of our apps and it throwing an error saying volume affinity not matched since we do not control the AZ when new instances are created by auto scalar. Ex: New node is assigned in us-east-1b but the drained node and its pvc is in us-east-1a.
- Some of the pods wont delete due to pod disruption budget.
from kubergrunt.
We have pvc for some of our apps and it throwing an error saying volume affinity not matched since we do not control the AZ when new instances are created by auto scalar. Ex: New node is assigned in us-east-1b but the drained node and its pvc is in us-east-1a.
Ah this is unfortunately a known issue from the k8s community with using EBS based PVs and ASGs. You need to actually make sure your ASGs are isolated to a single AZ to resolve this. See the note in cluster-autoscaler that highlights this (pasted below for convenience)
If youβre using Persistent Volumes, your deployment needs to run in the same AZ as where the EBS volume is, otherwise the pod scheduling could fail if it is scheduled in a different AZ and cannot find the EBS volume. To overcome this, either use a single AZ ASG for this use case, or an ASG-per-AZ while enabling --balance-similar-node-groups. Alternately, and depending on your use-case, you might be able to switch from using EBS to using shared storage that is available across AZs (for each pod in its respective AZ). Consider AWS services like Amazon EFS or Amazon FSx for Lustre.
from kubergrunt.
Some of the pods wont delete due to pod disruption budget.
Is the PDB set above or equal to the number of replicas in the deployment? If so then yes there is no way to drain the nodes. You need to either add more replicas in the deployment so it is above the PDB, or decrease the PDB.
If not, then usually you can resolve this by doing a rolling update (draining 1 node at a time for small clusters, or N nodes for larger clusters where N is less than the number of nodes currently active).
from kubergrunt.
This should be fixed in https://github.com/gruntwork-io/kubergrunt/releases/tag/v0.7.5 (binaries should show up in 15-30 mins).
from kubergrunt.
Related Issues (20)
- Kugergrunt using the wrong version for EKS 1.18 HOT 2
- No --version flag when building from source HOT 1
- EKS deploy error - LoadBalancer hostname is in an unexpected format HOT 1
- New VPC CNI version is available HOT 1
- --delete-local-data is deprecated; we should switch to --delete-emptydir-data HOT 1
- Configurable rolling deployment HOT 1
- cleanup-security-group misses elb security groups
- New recommended VPC CNI version HOT 1
- Support sync command with EKS 1.23
- missing ENABLE_IPv4 and ENABLE_IPv6 variable for cni version 1.10
- Support sync command with EKS 1.24
- Upgrade Go and dependencies
- Allow kubergrunt to continue even when it can't verify load balancer state.
- Add debug logs if kubergrunt cannot access Kubernetes cluster HOT 1
- Add support for 1.25 and drop support of 1.21
- Document a list of kubergrunt versions compatible with k8s versions
- Support for new regions: ECR repo HOT 1
- Update dependencies in kubergrunt
- eks 1.26 support for kubergrunt
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from kubergrunt.