Giter VIP home page Giter VIP logo

Comments (7)

yorinasub17 avatar yorinasub17 commented on May 28, 2024 1

Thanks for the info! This should be enough for me to try to repro this and resolve this. I should be able to get working on this today if not tomorrow, and hopefully have a fix soon after.

from kubergrunt.

yorinasub17 avatar yorinasub17 commented on May 28, 2024

Thanks for reporting this bug. Can you share:

  • Are you using the kubernetes in-tree controller (type nlb) or the AWS LoadBalancer controller (type external) for the LoadBalancer Service?
  • Are you using instance mode (default) or ip mode?
  • Do you have more than one LoadBalancer Service using NLBs?

Once I have this information, I'll start working on it to repro and investigate a fix.

from kubergrunt.

mraslam avatar mraslam commented on May 28, 2024

@yorinasub17

Thanks for the quick follow up.

Are you using the kubernetes in-tree controller (type nlb) or the AWS LoadBalancer controller (type external) for the LoadBalancer Service?
Internal NLB
Are you using instance mode (default) or ip mode?
Default
Do you have more than one LoadBalancer Service using NLBs?
Yes

from kubergrunt.

mraslam avatar mraslam commented on May 28, 2024

FYI I tried manually increasing the desired ec2 count and draining the node but faced two issues. Could you please provide your insight on this.

  1. We have pvc for some of our apps and it throwing an error saying volume affinity not matched since we do not control the AZ when new instances are created by auto scalar. Ex: New node is assigned in us-east-1b but the drained node and its pvc is in us-east-1a.
  2. Some of the pods wont delete due to pod disruption budget.

from kubergrunt.

yorinasub17 avatar yorinasub17 commented on May 28, 2024

We have pvc for some of our apps and it throwing an error saying volume affinity not matched since we do not control the AZ when new instances are created by auto scalar. Ex: New node is assigned in us-east-1b but the drained node and its pvc is in us-east-1a.

Ah this is unfortunately a known issue from the k8s community with using EBS based PVs and ASGs. You need to actually make sure your ASGs are isolated to a single AZ to resolve this. See the note in cluster-autoscaler that highlights this (pasted below for convenience)

If you’re using Persistent Volumes, your deployment needs to run in the same AZ as where the EBS volume is, otherwise the pod scheduling could fail if it is scheduled in a different AZ and cannot find the EBS volume. To overcome this, either use a single AZ ASG for this use case, or an ASG-per-AZ while enabling --balance-similar-node-groups. Alternately, and depending on your use-case, you might be able to switch from using EBS to using shared storage that is available across AZs (for each pod in its respective AZ). Consider AWS services like Amazon EFS or Amazon FSx for Lustre.

from kubergrunt.

yorinasub17 avatar yorinasub17 commented on May 28, 2024

Some of the pods wont delete due to pod disruption budget.

Is the PDB set above or equal to the number of replicas in the deployment? If so then yes there is no way to drain the nodes. You need to either add more replicas in the deployment so it is above the PDB, or decrease the PDB.

If not, then usually you can resolve this by doing a rolling update (draining 1 node at a time for small clusters, or N nodes for larger clusters where N is less than the number of nodes currently active).

from kubergrunt.

yorinasub17 avatar yorinasub17 commented on May 28, 2024

This should be fixed in https://github.com/gruntwork-io/kubergrunt/releases/tag/v0.7.5 (binaries should show up in 15-30 mins).

from kubergrunt.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.