Giter VIP home page Giter VIP logo

Comments (27)

aveshagarwal avatar aveshagarwal commented on June 10, 2024 5

any more details, please expand, I would be interested in contributing

Lets say a pod was scheduled on a node where there is mismatch between pod's toleration and node's taints, and it could be due to that the pod was scheduled on a node whose taints were later updated. Since, pod's tolerations were only checked at the time admission, and later node's taints were updated, pod continued running on the node, so it only applies for NoSchedule taints.

In summary, it would work as follows:

  1. get a list of nods (which already exists)
  2. get a list of pods on each node (which already exists)
  3. Verify that the node's taints (NoSchedule) are still satisfied by its pods' tolerations.
  4. As per previous step, If not satisfied, evict the pod.
  5. If yes (still satisfied), continue checking other pods until you have reached the end of list of pods for that node.
  6. Repeat above for all nodes.

from descheduler.

paktek123 avatar paktek123 commented on June 10, 2024 1

any more details, please expand, I would be interested in contributing

from descheduler.

pravarag avatar pravarag commented on June 10, 2024 1

@seanmalloy @damemi looks like this issue hasn't been picked up in long time. Is it okay if I give it a try? will post my queries here if any.

/assign

from descheduler.

pravarag avatar pravarag commented on June 10, 2024 1

Thanks @damemi for clarifying, I guess this issue will be closed now 🙂

from descheduler.

chadswen avatar chadswen commented on June 10, 2024

@paktek123 Take a look at this issue for a use case currently implemented with Draino that could be replaced by descheduler: kubernetes/node-problem-detector#199 (comment)

from descheduler.

paktek123 avatar paktek123 commented on June 10, 2024

Thanks for the explanation, I will hopefully try to contribute next week

from descheduler.

warmchang avatar warmchang commented on June 10, 2024

Nice idea! It is also very useful for customizing the migration time of POD.

from descheduler.

fejta-bot avatar fejta-bot commented on June 10, 2024

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

from descheduler.

fejta-bot avatar fejta-bot commented on June 10, 2024

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

from descheduler.

fejta-bot avatar fejta-bot commented on June 10, 2024

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

from descheduler.

k8s-ci-robot avatar k8s-ci-robot commented on June 10, 2024

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

from descheduler.

damemi avatar damemi commented on June 10, 2024

/reopen

from descheduler.

k8s-ci-robot avatar k8s-ci-robot commented on June 10, 2024

@damemi: Reopened this issue.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

from descheduler.

fejta-bot avatar fejta-bot commented on June 10, 2024

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

from descheduler.

k8s-ci-robot avatar k8s-ci-robot commented on June 10, 2024

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

from descheduler.

isindir avatar isindir commented on June 10, 2024

/reopen

from descheduler.

k8s-ci-robot avatar k8s-ci-robot commented on June 10, 2024

@isindir: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

from descheduler.

seanmalloy avatar seanmalloy commented on June 10, 2024

/reopen

from descheduler.

k8s-ci-robot avatar k8s-ci-robot commented on June 10, 2024

@seanmalloy: Reopened this issue.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

from descheduler.

seanmalloy avatar seanmalloy commented on June 10, 2024

/remove-lifecycle rotten
/kind feature

from descheduler.

StevenACoffman avatar StevenACoffman commented on June 10, 2024

Please do!

from descheduler.

pravarag avatar pravarag commented on June 10, 2024

@damemi @aveshagarwal one query around this description mentioned here, Are we looking to add a new check in api itself like somewhere here: https://github.com/kubernetes-sigs/descheduler/tree/master/pkg/descheduler/node or this feature is going to be a new change w.r.t command line options as well maybe here: https://github.com/kubernetes-sigs/descheduler/tree/master/cmd/descheduler/app ?

from descheduler.

damemi avatar damemi commented on June 10, 2024

I'm not actually sure why this issue is still open, I might have mistakenly reopened it when it went stale... we have a Taints/Tolerations strategy that was merged out of this issue (https://github.com/kubernetes-sigs/descheduler#removepodsviolatingnodetaints)

Is there something that strategy is missing from the discussion here? Or can we close this?

from descheduler.

StevenACoffman avatar StevenACoffman commented on June 10, 2024

Just checking, but is it now possible to:

  1. Detect permanent node problems and set Node Conditions using the Node Problem Detector and the scheduler's TaintNodesByCondition functionality.
  2. Configure Descheduler to deschedule pods based on taints to cordon and drain nodes when they exhibit the NPD's KernelDeadlock condition, or a variant of KernelDeadlock we call VolumeTaskHung.
  3. Let the Cluster Autoscaler scale down underutilised nodes, including the nodes Descheduler has drained.

If so, is there an example?

from descheduler.

damemi avatar damemi commented on June 10, 2024

@StevenACoffman yes, the descheduler will evict any pods that are currently running on a node that has any NoSchedule taint that the pods do not tolerate. So, your use case should work with the RemovePodsViolatingNodeTaints strategy

from descheduler.

damemi avatar damemi commented on June 10, 2024

/close

from descheduler.

k8s-ci-robot avatar k8s-ci-robot commented on June 10, 2024

@damemi: Closing this issue.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

from descheduler.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.