does not support kubernetes 1.23

GPU admission

It is a scheduler extender for GPU admission. It provides the following features:

provides quota limitation according to GPU device type
avoids fragment allocation of node by working with gpu-manager

For more details, please refer to the documents in docs directory in this project

1. Build

$ make build

2. Run

2.1 Run gpu-admission.

$ bin/gpu-admission --address=127.0.0.1:3456 --v=4 --kubeconfig <your kubeconfig> --logtostderr=true

Other options

      --address string                   The address it will listen (default "127.0.0.1:3456")
      --alsologtostderr                  log to standard error as well as files
      --kubeconfig string                Path to a kubeconfig. Only required if out-of-cluster.
      --log-backtrace-at traceLocation   when logging hits line file:N, emit a stack trace (default :0)
      --log-dir string                   If non-empty, write log files in this directory
      --log-flush-frequency duration     Maximum number of seconds between log flushes (default 5s)
      --logtostderr                      log to standard error instead of files (default true)
      --master string                    The address of the Kubernetes API server. Overrides any value in kubeconfig. Only required if out-of-cluster.
      --pprofAddress string              The address for debug (default "127.0.0.1:3457")
      --stderrthreshold severity         logs at or above this threshold go to stderr (default 2)
  -v, --v Level                          number for the log level verbosity
      --version version[=true]           Print version information and quit
      --vmodule moduleSpec               comma-separated list of pattern=N settings for file-filtered logging

2.2 Configure kube-scheduler policy file, and run a kubernetes cluster.

Example for scheduler-policy-config.json:

{
  "kind": "Policy",
  "apiVersion": "v1",
  "predicates": [
    {
      "name": "PodFitsHostPorts"
    },
    {
      "name": "PodFitsResources"
    },
    {
      "name": "NoDiskConflict"
    },
    {
      "name": "MatchNodeSelector"
    },
    {
      "name": "HostName"
    }
  ],
  "extenders": [
    {
      "urlPrefix": "http://<gpu-admission ip>:<gpu-admission port>/scheduler",
      "apiVersion": "v1beta1",
      "filterVerb": "predicates",
      "enableHttps": false,
      "nodeCacheCapable": false
    }
  ],
  "hardPodAffinitySymmetricWeight": 10,
  "alwaysCheckAllPredicates": false
}

Do not forget to add config for scheduler: --policy-config-file=XXX --use-legacy-policy-config=true. Keep this extender as the last one of all scheduler extenders.

	func GetCapacityOfNode(node *v1.Node, resourceName string) int {
	val, ok := node.Status.Capacity[v1.ResourceName(resourceName)]

	if !ok {
	return 0
	}

	return int(val.Value())
	}

	// GetGPUDeviceCountOfNode returns the number of GPU devices
	func GetGPUDeviceCountOfNode(node *v1.Node) int {
	val, ok := node.Status.Capacity[VCoreAnnotation]
	if !ok {
	return 0
	}
	return int(val.Value()) / HundredCore
	}

	if (pod.Spec.NodeName == node.Name \|\| predicateNode == node.Name) &&
	pod.Status.Phase != corev1.PodSucceeded &&
	pod.Status.Phase != corev1.PodFailed {

tkestack / gpu-admission Goto Github PK

gpu-admission's Introduction

GPU admission

1. Build

2. Run

2.1 Run gpu-admission.

2.2 Configure kube-scheduler policy file, and run a kubernetes cluster.

gpu-admission's People

Contributors

Stargazers

Watchers

Forkers

gpu-admission's Issues

Nodes information:

Recommend Projects

Recommend Topics

Recommend Org