Giter VIP home page Giter VIP logo

kueue's Introduction

Kueue

GoReport Widget Latest Release

kueue logo

Kueue is a set of APIs and controller for job queueing. It is a job-level manager that decides when a job should be admitted to start (as in pods can be created) and when it should stop (as in active pods should be deleted).

Read the overview to learn more.

Features overview

  • Job management: Support job queueing based on priorities with different strategies: StrictFIFO and BestEffortFIFO.
  • Resource management: Support resource fair sharing and preemption with a variety of policies between different tenants.
  • Dynamic resource reclaim: A mechanism to release quota as the pods of a Job complete.
  • Resource flavor fungibility: Quota borrowing or preemption in ClusterQueue and Cohort.
  • Integrations: Built-in support for popular jobs, e.g. BatchJob, Kubeflow training jobs, RayJob, RayCluster, JobSet, plain Pod.
  • System insight: Build-in prometheus metrics to help monitor the state of the system, as well as Conditions.
  • AdmissionChecks: A mechanism for internal or external components to influence whether a workload can be admitted.
  • Advanced autoscaling support: Integration with cluster-autoscaler's provisioningRequest via admissionChecks.
  • Sequential admission: A simple implementation of all-or-nothing scheduling.
  • Partial admission: Allows jobs to run with a smaller parallelism, based on available quota, if the application supports it.

Production Readiness status

  • ✔️ API version: v1beta1, respecting Kubernetes Deprecation Policy

  • ✔️ Up-to-date documentation.

  • ✔️ Test Coverage:

  • ✔️ Scalability verification via performance tests.

  • ✔️ Monitoring via metrics.

  • ✔️ Security: RBAC based accessibility.

  • ✔️ Stable release cycle(2-3 months) for new features, bugfixes, cleanups.

  • ✔️ Adopters running on production.

    Based on community feedback, we continue to simplify and evolve the API to address new use cases.

Installation

Requires Kubernetes 1.22 or newer.

To install the latest release of Kueue in your cluster, run the following command:

kubectl apply --server-side -f https://github.com/kubernetes-sigs/kueue/releases/download/v0.6.2/manifests.yaml

The controller runs in the kueue-system namespace.

Read the installation guide to learn more.

Usage

A minimal configuration can be set by running the examples:

kubectl apply -f examples/admin/single-clusterqueue-setup.yaml

Then you can run a job with:

kubectl create -f examples/jobs/sample-job.yaml

Learn more about:

Architecture

Learn more about the architecture of Kueue with the following design docs:

Roadmap

This is a high-level overview of the main priorities for 2023, in expected order of release:

  • Cooperative preemption support for workloads that implement checkpointing #477
  • Flavor assignment strategies, e.g. minimizing cost vs minimizing borrowing #312
  • Integration with cluster-autoscaler for guaranteed resource provisioning
  • Integration with common custom workloads #74:
    • Kubeflow (TFJob, MPIJob, etc.)
    • Spark
    • Ray
    • Workflows (Tekton, Argo, etc.)

These are features that we aim to have in the long-term, in no particular order:

  • Budget support #28
  • Dashboard for management and monitoring for administrators
  • Multi-cluster support

Community, discussion, contribution, and support

Learn how to engage with the Kubernetes community on the community page and the contributor's guide.

You can reach the maintainers of this project at:

Code of conduct

Participation in the Kubernetes community is governed by the Kubernetes Code of Conduct.

kueue's People

Contributors

achernevskii avatar ahg-g avatar alculquicondor avatar alexandear avatar arangogutierrez avatar astefanutti avatar b1f030 avatar binl233 avatar denkensk avatar dependabot[bot] avatar gekko0114 avatar googs1025 avatar irvingmg avatar k8s-ci-robot avatar kannon92 avatar kerthcet avatar knight42 avatar kunwuluan avatar mcariatm avatar mimowo avatar moficodes avatar mwielgus avatar nayihz avatar pbundyra avatar tenzen-y avatar thisisprasad avatar trasc avatar vicentefb avatar vsoch avatar yaroslava-serdiuk avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

kueue's Issues

Add support for budgets

Currently ClusterQueue supports usage limits at a specific point in time. A common use case is for batch admins to set up budgets, meaning usage limits over periods of time; for example, x cores over over a period of one month.

Support for hierarchical ClusterQueues

Systems like Yarn allow creating a hierarchy of fair sharing, which allows modeling deeper organizational structures with fair-sharing.

Kueue currently supports three organizational levels: Cohort (models a business unit), ClusterQueue (models divisions within a business unit), namespace (models teams within a division). However fair-sharing is only supported at one level, within a cohort.

We opted-out of supporting hierarchy from the beginning for two reasons: (1) it adds complexity to both the API and implementation; (2) it is also not clear that in practice customers need more than two levels of sharing which is what the current model enables and seems to work for other frameworks like Slurm and HTCondor.

As Kueue evolves we likely need to revisit this decision.

Graduate API to beta

Currently, this would be very cumbersome due to the lack of support from kubebuilder kubernetes-sigs/controller-tools#656

Once the support is added and we are ready to publish a v1beta1, we should consider renaming the api group. Note that this requires an official api-review kubernetes/enhancements#1111

Summary doc: https://docs.google.com/document/d/1Uu4hfGxux4Wh_laqZMLxXdEVdty06Sb2DwB035hj700/edit?usp=sharing&resourcekey=0-b7mU7mGPCkEfhjyYDsXOBg (join https://groups.google.com/a/kubernetes.io/g/wg-batch to access)

Potential changes when graduating:

  • Move admission from Workload spec into status (from #498)
  • Rename min, max into something easier to understand.
  • Support queue name as a label, in addition to annotation (makes it easier to filter workloads by queue).
  • Add ObjectMeta into each PodSet template.

Dynamically reclaiming resources

Currently a job's resources are reclaimed by Kueue only when the whole job finishes; for jobs with multiple pods, this entails waiting until the last pod finishes. This is not efficient as the pods of a parallel job may have laggards consuming little resources compared to the overall job.

One solution is to continuously update the Workload object with the number of completed pods so that Kueue can gradually reclaim the resources of those pods.

Fix go lint warnings

After running golang-ci and gofmt, the following is showed

pkg/capacity/capacity_test.go:37:19: Error return value is not checked (errcheck)                                      
        kueue.AddToScheme(scheme)                                                                                                                                                                                                             
                         ^                  
pkg/capacity/capacity_test.go:66:23: Error return value of `cache.AddCapacity` is not checked (errcheck)               
                                        cache.AddCapacity(context.Background(), &c)                                                                                                                                                           
                                                         ^                                                                                                                                                                                    
pkg/capacity/capacity_test.go:89:26: Error return value of `cache.UpdateCapacity` is not checked (errcheck)            
                                        cache.UpdateCapacity(&c)                                                       
                                                            ^                                                                                                                                                                                 
pkg/capacity/capacity_test.go:203:19: Error return value is not checked (errcheck)
        kueue.AddToScheme(scheme)                                                                                                                                                                                                             
                         ^                                                                                                                                                                                                                    
pkg/capacity/snapshot_test.go:38:19: Error return value is not checked (errcheck)                                                                                                                                                             
        kueue.AddToScheme(scheme)                                                                                                                                                                                                             
                         ^                                                                                                                                                                                                                    
pkg/capacity/snapshot_test.go:122:20: Error return value of `cache.AddCapacity` is not checked (errcheck)              
                cache.AddCapacity(context.Background(), &cap)                                                          
                                 ^                                                                                     
pkg/queue/manager.go:61:21: Error return value of `qImpl.setProperties` is not checked (errcheck)
        qImpl.setProperties(q)                   
                           ^                                                                                           
pkg/queue/manager_test.go:394:19: Error return value of `manager.AddQueue` is not checked (errcheck)                   
                manager.AddQueue(ctx, &q)                                                                              
                                ^                          
pkg/queue/manager_test.go:452:20: Error return value of `manager.AddQueue` is not checked (errcheck)                   
                        manager.AddQueue(ctx, &q)                                                                      
                                        ^                                                                                                                                                                                                     
pkg/queue/manager_test.go:462:19: Error return value of `manager.AddQueue` is not checked (errcheck)
                manager.AddQueue(ctx, &q)                                                                              
                                ^                   
pkg/scheduler/scheduler.go:200:32: Error return value of `s.capacityCache.AssumeWorkload` is not checked (errcheck)
        s.capacityCache.AssumeWorkload(newWorkload)                                                                                                                                                                                           
                                      ^                                                                                
pkg/capacity/capacity.go:120:2: S1023: redundant `return` statement (gosimple)                                         
        return                                                                                                         
        ^                                                                                                                                                                                                                                     
pkg/capacity/snapshot_test.go:292:4: SA9003: empty branch (staticcheck)
                        if m == nil {                                                                                                                                                                                                         
                        ^
make: *** [Makefile:73: ci-lint] Error 1

Make the GPU a prime citizen in kueue

Hello fellow HPC and batch enthusiasts, I have read your public doc with much interest and I have seen that the GPU is mentioned a couple of times. To make kueue and GPUs a success story I think we need to align the requirements that kueue needs for scheduling with our k8s stack which should expose the right information that you need to make the right scheduling decisions.

There are dedicated GPUs, MIG slices, vGPU either time shared or MIG backed, those are all features that need to be taken into consideration. Going further if we're doing multi-node with MPI and such, we need to think also about network topologies and node interconnects. You may rather use nodes that have GPUDirect enabled than nodes that have "only" a GPU with a slow ethernet connection.

I am one of the tech-leads for accelerator enablement on Kubernetes at NVIDIA and I am happy to help to move this forward.

Support dynamically sized (elastic) jobs

We should have a clear path towards support spark and other dynamically sized jobs. Another example of this is Ray.

One related aspect is to support dynamically updating the resource requirements of a workload, we can probably limit that to support changing the count of a PodSet in QueuedWorkload (in Spark, the number of workers could change during the runtime of the job, but not the resource requirements of a worker).

One idea is to model it in a way similar to "in-place update to pod resources" [1], but in our case it would be the count that is mutable. The driver pod in spark would be watching for the corresponding QueuedWorkload instance and adjusts the number of workers when the new count is admitted.

[1] https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/1287-in-place-update-pod-resources

Set pending condition on QueuedWorkload with message

A queued workload can be pending for several reasons:

  • The Queue doesn't exist
  • The ClusterQueue doesn't exist
  • The QW's namespace is not allowed by the ClusterQueue
  • The workload was attempted for scheduling but it didn't fit.

We need to find a way to set this information.

Probably the first 2 can happen in the queuedworkload_controller, after every update.
The other 2 should probably during scheduling.

/kind feature

Support Argo/Tekton workflows

This is lower priority than #65, but it would be good to have an integration with a workflow framework.

Argo supports the suspend flag, the tricky part is that suspend is for the whole workflow, meaning a QueuedWorkload would need to represent the resources of the whole workflow all at once.

Ideally Argo should create jobs per sequential step, and then resource reservation happens one step at a time.

Match workload affinity with capacity labels

During workload scheduling, a workload's node affinities and selectors should be matched against the labels of the resource flavors. This allows a workload to specify which exact flavors to use, or even force a different evaluation order of the flavors than that defined by the capacity.

/kind feature

Flavors with matching names should have identical labels/taints

A capacity can borrow resources from flavors matching the names of ones defined in the capacity. Those flavors with matching names should also have identical labels and taints.

One solution is to define a cluster-scoped object API that represents resource flavors that capacities refer to by name when setting a quota. It would look like this:

type ResourceFlavorSpec struct {  
  // the object name serves as the flavor name, e.g., nvidia-tesla-k80. 

  // resource is the resource name, e.g., nvidia.com/gpus.   
  Resource v1.ResourceName  

  // labels associated with this flavor. Those labels are matched against or  
  // converted to node affinity constraints on the workload’s pods.  
  // For example, cloud.provider.com/accelerator: nvidia-tesla-k80.  
  Labels map[string]string  

  // taints associated with this constraint that workloads must explicitly   
  // “tolerate” to be able to use this flavor.  
  // e.g., cloud.provider.com/preemptible="true":NoSchedule  
  Taints      []Taint
}

This will avoid duplicating labels/taints on each capacity and so makes it easier to create a cohort of capacities with similar resources.

The downside is of course now we have another resource that the batch admin needs to deal with. But I expect that the number of flavors will typically be small.

Support for workload preemption

Preemption can be useful to reclaim borrowed capacity, however the obvious tradeoff is interrupting workloads and potentially losing significant progress.

There are two high-level design decision we need to make and whether they should be tunable:

  1. What triggers preemption? reclaiming borrowed capacity? workload priority?
  2. What is the scope? is preemption is cohort knob? a capacity knob? a queue knob?

ClusterQueue updates/deletions and running workloads

With regards to CQ deletions, perhaps we can inject finalizers to block the delete until all running workloads finish, at the same time stop admitting new workloads.

What about CQ updates? One simple solution is to make everything immutable, and so updating a CQ is only possible by recreating it, hence we reduce update to a delete which we already handled above. We can relax this a little by allowing the following updates:

  1. an increase to existing quota
  2. adding new resources and/or flavors
  3. setting a cohort only if it was not set before

all of those updates don't impact running workloads and can be done without checking for current usage levels.

/kind feature

Make unit tests run at least 3 times

We should not allow any flakiness in our unit tests. The prow job should run the tests with -race -count 3

Leave the option in the Makefile to run the tests only once (by default), as it's likely useful during development.

/priority important-soon
/kind cleanup

Make sure assumed workloads are deleted when the object is deleted

Since the scheduler works on a snapshot, it's possible that a workload is deleted between the time we get it from a queue and when we assume it.

We should check the client cache before Assuming a workload to make sure it still exists.

Also, when a workload is deleted, we should clear the cache even if the workload API object is not assigned (regardless of DeleteStateUnknown). This is because the workload could be deleted between the time the scheduler Assumes a workload and it updates the assignment in the API.

/kind bug

Replace borrowing ceiling with weight

bit.ly/kueue-apis defined a weight to dynamically set a borrowing ceiling for each Capacity, based on the total resources in the Cohort and the capacities that have pending workloads.

We need to implement such behavior and remove the ceiling.
The weights and unused resources should lead to a dynamic ceiling that is calculated in every scheduling cycle. The exact semantics of this calculation are not fully understood.
In a given scheduling cycle, which capacities are considered for splitting the unused resources? Only the ones with pending jobs? What about the ones that are already borrowing but have no more pending jobs? What is considered unused resources once some resources have already being borrowed?

There are probably a few interpretations to these questions that lead to slightly different results. We need to explore them and pick one that sounds more reasonable or is based on existing systems.

Publish kueue in GCR

We don't necessarily need to wait for a production-ready version. We can publish alpha/beta builds

/kind feature

Need to improve the readability of the log

1.6451684909657109e+09	INFO	controller-runtime.metrics	Metrics server is starting to listen	{"addr": "127.0.0.1:8080"}
1.6451684909663508e+09	INFO	setup	starting manager
1.6451684909665146e+09	INFO	Starting server	{"kind": "health probe", "addr": "[::]:8081"}
1.645168490966593e+09	INFO	Starting server	{"path": "/metrics", "kind": "metrics", "addr": "127.0.0.1:8080"}
I0218 07:14:51.066639       1 leaderelection.go:248] attempting to acquire leader lease kueue-system/c1f6bfd2.gke-internal.googlesource.com...
I0218 07:15:07.705977       1 leaderelection.go:258] successfully acquired lease kueue-system/c1f6bfd2.gke-internal.googlesource.com
1.6451685077060497e+09	DEBUG	events	Normal	{"object": {"kind":"ConfigMap","namespace":"kueue-system","name":"c1f6bfd2.gke-internal.googlesource.com","uid":"e70e4b9b-54f4-4782-a904-e57d3001c8e6","apiVersion":"v1","resourceVersion":"264201"}, "reason": "LeaderElection", "message": "kueue-controller-manager-7ff7b759bf-nszmb_05445f7f-a871-4a4c-83c1-af075b850e49 became leader"}
1.6451685077061899e+09	DEBUG	events	Normal	{"object": {"kind":"Lease","namespace":"kueue-system","name":"c1f6bfd2.gke-internal.googlesource.com","uid":"72b48bf0-20e0-42a4-823b-2a6edcb3288a","apiVersion":"coordination.k8s.io/v1","resourceVersion":"264202"}, "reason": "LeaderElection", "message": "kueue-controller-manager-7ff7b759bf-nszmb_05445f7f-a871-4a4c-83c1-af075b850e49 became leader"}
1.6451685077062488e+09	INFO	controller.queue	Starting EventSource	{"reconciler group": "kueue.x-k8s.io", "reconciler kind": "Queue", "source": "kind source: *v1alpha1.Queue"}
1.645168507706281e+09	INFO	controller.queue	Starting Controller	{"reconciler group": "kueue.x-k8s.io", "reconciler kind": "Queue"}
1.6451685077062566e+09	INFO	controller.queuedworkload	Starting EventSource	{"reconciler group": "kueue.x-k8s.io", "reconciler kind": "QueuedWorkload", "source": "kind source: *v1alpha1.QueuedWorkload"}
1.6451685077063015e+09	INFO	controller.queuedworkload	Starting Controller	{"reconciler group": "kueue.x-k8s.io", "reconciler kind": "QueuedWorkload"}
1.6451685077062776e+09	INFO	controller.capacity	Starting EventSource	{"reconciler group": "kueue.x-k8s.io", "reconciler kind": "Capacity", "source": "kind source: *v1alpha1.Capacity"}
1.6451685077063189e+09	INFO	controller.capacity	Starting Controller	{"reconciler group": "kueue.x-k8s.io", "reconciler kind": "Capacity"}
1.6451685077064047e+09	INFO	controller.job	Starting EventSource	{"reconciler group": "batch", "reconciler kind": "Job", "source": "kind source: *v1.Job"}
1.6451685077064307e+09	INFO	controller.job	Starting EventSource	{"reconciler group": "batch", "reconciler kind": "Job", "source": "kind source: *v1alpha1.QueuedWorkload"}
1.6451685077064393e+09	INFO	controller.job	Starting Controller	{"reconciler group": "batch", "reconciler kind": "Job"}
1.6451685078075259e+09	INFO	controller.queuedworkload	Starting workers	{"reconciler group": "kueue.x-k8s.io", "reconciler kind": "QueuedWorkload", "worker count": 1}
1.6451685078075113e+09	INFO	controller.capacity	Starting workers	{"reconciler group": "kueue.x-k8s.io", "reconciler kind": "Capacity", "worker count": 1}
1.645168507807566e+09	INFO	controller.queue	Starting workers	{"reconciler group": "kueue.x-k8s.io", "reconciler kind": "Queue", "worker count": 1}
1.6451685078076618e+09	INFO	controller.job	Starting workers	{"reconciler group": "batch", "reconciler kind": "Job", "worker count": 1}
1.645168507807886e+09	LEVEL(-2)	job-reconciler	Job reconcile event	{"job": {"name":"ingress-nginx-admission-create","namespace":"kube-system"}}
1.645168507808418e+09	LEVEL(-2)	job-reconciler	Job reconcile event	{"job": {"name":"ingress-nginx-admission-patch","namespace":"kube-system"}}
1.6451685078085716e+09	LEVEL(-2)	job-reconciler	Job reconcile event	{"job": {"name":"kube-eventer-init-v1.6-a92aba6-aliyun","namespace":"kube-system"}}
1.6451706903900485e+09	LEVEL(-2)	capacity-reconciler	Capacity create event	{"capacity": {"name":"cluster-total"}}
1.6451706904384277e+09	LEVEL(-2)	queue-reconciler	Queue create event	{"queue": {"name":"main","namespace":"default"}}
1.6451707150770907e+09	LEVEL(-2)	job-reconciler	Job reconcile event	{"job": {"name":"sample-job-jjbq2","namespace":"default"}}
1.6451707150895817e+09	LEVEL(-2)	queued-workload-reconciler	QueuedWorkload create event	{"queuedWorkload": {"name":"sample-job-jjbq2","namespace":"default"}, "queue": "main", "status": "pending"}
1.645170715089716e+09	LEVEL(-2)	scheduler	Workload assumed in the cache	{"queuedWorkload": {"name":"sample-job-jjbq2","namespace":"default"}, "capacity": "cluster-total"}
1.6451707150901928e+09	LEVEL(-2)	job-reconciler	Job reconcile event	{"job": {"name":"sample-job-jjbq2","namespace":"default"}}
1.6451707150984285e+09	LEVEL(-2)	scheduler	Successfully assigned capacity and resource flavors to workload	{"queuedWorkload": {"name":"sample-job-jjbq2","namespace":"default"}, "capacity": "cluster-total"}
1.6451707150985863e+09	LEVEL(-2)	queued-workload-reconciler	QueuedWorkload update event	{"queuedWorkload": {"name":"sample-job-jjbq2","namespace":"default"}, "queue": "main", "capacity": "cluster-total", "status": "assigned", "prevStatus": "pending", "prevCapacity": ""}
1.6451707150986767e+09	LEVEL(-2)	job-reconciler	Job reconcile event	{"job": {"name":"sample-job-jjbq2","namespace":"default"}}

We can chose to switch to klog/v2.

Validating that flavors of a resource are different

What if we validate that the flavors of a resource in a capacity have at least on common label key with different values?

This practically forces that each flavor is pointing to different sets of nodes.

Add events that tracks a workload's status

Two possible locations to issue events:

  • when it is assigned a capacity in the scheduling loop.
  • in the job-controller when a corresponding workload is created.

/kind feature

Match workload tolerations with capacity taints

During workload scheduling, a workload's tolerations should be matched against the taints of the resource flavors. This allows a workload to opt-in to specific flavors.

/kind feature
/priority important-soon

Support kubeflow's MPIJob

That is kubeflow's mpi-operator. We could have started with other custom jobs, but this one seems important enough for our audience.

They currently don't have a suspend field, so we need to add it. Then, we program the controller based on the existing kueue job-controller.

/label feature
/size L
/priority important-longterm

Add info to Queue status

Suggestions:

  • Number of pending jobs
  • Number of started jobs
  • Resources currently used by the queue.

/kind feature

Ensure test cases are independent

In an effort to get a binary that "works", we wrote some tests where a test case depends on the state left by previous test cases.

This is problematic for debugging problems and it tends to lead to a lot of test changes when there is a behavior change or you want to insert a case in the middle of the existing ones.

Places that I'm aware of:

And there are similar situations in the following, but it's more like a single complex test case in each:

/priority backlog

Brainstorm enhancing UX

We are adding more information to statuses of the various APIs we have (#7 and #5); but I am wondering what other UX-related enhancements we should pursue for the two personas: batch admin and batch user.

UX gets users excited about the system and I think should be a focal point as Kueue evolves.

Add user guide

/kind feature
/size M

Something more comprehensive that the existing README. Some of the use cases in bit.ly/kueue-apis can be dumped into samples/guides.

If possible, generate some documentation out of the APIs, similar to https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.23/

Contents (not necessarily each one will be a page, but they could be sections on existing pages).

  • Single CQ setup
  • Multiple flavors
  • Multiple CQ setup (cohorts)
  • Namespace selectors
  • Cohorts
  • Running a Job
  • Configuring RBAC
  • Monitoring usage (kubectl describe)

controller.kubernetes.io/queue-name annotation not registered

The code in this repo uses an annotation, controller.kubernetes.io/queue-name, that is not registered in https://kubernetes.io/docs/reference/labels-annotations-taints/

We should either:

  • register and document the annotation
  • avoid specifying controller.kubernetes.io as the namespace for that annotation, and instead require specifying it as a command line option to the app. That way, end-users wouldn't assume that any particular namespace is expected.
  • use another namespace, that is appropriate for kueue.

Consider a diff image for testing/samples

I am running into

  Warning  Failed     11s   kubelet            Failed to pull image "perl": rpc error: code = Unknown desc = reading manifest latest in docker.io/library/perl: toomanyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading: https://www.docker.com/increase-rate-limit

we might want to consider an image out of a diff registry to avoid this unfortunate error

/kind test

Enhance Makefile arguments for img building and pushing

Current Makefile doesn't provide flexible ways to modify how I want to build and push the image

VERSION := $(shell git describe --tags --dirty --always)
# Image URL to use all building/pushing image targets
IMAGE_BUILD_CMD ?= docker build
IMAGE_PUSH_CMD ?= docker push
IMAGE_BUILD_EXTRA_OPTS ?=
IMAGE_REGISTRY ?= k8s.gcr.io/kueue
IMAGE_NAME := controller
IMAGE_TAG_NAME ?= $(VERSION)
IMAGE_EXTRA_TAG_NAMES ?=
IMAGE_REPO ?= $(IMAGE_REGISTRY)/$(IMAGE_NAME)
IMAGE_TAG ?= $(IMAGE_REPO):$(IMAGE_TAG_NAME)
BASE_IMAGE_FULL ?= golang:1.17

Also in order to be more generic, rename
docker-image to simply image or image-build
docker-push to simply push or image-push

This provides more flexibility when developing in a non docker environment, like buildah , podman or even building the image with CI tool on kubernetes it self.

/kind feature

Add scheduler integration tests

We have one that covers the job-controller on its own, we need a test that cover all other controllers together that includes creating queue, capacity and multiple jobs, and inspect that jobs are started as expected.

Add workload priority

This is a placeholder to discuss priority semantics.

We can have it at the workload level or queue level.

[Umbrella] ☂️ Requirements for release 0.1.0

Deadline: May 16th Kubecon EU

Issues that we need to complete to consider kueue ready for a first release:

  • Match workload affinities with flavors #3
  • Single heap per Capacity #87
  • Consistent flavors in a cohort #59
  • Queue status #5
  • Capacity status #7
  • Event for unschedulable workloads #91
  • Capacity namespace selector #4
  • Efficient requeuing #8
  • User guide #64
  • Publish image #52

Nice to have:

  • Add borrowing weight #62
  • E2E test #61
  • Use kueue.sigs.k8s.io API group #23
  • Support for one custom job #65

Rename Capacity to ClusterQueue

Capacity not only defines usage limits for a set of tenants, but it is the level at which ordering will be done for workloads submitted to queues sharing a capacity.

Renaming Capacity to ClusterQueue could provide clarify, with Queue being the namespaced equivalent serving two purposes:

  1. discoverability: tenants can simply list the queues that exist in their namespace to find which ones they can submit their workloads to, so it is simply a pointer to the cluster-scoped ClusterQueue.
  2. address the use case where a tenant is running an experiment and want to define usage limits for that experiment; in this use case an experiment is modeled as a queue; which means tenants should be able to create/delete queues as they see fit.

Add scalability tests

This is critical to better understand kueue's limits and where its bottlenecks. We should check if there is a way to use clusterloader for this

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.