Hi, We'd like to add the ability to proportionally scale stateful se

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

We'd likely be scaling on the number of nodes, but perhaps if the <a class="issue-link

Allow horizontally scaling statefulsets about cluster-proportional-autoscaler HOT 34 CLOSED

kubernetes-sigs commented on September 23, 2024

Allow horizontally scaling statefulsets

from cluster-proportional-autoscaler.

Comments (34)

djjayeeta commented on September 23, 2024 2

A use case for this feature. Kube state metrics has automated horizontal sharding (https://github.com/kubernetes/kube-state-metrics#horizontal-scaling-sharding) which is based on statefulset without PVC. A good metric to scale the statefulset is latency recommended by the community. Given it requires HPA with custom metrics (requires some extra work to adjust it in our architecture). Another approximate way can be to scale statefulset based on number of nodes (I may be wrong in this scenario).

What I would like to say, statefulset PV's can be left on the users to handle, some statefulset may not use PV's (ex this case), so it may be safe to use this feature in those cases.

from cluster-proportional-autoscaler.

MrHohn commented on September 23, 2024 1

Sounds good to me, would the idea also be scaling StatefulSets proportionally based on cluster size?

/cc @foxish @janetkuo for more insight :)

from cluster-proportional-autoscaler.

diranged commented on September 23, 2024 1

/reopen

This is a really obvious feature to add IMO... Would really like to see it worked on.

from cluster-proportional-autoscaler.

stuart-warren commented on September 23, 2024

I assume this would require a newer version of the client library too

from cluster-proportional-autoscaler.

davidopp commented on September 23, 2024

@mwielgus @fgrzadkowski @MaciekPytel

from cluster-proportional-autoscaler.

fgrzadkowski commented on September 23, 2024

What would you be scaling based on?

from cluster-proportional-autoscaler.

fgrzadkowski commented on September 23, 2024

Would it be safe to just remove some instances? Would it be safe to add new ones?

from cluster-proportional-autoscaler.

stuart-warren commented on September 23, 2024

We'd likely be scaling on the number of nodes, but perhaps if the #19 is completed, then also size of node.

I see it that you have to know that you want to scale a stateful set in advance, so it's up to you to know it is safe to do so and whether a min/max setting is required. I haven't looked too deep at the code yet to see if would handle not being able to scale down due to a node being unavailable.

from cluster-proportional-autoscaler.

mwielgus commented on September 23, 2024

What would be the main use-case for this?

from cluster-proportional-autoscaler.

stuart-warren commented on September 23, 2024

I said in the initial post :)

We'd like to have a common set of kubernetes manifest files that set up a base cluster and any necessary services for dev, test and prod. Currently we struggle to run this on very small clusters/minikube because on larger clusters we want more instances of prometheus/etcd/docker registry mirrors/etc.

from cluster-proportional-autoscaler.

mwielgus commented on September 23, 2024

Right, must have missed that. Sorry :).

from cluster-proportional-autoscaler.

MrHohn commented on September 23, 2024

I see it that you have to know that you want to scale a stateful set in advance, so it's up to you to know it is safe to do so and whether a min/max setting is required. I haven't looked too deep at the code yet to see if would handle not being able to scale down due to a node being unavailable.

@stuart-warren The autoscaler only takes available nodes into account.

It is totally true that it's up to the users to know whether it is safe to scale a statefulset. Though I'm not sure how to make our generic controllers be application-aware?

from cluster-proportional-autoscaler.

Tedezed commented on September 23, 2024

Issue related in kubernetes/kubernetes: kubernetes/kubernetes#44033

from cluster-proportional-autoscaler.

foxish commented on September 23, 2024

@gyliu513
The doc talks about kubectl scale which is shorthand from the CLI to modify the number of replicas in the StatefulSet spec. kubectl autoscale creates an HPA resource for you (described in kubernetes/kubernetes#48591), and that's different. This thread is talking about cluster-proportional-autoscaler which would scale the statefulset according to the number of nodes in the cluster.

For most stateful applications (zookeeper, mysql, etc), the scale needs deliberate thought and is likely not be something we want to vary with the size of the cluster. @stuart-warren, for the docker registry mirror pods, why does that use-case require a StatefulSet?

/cc @kow3ns

from cluster-proportional-autoscaler.

rdsubhas commented on September 23, 2024

Hi folks, reading the docs, node autoscaling says it won't scale down a node if there is a pod not backed by a replicationcontroller. Similarly, looking at the API docs, looks like horizontal pod autoscaler supports only pods backed by replicationcontroller now.

So - a tantential question, are there any plans to back StatefulSet with a replicationcontroller? Because now I don't see any RS behind SS. Because having the pod ID and ordering guarantees are really nice (reduce metrics explosion, better kubectl dev UX, etc) while still having same functionality of Deployments would be cool 👍

from cluster-proportional-autoscaler.

stuart-warren commented on September 23, 2024

@foxish technically yes this app doesn't need to be a statefulset, but we'd still like to be able to control the size of a zookeeper/cassandra cluster depending on the number of nodes in a cluster.

Ideally we'd have "everything" run in minikube with reduced resource requests and single instances and in a massive production cluster with many instances and increased resource requests.

from cluster-proportional-autoscaler.

fejta-bot commented on September 23, 2024

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

from cluster-proportional-autoscaler.

fejta-bot commented on September 23, 2024

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

from cluster-proportional-autoscaler.

stuart-warren commented on September 23, 2024

/remove-lifecycle rotten

from cluster-proportional-autoscaler.

fejta-bot commented on September 23, 2024

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

from cluster-proportional-autoscaler.

fejta-bot commented on September 23, 2024

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

from cluster-proportional-autoscaler.

fejta-bot commented on September 23, 2024

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

from cluster-proportional-autoscaler.

k8s-ci-robot commented on September 23, 2024

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

from cluster-proportional-autoscaler.

salavessa commented on September 23, 2024

Hi, I would also like to see this functionality implemented.

Looking at the source code the change seems super simple and straight forward (but I may be terribly wrong).

The use case is related to the unique ability of a statefulset to use volumeClaimTemplates which provisions pv/pvc automatically if necessary (https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#components).
I also use podManagementPolicy: Parallel so no issue if one of the lower index pods is erroring or can't be scheduled.
There could also be a big fat WARNING in the docs to highlight the less obvious issues (and discussed in some comments above) of using a statefulset instead of a deployment but the functionality would be there if you would really want it.

Thanks!

from cluster-proportional-autoscaler.

salavessa commented on September 23, 2024

/remove-lifecycle rotten
/reopen

from cluster-proportional-autoscaler.

salavessa commented on September 23, 2024

/reopen

from cluster-proportional-autoscaler.

k8s-ci-robot commented on September 23, 2024

@salavessa: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

/reopen

from cluster-proportional-autoscaler.

fejta-bot commented on September 23, 2024

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

from cluster-proportional-autoscaler.

fejta-bot commented on September 23, 2024

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

from cluster-proportional-autoscaler.

fejta-bot commented on September 23, 2024

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

from cluster-proportional-autoscaler.

k8s-ci-robot commented on September 23, 2024

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

from cluster-proportional-autoscaler.

fejta-bot commented on September 23, 2024

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

from cluster-proportional-autoscaler.

k8s-ci-robot commented on September 23, 2024

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

from cluster-proportional-autoscaler.

k8s-ci-robot commented on September 23, 2024

@diranged: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

/reopen

This is a really obvious feature to add IMO... Would really like to see it worked on.

from cluster-proportional-autoscaler.

Allow horizontally scaling statefulsets about cluster-proportional-autoscaler HOT 34 CLOSED

Comments (34)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent