Hey, please consider to add a feature, that will add (and remove) a node to a sing

Hey <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

Maintain HA on single-node cluster during updates about doks HOT 6 CLOSED

digitalocean commented on July 22, 2024

Maintain HA on single-node cluster during updates

from doks.

Comments (6)

timoreimann commented on July 22, 2024 2

Hey @mishushakov and @sandhose

thanks for your feedback. Point taken for running a single node for cost efficiency reasons but still want to ensure uptime during rolling restarts.

Regarding the point about "writing better software" to ensure that upgrades are not getting delayed: that's partially true and partially not because how long a drain takes also depends on the customer workload. We use the eviction API for draining and give each node (I think) 30 minutes time to move pods around. Workloads that do not respond to SIGTERMs properly end up using the entire timeframe for no reason. It's a trade-off to make between giving workloads a sane default to switch nodes and having a reasonable upper bound to ensure progress.

As far as i know, the nodes are within the same location, right? So, if some datacenter hardware in this location fails, does it really mater how many nodes i have? Won't they all be affected?

Yes, all nodes from a given cluster run in the same datacenter. Not all errors occur at the datacenter level, however: you could experience issues within a subset of the datacenter, like individual racks or hypervisors.
Not trying to talk you into spinning up more nodes here, only clarifying on different error scenarios you could hit.

Appreciate your opinions on serverless-like platforms. I'm not absolutely certain that it'd be easy to build when components like Istio and knative are involved. That said, the value added you can get out of a serverless platform for certain kinds of workloads is obvious to me as well.

from doks.

timoreimann commented on July 22, 2024 1

We have been looking into a create-before-destroy upgrade strategy, though it comes with certain considerations. I suppose that you'd be okay with getting billed for that extra node while it is running? What if the rolling upgrade got significantly delayed or even blocked for some reason?
Genuine questions I'd like to hear your opinion on to better understand how we should design this approach.

Can I also ask why you are running a single node cluster in the first place? That node could fail any time and thereby cause downtime of your cluster. The legitimate concerns you raise about upgrades seem to apply to regular periods of operations as well.

from doks.

mishushakov commented on July 22, 2024 1

Hey all,
thanks for you responses.

I suppose that you'd be okay with getting billed for that extra node while it is running

You've got to ask some more people on that. Some will be fine, some maybe not. Although i personally would consider a free node during upgrade as a really nice perk. It will also force you to write a better software, so this:

[rolling upgrade] significantly delayed or even blocked

won't happen.

The legitimate concerns you raise about upgrades seem to apply to regular periods of operations as well

To be honest, i never experienced any issues with your Kubernetes Service / Droplets so far. They seem to run pretty stable, but if some accident happen i will reconsider adding more nodes. As far as i know, the nodes are within the same location, right? So, if some datacenter hardware in this location fails, does it really mater how many nodes i have? Won't they all be affected?

Can I also ask why you are running a single node cluster in the first place

Oh, that's a very good question. Why would i need Kubernetes then, when i only run a single node? Or why would i need a LoadBalancer, where there is nothing to balance?

I'm application developer. I don't want to mess with any hardware/configurations etc. In fact, i never ssh'ed into my node

Kubernetes for me is a tool, that i use to describe my application's requirements and it would provision that resources + handle re-scheduling of pods if they fail. Overall, it's just keeps my computing resources more organised and portable.

The reason why there is only single node in my cluster: i don't require more nodes. The LoadBalancer ensures, that even if the device changes the IP, the traffic will still route to it.

Take a look at my usage (2GB 1vCPU):

As you see, i'm effectively loosing 70-80% of CPU capacity and about 30-50% of RAM. That means, that even with the smallest node possible, i still overpay.

It's more like a tradeoff, between being HA and rational thinking, which is: why would i add more hardware, if i can't even utilise what i have now? Why would you buy another house in the same city you live?

What can be done?

A nodes, that have dynamic hardware capacity or just more options to choose from
A managed, server-less offering from DO

I would like to still get same UX of Kubernetes, but to not manage/scale/maintain nodes at all. Plus, i want my applications to idle, when they don't run, so i don't overpay for resources i don't use and you can serve more customers with the hardware.

I'm looking at the Cloud Run and Fargate and they seem pretty close. What i need is

Managed runtime, billed for requests
Managed routing, load balancing
Automatic scaling + scaling down to zero
Networking between functions
Ability to run (cron-)jobs
Ability to assign a domain, subdomain and a wildcard subdomain to a function and have automatic TLS certificates
Ability to deploy both TCP/HTTP/WebSocket functions
Ability to scale stateful workloads (based on CPU usage)
Ability to set Secrets, Environment Variables, Runtime arguments
Support for volumes and volume mounts
Support for private registries

With all your resources it would be quite easy to build something like that.

The way Cloud Run works, is they just have a big shared k8s cluster + istio + knative and you get a nice dashboard, where you can put your .yaml and not worry about anything.

Thanks. Would appreciate your feedback on that.

from doks.

timoreimann commented on July 22, 2024 1

We have been supporting surge upgrades for quite a bit now (enabled by default when creating clusters via the UI, and needing an extra feature when using the API / doctl): the tl;dr is that we can now create worker nodes before tearing down old ones, thereby enabling proper uptime for single-node clusters as well (in addition to speeding up upgrade velocity).

See the docs for details.

from doks.

sandhose commented on July 22, 2024

I'd love to have an option for a create-before-destroy upgrade strategy.
Right now when upgrading a 3-nodes cluster, you usually end up with 2 nodes that are overutilized and one that is almost idle. You end up with this because when each node is drained, the pods are re-scheduled on the two other remaining nodes, and the cluster does not rebalances itself when the new node is available.
Also I kinda get why you'd have 1-node clusters for small/cheap side-projects that don't need to be really reliable, but you still want to avoid 5-15min downtimes each time you do an upgrade.

from doks.

liarco commented on July 22, 2024

Right now when upgrading a 3-nodes cluster, you usually end up with 2 nodes that are overutilized and one that is almost idle. You end up with this because when each node is drained, the pods are re-scheduled on the two other remaining nodes, and the cluster does not rebalances itself when the new node is available.

I quote on this, even if things are getting better with 1.16+ (topologySpreadConstraints are useful), those are not always supported by helm charts.

@sandhose maybe this can help you:

apiVersion: apps/v1
kind: Deployment
# ...
spec:
  # ...
  template:
    metadata:
      labels:
        app: my-app-name
    spec:
      # ...
      topologySpreadConstraints:
      - maxSkew: 1
        topologyKey: doks.digitalocean.com/node-id
        whenUnsatisfiable: DoNotSchedule
        labelSelector:
          matchLabels:
            app: my-app-name

Topology spread constraints allow you to avoid scheduling multiple pods to the same node and keep them unscheduled until the new node is available:

whenUnsatisfiable: DoNotSchedule tells the scheduler to let it stay pending if the incoming Pod can’t satisfy the constraint.

This is not limited to replicas, it can also be applied to pods from different deployments (as long as they match the labels), so you can specify constraints like no more than one replica on the same node AND no more than 5 services of the same type on the same node.

from doks.

Maintain HA on single-node cluster during updates about doks HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent