Giter VIP home page Giter VIP logo

Comments (6)

timoreimann avatar timoreimann commented on July 22, 2024 2

Hey @mishushakov and @sandhose

thanks for your feedback. Point taken for running a single node for cost efficiency reasons but still want to ensure uptime during rolling restarts.

Regarding the point about "writing better software" to ensure that upgrades are not getting delayed: that's partially true and partially not because how long a drain takes also depends on the customer workload. We use the eviction API for draining and give each node (I think) 30 minutes time to move pods around. Workloads that do not respond to SIGTERMs properly end up using the entire timeframe for no reason. It's a trade-off to make between giving workloads a sane default to switch nodes and having a reasonable upper bound to ensure progress.

As far as i know, the nodes are within the same location, right? So, if some datacenter hardware in this location fails, does it really mater how many nodes i have? Won't they all be affected?

Yes, all nodes from a given cluster run in the same datacenter. Not all errors occur at the datacenter level, however: you could experience issues within a subset of the datacenter, like individual racks or hypervisors.
Not trying to talk you into spinning up more nodes here, only clarifying on different error scenarios you could hit.

Appreciate your opinions on serverless-like platforms. I'm not absolutely certain that it'd be easy to build when components like Istio and knative are involved. That said, the value added you can get out of a serverless platform for certain kinds of workloads is obvious to me as well.

from doks.

timoreimann avatar timoreimann commented on July 22, 2024 1

We have been looking into a create-before-destroy upgrade strategy, though it comes with certain considerations. I suppose that you'd be okay with getting billed for that extra node while it is running? What if the rolling upgrade got significantly delayed or even blocked for some reason?
Genuine questions I'd like to hear your opinion on to better understand how we should design this approach.

Can I also ask why you are running a single node cluster in the first place? That node could fail any time and thereby cause downtime of your cluster. The legitimate concerns you raise about upgrades seem to apply to regular periods of operations as well.

from doks.

mishushakov avatar mishushakov commented on July 22, 2024 1

Hey all,
thanks for you responses.

I suppose that you'd be okay with getting billed for that extra node while it is running

You've got to ask some more people on that. Some will be fine, some maybe not. Although i personally would consider a free node during upgrade as a really nice perk. It will also force you to write a better software, so this:

[rolling upgrade] significantly delayed or even blocked

won't happen.

The legitimate concerns you raise about upgrades seem to apply to regular periods of operations as well

To be honest, i never experienced any issues with your Kubernetes Service / Droplets so far. They seem to run pretty stable, but if some accident happen i will reconsider adding more nodes. As far as i know, the nodes are within the same location, right? So, if some datacenter hardware in this location fails, does it really mater how many nodes i have? Won't they all be affected?

Can I also ask why you are running a single node cluster in the first place

Oh, that's a very good question. Why would i need Kubernetes then, when i only run a single node? Or why would i need a LoadBalancer, where there is nothing to balance?

I'm application developer. I don't want to mess with any hardware/configurations etc. In fact, i never ssh'ed into my node

Kubernetes for me is a tool, that i use to describe my application's requirements and it would provision that resources + handle re-scheduling of pods if they fail. Overall, it's just keeps my computing resources more organised and portable.

The reason why there is only single node in my cluster: i don't require more nodes. The LoadBalancer ensures, that even if the device changes the IP, the traffic will still route to it.

Take a look at my usage (2GB 1vCPU):

Screenshot 2019-08-06 at 13 24 24

Screenshot 2019-08-06 at 13 19 30

As you see, i'm effectively loosing 70-80% of CPU capacity and about 30-50% of RAM. That means, that even with the smallest node possible, i still overpay.

It's more like a tradeoff, between being HA and rational thinking, which is: why would i add more hardware, if i can't even utilise what i have now? Why would you buy another house in the same city you live?

What can be done?

  1. A nodes, that have dynamic hardware capacity or just more options to choose from
  2. A managed, server-less offering from DO

I would like to still get same UX of Kubernetes, but to not manage/scale/maintain nodes at all. Plus, i want my applications to idle, when they don't run, so i don't overpay for resources i don't use and you can serve more customers with the hardware.

I'm looking at the Cloud Run and Fargate and they seem pretty close. What i need is

  • Managed runtime, billed for requests
  • Managed routing, load balancing
  • Automatic scaling + scaling down to zero
  • Networking between functions
  • Ability to run (cron-)jobs
  • Ability to assign a domain, subdomain and a wildcard subdomain to a function and have automatic TLS certificates
  • Ability to deploy both TCP/HTTP/WebSocket functions
  • Ability to scale stateful workloads (based on CPU usage)
  • Ability to set Secrets, Environment Variables, Runtime arguments
  • Support for volumes and volume mounts
  • Support for private registries

With all your resources it would be quite easy to build something like that.

The way Cloud Run works, is they just have a big shared k8s cluster + istio + knative and you get a nice dashboard, where you can put your .yaml and not worry about anything.

Thanks. Would appreciate your feedback on that.

from doks.

timoreimann avatar timoreimann commented on July 22, 2024 1

We have been supporting surge upgrades for quite a bit now (enabled by default when creating clusters via the UI, and needing an extra feature when using the API / doctl): the tl;dr is that we can now create worker nodes before tearing down old ones, thereby enabling proper uptime for single-node clusters as well (in addition to speeding up upgrade velocity).

See the docs for details.

from doks.

sandhose avatar sandhose commented on July 22, 2024

I'd love to have an option for a create-before-destroy upgrade strategy.
Right now when upgrading a 3-nodes cluster, you usually end up with 2 nodes that are overutilized and one that is almost idle. You end up with this because when each node is drained, the pods are re-scheduled on the two other remaining nodes, and the cluster does not rebalances itself when the new node is available.
Also I kinda get why you'd have 1-node clusters for small/cheap side-projects that don't need to be really reliable, but you still want to avoid 5-15min downtimes each time you do an upgrade.

from doks.

liarco avatar liarco commented on July 22, 2024

Right now when upgrading a 3-nodes cluster, you usually end up with 2 nodes that are overutilized and one that is almost idle. You end up with this because when each node is drained, the pods are re-scheduled on the two other remaining nodes, and the cluster does not rebalances itself when the new node is available.

I quote on this, even if things are getting better with 1.16+ (topologySpreadConstraints are useful), those are not always supported by helm charts.

@sandhose maybe this can help you:

apiVersion: apps/v1
kind: Deployment
# ...
spec:
  # ...
  template:
    metadata:
      labels:
        app: my-app-name
    spec:
      # ...
      topologySpreadConstraints:
      - maxSkew: 1
        topologyKey: doks.digitalocean.com/node-id
        whenUnsatisfiable: DoNotSchedule
        labelSelector:
          matchLabels:
            app: my-app-name

Topology spread constraints allow you to avoid scheduling multiple pods to the same node and keep them unscheduled until the new node is available:

whenUnsatisfiable: DoNotSchedule tells the scheduler to let it stay pending if the incoming Pod can’t satisfy the constraint.

This is not limited to replicas, it can also be applied to pods from different deployments (as long as they match the labels), so you can specify constraints like no more than one replica on the same node AND no more than 5 services of the same type on the same node.

from doks.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.