digitalocean / doks Goto Github PK
View Code? Open in Web Editor NEWManaged Kubernetes designed for simple and cost effective container orchestration.
Home Page: https://www.digitalocean.com/products/kubernetes/
License: Apache License 2.0
Managed Kubernetes designed for simple and cost effective container orchestration.
Home Page: https://www.digitalocean.com/products/kubernetes/
License: Apache License 2.0
https://kubernetes.io/docs/tasks/administer-cluster/reconfigure-kubelet/
i.e. set the --dynamic-config-dir
flag of the kubelet so users can amend the kubelet config.
Would allow the user to configure things like the max eviction grace period #19 which is currently being set to 0.
Currently there seems to be no way to upgrade a Kubernetes cluster from one minor version to another. Therefore, upgrading to a new minor version of Kubernetes (e.g. from 1.14.x to 1.15.x) involves downtime in many cases since resources such as load balancers and persistent volumes need to either be re-created/re-attached manually.
$ doctl kubernetes cluster node-pool list my-cluster
ID Name Size Count Tags Nodes
xxx my-cluster-default-pool s-1vcpu-2gb 1 k8s,k8s:xxx,k8s:worker [my-cluster-default-pool-bvay]
$ doctl kubernetes cluster node-pool update my-cluster my-cluster-default-pool --count=0
Error: PUT https://api.digitalocean.com/v2/kubernetes/clusters/xxx/node_pools/xxx: 500 Server Error
(Actual IDs xxx'd out.)
I'm not sure what I expected attempting to scale the default worker node pool to zero to do, but throwing an HTTP 500 feels like a bug.
Update: The same effect happens when scaling to zero for a manually created node pool as well.
Versions:
$ doctl version
doctl version 1.31.2-release
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.0", GitCommit:"2bd9643cee5b3b3a5ecbd3af49d09018f0773c77", GitTreeState:"clean", BuildDate:"2019-09-21T15:34:26Z", GoVersion:"go1.12.9", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.3", GitCommit:"2d3c76f9091b6bec110a5e63777c332469e0cba2", GitTreeState:"clean", BuildDate:"2019-08-19T11:05:50Z", GoVersion:"go1.12.9", Compiler:"gc", Platform:"linux/amd64"}
Are pod security policies supported yet?
I can't find a way to enable this in the user interface at digitalocean.com.
Thanks
After upgrading from 1.15.x to 1.16.0, it appears custom APIs seem to be broken. For example, running kubectl get apiservice shows these APIs to be unavailable.
NAME SERVICE AVAILABLE
AGE
...
v1beta1.metrics.k8s.io kube-system/metrics-server False (FailedDiscoveryCheck) 27s
...
v1beta1.webhook.cert-manager.io cert-manager/cert-manager-webhook False (FailedDiscoveryCheck) 9h
Checking these APIs reveals more info.
kubectl describe apiservice v1beta1.webhook.cert-manager.io
Name: v1beta1.webhook.cert-manager.io
Namespace:
Labels: app=webhook
app.kubernetes.io/instance=cert-manager
app.kubernetes.io/managed-by=Tiller
app.kubernetes.io/name=webhook
helm.sh/chart=cert-manager-v0.11.0
Annotations: cert-manager.io/inject-ca-from-secret: cert-manager/cert-manager-webhook-tls
kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"apiregistration.k8s.io/v1beta1","kind":"APIService","metadata":{"annotations":{"cert-manager.io/inject-ca-from-secret":"cer...
API Version: apiregistration.k8s.io/v1
Kind: APIService
Metadata:
Creation Timestamp: 2019-11-17T17:21:51Z
Resource Version: 11640563
Self Link: /apis/apiregistration.k8s.io/v1/apiservices/v1beta1.webhook.cert-manager.io
UID: f93ef948-c0bc-42de-a2b7-9ae3248d15e2
Spec:
Ca Bundle: <certificate>
Group: webhook.cert-manager.io
Group Priority Minimum: 1000
Service:
Name: cert-manager-webhook
Namespace: cert-manager
Port: 443
Version: v1beta1
Version Priority: 15
Status:
Conditions:
Last Transition Time: 2019-11-17T17:21:51Z
Message: failing or missing response from https://10.245.173.138:443/apis/webhook.cert-manager.io/v1beta1: Get https://10.245.173.138:443/apis/webhook.cert-manager.io/v1beta1: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Reason: FailedDiscoveryCheck
Status: False
Type: Available
Events: <none>
kubectl describe apiservice v1beta1.metrics.k8s.io
Name: v1beta1.metrics.k8s.io
Namespace:
Labels: <none>
Annotations: kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"apiregistration.k8s.io/v1beta1","kind":"APIService","metadata":{"annotations":{},"name":"v1beta1.metrics.k8s.io"},"spec":{"...
API Version: apiregistration.k8s.io/v1
Kind: APIService
Metadata:
Creation Timestamp: 2019-11-18T02:32:40Z
Resource Version: 11674638
Self Link: /apis/apiregistration.k8s.io/v1/apiservices/v1beta1.metrics.k8s.io
UID: 9fe336bf-f7a9-40bd-b6fe-ca3616bc28d3
Spec:
Group: metrics.k8s.io
Group Priority Minimum: 100
Insecure Skip TLS Verify: true
Service:
Name: metrics-server
Namespace: kube-system
Port: 443
Version: v1beta1
Version Priority: 100
Status:
Conditions:
Last Transition Time: 2019-11-18T02:32:40Z
Message: failing or missing response from https://10.245.47.213:443/apis/metrics.k8s.io/v1beta1: Get https://10.245.47.213:443/apis/metrics.k8s.io/v1beta1: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Reason: FailedDiscoveryCheck
Status: False
Type: Available
Events: <none>
The services above exist in the cluster, so I'm not sure what is happening. Any thoughts or ideas on how to fix this?
It should be possible for pods to resolve node hostnames.
Apart from being generally useful, it is a also requirement for #2.
It would be nice if there were an option in the DO Control Panel to automatically apply a taint to any nodes that belong to a specific node pool - it could be added under the Node Pool settings:
This would help us better manage our clients some of whom need to stay on specific nodes.
Currently I am handling it by manually re-applying the taints if I need to recycle the nodes.
I suppose I could script it and execute a job periodically, but it would be nice if it was done at the DO level.
Thanks for your consideration,
We intend to deploy an application that uses io_uring
API of Linux Kernel on DigitalOcean Kubernetes Service .
The API was added to the Linux Kernel 5.1, released on May 5th 2019.
But it turns out DOKS Nodes are running on Linux 4.19.0-17 thus making io_uring
unavailable for k8s containers.
Is there a planned date on upgrading DOKS nodes to Linux Kernel 5.x? Thanks
Hi, we're trying to configure DOKS for our disk I/O workloads. For this reason we use droplets with NVMe disks โ Basic Premium Droplets.
The problem is that these Droplets boot with entire NVMe drive mounted as /
root ext4 partition.
For example we use Basic Premium Droplet with 50GB NVMe and it's partitioned like this after boot:
fdisk -l
Disk /dev/vda: 50 GiB, 53687091200 bytes, 104857600 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 3FAA642C-E0CB-4615-8382-0482DE0DF9B0
Device Start End Sectors Size Type
/dev/vda1 6144 104857566 104851423 50G Linux filesystem
/dev/vda2 2048 6143 4096 2M BIOS boot
Partition table entries are not in disk order.
For proper use of local drives in k8s with CSI like https://github.com/minio/direct-csi we need to have separate partitions for data:
like so:
fdisk -l
Disk /dev/vda: 50 GiB, 53687091200 bytes, 104857600 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 3FAA642C-E0CB-4615-8382-0482DE0DF9B0
Device Start End Sectors Size Type
/dev/vda1 6144 31463423 31457280 15G Linux filesystem
/dev/vda2 2048 6143 4096 2M BIOS boot
/dev/vda3 31463424 104857566 73394143 35G Linux filesystem
Where 35GB drive /dev/vda3
and its partition are used solely for data storage and we can provide it to direct-csi or other k8s storage solution as a separate drive.
However we can not reduce or shrink size of the root /
ext4 partition, it requires boot in safe mode otherwise root partition becomes read-only after resize on live system.
I've tried resizing /
with fdisk
but then resize2fs
gives error:
resize2fs /dev/vda1
resize2fs 1.44.5 (15-Dec-2018)
Filesystem at /dev/vda1 is mounted on /; on-line resizing required
Basically ext4 partition needs to be unmounted for it to be reduced, but unmounting root /
partition safely is nearly impossible, we'd need to stop all software that can write to disk, then create tmpfs with root data, unmount root, resize it, mount back, move tmp data back, very unsafe.
Ideally we'd be able to do that with either custom ISO or node initialization script or DOKS could allow us configure partitions upfront when creating a worker Node so that it loads with correct partitions right away.
Related issue: digitalocean/csi-digitalocean#384
Can you please suggest any workarounds? Thanks!
Being able to auto-scale DOKS clusters/nodes seems beneficial to address users with rather dynamic work loads.
We should probably look into https://github.com/kubernetes/autoscaler for this purpose.
We are implementing glusterfs cluster to be used as storage for persistent volumes using Heketi API and StorageClasses.
The scenario works only when the glusterfs client package is installed on the worker nodes manually when it comes to pod mounting. Since the manual package installation can be overwritten anytime the reconciler is run, it is requested to make it part of worker node image so that it does not have to be installed manually and also works in autoscale scenario.
It appears that in a fresh DigitalOcean Kubernetes cluster the kube-system
namespace already contains a kube-state-metrics
serviceaccount:
apiVersion: v1
kind: ServiceAccount
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"v1","imagePullSecrets":[],"kind":"ServiceAccount","metadata":{"annotations":{},"name":"kube-state-metrics","namespace":"kube-system"}}
creationTimestamp: "2021-04-24T16:33:13Z"
managedFields:
- apiVersion: v1
fieldsType: FieldsV1
fieldsV1:
f:secrets:
.: {}
k:{"name":"kube-state-metrics-token-766fc"}:
.: {}
f:name: {}
manager: kube-controller-manager
operation: Update
time: "2021-04-24T16:33:13Z"
- apiVersion: v1
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:annotations:
.: {}
f:kubectl.kubernetes.io/last-applied-configuration: {}
manager: kubectl-client-side-apply
operation: Update
time: "2021-04-24T16:33:13Z"
name: kube-state-metrics
namespace: kube-system
resourceVersion: "288"
uid: 1e228525-a73b-4a93-83d2-09c06ca719b8
secrets:
- name: kube-state-metrics-token-766fc
This is even though kube-state-metrics was not installed (and is not running in any namespace).
The cluster is at Kubernetes version 1.20.2-do.0
and was created using the Terraform DigitalOcean provider version 2.7.0
.
This trips the kube-state-metrics Helm chart version 2.13.2
, which also tries to create this serviceaccount:
Error: rendered manifests contain a resource that already exists. Unable to continue with install: ServiceAccount "kube-state-metrics" in namespace "kube-system" exists and cannot be imported into the current release: invalid ownership metadata; label validation error: missing key "app.kubernetes.io/managed-by": must be set to "Helm"; annotation validation error: missing key "meta.helm.sh/release-name": must be set to "kube-state-metrics"; annotation validation error: missing key "meta.helm.sh/release-namespace": must be set to "kube-system"
In a cluster created a long time ago with Kubernetes 1.18 and using a different version of the kube-state-metrics Helm chart this was not the case.
Is it possible to not create this serviceaccount automatically?
Hello ๐
I've a bug report/question about the title:
I've recently created a new cluster on DO, with following Terraform configuration
resource "digitalocean_kubernetes_cluster" "main" {
# ...
node_pool {
# ...
labels = {}
tags = []
taint {
key = "x-resource-kind"
value = "apps"
effect = "NoSchedule"
}
}
}
resource "digitalocean_kubernetes_node_pool" "pool-main-storages" {
# ...
labels = {}
tags = []
taint {
key = "x-resource-kind"
value = "storages"
effect = "NoSchedule"
}
}
Basically I want the new nodes spawned to automatically be given a taint, since I want to control my current/future pods resources for internal usages. The clusters & node pools are created fine, and so is the taint
captain@glados:~$ kubectl describe nodes pool-main-fv5zb
# ...
Taints: x-resource-kind=apps:NoSchedule
# ...
But I noticed that one of the deployments are not running (coredns
)
captain@glados:~$ kubectl get deployment -n kube-system
NAME READY UP-TO-DATE AVAILABLE AGE
cilium-operator 1/1 1 1 10h
coredns 0/2 2 0 10h
captain@glados:~$ kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
cilium-operator-98d97cdf6-phw2j 1/1 Running 0 10h
cilium-plbv2 1/1 Running 0 10h
coredns-575d7877bb-9sxdl 0/1 Pending 0 10h
coredns-575d7877bb-pwjtl 0/1 Pending 0 10h
cpc-bridge-proxy-hl55s 1/1 Running 0 10h
konnectivity-agent-dcgsg 1/1 Running 0 10h
kube-proxy-zfn9p 1/1 Running 0 10h
captain@glados:~$ kubectl describe pod/coredns-575d7877bb-9sxdl -n kube-system
# ...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 31m (x118 over 10h) default-scheduler 0/1 nodes are available: 1 node(s) had untolerated taint {x-resource-kind: apps}. preemption: 0/1 nodes are available: 1 Preemption is not helpful for scheduling..
Normal NotTriggerScaleUp 2m10s (x431 over 7h16m) cluster-autoscaler pod didn't trigger scale-up: 1 node(s) had untolerated taint {x-resource-kind: apps}
Is this expected? From the logs I understand why it didn't trigger the scale up, it's just that I don't know whether this is the proper behaviour or not.
It's also that other kube-system
pods/deployments are running fine, I think because the tolerations are set up to "always tolerate everything"
captain@glados:~$ kubectl describe pod/cilium-plbv2 -n kube-system
# ...
Tolerations: op=Exists
node.kubernetes.io/disk-pressure:NoSchedule op=Exists
node.kubernetes.io/memory-pressure:NoSchedule op=Exists
node.kubernetes.io/network-unavailable:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists
node.kubernetes.io/pid-pressure:NoSchedule op=Exists
node.kubernetes.io/unreachable:NoExecute op=Exists
node.kubernetes.io/unschedulable:NoSchedule op=Exists
# ...
versus
captain@glados:~$ kubectl describe pod/coredns-575d7877bb-9sxdl -n kube-system
# ...
Tolerations: CriticalAddonsOnly op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
# ...
As per the reference
If this is expected, you can close this issue. If not then the default deployment might need to be adjusted maybe? Though I don't know whether this will affect others
the following flag is currently defaulted on the kubelet: --eviction-max-pod-grace-period="0"
which renders any terminationGracePeriodSeconds
set on the pod spec useless when said pod is (softly) evicted.
Would be nice to set this to a larger period (300s / 5m?) in order to support pods that need to terminate gracefully when handling long running connections or processes.
Pods that are evicted due to 'soft' thresholds being met are effectively being hard-killed which doesn't seem right to me.
I created a Kubernetes cluster using Terraform. Since the DigitalOcean Terraform provider does not yet support the urn
attribute for the digitalocean_kubernetes_cluster
resource, I manually moved the cluster and droplets to a different project. Afterwards I created an nginx ingress via Terraform and its Helm provider. The load balancer that the DigitalOcean cloud provider created for the type: LoadBalancer
service of the nginx ingress Helm chart appeared in the default project.
It would be nice if the load balancer was created in the same project as the cluster, ideally following the cluster when it is moved through projects.
An alternative new feature would be to allow specifying the project as an annotation to the nginx ingress service (which I can set via the Helm chart -- I already do that for service.beta.kubernetes.io/do-loadbalancer-name
, cf. https://docs.digitalocean.com/products/kubernetes/how-to/configure-load-balancers/) This could be a new service.beta.kubernetes.io/do-loadbalancer-project-name: "my-project-name"
annotation.
Another alternative would be to create the load balancer using the digitalocean_loadbalancer
Terraform resource, which does support the urn
attribute and can thus be moved to a specific project. Afterwards I could set the kubernetes.digitalocean.com/load-balancer-id
annotation on the nginx ingress service to make it use the existing load balancer instead of creating a new one. However, this would require me to configure a forwarding_rule
for this resource, which would conflict with the configuration set by the cloud provider.
As of today, installing metrics-server requires tweaking the configuration since the default is to reach out to nodes by DNS and use TLS. A fair number of users have asked to support the default setup including TLS, which is a highly reasonable request.
The issue has originally been discussed in digitalocean/digitalocean-cloud-controller-manager#150. Several comments describe how to run metrics-server in TLS-less mode as a workaround for now.
Currently, cluster-internal users cannot talk to nodes over TLS because kubelets do not serve certificates with SANs. Many use cases (e.g., #2, DataDog/integrations-core/issues/2582) require this, however.
when trying to apply the timeZone
field for Kubernetes cronjob
kind:
CronJob.batch "helium-scheduler" is invalid: spec.timeZone: Invalid value: "Asia/Singapore": unknown time zone Asia/Singapore
Where can I check the supported timezone or if this feature is available?
When a pod within the cluster connects to a load balancer HTTPS port that is configured to perform TLS termination (i.e. has a certificate configured), TLS is not terminated and the connection is forwarded to the pod HTTP port as-is. This causes traffic from within the cluster to fail.
The Service definition:
kind: Service
apiVersion: v1
metadata:
name: traefik
annotations:
service.beta.kubernetes.io/do-loadbalancer-protocol: http
service.beta.kubernetes.io/do-loadbalancer-tls-ports: "443"
service.beta.kubernetes.io/do-loadbalancer-certificate-id: XXX
spec:
type: LoadBalancer
selector:
app: traefik
ports:
- name: http
port: 80
- name: https
port: 443
targetPort: 80
(See also the https-with-cert-nginx.yml example.)
Connection flow:
External -> LB proto HTTPS port 443 -> Service proto HTTP port 443 -> Pod proto HTTP port 80
Internal -> LB proto HTTPS port 443 -> Service proto HTTPS port 443 -> Pod proto HTTPS port 80
(For DigitalOcean engineers, I posted debugging information in support ticket 3402891.)
Hello,
I am fighting to deploy the metrics-server and see CPU/Mem usage in the dashboard using a control plane with 1.20.2-do.0.
kubectl top pods
returns:
Error from server (ServiceUnavailable): the server is currently unable to handle the request (get pods.metrics.k8s.io)
I have tried many things and current everything *appears * to be running properly on the cluster, but still no metrics.
Several users have expressed the need to have an internal pod reach reach out a service by running through an external load-balancer. Reasons for the extra routing step are to have the proxy handle TLS and/or Proxy Protocol in a consistent manner.
The reason for the described behavior is that kube-proxy explicitly implements a bypassing logic. There is already an upstream issue describing this as a problem for users.
A workaround is outlined in the originating CCM issue. The only other alternative is to have pods speak to the service natively, which often isn't desired.
We should look into ways to support external routing one way or another in DOKS.
I want to use Wireguard (VPN) to create a persistent connection between a Kubernetes cluster and a server hosted elsewhere.
Wireguard uses UDP for communication, but UDP is not supported in the DigitalOcean load balancers. A NodePort Service is not ideal because Node IPs are not static, and will change when the nodes are recycled (like during a Kubernetes version upgrade). I want to use a LoadBalancer because this gives me a static IP.
apiVersion: v1
kind: Service
metadata:
name: wireguard-load-balancer
spec:
type: LoadBalancer
selector:
app: wireguard
ports:
- port: 51820
targetPort: wireguard-port
protocol: UDP
We integrate the upstream dashboard. In order for metrics to show up on the dashboard, the dedicated kubernetesui/metrics-scraper
sidecar needs to be integrated as well. This issue is to track the necessary work.
This is probably more a general DO feature request than a k8s specific request.
We need static IPs for egress traffic as it makes it a lot easier for our customers to whitelist traffic from us. We are currently using a kubeadm
cluster and some custom scripting to route all traffic through a single node (which has a floating ip).
Ideally we could use DOKS directly.
https://cloud.google.com/nat/docs/using-nat
https://docs.aws.amazon.com/vpc/latest/userguide/vpc-nat-gateway.html
Hey,
please consider to add a feature, that will add (and remove) a node to a single-node cluster during update, to maintain high availability. Thanks.
Copying from this DO Idea:
As a user, Iโd like to be able to add arbitrary (Kubernetes) labels to my worker node pools that are part of my DOKS cluster. Currently, tags on the pool are insufficient as they are not automatically synced, and the validation rules are generally different than those expected by Kubernetes labels. Labeling support is important for advanced scheduling / placement decisions in Kubernetes. Manually labeling nodes is infeasible as any new nodes in the pool will not contain the same base set of labels. During events such as the recent CVE fix, all nodes were replaced, and workloads depending on nodes with certain labels existing were unable to be scheduled.
See existing thread at digitalocean/digitalocean-cloud-controller-manager#136 for more information.
DigitalOcean Projects allows to logically group resources (such as droplets, volumes, and load balancers). Currently, DOKS clusters and their managed resources are not supported.
This issue is to track support, including automatic association of managed resources (e.g., resources created through Kubernetes such as volumes, load balancers, and snapshots) to the cluster's project.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.