This repo contains horizontal pod autoscaling demo content
Official kubernetes documentation link https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/
Metrics server https://github.com/kubernetes-sigs/metrics-server
-
Dev Teams are deploying to Kubernetes clusters with resources defined with requests and limits with number of replicas.
-
But most of the time teams overprovisions the resources in their deployments/sts.
-
As a SRE/K8S engineer you must expand the on-premises clusters to meet the demand. If itβs a Kubernetes cluster in the cloud, then something like Azure Cluster Autoscaler will occur.
-
Both scenarios are highly inefficient and less cost effective.
So how can we address this issue?
One way to address this is to use Horizontal Pod Autoscaling (HPA)
HorizontalPodAutoscaler (HPA) automatically updates a workload resource (such as a Deployment or StatefulSet), with the aim of automatically scaling the workload to match demand in kubernetes.
When you have HPA implemented in your deployment namespace in your Kubernetes clusters you can,
-
Improve resource utilization
-
Increase availability
-
Improve performance
-
Reduce operational overhead
-
Improve cost efficiency
Prerequisites:
-
A Kubernetes cluster with admin privileges
-
A Metric Server
-
A Deployment
- Create the namespace
kubcelt create ns autoscale-test
- Create the metrics server if it is not present in the cluser
a. Find if it is installed
kubectl get pods -n kube-system | grep metric
b. If it is not present deploy it with the following command. This will create the metric-server deployment with correct roles and rolebindings.
kubectl apply -f components.yaml
- Create the PHP Apache deployment exposing it in port 80
kubectl apply -f apache-test.yaml
- Verify your service and deployment
kubectl get all -n autoscale-test
- Configure auto scaling
a. Set the autoscaling values.
kubectl autoscale deployment php-apache -n autoscale-test --cpu-percent=50 --min=1 --max=15
b. Verify the HPA settings
kubectl get hpa -n autoscale-test
- Let's test the hpa for our deployment
a. Run a watch command to monitor the number of pod scale.
watch kubectl get pods -n autoscale-test
b. Open another terminal and spin up a busybox pod which will continously quering the php-apache service.
kubectl run -i --tty stress-generator --rm --image=busybox -n autoscale-test --restart=Never -- /bin/sh -c "while sleep 0.01; do wget -q -O- http://php-apache; done"
c. You can see the pods are scaling up as the load grows.
d. Stop the stress-generator pod in the other terminal and watch the number of pods of the php-apache deployment scales down. (It might take around 6-7+ Minutes depedning on what you are running)
- Delete the hpa
kubectl delete hpa -n autoscale-test php-apache
- Delete the php-apache deployment
kubectl delete -f apache-test.yaml
- Delete the metric-server if you installed it in the previous steps
kubectl delete -f components.yaml