Introduction
HorizontalPodAutoscaler is a Kubernetes feature that allows you to automatically scale the number of pods in a deployment based on resource utilization. In this lab, we will learn how to use HorizontalPodAutoscaler to automatically scale our deployment.
Start the Minikube Cluster
Before creating resources, you need a running Kubernetes cluster. Minikube is a lightweight Kubernetes environment that runs on your local machine.
Navigate to your working directory:
Open the terminal and navigate to the default project folder:
cd /home/labex/projectStart Minikube:
Start Minikube to initialize a Kubernetes cluster:
minikube start- This command sets up a single-node Kubernetes cluster on your local machine.
- Minikube may take a few minutes to start depending on your system's performance.
Verify Minikube is running:
Check the status of the Minikube cluster:
minikube status- Look for components like
kubeletandapiserverlisted asRunning. - If the cluster is not running, rerun
minikube start.
- Look for components like
If you encounter issues starting Minikube. Use minikube delete to reset the environment if needed.
Create a Deployment
First, we need to create a deployment to which we will apply the HorizontalPodAutoscaler.
- Create a deployment file named
deployment.yamlwith the following contents:
apiVersion: apps/v1
kind: Deployment
metadata:
name: hpa-demo
spec:
replicas: 1
selector:
matchLabels:
app: hpa-demo
template:
metadata:
labels:
app: hpa-demo
spec:
containers:
- name: hpa-demo
image: nginx
resources:
limits:
cpu: "1"
memory: 512Mi
requests:
cpu: "0.5"
memory: 256Mi
---
apiVersion: v1
kind: Service
metadata:
name: hpa-demo
spec:
selector:
app: hpa-demo
ports:
- name: http
port: 80
targetPort: 80
This deployment specifies a single replica of an Nginx container with resource limits and requests for CPU and memory.
- Create the deployment:
kubectl apply -f deployment.yaml
Create a HorizontalPodAutoscaler
Now that we have a deployment, we can create a HorizontalPodAutoscaler to automatically scale the deployment.
- Create a HorizontalPodAutoscaler file named
hpa.yamlwith the following contents:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: hpa-demo
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: hpa-demo
minReplicas: 1
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
averageUtilization: 1
type: Utilization
This HorizontalPodAutoscaler specifies that we want to scale the hpa-demo deployment to have between 1 and 10 replicas, and that we want to target an average CPU utilization of 50%.
- Create the HorizontalPodAutoscaler:
kubectl apply -f hpa.yaml
Test the HorizontalPodAutoscaler
Now that we have a HorizontalPodAutoscaler, we can test it by generating load on the deployment.
- Enable the metrice-server
minikube addons enable metrics-server
- Create a load generation pod:
kubectl run -i --tty load-generator --image=busybox /bin/sh
- In the load generation pod, run the following command to generate load on the deployment:
while true; do wget -q -O- http://hpa-demo; done
- Open another terminal, check the status of the HorizontalPodAutoscaler:
kubectl get hpa
You can see that the number of copies of hpa-demo has been extended to 10. You can check the number of replicas with the following command.
kubectl get pods -l app=hpa-demo
- Stop the load generation by typing
ctrl+cin the load generation pod.
Summary
In this lab, we learned how to use HorizontalPodAutoscaler to automatically scale a deployment based on resource utilization. We created a deployment, created a HorizontalPodAutoscaler, and tested it by generating load on the deployment. We also saw how the HorizontalPodAutoscaler scaled the deployment in response to the increased load.


