Using HorizontalPodAutoscaler in Kubernetes

KubernetesKubernetesIntermediate
Practice Now

Introduction

HorizontalPodAutoscaler is a Kubernetes feature that allows you to automatically scale the number of pods in a deployment based on resource utilization. In this lab, we will learn how to use HorizontalPodAutoscaler to automatically scale our deployment.


Skills Graph

%%%%{init: {'theme':'neutral'}}%%%% flowchart RL kubernetes(("`Kubernetes`")) -.-> kubernetes/BasicCommandsGroup(["`Basic Commands`"]) kubernetes(("`Kubernetes`")) -.-> kubernetes/AdvancedCommandsGroup(["`Advanced Commands`"]) kubernetes/BasicCommandsGroup -.-> kubernetes/get("`Get`") kubernetes/BasicCommandsGroup -.-> kubernetes/run("`Run`") kubernetes/AdvancedCommandsGroup -.-> kubernetes/apply("`Apply`") subgraph Lab Skills kubernetes/get -.-> lab-34031{{"`Using HorizontalPodAutoscaler in Kubernetes`"}} kubernetes/run -.-> lab-34031{{"`Using HorizontalPodAutoscaler in Kubernetes`"}} kubernetes/apply -.-> lab-34031{{"`Using HorizontalPodAutoscaler in Kubernetes`"}} end

Create a Deployment

First, we need to create a deployment to which we will apply the HorizontalPodAutoscaler.

  1. Create a deployment file named deployment.yaml with the following contents:
apiVersion: apps/v1
kind: Deployment
metadata:
  name: hpa-demo
spec:
  replicas: 1
  selector:
    matchLabels:
      app: hpa-demo
  template:
    metadata:
      labels:
        app: hpa-demo
    spec:
      containers:
        - name: hpa-demo
          image: nginx
          resources:
            limits:
              cpu: "1"
              memory: 512Mi
            requests:
              cpu: "0.5"
              memory: 256Mi
---
apiVersion: v1
kind: Service
metadata:
  name: hpa-demo
spec:
  selector:
    app: hpa-demo
  ports:
    - name: http
      port: 80
      targetPort: 80

This deployment specifies a single replica of an Nginx container with resource limits and requests for CPU and memory.

  1. Create the deployment:
kubectl apply -f deployment.yaml

Create a HorizontalPodAutoscaler

Now that we have a deployment, we can create a HorizontalPodAutoscaler to automatically scale the deployment.

  1. Create a HorizontalPodAutoscaler file named hpa.yaml with the following contents:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: hpa-demo
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: hpa-demo
  minReplicas: 1
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          averageUtilization: 1
          type: Utilization

This HorizontalPodAutoscaler specifies that we want to scale the hpa-demo deployment to have between 1 and 10 replicas, and that we want to target an average CPU utilization of 50%.

  1. Create the HorizontalPodAutoscaler:
kubectl apply -f hpa.yaml

Test the HorizontalPodAutoscaler

Now that we have a HorizontalPodAutoscaler, we can test it by generating load on the deployment.

  1. Enable the metrice-server
minikube addons enable metrics-server
  1. Create a load generation pod:
kubectl run -i --tty load-generator --image=busybox /bin/sh
  1. In the load generation pod, run the following command to generate load on the deployment:
while true; do wget -q -O- http://hpa-demo; done
  1. Open another terminal, check the status of the HorizontalPodAutoscaler:
kubectl get hpa

You can see that the number of copies of hpa-demo has been extended to 10. You can check the number of replicas with the following command.

kubectl get pods -l app=hpa-demo
  1. Stop the load generation by typing ctrl+c in the load generation pod.

Summary

In this lab, we learned how to use HorizontalPodAutoscaler to automatically scale a deployment based on resource utilization. We created a deployment, created a HorizontalPodAutoscaler, and tested it by generating load on the deployment. We also saw how the HorizontalPodAutoscaler scaled the deployment in response to the increased load.

Other Kubernetes Tutorials you may like