Briefly Introduce Horizontal Pod Autoscaler (HPA) for Future Learning
In this step, you'll get an introduction to Horizontal Pod Autoscaler (HPA), a powerful Kubernetes feature that automatically scales applications based on resource utilization.
Enable metrics server addon in Minikube:
minikube addons enable metrics-server
Example output:
* The 'metrics-server' addon is enabled
Create a deployment with resource requests:
nano ~/project/k8s-manifests/hpa-example.yaml
Add the following content:
apiVersion: apps/v1
kind: Deployment
metadata:
name: php-apache
spec:
selector:
matchLabels:
run: php-apache
replicas: 1
template:
metadata:
labels:
run: php-apache
spec:
containers:
- name: php-apache
image: k8s.gcr.io/hpa-example
ports:
- containerPort: 80
resources:
limits:
cpu: 500m
requests:
cpu: 200m
---
apiVersion: v1
kind: Service
metadata:
name: php-apache
labels:
run: php-apache
spec:
ports:
- port: 80
selector:
run: php-apache
Apply the deployment:
kubectl apply -f ~/project/k8s-manifests/hpa-example.yaml
Create an HPA configuration:
nano ~/project/k8s-manifests/php-apache-hpa.yaml
Add the following HPA manifest:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: php-apache
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: php-apache
minReplicas: 1
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 50
Apply the HPA configuration:
kubectl apply -f ~/project/k8s-manifests/php-apache-hpa.yaml
Verify the HPA configuration:
kubectl get hpa
Example output:
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
php-apache Deployment/php-apache 0%/50% 1 10 1 30s
Simulate load to trigger scaling (optional):
kubectl run -i --tty load-generator --rm --image=busybox --restart=Never -- /bin/sh -c "while sleep 0.01; do wget -q -O- http://php-apache; done"
Open another terminal and monitor the HPA behavior:
kubectl get hpa
kubectl get hpa
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
php-apache Deployment/php-apache 68%/50% 1 10 2 72s
You can see that the HPA has scaled the deployment to 2 replicas based on CPU utilization.
Press Ctrl+C
to stop the load generator.
Key points about HPA:
- Automatically scales pods based on resource utilization
- Can scale based on CPU, memory, or custom metrics
- Defines min and max replica counts
- Helps maintain application performance