Scale and Load Balance Applications

KubernetesKubernetesBeginner
Practice Now

Introduction

In this lab, you will start a local Kubernetes cluster using Minikube, deploy a sample NGINX application, and then scale it to meet varying demands. You will observe load balancing across multiple pods, monitor cluster events, and gain a brief introduction to Horizontal Pod Autoscaler (HPA) for future scaling automation.


Skills Graph

%%%%{init: {'theme':'neutral'}}%%%% flowchart RL kubernetes(("`Kubernetes`")) -.-> kubernetes/TroubleshootingandDebuggingCommandsGroup(["`Troubleshooting and Debugging Commands`"]) kubernetes(("`Kubernetes`")) -.-> kubernetes/BasicCommandsGroup(["`Basic Commands`"]) kubernetes(("`Kubernetes`")) -.-> kubernetes/AdvancedCommandsGroup(["`Advanced Commands`"]) kubernetes(("`Kubernetes`")) -.-> kubernetes/AdvancedDeploymentGroup(["`Advanced Deployment`"]) kubernetes/TroubleshootingandDebuggingCommandsGroup -.-> kubernetes/describe("`Describe`") kubernetes/TroubleshootingandDebuggingCommandsGroup -.-> kubernetes/exec("`Exec`") kubernetes/BasicCommandsGroup -.-> kubernetes/create("`Create`") kubernetes/BasicCommandsGroup -.-> kubernetes/get("`Get`") kubernetes/AdvancedCommandsGroup -.-> kubernetes/apply("`Apply`") kubernetes/AdvancedDeploymentGroup -.-> kubernetes/scale("`Scale`") subgraph Lab Skills kubernetes/describe -.-> lab-434648{{"`Scale and Load Balance Applications`"}} kubernetes/exec -.-> lab-434648{{"`Scale and Load Balance Applications`"}} kubernetes/create -.-> lab-434648{{"`Scale and Load Balance Applications`"}} kubernetes/get -.-> lab-434648{{"`Scale and Load Balance Applications`"}} kubernetes/apply -.-> lab-434648{{"`Scale and Load Balance Applications`"}} kubernetes/scale -.-> lab-434648{{"`Scale and Load Balance Applications`"}} end

Start the Kubernetes Cluster

In this step, you'll learn how to start and verify a local Kubernetes cluster using Minikube. This is an essential first step for deploying and managing containerized applications in a Kubernetes environment.

First, start the Minikube cluster:

minikube start

Example output:

😄  minikube v1.29.0 on Ubuntu 22.04
âœĻ  Automatically selected the docker driver
📌  Using Docker driver with root permissions
ðŸ”Ĩ  Creating kubernetes in kubernetes cluster
🔄  Restarting existing kubernetes cluster
ðŸģ  Preparing Kubernetes v1.26.1 on Docker 20.10.23 ...
🚀  Launching Kubernetes ...
🌟  Enabling addons: storage-provisioner, default-storageclass
🏄  Done! kubectl is now configured to use "minikube" cluster and "default" namespace

Verify the cluster status using multiple commands:

minikube status
minikube kubectl -- get nodes

Example output for minikube status:

minikube
type: Control Plane
host: Running
kubelet: Running
apiserver: Running
kubeconfig: Configured

Example output for minikube kubectl -- get nodes:

NAME       STATUS   ROLES           AGE   VERSION
minikube   Ready    control-plane   1m    v1.26.1

These commands confirm that:

  1. Minikube is successfully running
  2. A local Kubernetes cluster has been created
  3. The cluster is ready to use
  4. You have a single-node cluster with control plane capabilities

Deploy a Sample Application

In this step, you'll learn how to deploy a simple web application using a Kubernetes Deployment with a single replica. We'll create a YAML manifest for an NGINX web server and apply it to the Minikube cluster.

First, create a directory for your Kubernetes manifests:

mkdir -p ~/project/k8s-manifests
cd ~/project/k8s-manifests

Create a new YAML file for the deployment:

nano nginx-deployment.yaml

Add the following deployment configuration:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  labels:
    app: nginx
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
        - name: nginx
          image: nginx:latest
          ports:
            - containerPort: 80

Save the file (Ctrl+X, then Y, then Enter).

Apply the deployment to the Kubernetes cluster:

kubectl apply -f nginx-deployment.yaml

Example output:

deployment.apps/nginx-deployment created

Verify the deployment status:

kubectl get deployments
kubectl get pods

Example output for kubectl get deployments:

NAME               READY   UP-TO-DATE   AVAILABLE   AGE
nginx-deployment   1/1     1            1           30s

Example output for kubectl get pods:

NAME                                READY   STATUS    RESTARTS   AGE
nginx-deployment-xxx-yyy            1/1     Running   0          30s

Key points about this deployment:

  1. We created a Deployment with a single replica
  2. The deployment uses the latest NGINX image
  3. The container exposes port 80
  4. The deployment has a label app: nginx for identification

Inspect the deployment details:

kubectl describe deployment nginx-deployment

Example output will show deployment configuration, events, and current state.

Scale Deployments by Modifying the replicas Field in YAML

In this step, you'll learn how to scale your Kubernetes deployment by modifying the replicas field in the YAML manifest. Scaling allows you to increase or decrease the number of pod instances running in your cluster.

Open the previously created deployment manifest:

nano ~/project/k8s-manifests/nginx-deployment.yaml

Modify the replicas field from 1 to 3:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  labels:
    app: nginx
spec:
  replicas: 3 ## Changed from 1 to 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
        - name: nginx
          image: nginx:latest
          ports:
            - containerPort: 80

Save the file (Ctrl+X, then Y, then Enter).

Apply the updated deployment:

kubectl apply -f ~/project/k8s-manifests/nginx-deployment.yaml

Example output:

deployment.apps/nginx-deployment configured

Verify the scaled deployment:

kubectl get deployments
kubectl get pods

Example output for kubectl get deployments:

NAME               READY   UP-TO-DATE   AVAILABLE   AGE
nginx-deployment   3/3     3            3           5m

Example output for kubectl get pods:

NAME                                READY   STATUS    RESTARTS   AGE
nginx-deployment-xxx-yyy            1/1     Running   0          5m
nginx-deployment-xxx-zzz            1/1     Running   0          30s
nginx-deployment-xxx-www            1/1     Running   0          30s

Alternative scaling method using kubectl scale:

kubectl scale deployment nginx-deployment --replicas=4

Example output:

deployment.apps/nginx-deployment scaled

Verify the new number of replicas:

kubectl get deployments
kubectl get pods

Key points about scaling:

  1. Modify replicas in the YAML file
  2. Use kubectl apply to update the deployment
  3. Alternatively, use kubectl scale for quick scaling
  4. Kubernetes ensures the desired number of replicas are running

Verify Load Balancing by Checking Multiple Pod Responses

In this step, you'll learn how to verify load balancing in Kubernetes by creating a service and checking responses from multiple pods. We'll expose the NGINX deployment and demonstrate how Kubernetes distributes traffic across replicas.

Create a service to expose the deployment:

nano ~/project/k8s-manifests/nginx-service.yaml

Add the following service configuration:

apiVersion: v1
kind: Service
metadata:
  name: nginx-service
spec:
  selector:
    app: nginx
  type: ClusterIP
  ports:
    - port: 80
      targetPort: 80

Apply the service:

kubectl apply -f ~/project/k8s-manifests/nginx-service.yaml

Example output:

service/nginx-service created

Verify the service:

kubectl get services

Example output:

NAME            TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)   AGE
kubernetes      ClusterIP   10.96.0.1       <none>        443/TCP   30m
nginx-service   ClusterIP   10.96.xxx.xxx   <none>        80/TCP    30s

Create a temporary pod to test load balancing:

kubectl run curl-test --image=curlimages/curl --rm -it -- sh

Inside the temporary pod, run multiple requests:

for i in {1..10}; do
  curl -s nginx-service | grep -o "Welcome to nginx!" && echo " - Request $i"
done

Example output:

Welcome to nginx! - Request 1
Welcome to nginx! - Request 2
Welcome to nginx! - Request 3
...

Alternatively, use a more detailed verification:

for i in {1..10}; do
  curl -s nginx-service | grep -o "nginx/[0-9.]*" && echo " - Request $i"
done

This command helps you see if requests are distributed across different pods.

Exit the temporary pod:

exit

Key points about load balancing:

  1. Services distribute traffic across all matching pods
  2. Each request can potentially hit a different pod
  3. Kubernetes uses a round-robin approach by default
  4. The ClusterIP service type provides internal load balancing

Adjust the Deployment Scale to Meet Demand

In this step, you'll learn how to dynamically adjust your Kubernetes deployment scale to meet changing application demands using different scaling methods.

First, check the current deployment status:

kubectl get deployments

Example output:

NAME               READY   UP-TO-DATE   AVAILABLE   AGE
nginx-deployment   4/4     4            4           45m

Scale the deployment using kubectl command:

kubectl scale deployment nginx-deployment --replicas=5

Example output:

deployment.apps/nginx-deployment scaled

Verify the new number of replicas:

kubectl get deployments
kubectl get pods

Example output for deployments:

NAME               READY   UP-TO-DATE   AVAILABLE   AGE
nginx-deployment   5/5     5            5           46m

Example output for pods:

NAME                                READY   STATUS    RESTARTS   AGE
nginx-deployment-xxx-yyy            1/1     Running   0          1m
nginx-deployment-xxx-zzz            1/1     Running   0          1m
nginx-deployment-xxx-www            1/1     Running   0          1m
nginx-deployment-xxx-aaa            1/1     Running   0          1m
nginx-deployment-xxx-bbb            1/1     Running   0          1m

Update the deployment YAML for persistent scaling:

nano ~/project/k8s-manifests/nginx-deployment.yaml

Modify the replicas field:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  labels:
    app: nginx
spec:
  replicas: 5 ## Updated from previous value
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
        - name: nginx
          image: nginx:latest
          ports:
            - containerPort: 80

Apply the updated configuration:

kubectl apply -f ~/project/k8s-manifests/nginx-deployment.yaml

Example output:

deployment.apps/nginx-deployment configured

Simulate scaling down for reduced demand:

kubectl scale deployment nginx-deployment --replicas=2

Example output:

deployment.apps/nginx-deployment scaled

Verify the reduced number of replicas:

kubectl get deployments
kubectl get pods

Key points about scaling:

  1. Use kubectl scale for quick, temporary scaling
  2. Update YAML for persistent configuration
  3. Kubernetes ensures smooth scaling with minimal disruption
  4. Can scale up or down based on application needs

Monitor Deployment and Pod Events for Changes

In this step, you'll learn how to monitor Kubernetes deployments and pods using various kubectl commands to track changes, troubleshoot issues, and understand the lifecycle of your applications.

Describe the current deployment to get detailed information:

kubectl describe deployment nginx-deployment

Example output:

Name:                   nginx-deployment
Namespace:              default
CreationTimestamp:      [timestamp]
Labels:                 app=nginx
Replicas:               2 desired | 2 updated | 2 total | 2 available | 0 unavailable
StrategyType:           RollingUpdate
MinReadySeconds:        0
RollingUpdateStrategy:  25% max unavailable, 25% max surge
Pod Template:
  Labels:  app=nginx
  Containers:
   nginx:
    Image:        nginx:latest
    Port:         80/TCP
    Host Port:    0/TCP
    Environment:  <none>
    Mounts:       <none>
Conditions:
  Type           Status  Reason
  ----           ------  ------
  Available      True    MinimumReplicasAvailable
  Progressing    True    NewReplicaSetAvailable
OldReplicaSets:  <none>
NewReplicaSet:   nginx-deployment-xxx (2/2 replicas created)
Events:          <some deployment events>

Get detailed information about individual pods:

kubectl describe pods -l app=nginx

Example output will show details for each pod, including:

  • Current status
  • Container information
  • Events
  • IP addresses
  • Node information

View cluster-wide events:

kubectl get events

Example output:

LAST SEEN   TYPE      REASON              OBJECT                           MESSAGE
5m          Normal    Scheduled           pod/nginx-deployment-xxx-yyy    Successfully assigned default/nginx-deployment-xxx-yyy to minikube
5m          Normal    Pulled              pod/nginx-deployment-xxx-yyy    Container image "nginx:latest" already present on machine
5m          Normal    Created             pod/nginx-deployment-xxx-yyy    Created container nginx
5m          Normal    Started             pod/nginx-deployment-xxx-yyy    Started container nginx

Filter events for specific resources:

kubectl get events --field-selector involvedObject.kind=Deployment

Example output will show only deployment-related events.

Simulate an event by deleting a pod:

## Get a pod name
POD_NAME=$(kubectl get pods -l app=nginx -o jsonpath='{.items[0].metadata.name}')

## Delete the pod
kubectl delete pod $POD_NAME

Observe the events and pod recreation:

kubectl get events
kubectl get pods

Key points about monitoring:

  1. kubectl describe provides detailed resource information
  2. kubectl get events shows cluster-wide events
  3. Kubernetes automatically replaces deleted pods
  4. Events help troubleshoot deployment issues

Briefly Introduce Horizontal Pod Autoscaler (HPA) for Future Learning

In this step, you'll get an introduction to Horizontal Pod Autoscaler (HPA), a powerful Kubernetes feature that automatically scales applications based on resource utilization.

Enable metrics server addon in Minikube:

minikube addons enable metrics-server

Example output:

* The 'metrics-server' addon is enabled

Create a deployment with resource requests:

nano ~/project/k8s-manifests/hpa-example.yaml

Add the following content:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: php-apache
spec:
  selector:
    matchLabels:
      run: php-apache
  replicas: 1
  template:
    metadata:
      labels:
        run: php-apache
    spec:
      containers:
        - name: php-apache
          image: k8s.gcr.io/hpa-example
          ports:
            - containerPort: 80
          resources:
            limits:
              cpu: 500m
            requests:
              cpu: 200m
---
apiVersion: v1
kind: Service
metadata:
  name: php-apache
  labels:
    run: php-apache
spec:
  ports:
    - port: 80
  selector:
    run: php-apache

Apply the deployment:

kubectl apply -f ~/project/k8s-manifests/hpa-example.yaml

Create an HPA configuration:

nano ~/project/k8s-manifests/php-apache-hpa.yaml

Add the following HPA manifest:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: php-apache
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: php-apache
  minReplicas: 1
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 50

Apply the HPA configuration:

kubectl apply -f ~/project/k8s-manifests/php-apache-hpa.yaml

Verify the HPA configuration:

kubectl get hpa

Example output:

NAME         REFERENCE              TARGETS         MINPODS   MAXPODS   REPLICAS   AGE
php-apache   Deployment/php-apache  0%/50%          1         10        1          30s

Simulate load to trigger scaling (optional):

kubectl run -i --tty load-generator --rm --image=busybox --restart=Never -- /bin/sh -c "while sleep 0.01; do wget -q -O- http://php-apache; done"

Open another terminal and monitor the HPA behavior:

kubectl get hpa
kubectl get hpa
NAME         REFERENCE               TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
php-apache   Deployment/php-apache   68%/50%   1         10        2          72s

You can see that the HPA has scaled the deployment to 2 replicas based on CPU utilization.

Press Ctrl+C to stop the load generator.

Key points about HPA:

  1. Automatically scales pods based on resource utilization
  2. Can scale based on CPU, memory, or custom metrics
  3. Defines min and max replica counts
  4. Helps maintain application performance

Summary

In this lab, you learned how to start and verify a local Kubernetes cluster using Minikube, which is an essential first step for deploying and managing containerized applications. You then deployed a simple web application using a Kubernetes Deployment with a single replica, creating a YAML manifest for an NGINX web server and applying it to the Minikube cluster. This allowed you to gain experience with the basic deployment and management of a sample application in a Kubernetes environment.

Next, you explored scaling deployments by modifying the replicas field in the YAML configuration, and verified the load balancing behavior by checking the responses from multiple pods. You also learned how to monitor deployment and pod events using kubectl commands, and were introduced to the concept of Horizontal Pod Autoscaler (HPA) for future learning, which can automatically scale your application based on resource utilization.

Other Kubernetes Tutorials you may like