How to handle pod scheduling when a node is cordoned?

KubernetesKubernetesBeginner
Practice Now

Introduction

Kubernetes, the popular container orchestration platform, provides a robust set of features to manage and scale containerized applications. One crucial aspect of Kubernetes is the ability to handle pod scheduling when a node is cordoned. This tutorial will guide you through understanding node cordoning, managing pod scheduling on cordoned nodes, and configuring pod scheduling policies to ensure seamless application deployment and maintenance.


Skills Graph

%%%%{init: {'theme':'neutral'}}%%%% flowchart RL kubernetes(("`Kubernetes`")) -.-> kubernetes/TroubleshootingandDebuggingCommandsGroup(["`Troubleshooting and Debugging Commands`"]) kubernetes(("`Kubernetes`")) -.-> kubernetes/BasicCommandsGroup(["`Basic Commands`"]) kubernetes(("`Kubernetes`")) -.-> kubernetes/AdvancedCommandsGroup(["`Advanced Commands`"]) kubernetes(("`Kubernetes`")) -.-> kubernetes/ConfigurationandVersioningGroup(["`Configuration and Versioning`"]) kubernetes/TroubleshootingandDebuggingCommandsGroup -.-> kubernetes/describe("`Describe`") kubernetes/BasicCommandsGroup -.-> kubernetes/cordon("`Cordon`") kubernetes/BasicCommandsGroup -.-> kubernetes/uncordon("`Uncordon`") kubernetes/AdvancedCommandsGroup -.-> kubernetes/apply("`Apply`") kubernetes/ConfigurationandVersioningGroup -.-> kubernetes/config("`Config`") subgraph Lab Skills kubernetes/describe -.-> lab-415545{{"`How to handle pod scheduling when a node is cordoned?`"}} kubernetes/cordon -.-> lab-415545{{"`How to handle pod scheduling when a node is cordoned?`"}} kubernetes/uncordon -.-> lab-415545{{"`How to handle pod scheduling when a node is cordoned?`"}} kubernetes/apply -.-> lab-415545{{"`How to handle pod scheduling when a node is cordoned?`"}} kubernetes/config -.-> lab-415545{{"`How to handle pod scheduling when a node is cordoned?`"}} end

Understanding Node Cordoning in Kubernetes

What is Node Cordoning in Kubernetes?

Node cordoning is a Kubernetes feature that allows you to mark a node as unschedulable, preventing new pods from being scheduled on that node. This is useful when you need to perform maintenance or upgrades on a node without disrupting the running workloads.

When a node is cordoned, the Kubernetes scheduler will no longer place new pods on that node, but existing pods on the cordoned node will continue to run. This allows you to safely perform operations on the node without affecting the overall application availability.

Why Use Node Cordoning?

There are several common use cases for node cordoning in Kubernetes:

  1. Node Maintenance: When you need to perform maintenance tasks on a node, such as upgrading the operating system or installing security patches, you can cordon the node to prevent new pods from being scheduled on it.

  2. Node Decommissioning: If you are planning to remove a node from your Kubernetes cluster, you can cordon the node to gracefully drain any running pods before the node is shut down or removed.

  3. Node Failure: If a node is experiencing issues or has failed, you can cordon the node to prevent new pods from being scheduled on it while you investigate and resolve the problem.

  4. Resource Optimization: You can cordon nodes with specific resource constraints (e.g., high memory or CPU usage) to ensure that new pods are scheduled on more suitable nodes, optimizing resource utilization.

How to Cordon a Node in Kubernetes

To cordon a node in Kubernetes, you can use the kubectl cordon command:

kubectl cordon <node-name>

This will mark the specified node as unschedulable, preventing new pods from being placed on it.

To verify that a node is cordoned, you can use the kubectl get nodes command:

kubectl get nodes

The output will show the STATUS of the cordoned node as SchedulingDisabled.

To uncordon a node and make it schedulable again, you can use the kubectl uncordon command:

kubectl uncordon <node-name>

This will allow the Kubernetes scheduler to start placing new pods on the node again.

Handling Pod Scheduling on Cordoned Nodes

Understanding Pod Behavior on Cordoned Nodes

When a node is cordoned in Kubernetes, the Kubernetes scheduler will no longer place new pods on that node. However, existing pods that are already running on the cordoned node will continue to run.

This behavior is important to understand, as it means that you can safely cordon a node without immediately disrupting any running workloads. The pods on the cordoned node will continue to run until they are terminated or rescheduled to a different node.

Draining Pods from Cordoned Nodes

While existing pods will continue to run on a cordoned node, you may want to gracefully drain the pods from the node before performing maintenance or decommissioning the node. To do this, you can use the kubectl drain command:

kubectl drain --delete-local-data < node-name > --ignore-daemonsets

This command will evict all the pods from the specified node, except for pods managed by DaemonSets and pods with local data. The --ignore-daemonsets and --delete-local-data options ensure that critical pods are not disrupted during the draining process.

Handling Pod Rescheduling

When a node is cordoned, the Kubernetes scheduler will not place any new pods on that node. However, if a pod on the cordoned node needs to be rescheduled (e.g., due to a node failure or pod eviction), the scheduler will attempt to reschedule the pod on a different, available node.

To ensure that pods are rescheduled correctly, you should configure appropriate pod scheduling policies, such as node affinity or pod anti-affinity, to control where the pods are placed. This can help prevent disruptions to your application's availability during node maintenance or decommissioning.

Example: Draining Pods from a Cordoned Node

Here's an example of how to drain pods from a cordoned node in a Kubernetes cluster running on Ubuntu 22.04:

## Cordon the node
kubectl cordon node-1

## Drain the node
kubectl drain node-1 --ignore-daemonsets --delete-local-data

## Verify that the node is cordoned and drained
kubectl get nodes
kubectl get pods -o wide

This will cordon the node-1 node, drain all the pods from it (except for DaemonSet pods and pods with local data), and then you can verify the node's status and the pod placements.

Configuring Pod Scheduling Policies for Cordoned Nodes

Importance of Pod Scheduling Policies

When working with cordoned nodes in Kubernetes, it's important to configure appropriate pod scheduling policies to ensure that pods are placed on suitable nodes and to minimize disruptions to your application's availability.

Pod scheduling policies, such as node affinity and pod anti-affinity, allow you to control where pods are scheduled within your Kubernetes cluster. These policies can be especially useful when dealing with cordoned nodes to ensure that pods are not scheduled on nodes that are undergoing maintenance or decommissioning.

Using Node Affinity

Node affinity allows you to specify a set of node selection criteria that must be met for a pod to be scheduled on a particular node. This can be useful when you want to ensure that pods are not scheduled on cordoned nodes.

Here's an example of how to use node affinity to avoid scheduling pods on a cordoned node:

apiVersion: v1
kind: Pod
metadata:
  name: my-app
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
          - matchExpressions:
              - key: node-status
                operator: NotIn
                values:
                  - SchedulingDisabled
  containers:
    - name: my-app
      image: my-app:v1

In this example, the pod's node affinity configuration ensures that the pod will not be scheduled on any node with the SchedulingDisabled label, which is the label Kubernetes uses to indicate a cordoned node.

Using Pod Anti-Affinity

Pod anti-affinity allows you to specify that a pod should not be scheduled on a node that already has a certain pod running on it. This can be useful when you want to ensure that pods are not co-located with pods running on cordoned nodes.

Here's an example of how to use pod anti-affinity to avoid scheduling pods on the same node as pods running on a cordoned node:

apiVersion: v1
kind: Pod
metadata:
  name: my-app
spec:
  affinity:
    podAntiAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        - labelSelector:
            matchExpressions:
              - key: node-status
                operator: In
                values:
                  - SchedulingDisabled
          topologyKey: kubernetes.io/hostname
  containers:
    - name: my-app
      image: my-app:v1

In this example, the pod's anti-affinity configuration ensures that the pod will not be scheduled on the same node as any other pod that has the SchedulingDisabled label, which indicates that the node is cordoned.

By configuring these pod scheduling policies, you can ensure that your application's pods are not disrupted when nodes are cordoned for maintenance or decommissioning.

Summary

In this Kubernetes tutorial, you have learned how to effectively handle pod scheduling when a node is cordoned. By understanding the concept of node cordoning, configuring pod scheduling policies, and implementing strategies to manage pods on cordoned nodes, you can ensure the resilience and availability of your Kubernetes-based applications.

Other Kubernetes Tutorials you may like