Hadoop Applying Scheduler

HadoopHadoopBeginner
Practice Now

Introduction

In the ancient land of Egypt, the Pharaoh's palace stood as a magnificent testament to the kingdom's power and prosperity. However, beneath the gilded surface, a crisis was brewing. The Pharaoh's vast storehouses, filled with the bounty of the Nile, were in disarray. The priests, responsible for managing the distribution of resources, struggled to keep up with the demands of the people.

Enter Amenhotep, a brilliant young priest tasked with restoring order to the kingdom's resources. His mission was to develop a system that would ensure fair and efficient allocation of the precious goods, ensuring that every citizen received their due share.

The Pharaoh, impressed by Amenhotep's intellect and dedication, granted him access to the latest technological marvel – the Hadoop YARN (Yet Another Resource Negotiator) system. With this powerful tool at his disposal, Amenhotep set out to learn the intricacies of the Applying Scheduler, a key component of YARN that would help him achieve his goal.


Skills Graph

%%%%{init: {'theme':'neutral'}}%%%% flowchart RL hadoop(("`Hadoop`")) -.-> hadoop/HadoopYARNGroup(["`Hadoop YARN`"]) hadoop/HadoopYARNGroup -.-> hadoop/apply_scheduler("`Applying Scheduler`") subgraph Lab Skills hadoop/apply_scheduler -.-> lab-288957{{"`Hadoop Applying Scheduler`"}} end

Understanding the Hadoop YARN Schedulers

In this step, we will explore the different scheduling policies available in Hadoop YARN and their respective use cases.

First, we need switch the user to the Hadoop user:

su - hadoop

Hadoop YARN supports two primary scheduling policies: the Fair Scheduler and the Capacity Scheduler. The Fair Scheduler aims to provide fair sharing of resources among multiple users and applications, ensuring that no single user or application monopolizes the cluster's resources. On the other hand, the Capacity Scheduler allows for hierarchical queue management and resource allocation based on predefined capacity limits.

To display the current scheduler configuration, use the following command:

yarn scheduler -getConf

This command will show you the currently active scheduler and its configuration.

Configuring the Fair Scheduler

In this step, we will configure the Fair Scheduler to ensure fair resource distribution among the kingdom's citizens.

First, create a new configuration file called fair-scheduler.xml in /home/hadoop for the Fair Scheduler:

<!-- /home/hadoop/fair-scheduler.xml -->
<?xml version="1.0"?>
<allocations>
  <defaultQueueSchedulingPolicy>fair</defaultQueueSchedulingPolicy>
  <queue name="root">
    <weight>1.0</weight>
    <queue name="citizens">
      <weight>1.0</weight>
      <minResources>1024 mb, 1 vcores</minResources>
    </queue>
    <queue name="priests">
      <weight>2.0</weight>
      <minResources>2048 mb, 2 vcores</minResources>
    </queue>
  </queue>
</allocations>

In this configuration, we have defined two queues: citizens and priests. The citizens queue has a weight of 1.0 and a minimum resource allocation of 1024 MB memory and 1 vcore. The priests queue has a weight of 2.0 and a minimum resource allocation of 2048 MB memory and 2 vcores.

Next, apply the new configuration by running the following command:

yarn scheduler --setConf /home/hadoop/fair-scheduler.xml

Verify that the Fair Scheduler is now active by running the yarn scheduler -getConf command again.

Configuring the Capacity Scheduler

In this step, we will configure the Capacity Scheduler to allocate resources based on predefined capacity limits.

First, create a new configuration file called capacity-scheduler.xml in /home/hadoop/ for the Capacity Scheduler:

<!-- /home/hadoop/capacity-scheduler.xml -->
<?xml version="1.0"?>
<configuration>
  <property>
    <name>yarn.scheduler.capacity.root.queues</name>
    <value>citizens,priests</value>
  </property>
  <property>
    <name>yarn.scheduler.capacity.root.citizens.capacity</name>
    <value>50</value>
  </property>
  <property>
    <name>yarn.scheduler.capacity.root.priests.capacity</name>
    <value>50</value>
  </property>
</configuration>

In this configuration, we have defined two queues: citizens and priests. Each queue is allocated 50% of the cluster's resources.

Next, apply the new configuration by running the following command:

yarn scheduler --setConf /home/hadoop/capacity-scheduler.xml

Verify that the Capacity Scheduler is now active by running the yarn scheduler -getConf command again.

Summary

In this lab, we learned how to configure and apply different scheduling policies in Hadoop YARN to manage resource allocation effectively. By mastering the Fair Scheduler and the Capacity Scheduler, Amenhotep can ensure fair distribution of resources among the kingdom's citizens and prioritize critical tasks performed by the priests.

Through this hands-on experience, we gained a deeper understanding of the powerful capabilities of Hadoop YARN and its ability to manage resources in a complex environment. By applying the knowledge gained from this lab, we can build efficient and fair resource management systems tailored to the specific needs of our organization.

Other Hadoop Tutorials you may like