Explore the Node Manager's Role
In this step, you will learn about the role of the Node Manager in the Hadoop YARN architecture.
The Node Manager is a vital component of the Hadoop YARN (Yet Another Resource Negotiator) framework. It is responsible for managing the resources of individual nodes within a Hadoop cluster. Each node in the cluster runs a Node Manager instance, which communicates with the Resource Manager to receive and execute tasks.
Here's how the Node Manager works:
- Node Registration: When a Node Manager starts up, it registers itself with the Resource Manager, providing information about the available resources on its node, such as CPU, memory, and disk space.
- Container Management: The Node Manager is responsible for creating and managing containers, which are isolated execution environments for tasks. Each container has a specific resource allocation defined by the Resource Manager.
- Task Execution: When the Resource Manager assigns a task to a node, the Node Manager creates a container and launches the task within it. The Node Manager monitors the task's execution and reports its status back to the Resource Manager.
- Resource Monitoring: The Node Manager continuously monitors the resource usage of each container and node, ensuring that tasks do not consume more resources than allocated.
- Health Monitoring: The Node Manager also monitors the health of the node itself, checking for issues like disk failures or network connectivity problems. If a node becomes unhealthy, the Node Manager can report this to the Resource Manager, which can then take appropriate actions, such as restarting or rescheduling tasks.
To explore the Node Manager's role, let's first switch to the hadoop
user:
su - hadoop
Next, we can check the status of the Node Manager by running the following command:
yarn node -status <Node-Id>
tips: you can find the 'Node-Id' by yarn node -list
command.
This command will display information about the running Node Manager, including its address, the resources available on the node, and the currently running containers.
hadoop:~/ $ yarn node -status iZj6c4hvgdd6j6qljtbxoaZ:39885 [21:53:30]
2024-03-23 21:54:08,741 INFO client.DefaultNoHARMFailoverProxyProvider: Connecting to ResourceManager at /0.0.0.0:8032
2024-03-23 21:54:09,119 INFO conf.Configuration: resource-types.xml not found
2024-03-23 21:54:09,128 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'.
Node Report :
Node-Id : iZj6c4hvgdd6j6qljtbxoaZ:39885
Rack : /default-rack
Node-State : RUNNING
Node-Http-Address : iZj6c4hvgdd6j6qljtbxoaZ:8042
Last-Health-Update : Sat 23/Mar/24 09:52:56:762CST
...