Configuring Auxiliary Services for the NodeManager
As mentioned in the previous section, the Node Manager in YARN allows you to configure various auxiliary services to enhance the functionality of the application containers. These auxiliary services can be used for tasks such as logging, monitoring, and application-specific processing.
Identifying Auxiliary Services
LabEx provides a list of available auxiliary services that can be configured for the Node Manager. You can find the list of supported auxiliary services in the yarn-default.xml
file, which is typically located in the $HADOOP_HOME/etc/hadoop/
directory.
Here's an example of the available auxiliary services:
Service Name |
Description |
mapreduce_shuffle |
Provides the shuffle service for MapReduce applications. |
spark_shuffle |
Provides the shuffle service for Spark applications. |
log_aggregation |
Aggregates and stores the logs of application containers. |
timeline |
Provides the timeline service for application monitoring and historical data. |
Configuring Auxiliary Services
To configure auxiliary services for the Node Manager, you need to modify the yarn-site.xml
file on the worker nodes. Here's an example configuration:
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle,spark_shuffle,log_aggregation</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.spark_shuffle.class</name>
<value>org.apache.spark.network.yarn.YarnShuffleService</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.log_aggregation.class</name>
<value>org.apache.hadoop.yarn.logaggregation.LogAggregationService</value>
</property>
In this example, we've configured three auxiliary services: mapreduce_shuffle
, spark_shuffle
, and log_aggregation
. Each service is associated with a specific class that implements the service's functionality.
After configuring the auxiliary services, you need to restart the Node Manager on the worker nodes for the changes to take effect.
sudo systemctl restart hadoop-yarn-nodemanager
By configuring these auxiliary services, you can extend the functionality of the Node Manager and provide additional capabilities to the application containers running on the worker nodes.