Introduction
Imagine a scenario where you find yourself in the middle of a desert ruin, seeking guidance from a mythical figure known as the Disaster Oracle. The Disaster Oracle has foreseen a cataclysmic event that can only be averted by setting up the Hadoop HDFS infrastructure correctly. Your goal is to follow the Oracle's instructions to ensure the safety of the data kingdom.
Initializing HDFS Configuration
In this step, you will start by configuring the Hadoop HDFS to prepare for data storage and processing.
Open the terminal and follow the steps below to get started.
Switch to the Hadoop user for proper permissions:
su - hadoopCreate a directory for storing HDFS data:
hdfs dfs -mkdir /home/hadoop/data
Uploading Data to HDFS
Next, you will upload sample data to the configured HDFS directory.
Create a local file with sample data:
echo 'Hello, Hadoop World!' > /tmp/sample.txtUpload the local file to HDFS:
hdfs dfs -put /tmp/sample.txt /home/hadoop/dataCheck if the file exists in HDFS:
hdfs dfs -ls /home/hadoop/data
Data Replication Management
In this step, you will explore how HDFS handles data replication.
Check the replication status of the uploaded file:
hdfs fsck /home/hadoop/data/sample.txt -files -blocks -locationsChange the replication factor of the file to 2:
hdfs dfs -setrep -R 2 /home/hadoop/data/sample.txt
Summary
In this lab, we designed an immersive scenario where participants interact with the Disaster Oracle in a desert ruin to learn and practice setting up Hadoop HDFS. By following the steps outlined in the lab, users get hands-on experience in configuring HDFS, uploading data, and managing data replication. This lab aims to provide a comprehensive introduction to Hadoop HDFS setup while ensuring users have a practical understanding of the key concepts and operations involved.



