Introduction
Welcome to the magical carnival where the extraordinary magician is ready to showcase the wonders of Hadoop's HDFS with the copy skill. In this enchanting scenario, the magician aims to demonstrate how to copy files using the Hadoop FS Shell command, providing a magical touch to your Hadoop skills journey.
Copying Files Using Hadoop FS Shell
In this step, we will learn how to copy files in Hadoop using the FS Shell cp command.
Switch to the
hadoopuser in the terminal:su - hadoopCreate a test file named
source.txtin the/home/hadoopdirectory. Execute the following commands:echo "This is a test file." > /home/hadoop/source.txtNow, let's copy the local file
source.txtfile to a new destination file nameddestination.txton HDFS. Use the following command:hdfs dfs -copyFromLocal /home/hadoop/source.txt /destination.txtVerify that the file has been copied successfully. You can list the files in
/to confirm.hdfs dfs -ls /
Recursive File Copy with Hadoop FS Shell
In this step, we will enhance our file copying skills by copying directories recursively using the Hadoop FS Shell command.
Create a directory named
source_dirin/and a subdirectory namedsubdirin/source_dir/. Execute the following commands:hdfs dfs -mkdir /source_dir hdfs dfs -mkdir /source_dir/subdirPlace a test file named
file1.txtinside thesubdirdirectory. Use the command below:echo "Contents of file1" > /home/hadoop/file1.txt hdfs dfs -put /home/hadoop/file1.txt /source_dir/subdir/Copy the entire
source_dirdirectory to a new destination nameddestination_dirrecursively. Try the following command:hdfs dfs -cp /source_dir/ /destination_dir
Certainly! The command hdfs dfs -cp /source_dir /destination_dir has the following components:
hdfs dfs -cp: This part indicates the use of the Hadoop Distributed File System (HDFS)cpcommand, which is used for copying files or directories./source_dir/*: This represents the path of the source directory. The*wildcard matches all files and subdirectories within this directory./destination_dir: This is the path of the target directory where you want to copy the files.
In summary, this command copies all files and subdirectories from /source_dir to /destination_dir, while preserving the original attributes of the files.
Validate the recursive copy by listing the contents of the
destination_dirdirectory.hdfs dfs -ls -R /destination_dir
Summary
In this lab, we delved into the magical world of Hadoop HDFS with the focus on the hdfs dfs -copyFromLocal and hdfs dfs -copy commands. By creating engaging scenarios and providing hands-on practice, this lab aimed to enhance your understanding of file copying operations in Hadoop. Remember, practice makes perfect, and mastering these skills will empower you in your Hadoop journey.



