Recursive File Copy with Hadoop FS Shell
In this step, we will enhance our file copying skills by copying directories recursively using the Hadoop FS Shell command.
-
Create a directory named source_dir
in /
and a subdirectory named subdir
in /source_dir/
. Execute the following commands:
hdfs dfs -mkdir /source_dir
hdfs dfs -mkdir /source_dir/subdir
-
Place a test file named file1.txt
inside the subdir
directory. Use the command below:
echo "Contents of file1" > /home/hadoop/file1.txt
hdfs dfs -put /home/hadoop/file1.txt /source_dir/subdir/
-
Copy the entire source_dir
directory to a new destination named destination_dir
recursively. Try the following command:
hdfs dfs -cp /source_dir/ /destination_dir
Certainly! The command hdfs dfs -cp /source_dir /destination_dir
has the following components:
hdfs dfs -cp
: This part indicates the use of the Hadoop Distributed File System (HDFS) cp
command, which is used for copying files or directories.
/source_dir/*
: This represents the path of the source directory. The *
wildcard matches all files and subdirectories within this directory.
/destination_dir
: This is the path of the target directory where you want to copy the files.
In summary, this command copies all files and subdirectories from /source_dir
to /destination_dir
, while preserving the original attributes of the files.
-
Validate the recursive copy by listing the contents of the destination_dir
directory.
hdfs dfs -ls -R /destination_dir