Hadoop FS Shell cp

HadoopHadoopBeginner
Practice Now

Introduction

Welcome to the magical carnival where the extraordinary magician is ready to showcase the wonders of Hadoop's HDFS with the copy skill. In this enchanting scenario, the magician aims to demonstrate how to copy files using the Hadoop FS Shell command, providing a magical touch to your Hadoop skills journey.


Skills Graph

%%%%{init: {'theme':'neutral'}}%%%% flowchart RL hadoop(("`Hadoop`")) -.-> hadoop/HadoopHDFSGroup(["`Hadoop HDFS`"]) hadoop/HadoopHDFSGroup -.-> hadoop/fs_cp("`FS Shell cp`") subgraph Lab Skills hadoop/fs_cp -.-> lab-271866{{"`Hadoop FS Shell cp`"}} end

Copying Files Using Hadoop FS Shell

In this step, we will learn how to copy files in Hadoop using the FS Shell cp command.

  1. Switch to the hadoop user in the terminal:

    su - hadoop
  2. Create a test file named source.txt in the /home/hadoop directory. Execute the following commands:

    echo "This is a test file." > /home/hadoop/source.txt
  3. Now, let's copy the local file source.txt file to a new destination file named destination.txt on HDFS. Use the following command:

    hdfs dfs -copyFromLocal /home/hadoop/source.txt /destination.txt
  4. Verify that the file has been copied successfully. You can list the files in / to confirm.

    hdfs dfs -ls /

Recursive File Copy with Hadoop FS Shell

In this step, we will enhance our file copying skills by copying directories recursively using the Hadoop FS Shell command.

  1. Create a directory named source_dir in / and a subdirectory named subdir in /source_dir/. Execute the following commands:

    hdfs dfs -mkdir /source_dir
    hdfs dfs -mkdir /source_dir/subdir
  2. Place a test file named file1.txt inside the subdir directory. Use the command below:

    echo "Contents of file1" > /home/hadoop/file1.txt
    hdfs dfs -put /home/hadoop/file1.txt /source_dir/subdir/
  3. Copy the entire source_dir directory to a new destination named destination_dir recursively. Try the following command:

    hdfs dfs -cp /source_dir/ /destination_dir

Certainly! The command hdfs dfs -cp /source_dir /destination_dir has the following components:

  1. hdfs dfs -cp: This part indicates the use of the Hadoop Distributed File System (HDFS) cp command, which is used for copying files or directories.
  2. /source_dir/*: This represents the path of the source directory. The * wildcard matches all files and subdirectories within this directory.
  3. /destination_dir: This is the path of the target directory where you want to copy the files.

In summary, this command copies all files and subdirectories from /source_dir to /destination_dir, while preserving the original attributes of the files.

  1. Validate the recursive copy by listing the contents of the destination_dir directory.

    hdfs dfs -ls -R /destination_dir

Summary

In this lab, we delved into the magical world of Hadoop HDFS with the focus on the hdfs dfs -copyFromLocal and hdfs dfs -copy commands. By creating engaging scenarios and providing hands-on practice, this lab aimed to enhance your understanding of file copying operations in Hadoop. Remember, practice makes perfect, and mastering these skills will empower you in your Hadoop journey.

Other Hadoop Tutorials you may like