Transfer Files in Red Hat Enterprise Linux

Introduction

In this lab, you will gain practical experience in managing and transferring files efficiently and securely on a RHEL system. You will learn to create, list, and extract files from tar archives, including compressed archives, which are essential for packaging and backing up data.

Furthermore, this lab will guide you through securely transferring files using sftp for interactive file transfers and rsync for robust and efficient file synchronization, ensuring data integrity and security during network operations.

Create and List tar Archives

In this step, you will learn how to create and list tar archives. The tar utility is a powerful command-line tool used for archiving files and directories. It is commonly used for backups and transferring files.

The tar command requires an action option to specify what operation it should perform. Common action options include:

-c or --create: Creates a new archive.
-t or --list: Lists the contents of an archive.
-x or --extract: Extracts files from an archive.

Additionally, tar often uses general options to modify its behavior:

-v or --verbose: Displays the files being processed during the archiving or extraction.
-f or --file: Specifies the name of the archive file. This option must be followed by the archive filename.

Let's start by creating some sample files that we will archive. Navigate to your ~/project directory, which is your default working directory.

cd ~/project

Now, create a new directory named my_files and some sample text files inside it.

mkdir my_files
echo "This is file1 content." > my_files/file1.txt
echo "This is file2 content." > my_files/file2.txt
echo "This is file3 content." > my_files/file3.txt
ls my_files

You should see the three files listed:

file1.txt  file2.txt  file3.txt

Now, let's create a tar archive of the my_files directory. We will name the archive my_archive.tar.

tar -cvf my_archive.tar my_files

The output will show the files being added to the archive:

my_files/
my_files/file1.txt
my_files/file2.txt
my_files/file3.txt

You can verify that the archive file my_archive.tar has been created in your current directory:

ls

You should see my_archive.tar listed along with my_files:

my_archive.tar  my_files

Next, let's list the contents of the my_archive.tar file using the -t (list) and -f (file) options.

tar -tf my_archive.tar

The output will show the contents of the archive:

my_files/
my_files/file1.txt
my_files/file2.txt
my_files/file3.txt

If you want to see more detailed information, such as file permissions, ownership, and size, you can add the -v (verbose) option:

tar -tvf my_archive.tar

The output will be similar to this, providing more details about each archived item:

drwxr-xr-x labex/labex        0 2023-10-27 10:00 my_files/
-rw-r--r-- labex/labex       22 2023-10-27 10:00 my_files/file1.txt
-rw-r--r-- labex/labex       22 2023-10-27 10:00 my_files/file2.txt
-rw-r--r-- labex/labex       22 2023-10-27 10:00 my_files/file3.txt

Notice that tar by default removes the leading / from absolute paths when archiving. This is a safety measure to prevent accidental overwriting of system files when extracting archives. For example, if you were to archive /etc/hosts, it would be stored as etc/hosts inside the tar file. This allows you to extract it to a new location without affecting the original /etc/hosts file.

Extract Files from tar Archives

In this step, you will learn how to extract files from a tar archive. Extracting files is the process of taking the archived contents and placing them back into the file system.

The primary option for extracting files is -x or --extract. You will also typically use -f to specify the archive file and -v for verbose output to see which files are being extracted.

Before extracting, it's a good practice to extract archives into an empty directory to avoid overwriting existing files or mixing them with other content. Let's create a new directory called extracted_files in your ~/project directory.

cd ~/project
mkdir extracted_files

Now, navigate into the extracted_files directory. This ensures that the contents of the archive will be extracted here.

cd extracted_files

Now, let's extract the contents of my_archive.tar (which is located in the parent directory, ~/project) into the current extracted_files directory.

tar -xvf ../my_archive.tar

The output will show the files being extracted:

my_files/
my_files/file1.txt
my_files/file2.txt
my_files/file3.txt

After extraction, you can list the contents of the current directory (~/project/extracted_files) to verify that the my_files directory and its contents have been successfully extracted.

ls

You should see the my_files directory:

my_files

Now, let's check the contents of the my_files directory inside extracted_files:

ls my_files

You should see the original files:

file1.txt  file2.txt  file3.txt

You can also view the content of one of the extracted files to confirm its integrity:

cat my_files/file1.txt

The output should be:

This is file1 content.

When extracting files, the tar command uses the current umask to set permissions for the extracted files. However, if you want to preserve the original file permissions as they were in the archive, you can use the -p or --preserve-permissions option. This is particularly useful when dealing with executable scripts or configuration files where specific permissions are crucial. For the root user, this option is often enabled by default. For regular users, it's good practice to include it if permission preservation is important.

For this lab, we will not demonstrate the -p option explicitly, as the default behavior is sufficient for our text files. However, keep this option in mind for future use cases.

Create Compressed tar Archives

In this step, you will learn how to create compressed tar archives. While tar itself only bundles files, it can integrate with compression utilities like gzip, bzip2, and xz to create smaller archive files. This is crucial for saving disk space and reducing transfer times.

The tar command provides specific options for different compression algorithms:

-z or --gzip: Uses gzip compression, resulting in a .tar.gz or .tgz suffix. This is the most common and fastest compression method.
-j or --bzip2: Uses bzip2 compression, resulting in a .tar.bz2 or .tbz suffix. This generally offers better compression than gzip but is slower.
-J or --xz: Uses xz compression, resulting in a .tar.xz or .txz suffix. This provides the best compression ratio among the three but is the slowest.
-a or --auto-compress: Allows tar to automatically determine the compression algorithm based on the archive's suffix (e.g., .tar.gz implies gzip). This is a convenient option.

Let's start by ensuring you are in your ~/project directory.

cd ~/project

First, we will create a gzip compressed archive of the my_files directory. We will name it my_archive.tar.gz.

tar -czvf my_archive.tar.gz my_files

The output will show the files being added and compressed:

my_files/
my_files/file1.txt
my_files/file2.txt
my_files/file3.txt

You can verify the creation of the compressed archive:

ls -lh my_archive.tar.gz

The -lh options for ls provide a human-readable size. You will see output similar to this, showing the file size:

-rw-r--r-- 1 labex labex 180 Oct 27 10:00 my_archive.tar.gz

(Note: The exact size might vary slightly depending on the system and content, but it will be a small size for these small text files.)

Now, let's try creating an xz compressed archive, which typically offers better compression. We will name it my_archive.tar.xz.

tar -cJvf my_archive.tar.xz my_files

Again, the output will show the files being processed:

my_files/
my_files/file1.txt
my_files/file2.txt
my_files/file3.txt

Check the size of the xz archive:

ls -lh my_archive.tar.xz

You might notice that my_archive.tar.xz is slightly smaller than my_archive.tar.gz, demonstrating the better compression ratio of xz.

-rw-r--r-- 1 labex labex 168 Oct 27 10:00 my_archive.tar.xz

To extract a compressed tar archive, tar is smart enough to often detect the compression type automatically when using the -x option. However, it's good practice to explicitly use the corresponding decompression option (-z, -j, or -J) or the -a (auto-compress) option.

Let's try extracting my_archive.tar.gz into a new directory called extracted_gz.

mkdir extracted_gz
tar -xzvf my_archive.tar.gz -C extracted_gz

The -C option (change directory) tells tar to extract the files into the specified directory. This is a very useful option to avoid cluttering your current directory.

Verify the contents of extracted_gz:

ls extracted_gz/my_files

You should see:

file1.txt  file2.txt  file3.txt

Transfer Files Securely with sftp

In this step, you will learn how to securely transfer files between systems using sftp (Secure File Transfer Program). sftp is an interactive file transfer program that uses SSH (Secure Shell) for secure communication, providing encryption and authentication. It is part of the OpenSSH suite.

For this lab, we will simulate a remote system by using the labex user on the same host as a "remote" user. This allows us to practice sftp commands without needing a separate virtual machine.

First, ensure you are in your ~/project directory.

cd ~/project

Let's create a file that we will "upload" to the simulated remote user's home directory.

echo "This file will be uploaded via sftp." > local_file.txt

Now, initiate an sftp session to the labex user on localhost.

sftp labex@localhost

You will be prompted for the password for labex@localhost. Enter labex.

The authenticity of host 'localhost (127.0.0.1)' can't be established.
ED25519 key fingerprint is SHA256:....
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added 'localhost' (ED25519) to the list of known hosts.
labex@localhost's password: labex
Connected to localhost.
sftp>

You are now in the sftp interactive prompt.

Inside the sftp prompt, you can use various commands similar to a regular shell.

pwd: Shows the current working directory on the remote system.
lpwd: Shows the current working directory on your local system.
ls: Lists files on the remote system.
lls: Lists files on the local system.

Let's try them:

sftp> pwd
Remote working directory: /home/labex
sftp> lpwd
Local working directory: /home/labex/project
sftp> ls

(The ls command will show the contents of /home/labex on the remote side, which is your own home directory.)

Now, let's "upload" local_file.txt from your local ~/project directory to the remote labex user's home directory (/home/labex). Use the put command.

sftp> put local_file.txt
Uploading local_file.txt to /home/labex/local_file.txt
local_file.txt                               100%   32     0.0KB/s   00:00
sftp>

You can verify that the file was uploaded by listing the remote directory:

sftp> ls

You should see local_file.txt listed among the files in /home/labex.

Next, let's "download" a file from the remote system. We will download the .bashrc file from the remote labex user's home directory to your local ~/project directory. Use the get command.

sftp> get .bashrc
Fetching /home/labex/.bashrc to .bashrc
/home/labex/.bashrc                          100%  193     0.2KB/s   00:00
sftp>

You can verify the download by listing your local directory:

sftp> lls

You should see .bashrc listed in your local ~/project directory.

To exit the sftp session, use the exit or bye command.

sftp> exit

You will return to your regular shell prompt.

Synchronize Files Securely with rsync

In this step, you will learn how to synchronize files between systems using the rsync command. rsync is a powerful and versatile tool for copying and synchronizing files and directories, locally and remotely. Its key advantage is its ability to transfer only the differences between files, making it highly efficient for updates. Like sftp, rsync can use SSH for secure, encrypted transfers.

The most common options for rsync include:

-a or --archive: This is a combination of several options (-rlptgoD) that preserve most file attributes (recursive, links, permissions, times, group, owner, device files). It's often referred to as "archive mode" and is highly recommended for most synchronization tasks.
-v or --verbose: Increases verbosity, showing more details about the transfer.
-z or --compress: Compresses file data during the transfer, which can speed up transfers over slow links.
-h or --human-readable: Outputs numbers in a human-readable format.
-n or --dry-run: Performs a trial run without making any changes. This is extremely useful for verifying what rsync will do before actually executing the command.

Let's start by ensuring you are in your ~/project directory.

cd ~/project

We will simulate a synchronization scenario by creating a source directory and a destination directory.

Create a source directory source_dir with some files:

mkdir source_dir
echo "Content of fileA" > source_dir/fileA.txt
echo "Content of fileB" > source_dir/fileB.txt
mkdir source_dir/subdir
echo "Content of subfile1" > source_dir/subdir/subfile1.txt

Create an empty destination directory dest_dir:

mkdir dest_dir

Now, let's perform a dry run to see what rsync would do when synchronizing source_dir to dest_dir. We will use the -avh options for archive mode, verbose output, and human-readable sizes, along with -n for the dry run.

rsync -avhn source_dir/ dest_dir/

Important Note on Trailing Slashes:

source_dir/: The trailing slash means "copy the contents of source_dir".
source_dir: No trailing slash means "copy source_dir itself into the destination".

The output of the dry run will show you the files that would be transferred:

sending incremental file list
./
fileA.txt
fileB.txt
subdir/
subdir/subfile1.txt

sent 186 bytes  received 12 bytes  396.00 bytes/sec
total size is 66  speedup is 0.33 (DRY RUN)

Notice the (DRY RUN) at the end, indicating no actual changes were made.

Now, let's perform the actual synchronization. Remove the -n option.

rsync -avh source_dir/ dest_dir/

The output will be similar to the dry run, but without the (DRY RUN) tag:

sending incremental file list
./
fileA.txt
fileB.txt
subdir/
subdir/subfile1.txt

sent 186 bytes  received 12 bytes  396.00 bytes/sec
total size is 66  speedup is 0.33

Verify that the files have been copied to dest_dir:

ls -R dest_dir

You should see:

dest_dir:
fileA.txt  fileB.txt  subdir

dest_dir/subdir:
subfile1.txt

Now, let's modify a file in source_dir and add a new file to see rsync's efficiency.

echo "Updated content for fileA" > source_dir/fileA.txt
echo "New file content" > source_dir/new_file.txt

Perform another dry run to see what rsync will transfer this time:

rsync -avhn source_dir/ dest_dir/

The output will show only the changed and new files:

sending incremental file list
./
fileA.txt
new_file.txt

sent 128 bytes  received 12 bytes  280.00 bytes/sec
total size is 100  speedup is 0.71 (DRY RUN)

This demonstrates rsync's ability to only transfer the differences.

Now, perform the actual synchronization again:

rsync -avh source_dir/ dest_dir/

Verify the contents of dest_dir again:

ls -R dest_dir
cat dest_dir/fileA.txt
cat dest_dir/new_file.txt

You should see new_file.txt in dest_dir and fileA.txt should contain "Updated content for fileA".

Summary

In this lab, we gained practical experience in managing and transferring files on RHEL systems using essential command-line tools. We began by mastering the tar utility, learning how to create, list, and extract files from archives, including the creation of compressed tar.gz archives for efficient storage and transfer.

Subsequently, we explored secure file transfer methods. We utilized sftp for interactive and secure file transfers between systems, understanding its capabilities for uploading and downloading files. Finally, we delved into rsync, a powerful tool for synchronizing files and directories, highlighting its efficiency in handling incremental updates and ensuring data consistency across different locations.