Introduction
Welcome to the Linux File Comparison lab. In modern software development environments, comparing files is an essential skill for tracking changes, debugging issues, and maintaining code integrity. As a system administrator or developer, you frequently need to identify differences between configuration files, code versions, or data files.
In this lab, you will learn to use the diff command - a powerful Linux utility for comparing files line by line. The diff tool helps you identify exactly what has changed between file versions, which is crucial when updating configurations, reviewing code changes, or troubleshooting problems.
By mastering file comparison techniques, you'll be able to efficiently manage file versions, create patches, and ensure consistency across your development environments. This fundamental skill is valuable for anyone working with code, configuration files, or any text-based data that changes over time.
Understanding the diff Command
The diff command is a fundamental Linux utility used to compare the contents of files line by line. In this step, you will learn the basic syntax of the diff command and how to compare two simple text files.
Let's start by ensuring the diff utility is installed on your system. Open a terminal in the /home/labex/project directory and execute:
which diff
You should see output similar to:
/usr/bin/diff
This confirms that the diff command is available. If for any reason it's not installed, you could install it with:
sudo apt-get update && sudo apt-get install -y diffutils
Now, let's create two simple text files to compare. We'll create files that could represent configuration settings:
echo "## Configuration File for Robot Arm" > /home/labex/project/files/config1.txt
echo "motor_speed = 100" >> /home/labex/project/files/config1.txt
echo "acceleration = 20" >> /home/labex/project/files/config1.txt
echo "max_rotation = 180" >> /home/labex/project/files/config1.txt
Now create a second file with a small difference:
echo "## Configuration File for Robot Arm" > /home/labex/project/files/config2.txt
echo "motor_speed = 120" >> /home/labex/project/files/config2.txt
echo "acceleration = 20" >> /home/labex/project/files/config2.txt
echo "max_rotation = 180" >> /home/labex/project/files/config2.txt
Let's view both files to understand their contents:
cat /home/labex/project/files/config1.txt
This displays:
## Configuration File for Robot Arm
motor_speed = 100
acceleration = 20
max_rotation = 180
Now view the second file:
cat /home/labex/project/files/config2.txt
This displays:
## Configuration File for Robot Arm
motor_speed = 120
acceleration = 20
max_rotation = 180
Now, let's use the diff command to compare these two files:
diff /home/labex/project/files/config1.txt /home/labex/project/files/config2.txt
You should see output similar to:
2c2
< motor_speed = 100
---
> motor_speed = 120
This output tells us:
- Line 2 in the first file needs to be changed to match line 2 in the second file
<indicates the line from the first file>indicates the line from the second file- The line with
---separates the two versions
The difference between the files is that the motor_speed value changed from 100 to 120.
Using Advanced diff Options
In the previous step, you used the basic diff command to compare two files. Now, let's explore some advanced options that make the output more readable and useful in different scenarios.
The Unified Format (-u option)
The unified format shows the differences in a more context-aware format and is widely used in software development. The -u option displays several lines of context around the differences.
Let's use the -u option to compare our files:
diff -u /home/labex/project/files/config1.txt /home/labex/project/files/config2.txt
You should see output similar to:
--- /home/labex/project/files/config1.txt 2023-01-01 00:00:00.000000000 +0000
+++ /home/labex/project/files/config2.txt 2023-01-01 00:00:00.000000000 +0000
@@ -1,4 +1,4 @@
## Configuration File for Robot Arm
-motor_speed = 100
+motor_speed = 120
acceleration = 20
max_rotation = 180
In this format:
- Lines starting with
-(minus) are in the first file but not in the second - Lines starting with
+(plus) are in the second file but not in the first - The header shows which files are being compared
- The
@@ -1,4 +1,4 @@section indicates the line numbers being displayed
The Side-by-Side Format (-y option)
The side-by-side format shows both files in parallel columns, making it easier to visualize differences:
diff -y /home/labex/project/files/config1.txt /home/labex/project/files/config2.txt
The output should look like:
## Configuration File for Robot Arm ## Configuration File for Robot Arm
motor_speed = 100 | motor_speed = 120
acceleration = 20 acceleration = 20
max_rotation = 180 max_rotation = 180
In this view:
- The
|character in the middle indicates that the lines differ - Lines that are identical appear in both columns without any marker
Ignoring White Space (-w option)
Sometimes you only want to compare the content without considering white space differences. The -w option ignores all white space changes:
Let's create a file with different spacing:
echo "## Configuration File for Robot Arm" > /home/labex/project/files/config3.txt
echo "motor_speed = 100 " >> /home/labex/project/files/config3.txt
echo "acceleration = 20" >> /home/labex/project/files/config3.txt
echo "max_rotation = 180" >> /home/labex/project/files/config3.txt
Now let's compare it with the first file, first without and then with the -w option:
diff /home/labex/project/files/config1.txt /home/labex/project/files/config3.txt
You might see differences due to white space. Now try:
diff -w /home/labex/project/files/config1.txt /home/labex/project/files/config3.txt
With the -w option, diff should show no differences since the only variations are in white space.
These advanced options make diff more versatile for different use cases and file types. By combining options, you can customize the output to suit your specific needs.
Creating and Applying Patch Files
Patch files are a way to distribute changes to text files. They contain the differences between two versions of a file, which can be applied to transform one version into another. This is especially useful when you need to share code changes with others or update configuration files across multiple systems.
Creating a Patch File
Let's create a patch file that captures the differences between our config1.txt and config2.txt files:
diff -u /home/labex/project/files/config1.txt /home/labex/project/files/config2.txt > /home/labex/project/files/config.patch
This command creates a patch file called config.patch using the unified diff format. Let's examine the contents of this patch file:
cat /home/labex/project/files/config.patch
You should see output similar to what you saw earlier with the diff -u command:
--- /home/labex/project/files/config1.txt 2023-01-01 00:00:00.000000000 +0000
+++ /home/labex/project/files/config2.txt 2023-01-01 00:00:00.000000000 +0000
@@ -1,4 +1,4 @@
## Configuration File for Robot Arm
-motor_speed = 100
+motor_speed = 120
acceleration = 20
max_rotation = 180
Applying a Patch File
Now, let's create a copy of config1.txt and apply the patch to update it:
cp /home/labex/project/files/config1.txt /home/labex/project/files/config1_copy.txt
To apply the patch, we use the patch command:
patch /home/labex/project/files/config1_copy.txt < /home/labex/project/files/config.patch
You should see output indicating that the patch was successfully applied:
patching file /home/labex/project/files/config1_copy.txt
Let's verify that the patched file now matches config2.txt:
cat /home/labex/project/files/config1_copy.txt
The output should be identical to config2.txt:
## Configuration File for Robot Arm
motor_speed = 120
acceleration = 20
max_rotation = 180
Let's confirm there are no differences between the patched file and config2.txt:
diff /home/labex/project/files/config1_copy.txt /home/labex/project/files/config2.txt
If there's no output, it means the files are identical, confirming that the patch was applied correctly.
Creating More Complex Patch Files
Let's create a more complex patch by modifying multiple lines in a new file:
cp /home/labex/project/files/config1.txt /home/labex/project/files/config4.txt
Now edit the file to make several changes:
echo "## Updated Configuration File for Robot Arm" > /home/labex/project/files/config4.txt
echo "motor_speed = 150" >> /home/labex/project/files/config4.txt
echo "acceleration = 25" >> /home/labex/project/files/config4.txt
echo "max_rotation = 270" >> /home/labex/project/files/config4.txt
echo "safety_limit = enabled" >> /home/labex/project/files/config4.txt
Now create a patch file for these changes:
diff -u /home/labex/project/files/config1.txt /home/labex/project/files/config4.txt > /home/labex/project/files/complex.patch
Let's look at this more complex patch:
cat /home/labex/project/files/complex.patch
You should see a patch file showing multiple line changes, including additions, modifications, and possibly removals.
Patches are an efficient way to distribute changes and keep track of modifications to files. They are widely used in software development for sharing code changes, creating updates, and managing configurations.
Comparing Directories and Using Other Comparison Tools
In addition to comparing individual files, Linux provides tools for comparing entire directories and offers alternative comparison tools that may be better suited for certain scenarios.
Comparing Directories with diff
The diff command can also compare directories by using the -r (recursive) option:
Let's create two directories with some files to compare:
mkdir -p /home/labex/project/dir1
mkdir -p /home/labex/project/dir2
## Create files in the first directory
echo "This is file 1" > /home/labex/project/dir1/file1.txt
echo "This is file 2" > /home/labex/project/dir1/file2.txt
echo "This is file 3" > /home/labex/project/dir1/file3.txt
## Create similar files in the second directory with some differences
echo "This is file 1 - modified" > /home/labex/project/dir2/file1.txt
echo "This is file 2" > /home/labex/project/dir2/file2.txt
## Note: file3.txt is missing from dir2
echo "This is a new file" > /home/labex/project/dir2/file4.txt
Now, let's compare these directories:
diff -r /home/labex/project/dir1 /home/labex/project/dir2
You should see output similar to:
diff -r /home/labex/project/dir1/file1.txt /home/labex/project/dir2/file1.txt
1c1
< This is file 1
---
> This is file 1 - modified
Only in /home/labex/project/dir1: file3.txt
Only in /home/labex/project/dir2: file4.txt
This output shows:
- The content difference in
file1.txt file3.txtexists only indir1file4.txtexists only indir2file2.txtis identical in both directories (so no difference is reported)
Using the diff3 Command
When you need to compare three files (for example, when merging changes from multiple sources), you can use the diff3 command:
Let's create a third configuration file with its own changes:
echo "## Configuration File for Robot Arm" > /home/labex/project/files/config5.txt
echo "motor_speed = 100" >> /home/labex/project/files/config5.txt
echo "acceleration = 30" >> /home/labex/project/files/config5.txt
echo "max_rotation = 180" >> /home/labex/project/files/config5.txt
Now use diff3 to compare all three files:
diff3 /home/labex/project/files/config1.txt /home/labex/project/files/config2.txt /home/labex/project/files/config5.txt
The output format of diff3 is a bit more complex, but it shows how each file differs from the others, which is useful for resolving merge conflicts.
Using the colordiff Command
The colordiff utility is a wrapper for diff that produces the same output but with colored syntax highlighting, making it easier to read.
Let's first install colordiff:
sudo apt-get update && sudo apt-get install -y colordiff
Now compare our files using colordiff:
colordiff /home/labex/project/files/config1.txt /home/labex/project/files/config2.txt
The output will be similar to the regular diff command but with color highlighting for added, removed, and changed lines.
Using the wdiff Command
The wdiff (word diff) command compares files on a word-by-word basis rather than line-by-line, which can be more useful for prose or documentation:
Let's install wdiff:
sudo apt-get update && sudo apt-get install -y wdiff
Let's create two files with sentence changes:
echo "The robot arm moves quickly and efficiently." > /home/labex/project/files/sentence1.txt
echo "The robot arm moves slowly but efficiently." > /home/labex/project/files/sentence2.txt
Now compare them with wdiff:
wdiff /home/labex/project/files/sentence1.txt /home/labex/project/files/sentence2.txt
You should see output highlighting the changed words:
The robot arm moves [-quickly and-] {+slowly but+} efficiently.
The different comparison tools in Linux serve various purposes and scenarios:
difffor general file comparisondiff -rfor directory comparisondiff3for three-way comparisoncolordifffor color-highlighted outputwdifffor word-by-word comparison
By choosing the appropriate tool for your specific needs, you can make file comparison more effective and efficient.
Summary
In this lab, you have learned how to effectively use file comparison tools in Linux, focusing on the versatile diff command. Here are the key skills you have acquired:
Basic File Comparison: You learned how to use the basic
diffcommand to identify differences between text files, helping you quickly spot changes in configuration files and code.Advanced Diff Options: You explored various options like unified format (
-u), side-by-side comparison (-y), and ignoring white space (-w), each serving different comparison needs.Patch Files: You created and applied patch files, a crucial skill for distributing changes, updating systems, and contributing to software projects.
Directory Comparison: You used the recursive option (
-r) to compare entire directories, helping you identify differences across multiple files simultaneously.Alternative Comparison Tools: You were introduced to specialized tools like
diff3for three-way comparisons,colordifffor color-highlighted output, andwdifffor word-by-word comparison.
These file comparison skills are fundamental for system administration, software development, and configuration management. They allow you to track changes, debug issues, maintain version control, and ensure consistency across systems.
By mastering these tools, you have gained valuable capabilities that will enhance your efficiency when working with text files in any Linux environment.



