Squeezing Characters with tr
Another useful feature of the tr command is its ability to "squeeze" repeated characters, replacing consecutive occurrences of the same character with a single instance. This functionality is particularly valuable when dealing with text that contains excessive whitespace or other repeated characters.
The Squeeze Option in tr
To squeeze repeated characters using the tr command, you use the -s option followed by the set of characters you want to squeeze:
tr -s SET1
Where SET1 is the set of characters you want to squeeze.
Creating a Sample Text File with Excessive Whitespace
Let's create a sample file with excessive whitespace that we can use to practice:
echo 'Error: Too much whitespace.' > ~/project/whitespace.txt
This command creates a file named whitespace.txt in the /home/labex/project directory with text that includes multiple consecutive spaces.
Squeezing Spaces in the Text
Now, let's use the tr command with the -s option to squeeze multiple spaces into single spaces:
tr -s ' ' < ~/project/whitespace.txt
When you run this command, you should see the following output:
Error: Too much whitespace.
Notice that the multiple spaces between words have been reduced to single spaces, making the text more readable.
Understanding the Command
Let's break down what happened:
tr -s ' ' instructs the command to squeeze repeated occurrences of a space character into a single space.
- The
< symbol redirects the content of ~/project/whitespace.txt as input to the tr command.
- The result is displayed on the terminal but not saved to the file.
Saving the Output to a New File
If you want to save the text with squeezed spaces to a new file, you can use output redirection:
tr -s ' ' < ~/project/whitespace.txt > ~/project/clean_whitespace.txt
To verify the content of the new file, use the cat command:
cat ~/project/clean_whitespace.txt
You should see:
Error: Too much whitespace.
Combining tr Operations
The tr command allows you to combine operations. For example, you can both translate characters and squeeze them in a single command:
tr 'a-z' 'A-Z' -s ' ' < ~/project/whitespace.txt
This command will convert all lowercase letters to uppercase and also squeeze multiple spaces into single spaces.
Creating a More Complex Example
Let's create a more complex example to practice with:
echo 'log entry: error code 404 not found' > ~/project/complex.txt
Now, let's use tr to convert all letters to uppercase and squeeze spaces:
tr 'a-z' 'A-Z' -s ' ' < ~/project/complex.txt > ~/project/processed_complex.txt
To see the result:
cat ~/project/processed_complex.txt
You should see:
LOG ENTRY: ERROR CODE 404 NOT FOUND
Now you've learned how to use the tr command to squeeze repeated characters in text. This, combined with the translation and deletion capabilities you learned earlier, gives you a powerful toolkit for text manipulation in Linux.