How do `sort` and `uniq` work together?

The sort and uniq commands work together to filter out duplicate lines from a text file. Here's how they function in combination:

  1. Sorting: The sort command arranges the lines of a file in a specified order (alphabetically by default). This is important because uniq only removes adjacent duplicate lines. Therefore, sorting the file first ensures that all duplicates are next to each other.

  2. Removing Duplicates: The uniq command then processes the sorted output and removes any duplicate lines, leaving only unique entries.

Example Usage

Suppose you have a file named data.txt with the following content:

apple
banana
apple
orange
banana
kiwi

You can use the following command to sort the file and remove duplicates:

sort data.txt | uniq

Output

The output will be:

apple
banana
kiwi
orange

Counting Occurrences

You can also count how many times each unique line appears by using the -c option with uniq:

sort data.txt | uniq -c

Output

The output will show the count of each unique line:

  2 apple
  2 banana
  1 kiwi
  1 orange

This combination is powerful for data processing tasks, allowing you to efficiently manage and analyze text data.

0 Comments

no data
Be the first to share your comment!