How does the 'uniq' command handle duplicate lines in text files?

The uniq command in Linux is used to filter out duplicate lines in text files. Here's how it works:

  1. Basic Functionality: By default, uniq removes adjacent duplicate lines from a file. This means that only consecutive duplicates are filtered out.

  2. Sorting Requirement: To remove all duplicates regardless of their position in the file, you typically need to sort the file first using the sort command. For example:

    sort input.txt | uniq
  3. Options:

    • -c: Counts occurrences of each unique line and displays the count.
    • -d: Shows only the duplicate lines.
    • -u: Displays only unique lines (lines that appear exactly once).
    • -i: Ignores case when comparing lines.
    • -f N: Skips the first N fields when comparing lines.
    • -s N: Skips the first N characters when comparing lines.

These features make uniq a powerful tool for data processing and analysis in text files.

0 Comments

no data
Be the first to share your comment!