awk is a powerful text processing tool in Unix/Linux that is used for pattern scanning and processing. It works by reading input line by line, splitting each line into fields based on a specified delimiter (default is whitespace), and then performing actions on those fields.
Basic Structure
The basic syntax of an awk command is:
awk 'pattern { action }' input_file
- pattern: A condition that determines when the action should be executed. If omitted, the action is applied to all lines.
- action: A command or set of commands to execute when the pattern matches.
Example
Here’s a simple example:
awk '{ print $1 }' filename.txt
This command prints the first field of each line in filename.txt.
Common Features
- Field Variables:
$1,$2, ...,$NFrepresent the first, second, ..., and last fields of the current line. - Built-in Variables:
NR: Number of records (lines) processed.NF: Number of fields in the current record.
- String Functions: Functions like
length(),toupper(),tolower(), etc., can manipulate strings. - Control Structures:
awksupports control structures likeif,for, andwhile, allowing for complex processing.
Example with Conditions
awk '$3 > 50 { print $1, $3 }' filename.txt
This command prints the first and third fields of lines where the third field is greater than 50.
Conclusion
awk is versatile and can be used for a variety of tasks, including data extraction, reporting, and transformation. Its ability to handle complex text processing makes it a valuable tool for shell scripting and data manipulation.
