Leveraging AWK Conditions and Filters
One of the powerful features of AWK is its ability to apply conditional logic and filters to the data being processed. This allows you to selectively process lines of text based on specific criteria, making AWK an extremely versatile tool for data manipulation and analysis.
AWK's conditional statements are similar to those found in other programming languages, such as the if-else
statement. Here's an example that prints the third field of a CSV file only if the first field matches "John":
$ cat data.csv
John,Doe,35,New York
Jane,Doe,30,Los Angeles
Bob,Smith,45,Chicago
$ awk -F, '$1 == "John" {print $3}' data.csv
35
In this example, the $1 == "John"
part of the script is the condition, which checks if the first field of each line is equal to "John". If the condition is true, the {print $3}
part of the script is executed, printing the third field.
AWK also provides a variety of logical operators, such as &&
(and), ||
(or), and !
(not), that can be used to create more complex conditions. For example, you can print the third field if the first field is "John" and the second field is "Doe":
$ awk -F, '$1 == "John" && $2 == "Doe" {print $3}' data.csv
35
Filters in AWK are used to select which lines of text should be processed. The BEGIN
and END
blocks are special filters that allow you to execute code before the first line is processed or after the last line is processed, respectively. Here's an example that prints a header before the data is printed:
$ awk -F, 'BEGIN {print "Name,Age,City"} {print $1","$3","$4}' data.csv
Name,Age,City
John,35,New York
Jane,30,Los Angeles
Bob,45,Chicago
In this example, the BEGIN {print "Name,Age,City"}
part of the script is executed before the first line of the file is processed, printing the header. The {print $1","$3","$4}
part of the script is then executed for each line of the file, printing the first, third, and fourth fields separated by commas.
By leveraging AWK's conditional statements and filters, you can create powerful and flexible text processing scripts that can automate a wide range of data manipulation tasks.