Introduction
AWK is a powerful text processing and data manipulation tool that’s particularly useful for working with structured data. Named after its creators (Aho, Weinberger, and Kernighan), AWK shines in pattern scanning, field processing, and generating reports.
Why Learn AWK?
- Process column-based data efficiently
- Perform mathematical operations on text data
- Generate formatted reports
- Transform data between different formats
- Combine with other Unix tools in pipelines
Basic AWK Syntax
The basic syntax for AWK is:
awk 'pattern { action }' input-file
AWK Cheat Sheet
| Concept | Description |
|---|
| `$0` | Entire line |
| `$1`, `$2`… | Specific fields |
| `NR` | Number of Records (line number) |
| `NF` | Number of Fields in current line |
| `FS` | Field Separator (default: whitespace) |
| `OFS` | Output Field Separator |
| `BEGIN` | Pre-processing block |
| `END` | Post-processing block |
Sample Data Setup
Let’s create test files for our examples:
echo "Name,Department,Salary
Alice,Engineering,65000
Bob,Sales,58000
Carol,Marketing,62000
David,Engineering,72000" > employees.csv
echo "1 Apple 2.99
2 Banana 1.50
3 Orange 3.25
4 Grape 4.75" > prices.txt
Basic Examples
Example 1: Print Entire File
awk '{print}' employees.csv
| Name | Department | Salary |
|---|
| Alice | Engineering | 65000 |
| Bob | Sales | 58000 |
| Carol | Marketing | 62000 |
| David | Engineering | 72000 |
Example 2: Print Specific Fields
awk -F, '{print $1, $3}' employees.csv
| Name | Salary |
|---|
| Alice | 65000 |
| Bob | 58000 |
| Carol | 62000 |
| David | 72000 |
Example 3: Filter Rows by Condition
awk -F, '$3 > 60000' employees.csv
| Name | Department | Salary |
|---|
| Alice | Engineering | 65000 |
| Carol | Marketing | 62000 |
| David | Engineering | 72000 |
Example 4: Add Line Numbers
awk '{print NR, $0}' prices.txt
| 1 | 1 | Apple | 2.99 |
|---|
| 2 | 2 | Banana | 1.5 |
| 3 | 3 | Orange | 3.25 |
| 4 | 4 | Grape | 4.75 |
Example 5: Calculate Total Salary
awk -F, 'NR>1 {sum+=$3} END {print "Total Salary:", sum}' employees.csv
awk -F, 'BEGIN {print "Employee Report\n=============="}
NR==1 {print "Name: " $1; next}
{print "Name: " $1 ", Department: " $2}' employees.csv
| Employee | Report | | |
|---|
============ | | | |
| Name: | Name | | |
| Name: | Alice, | Department: | Engineering |
| Name: | Bob, | Department: | Sales |
| Name: | Carol, | Department: | Marketing |
| Name: | David, | Department: | Engineering |
Example 7: Field Calculations
awk '{total = $1 * $3; print $2, "total:", total}' prices.txt
| Apple | total: | 2.99 |
|---|
| Banana | total: | 3 |
| Orange | total: | 9.75 |
| Grape | total: | 19 |
Advanced Examples
Example 8: Pattern Matching with Regular Expressions
awk -F, '/Engineer/ {print $1 " is an engineer"}' employees.csv
| Alice | is | an | engineer |
|---|
| David | is | an | engineer |
Example 9: Using Associative Arrays (Count Department Members)
awk -F, 'NR>1 {dept[$2]++} END {for (d in dept) print d ":", dept[d]}' employees.csv
| Marketing: | 1 |
|---|
| Sales: | 1 |
| Engineering: | 2 |
Example 10: Multi-file Processing
awk '{print FILENAME, NR, $0}' employees.csv prices.txt
| employees.csv | 1 | Name,Department,Salary | | |
|---|
| employees.csv | 2 | Alice,Engineering,65000 | | |
| employees.csv | 3 | Bob,Sales,58000 | | |
| employees.csv | 4 | Carol,Marketing,62000 | | |
| employees.csv | 5 | David,Engineering,72000 | | |
| prices.txt | 6 | 1 | Apple | 2.99 |
| prices.txt | 7 | 2 | Banana | 1.5 |
| prices.txt | 8 | 3 | Orange | 3.25 |
| prices.txt | 9 | 4 | Grape | 4.75 |
Built-in Functions
String Functions
awk '{print toupper($2), length($2)}' prices.txt
| APPLE | 5 |
|---|
| BANANA | 6 |
| ORANGE | 6 |
| GRAPE | 5 |
Math Functions
awk '{print sqrt($3), int($3)}' prices.txt
| 1.72916 | 2 |
|---|
| 1.22474 | 1 |
| 1.80278 | 3 |
| 2.17945 | 4 |
Pro Tips
- Combine with other commands:
ps aux | awk '$3 > 9 {print $1, $3}'
- Create AWK scripts:
awk -f script.awk data.txt
- Use for data transformation:
awk -F, 'BEGIN {OFS="|"} {$1=$1; print}' employees.csv
| Name | Department | Salary |
|---|
| Alice | Engineering | 65000 |
| Bob | Sales | 58000 |
| Carol | Marketing | 62000 |
| David | Engineering | 72000 |