AWK is a powerful programming language for text processing. It was created at Bell Labs in the 70s. The name AWK comes from the surnames of its three authors: Alfred Aho, Peter Weinberger, and Brian Kernighan.
awk '{print $1, $3}' filename
awk -F',' '{print "Name: " $1, "Salary: " $3}' data.csv
awk '/start_pattern/,/end_pattern/' filename
awk -F',' '{for(i=NF;i>=1;i--) printf $i" "; print ""}' filename
awk '{gsub(/ /,"\t"); print}' input.txt > output.txt
awk 'NF > 3' filename
ls -l /path/to/directory | awk '{total += $5} END {print "Total Size: ", total/1024, "KB"}'
awk '$3 > 50 {print $0}' filename
awk -F',' '$4 == ""' filename
awk -F',' '$4 == ""' filename | wc -l
awk -F ',' '{if (++seen[$4] == 2) print}' filename
awk -F ',' '{if (++seen[$4] == 2) print}' filename | wc -l
awk 'END {print NR}' filename
awk '/pattern/ {print $0}' filename
awk '{print $1}' filename | sort | uniq
awk '!seen[$0]++' filename.csv
awk 'seen[$0]++' filename.csv
awk '{gsub(/old_text/, "new_text"); print}' filename
ps aux | awk '{printf "%-10s %-10s %-20s\n", $1, $2, $11}'
cat /etc/passwd | awk -F: '{print "Username: " $1, "UID: " $3, "Shell: " $NF}'
awk '{if ($2 ~ /^[0-9]+$/) sum += $2} END {print "Sum: ", sum}' filename
awk -F',' '{gsub(/"/, "", $1); print $1}' some_directory/* | sort | uniq > clean.csv
awk 'FNR==NR {seen[$0]=1; next} !seen[$0]' product_test.csv ledger_test.csv > not_found_test.txt
awk 'FNR==NR {seen[$0]=1; next} seen[$0]' product_test.csv ledger_test.csv > found_test.txt
awk -F',' '$4 ~ /^ *$/ {count++} END {print count}' "test.csv"
awk -F',' '{print $4}' filename.csv | sort > new_file.txt
awk -F',' '{print $4}' filename.csv | sort | uniq > uniques.txt
awk -F',' '{print $4}' filename.csv | sort | uniq -d> duplicates.txt
awk -F',' '$4 ~ /^ *$/ {count++} END {print count}' filename.csv
awk '{ sum += $n } END { print sum }' data.txt
awk 'NR == 1 { max = $n } { if ($n > max) max = $n } END { print max }' data.txt
awk 'NR == 1 { min = $n } { if ($n < min) min = $n } END { print min }' data.txt
awk '{ sum += $n } END { print sum / NR }' data.txt
| Operation | Operators | Example | Meaning |
|---|---|---|---|
| assignment | = += -= *= /= %= ^= | x = x * 2 | x = x * 2 |
| conditional | ?: | x ? y : z | If x is true, then y; else z |
| logical OR | || | x || y | 1 if x or y is true; 0 otherwise |
| logical AND | && | x && y | 1 if x and y are true; 0 otherwise |
| array membership | in | i in a | 1 if a[i] exists; 0 otherwise |
| matching | ~ !~ | $1 ~ /x/ | 1 if the first field contains an x; 0 otherwise |
| relational | < < = > >= == != | x == y | 1 if x equals y; 0 otherwise |
| concatenation | – | “a” “bc” | “abc”; there is no explicit concatenation operator |
| add, subtract | + - | x + y | Sum of x and y |
| multiply, divide, mod | * / % | x % y | Remainder of x divided by y (fraction) |
| unary plus and minus | + - | -x | Negative x |
| logical NOT | ! | !$1 | 1 if $1 is zero or null; 0 otherwise |
| exponentiation | ^ | x ^ y | x^y |
| increment, decrement | ++ -- | ++x, x++ | Add 1 to x |
| field | $ | $i + 1 | Value of the ith field, plus 1 |
| grouping | ( ) | ($i)++ | Add 1 to the value of the ith field |
| Variable | Meaning | Default |
|---|---|---|
| ARGC | Number of command line arguments | – |
| ARGV | Array of command line arguments | – |
| FILENAME | Name of current input file | – |
| FNR | Record number in current file | – |
| FS | Controls the input field separator | one space |
| NF | Number of fields in current record | – |
| NR | Number of records read so far | – |
| OFMT | Output format for numbers | %.6g |
| OFS | Output field separator | one space |
| ORS | Output record separator | \n |
| RLENGTH | Length of string matched by match function | – |
| RS | Controls the input record separator | \n |
| RSTART | Start of string matched by match function | – |
| SUBSEP | Subscript separator | \034 |
| Character | Description |
|---|---|
| \ | Used in an escape sequence to match a special symbol (e.g., \t matches a tab and \* matches * literally) |
| ^ | Matches the beginning of a string |
| $ | Matches the end of a string |
| . | Matches any single character |
| [ABDU] | Matches either character A, B, D, or U; may include ranges like [a-e-B-R] |
| A|B | Matches A or B |
| DF | Matches D immediately followed by an F |
| R* | Matches zero or more Rs |
| R+ | Matches one or more Rs |
| R? | Matches a null string or R |
| NR==10, NR==25 | Matches all lines from the 10th read to the 25th read |
| \b | Backspace |
| \f | Form feed |
| \n | Newline (line feed) |
| \r | Carriage return |
| \t | Tab |
| \ddd | Octal value ddd, where ddd is 1 to 3 digits between 0 and 7 |
| \c | Any other character literally (e.g., \\ for backslash, \” for “, \* for *, and so on) |
| Operator | Description |
|---|---|
| < | Less than |
| <= | Less than or equal to |
| == | Equal to |
| != | Not equal to |
| >= | Greater than or equal to |
| > | Greater than |
| ~ | Matched by (used when comparing strings) |
| !~ | Not matched by (used when comparing strings) |
| Variable | Meaning |
|---|---|
| r | Represents a regular expression |
| s and t | Represent string expressions |
| n and p | Integers |
| Function | Description |
|---|---|
| gsub(r,s) | Substitute s for r globally in $0; return number of substitutions made |
| gsub(r,s,t) | Substitute s for r globally in string t; return number of substitutions made |
| index(s,t) | Return the first position of string t in s, or 0 if t is not present |
| length(s) | Return the number of characters in s |
| match(s,r) | Test whether s contains a substring matched by r; return index or 0; sets RSTART and RLENGTH |
| split(s,a) | Split s into array 'a' on FS; return the number of fields |
| split(s,a,fs) | Split s into array 'a' on the field separator fs; return the number of fields |
| sprintf(fmt,expr-list) | Return expr-list formatted according to the format string fmt |
| sub(r,s) | Substitute s for the leftmost longest substring of $0 matched by r; return the number of substitutions made |
| sub(r,s,t) | Substitute s for the leftmost longest substring of t matched by r; return the number of substitutions made |
| substr(s,p) | Return the suffix of s starting at position p |
| substr(s,p,n) | Return the substring of s of length n starting at position p |
References: The GNU Awk User's Guide. (n.d.). Retrieved from https://www.gnu.org/software/gawk/manual/gawk.html Hayes, M. (n.d.). Quick Tip: Use our AWK cheat sheets to quickly and easily manipulate UNIX data. Retrieved from https://www.techrepublic.com/article/quick-tip-use-our-awk-cheat-sheets-to-quickly-and-easily-manipulate-unix-data/
Last Updated: July 11, 2022