I/O Operations in AWK
Master input and output in AWK - reading records and fields, getline, formatted printing, file redirection, and pipes with Docker-ready examples
Input and output are not a library feature bolted onto AWK — they are AWK. The entire language is built around an implicit loop that reads input one record at a time, splits it into fields, and runs your pattern-action rules against it. Where most languages make you open a file, loop over its lines, and split each one, AWK does all of that before your code even runs.
That data-driven design means “I/O” in AWK covers a lot of ground: how records and fields arrive, how to read input explicitly with getline, how to format output precisely with printf, and how to send results to files and external commands using redirection and pipes. Because AWK’s typing is dynamic and weak, values flow between numbers and strings automatically as you read and print them.
In this tutorial you’ll go beyond the single print statement from Hello World and learn the full I/O toolkit: the implicit record loop, getline, formatted output, writing to multiple files, and piping output through Unix commands. Every example runs unchanged in a tiny Alpine Linux container.
Formatted Output: print vs printf
print is convenient — it joins its arguments with the output field separator (a space by default) and appends a newline. When you need column alignment or numeric precision, printf gives you C-style format control and adds no automatic newline, so you place the \n yourself.
Create a file named output.awk:
| |
The format specifiers mirror C: %-10s left-justifies a string in a 10-character field, %3d right-justifies an integer in a 3-character field, %.2f prints a float with two decimals, and %x / %o print in hexadecimal and octal.
Reading Records and Fields
The heart of AWK I/O is the implicit main loop. Any rule without BEGIN or END runs once per input record (by default, once per line). AWK auto-splits each record into fields $1, $2, … and tracks NR (record number) and NF (number of fields). BEGIN runs before the first record; END runs after the last.
First, create the input data. Create a file named people.txt:
| |
Now create a file named fields.awk:
| |
Notice there is no loop and no file-open call. AWK opens people.txt, feeds it through the main rule line by line, and exposes each column as a field automatically.
Reading Input with getline
Sometimes you need to pull input yourself rather than relying on the implicit loop — for example, to read a value on demand. Plain getline var reads the next record from the main input stream (standard input) into var and returns 1 on success, 0 at end of input, and -1 on error.
Create a file named greet.awk:
| |
The prompt uses printf (no trailing newline) so it stays on the same line as the response. Because we pipe input into the program, this runs deterministically without any interactive typing.
Writing to Files with Redirection
print and printf can redirect their output to a file with > "file" (truncate) or >> "file" (append). AWK opens the file on first use and keeps it open, so you should close() it when done — especially inside a loop or a long-running program. This makes routing records to different files trivial.
Create a file named split_ages.awk:
| |
Here the patterns $2 >= 30 and $2 < 30 decide which file each name lands in — the pattern-action model doubles as your I/O routing logic.
Piping Output to Commands
AWK integrates seamlessly with Unix pipelines. print ... | "command" sends output into the standard input of an external command, letting you offload work like sorting or formatting. As with files, close the pipe with close() to flush it and let the command finish.
Create a file named report.awk:
| |
This reformats each record to put age first, then streams every line through sort -n so the final output is ordered numerically by age — all without a temporary file.
Running with Docker
Alpine Linux ships with AWK built in, so no extra setup is needed. Note the -i flag on the getline example: it keeps standard input open so the container can receive piped data.
| |
Expected Output
Formatted output (output.awk):
Line one
Multiple fields joined
Name: Alice Age: 30
Pi is about 3.14
Hex: ff, Octal: 10
Reading records and fields (fields.awk):
NAME AGE ROLE
-----------------------
Alice 30 Engineer
Bob 25 Designer
Carol 35 Manager
-----------------------
Total records: 3
Reading input with getline (greet.awk):
Enter your name: Hello, Ada!
Writing to files (split_ages.awk, then cat senior.txt junior.txt):
Wrote senior.txt and junior.txt
Alice
Carol
Bob
Piping output to a command (report.awk):
25 Bob
30 Alice
35 Carol
Key Concepts
- I/O is the paradigm — AWK’s implicit main loop reads input record by record and auto-splits it into fields; you rarely open or loop over a file manually.
printvsprintf—printadds a newline and joins arguments withOFS;printfgives C-style format control and adds no newline of its own.getlinereads on demand — plaingetline varpulls the next record from standard input and returns1(success),0(end of input), or-1(error).- Redirection routes output —
print > "file"truncates andprint >> "file"appends; the file stays open until youclose()it. - Pipes connect to Unix —
print | "command"streams output into another program’s standard input, making AWK a natural pipeline citizen. - Always
close()— closing files and pipes flushes buffers and lets downstream commands complete, which matters inside loops and long-running scripts. - Patterns drive I/O — because conditions like
$2 >= 30are patterns, your I/O destinations can be chosen declaratively by the pattern-action model itself.
Comments
Loading comments...
Leave a Comment