Intermediate

I/O Operations in AWK

Master input and output in AWK - reading records and fields, getline, formatted printing, file redirection, and pipes with Docker-ready examples

Input and output are not a library feature bolted onto AWK — they are AWK. The entire language is built around an implicit loop that reads input one record at a time, splits it into fields, and runs your pattern-action rules against it. Where most languages make you open a file, loop over its lines, and split each one, AWK does all of that before your code even runs.

That data-driven design means “I/O” in AWK covers a lot of ground: how records and fields arrive, how to read input explicitly with getline, how to format output precisely with printf, and how to send results to files and external commands using redirection and pipes. Because AWK’s typing is dynamic and weak, values flow between numbers and strings automatically as you read and print them.

In this tutorial you’ll go beyond the single print statement from Hello World and learn the full I/O toolkit: the implicit record loop, getline, formatted output, writing to multiple files, and piping output through Unix commands. Every example runs unchanged in a tiny Alpine Linux container.

Formatted Output: print vs printf

print is convenient — it joins its arguments with the output field separator (a space by default) and appends a newline. When you need column alignment or numeric precision, printf gives you C-style format control and adds no automatic newline, so you place the \n yourself.

Create a file named output.awk:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
BEGIN {
    # print separates arguments with OFS (a space) and adds a newline
    print "Line one"
    print "Multiple", "fields", "joined"

    # printf gives precise control; you supply the newline
    printf "Name: %-10s Age: %3d\n", "Alice", 30
    printf "Pi is about %.2f\n", 3.14159
    printf "Hex: %x, Octal: %o\n", 255, 8
}

The format specifiers mirror C: %-10s left-justifies a string in a 10-character field, %3d right-justifies an integer in a 3-character field, %.2f prints a float with two decimals, and %x / %o print in hexadecimal and octal.

Reading Records and Fields

The heart of AWK I/O is the implicit main loop. Any rule without BEGIN or END runs once per input record (by default, once per line). AWK auto-splits each record into fields $1, $2, … and tracks NR (record number) and NF (number of fields). BEGIN runs before the first record; END runs after the last.

First, create the input data. Create a file named people.txt:

1
2
3
Alice 30 Engineer
Bob 25 Designer
Carol 35 Manager

Now create a file named fields.awk:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
# BEGIN runs once, before any input is read
BEGIN {
    printf "%-8s %-4s %s\n", "NAME", "AGE", "ROLE"
    print "-----------------------"
}

# This block runs for every input record (line)
{
    printf "%-8s %-4s %s\n", $1, $2, $3
}

# END runs once, after all input is consumed
END {
    print "-----------------------"
    print "Total records:", NR
}

Notice there is no loop and no file-open call. AWK opens people.txt, feeds it through the main rule line by line, and exposes each column as a field automatically.

Reading Input with getline

Sometimes you need to pull input yourself rather than relying on the implicit loop — for example, to read a value on demand. Plain getline var reads the next record from the main input stream (standard input) into var and returns 1 on success, 0 at end of input, and -1 on error.

Create a file named greet.awk:

1
2
3
4
5
BEGIN {
    printf "Enter your name: "
    if ((getline name) > 0)
        printf "Hello, %s!\n", name
}

The prompt uses printf (no trailing newline) so it stays on the same line as the response. Because we pipe input into the program, this runs deterministically without any interactive typing.

Writing to Files with Redirection

print and printf can redirect their output to a file with > "file" (truncate) or >> "file" (append). AWK opens the file on first use and keeps it open, so you should close() it when done — especially inside a loop or a long-running program. This makes routing records to different files trivial.

Create a file named split_ages.awk:

1
2
3
4
5
6
7
8
9
# Route each person to a file based on their age
$2 >= 30 { print $1 > "senior.txt" }
$2 <  30 { print $1 > "junior.txt" }

END {
    close("senior.txt")
    close("junior.txt")
    print "Wrote senior.txt and junior.txt"
}

Here the patterns $2 >= 30 and $2 < 30 decide which file each name lands in — the pattern-action model doubles as your I/O routing logic.

Piping Output to Commands

AWK integrates seamlessly with Unix pipelines. print ... | "command" sends output into the standard input of an external command, letting you offload work like sorting or formatting. As with files, close the pipe with close() to flush it and let the command finish.

Create a file named report.awk:

1
2
3
4
5
6
7
8
# Send "age name" pairs into an external sort command
{
    print $2, $1 | "sort -n"
}

END {
    close("sort -n")
}

This reformats each record to put age first, then streams every line through sort -n so the final output is ordered numerically by age — all without a temporary file.

Running with Docker

Alpine Linux ships with AWK built in, so no extra setup is needed. Note the -i flag on the getline example: it keeps standard input open so the container can receive piped data.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
# Pull the official image
docker pull alpine:latest

# 1. Formatted output
docker run --rm -v $(pwd):/app -w /app alpine:latest awk -f output.awk

# 2. Reading records and fields from a file
docker run --rm -v $(pwd):/app -w /app alpine:latest awk -f fields.awk people.txt

# 3. getline reading from standard input (-i keeps stdin open)
echo "Ada" | docker run --rm -i -v $(pwd):/app -w /app alpine:latest awk -f greet.awk

# 4. Writing to multiple files, then inspect them
docker run --rm -v $(pwd):/app -w /app alpine:latest awk -f split_ages.awk people.txt
docker run --rm -v $(pwd):/app -w /app alpine:latest cat senior.txt junior.txt

# 5. Piping output through an external command
docker run --rm -v $(pwd):/app -w /app alpine:latest awk -f report.awk people.txt

Expected Output

Formatted output (output.awk):

Line one
Multiple fields joined
Name: Alice      Age:  30
Pi is about 3.14
Hex: ff, Octal: 10

Reading records and fields (fields.awk):

NAME     AGE  ROLE
-----------------------
Alice    30   Engineer
Bob      25   Designer
Carol    35   Manager
-----------------------
Total records: 3

Reading input with getline (greet.awk):

Enter your name: Hello, Ada!

Writing to files (split_ages.awk, then cat senior.txt junior.txt):

Wrote senior.txt and junior.txt
Alice
Carol
Bob

Piping output to a command (report.awk):

25 Bob
30 Alice
35 Carol

Key Concepts

  • I/O is the paradigm — AWK’s implicit main loop reads input record by record and auto-splits it into fields; you rarely open or loop over a file manually.
  • print vs printfprint adds a newline and joins arguments with OFS; printf gives C-style format control and adds no newline of its own.
  • getline reads on demand — plain getline var pulls the next record from standard input and returns 1 (success), 0 (end of input), or -1 (error).
  • Redirection routes outputprint > "file" truncates and print >> "file" appends; the file stays open until you close() it.
  • Pipes connect to Unixprint | "command" streams output into another program’s standard input, making AWK a natural pipeline citizen.
  • Always close() — closing files and pipes flushes buffers and lets downstream commands complete, which matters inside loops and long-running scripts.
  • Patterns drive I/O — because conditions like $2 >= 30 are patterns, your I/O destinations can be chosen declaratively by the pattern-action model itself.

Running Today

All examples can be run using Docker:

docker pull alpine:latest
Last updated:

Comments

Loading comments...

Leave a Comment

2000 characters remaining