Functions in AWK
Learn how to define and use functions in AWK - parameters, return values, variable scope, recursion, and built-in functions with Docker-ready examples
AWK is built around the pattern-action model, but as programs grow you need a way to package reusable logic. AWK gives you two kinds of functions: a rich set of built-in functions for string and math work, and user-defined functions you write yourself. Together they let you keep pattern-action rules small and readable while pushing the real work into named, testable pieces.
User-defined functions arrived in the 1985 revision of AWK and have been part of every implementation since - including the lightweight BusyBox AWK that ships in Alpine Linux. They look familiar if you know C, but AWK adds one famous quirk: there is no local keyword. Instead, extra parameters serve as local variables, a convention that surprises newcomers but quickly becomes second nature.
In this tutorial you’ll learn how to define functions, pass parameters and return values, manage the boundary between local and global scope, write recursive functions, and combine your own functions with AWK’s built-ins inside real record processing.
Defining and Calling Functions
A user-defined function uses the function keyword, a name, a parameter list, and a body in braces. The return statement sends a value back to the caller. Functions can be defined anywhere in the program - before or after the rules that use them.
Create a file named functions.awk:
| |
A few things to notice:
greetbuilds its result with string concatenation - in AWK you just place values adjacent to each other, no+operator.maxuses the ternary operator?:as a compactif/else.- There is no return-type declaration. AWK is dynamically and weakly typed, so a function can return a string in one call and a number in another.
Tip: When calling a user-defined function, there must be no space between the function name and the opening parenthesis (
max(7, 12), notmax (7, 12)). A space can confuse AWK into reading it as string concatenation.
Local vs Global Scope
This is the single most important thing to understand about AWK functions. Any variable that is not in the parameter list is global - shared by every function and every rule. To get a local variable, you list it as an extra parameter that the caller simply never passes. By convention these “local” parameters are separated from the real ones with extra spaces.
Create a file named scope.awk:
| |
Here total is global, so the two calls to bump_total accumulate to 20. Inside add_tax, the variable rate is a local parameter - it never escapes the function, which is why printing rate in BEGIN shows it as undefined (an unset AWK variable is the empty string "").
Convention: The extra spaces before local parameters (
amount, rate, result) are purely cosmetic, but they signal intent: everything after the gap is a working variable, not something the caller supplies.
Recursion
Because each call gets its own copies of its parameters, AWK functions can call themselves. Recursion is the natural way to express problems like factorials and Fibonacci numbers.
Create a file named recursion.awk:
| |
factorial multiplies on the way back up the call stack; fib branches into two recursive calls. The final print "" emits a newline after the printf loop, which deliberately leaves no newline of its own.
Built-in Functions
AWK ships with a large library of built-in functions so you rarely need to reinvent common operations. You can freely mix them with your own functions.
Create a file named builtins.awk:
| |
Key built-ins shown here:
length(s),substr(s, start, len),toupper/tolower- core string tools.split(s, arr, sep)- fillsarrand returns the number of pieces. Arrays are passed by reference, sowordsis populated for you.sqrtandint- a sample of the math library (others includesin,cos,exp,log,rand).sprintf- the same formatting asprintf, but it returns the string instead of printing it.
Functions in Record Processing
Functions shine when they’re called from pattern-action rules, keeping the rules themselves short. This example reads a file of names and scores, classifies each one, and reports a class average - building an array as it goes and passing it to a function in the END block.
Create a file named scores.txt:
| |
Create a file named scores.awk:
| |
The main rule runs once per input line: it stores the score using the record number NR as the array key, then prints the name and letter grade. After all input is consumed, the END block passes the whole scores array (by reference) and the record count to average. Note how i and sum are local parameters of average, so they don’t pollute the global namespace.
Running with Docker
Run each example with the Alpine image, which includes BusyBox AWK:
| |
Expected Output
Running functions.awk:
Hello, AWK!
Max of 7 and 12: 12
Area of 4 x 5: 20
Running scope.awk:
Total (global): 20
Price with tax: 108
rate outside function: undefined
Running recursion.awk:
5! = 120
10! = 3628800
First 10 Fibonacci numbers:
0 1 1 2 3 5 8 13 21 34
Running builtins.awk:
Length: 16
Uppercase: CODE ARCHAEOLOGY
Lowercase: code archaeology
Substring: Code
Word count: 2
First word: Code
Square root of 144: 12
Truncated 7.9: 7
Formatted ID: [00042]
Running scores.awk with scores.txt:
Alice: 92 (A)
Bob: 78 (C)
Carol: 85 (B)
Dave: 64 (F)
Class average: 79.8
Key Concepts
function name(params) { ... }defines a function; call it with no space before the parenthesis.- There is no
localkeyword - extra parameters act as local variables, conventionally set off with extra spaces. - Any variable not in the parameter list is global, shared across all functions and rules.
returnsends a value back; without it, a function returns the empty string / zero.- Recursion works because each call gets its own copies of its parameters.
- Arrays are passed by reference, so functions like
splitand your own helpers can fill or read them in place. - Built-in functions (
length,substr,split,toupper,sqrt,sprintf, …) cover most common string and math tasks - reach for them before writing your own. - Functions keep pattern-action rules small, moving complex logic out of the implicit main loop.
Comments
Loading comments...
Leave a Comment