Intermediate

Functions in Assembly

Learn how functions work in x86 assembly with CALL/RET, the stack, calling conventions, and recursion using NASM and Docker

In high-level languages, a function is a built-in abstraction: you write def, fn, or void name() and the compiler arranges everything behind the scenes. In assembly, there is no function keyword. A “function” is simply a label you can jump to and return from — and you are responsible for every detail the compiler would normally handle: where arguments live, where the return value goes, which registers must be preserved, and how the stack is managed.

The two instructions that make this work are call and ret. call label pushes the address of the next instruction onto the stack and jumps to label. ret pops that saved address back off the stack and jumps to it, resuming execution right after the original call. This push/pop dance is why functions can be nested and recursive — each call stacks a return address, and each ret unwinds one.

Because assembly is untyped and low-level, “calling conventions” are agreements, not rules enforced by the language. This tutorial uses simple, consistent conventions: arguments and return values in registers for the early examples, then the stack-frame style (ebp/esp) that C compilers use. All examples use 32-bit x86 with Linux int 0x80 syscalls, matching the Hello World page.

Calling a Subroutine with CALL and RET

The simplest function takes no arguments and returns nothing — it just does work and returns. Here greet is a leaf function (it calls nothing but the kernel), while print_string is a reusable helper that performs the actual sys_write.

Create a file named functions.asm:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
section .data
    intro   db "Calling a function...", 10
    introlen equ $ - intro
    hello   db "Hello from inside the function!", 10
    hellolen equ $ - hello

section .text
    global _start

_start:
    mov ecx, intro          ; pointer to string
    mov edx, introlen       ; length
    call print_string

    call greet              ; call our function...
    call greet              ; ...and call it again

    mov eax, 1              ; sys_exit
    xor ebx, ebx            ; exit code 0
    int 0x80

; greet -- prints the hello message, takes no arguments
greet:
    mov ecx, hello
    mov edx, hellolen
    call print_string
    ret                     ; pop return address, jump back to caller

; print_string -- ecx = pointer, edx = length
print_string:
    mov eax, 4              ; sys_write
    mov ebx, 1              ; stdout
    int 0x80
    ret

Notice that greet itself calls print_string. This nesting works because each call pushes its own return address: the ret inside print_string returns into greet, and the ret inside greet returns into _start. The stack keeps the two return addresses in order automatically.

Passing Arguments and Returning Values in Registers

Real functions need inputs and outputs. The fastest convention is to pass arguments in registers and return the result in eax (the traditional “accumulator” register). Here add_numbers reads two operands and leaves the sum in eax, and a print_int helper converts that integer to text and prints it.

Create a file named arguments.asm:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
section .bss
    numbuf  resb 12         ; scratch buffer for the digits

section .text
    global _start

_start:
    mov eax, 7              ; first argument
    mov ebx, 5              ; second argument
    call add_numbers        ; result returned in eax
    call print_int          ; prints the value in eax

    mov eax, 1              ; sys_exit
    xor ebx, ebx
    int 0x80

; add_numbers -- eax + ebx, result in eax
add_numbers:
    add eax, ebx
    ret

; print_int -- prints the unsigned integer in eax, followed by a newline
print_int:
    mov edi, numbuf + 11    ; work backwards from the end of the buffer
    mov byte [edi], 10      ; place a trailing newline
    mov ecx, 10             ; divisor
.loop:
    xor edx, edx            ; clear high half before dividing
    div ecx                 ; edx:eax / 10 -> quotient in eax, remainder in edx
    add dl, '0'             ; turn remainder (0-9) into an ASCII digit
    dec edi
    mov [edi], dl           ; store the digit
    test eax, eax           ; more digits left?
    jnz .loop
    mov ecx, edi            ; ecx = start of the digit string
    mov edx, numbuf + 12    ; edx = length = end - start
    sub edx, edi
    mov eax, 4              ; sys_write
    mov ebx, 1              ; stdout
    int 0x80
    ret

print_int shows why assembly functions take effort: there is no built-in way to print a number. We repeatedly divide by 10, collecting remainders as ASCII digits from right to left in a buffer, then write the buffer with a single syscall. div is unsigned division — it divides the 64-bit value edx:eax by the operand, so we must zero edx before each div.

Stack Frames and the C Calling Convention

Registers run out quickly, so most compiled code passes arguments on the stack using a stack frame. In the cdecl convention, the caller pushes arguments (right to left) and cleans them up afterward; the callee sets up a frame with ebp so it can reference arguments by fixed offsets.

Create a file named stack_frame.asm:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
section .bss
    numbuf  resb 12

section .text
    global _start

_start:
    push 20                 ; second argument (pushed first)
    push 22                 ; first argument
    call sum                ; result in eax
    add esp, 8              ; caller cleans up the two pushed args
    call print_int

    mov eax, 1
    xor ebx, ebx
    int 0x80

; sum -- adds two arguments passed on the stack
;   [ebp+8]  = first argument
;   [ebp+12] = second argument
sum:
    push ebp                ; save the caller's frame pointer
    mov ebp, esp            ; establish our own frame
    mov eax, [ebp+8]        ; load first argument
    add eax, [ebp+12]       ; add second argument
    pop ebp                 ; restore the caller's frame pointer
    ret

; print_int -- prints the unsigned integer in eax, followed by a newline
print_int:
    mov edi, numbuf + 11
    mov byte [edi], 10
    mov ecx, 10
.loop:
    xor edx, edx
    div ecx
    add dl, '0'
    dec edi
    mov [edi], dl
    test eax, eax
    jnz .loop
    mov ecx, edi
    mov edx, numbuf + 12
    sub edx, edi
    mov eax, 4
    mov ebx, 1
    int 0x80
    ret

After call sum runs, the stack holds (from the top): the return address at [ebp+4], the first argument at [ebp+8], and the second at [ebp+12]. The push ebp / mov ebp, esp prologue and the matching pop ebp epilogue are the standard frame setup you will see in nearly all compiler-generated code.

Recursion with the Stack

Because every call saves its own return address and each invocation can push its own data, recursion works naturally. The classic example is factorial: factorial(n) = n * factorial(n - 1), with factorial(1) = 1. Each recursive call saves n on the stack before descending, then multiplies on the way back up.

Create a file named recursion.asm:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
section .bss
    numbuf  resb 12

section .text
    global _start

_start:
    mov eax, 5              ; compute 5!
    call factorial          ; result in eax
    call print_int          ; prints 120

    mov eax, 1
    xor ebx, ebx
    int 0x80

; factorial -- n in eax, returns n! in eax
factorial:
    cmp eax, 1
    jle .base               ; if n <= 1, return 1
    push eax                ; save n on the stack
    dec eax                 ; compute n - 1
    call factorial          ; eax = (n - 1)!
    pop ebx                 ; restore n into ebx
    imul eax, ebx           ; eax = n * (n - 1)!
    ret
.base:
    mov eax, 1
    ret

; print_int -- prints the unsigned integer in eax, followed by a newline
print_int:
    mov edi, numbuf + 11
    mov byte [edi], 10
    mov ecx, 10
.loop:
    xor edx, edx
    div ecx
    add dl, '0'
    dec edi
    mov [edi], dl
    test eax, eax
    jnz .loop
    mov ecx, edi
    mov edx, numbuf + 12
    sub edx, edi
    mov eax, 4
    mov ebx, 1
    int 0x80
    ret

Each level of recursion pushes its own copy of n. When the base case is reached, the ret instructions unwind the call chain, and each level multiplies its saved n into the accumulating result. The stack is doing the bookkeeping a high-level language would hide from you — here it is fully visible.

Running with Docker

Each example is a complete, standalone program. Assemble and run them with the same NASM image used throughout this series:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
# Pull the NASM assembly image
docker pull esolang/x86asm-nasm:latest

# Run the basic CALL/RET example
docker run --rm -v $(pwd):/code -w /code esolang/x86asm-nasm x86asm-nasm functions.asm

# Run the register-argument example
docker run --rm -v $(pwd):/code -w /code esolang/x86asm-nasm x86asm-nasm arguments.asm

# Run the stack-frame example
docker run --rm -v $(pwd):/code -w /code esolang/x86asm-nasm x86asm-nasm stack_frame.asm

# Run the recursion example
docker run --rm -v $(pwd):/code -w /code esolang/x86asm-nasm x86asm-nasm recursion.asm

Expected Output

functions.asm:

Calling a function...
Hello from inside the function!
Hello from inside the function!

arguments.asm:

12

stack_frame.asm:

42

recursion.asm:

120

Key Concepts

  • call and ret are the whole mechanism: call pushes the return address and jumps; ret pops it and jumps back. There is no function keyword — a function is just a reachable label.
  • The stack enables nesting and recursion: each call stores its own return address, so functions can call other functions (and themselves) without losing their place.
  • Calling conventions are agreements, not rules: assembly won’t enforce where arguments go. We used registers for speed and eax for return values, then the cdecl stack-frame style for compatibility with compiled code.
  • Stack frames use ebp as an anchor: the push ebp / mov ebp, esp prologue lets a function reference arguments at fixed offsets ([ebp+8], [ebp+12]) even as esp moves.
  • Caller vs. callee responsibilities: in cdecl, the caller pushes arguments and cleans them up afterward (add esp, 8), while the callee preserves and restores ebp.
  • Leaf vs. non-leaf functions: a leaf function calls nothing else and needs minimal setup; a non-leaf function must protect any registers and return data it relies on across nested calls.
  • You build your own abstractions: even printing an integer requires a hand-written routine (print_int) — assembly gives you total control precisely because it gives you nothing for free.

Running Today

All examples can be run using Docker:

docker pull esolang/x86asm-nasm:latest
Last updated:

Comments

Loading comments...

Leave a Comment

2000 characters remaining