GNU Find
The GNU implementation of the Unix find utility — a declarative, expression-based language for locating files and acting on them, distributed as part of GNU findutils.
Created by David MacKenzie, Eric B. Decker, Jay Plett, Tim Wood (and later James Youngman)
GNU Find is the GNU Project’s implementation of the Unix find(1) utility — a small, declarative, expression-based language for walking a filesystem tree, testing files against predicates, and acting on the matches. It is distributed as part of the GNU findutils package, alongside xargs, locate, and updatedb, and is the default find implementation on essentially every GNU/Linux distribution.
The
findlanguage — its predicate grammar, primaries such as-nameand-mtime, and the operator precedence of-a,-o, and!— was defined by the original Unixfindfrom 1974 and later codified by POSIX. This page is specifically about the GNU implementation of that language: its history, the extensions it adds beyond POSIX, and the role it plays in the GNU userland.
Origins
The original find utility was written by Dick Haight at Bell Labs and shipped in Version 5 Unix in 1974. It introduced the now-familiar shape of the tool: a starting path, followed by an expression made up of primaries (each prefixed by -) joined by Boolean operators. From the start, find was unusual in the Unix toolbox: instead of taking flags and producing output for another tool to parse, it accepted a tiny embedded language and acted on each file as the expression evaluated.
The GNU Project needed a freely-licensed replacement for the proprietary AT&T find, and work on a GNU implementation began in the late 1980s. The eventual GNU find was written primarily by David MacKenzie, with earlier contributions from Eric B. Decker, Jay Plett, and Tim Wood. The code was bundled with the GNU xargs and locate/updatedb programs into the findutils package, which has been part of the GNU system ever since. James Youngman has been the long-time maintainer through the 4.x series.
The Expression Language
A find invocation is structured as:
find [path...] [expression]
The expression is a sequence of primaries combined with operators. The grammar is declarative — you describe which files you want and what should happen to them, and find walks the tree and evaluates the expression for each one. The major primary categories are:
| Category | Examples | Purpose |
|---|---|---|
| Name tests | -name, -iname, -path, -regex, -iregex | Match the filename or full path against a pattern |
| Type tests | -type f, -type d, -type l | Match by file type (regular, directory, symlink, etc.) |
| Metadata tests | -size, -perm, -user, -group, -inum, -links | Match by file metadata |
| Time tests | -mtime, -atime, -ctime, -newer, -newerXY | Match by modification, access, or change time |
| Tree control | -maxdepth, -mindepth, -prune, -xdev | Control how the directory tree is traversed |
| Actions | -print, -print0, -printf, -exec, -execdir, -delete, -ls, -quit | Do something with matched files |
| Operators | -a (and, implicit), -o (or), ! (not), ( ... ) | Combine primaries |
Each primary returns true or false; find short-circuits the expression and, as a side effect of evaluating actions, performs the requested operation. The implicit -a between adjacent primaries and the higher precedence of -a over -o are subtle enough that GNU find’s documentation devotes a substantial section to operator precedence and side-effect ordering.
GNU Extensions Beyond POSIX
GNU find is largely a superset of POSIX find, but its extensions have become so widely depended-upon that scripts written against “find” in practice often mean “GNU find.” Major GNU-specific features include:
-print0/-files0-from: NUL-terminated output, designed to interoperate safely withxargs -0even when filenames contain spaces, newlines, or shell metacharacters.-regexand-iregex: full-path regular-expression matching (with selectable regex dialects via-regextype), in addition to POSIX glob-style-name.-iname/-ipath: case-insensitive variants of-nameand-path.-printf: aprintf-style format directive with format specifiers for nearly everystat(2)field —%p(path),%f(basename),%s(size),%T@(mtime as Unix epoch),%y(file type letter), and many more.-delete: remove the matched file as a primary, avoiding the overhead and quoting hazards of-exec rm.-execdir/-okdir: like-exec/-ok, but executed with the working directory set to the directory containing the matched file — designed to avoid race conditions when symlinks in ancestor directories could be swapped out during traversal.-newerXY: a generalized newer-than comparison whereXandYindependently select amonga,B,c,m, ort(access, birth, change, modification, or a literal time string) — far more flexible than POSIX-newer.-quit: stop the search after the first match, useful for “does any file matching X exist?” probes.- Refined symlink-handling global options (
-H,-L,-P): standardized options that select how symlinks are followed during the walk; GNU find implements these consistently with POSIX while preserving traditional non-following behavior by default.
The combination of -print0 and xargs -0 in particular is widely considered the only fully correct way to pipe arbitrary filenames between processes in shell pipelines, and is a GNU-driven idiom.
Implementation and Optimizer
GNU find is more than a straightforward expression interpreter. The implementation includes a small cost-based optimizer that reorders the predicates in an expression, where doing so is semantically safe, to evaluate cheap tests (such as name matching) before expensive ones (such as stat-requiring metadata tests or -exec invocations). The optimizer respects side-effect ordering — actions and primaries with observable effects are never reordered past one another — but can, for example, hoist a -name '*.c' test ahead of a -mtime -1 test that would otherwise force a stat on every file.
Users can inspect optimizer decisions through the -D opt debug flag, and other -D categories expose tree-traversal decisions, statistics, and rate-limited diagnostics. This is unusual for a Unix command-line tool and reflects the fact that find is heavily used in performance-sensitive automation.
findutils: The Companion Programs
GNU find ships inside the larger findutils package, which also provides:
xargs: read items from standard input and execute a command with them as arguments. Thefind ... -print0 | xargs -0 ...pairing is the canonical way to apply a command to every file matching afindexpression, especially when-execwould invoke the command once per file rather than in batches.locate: a fast filename lookup tool that queries a pre-built database rather than walking the filesystem. Useful for interactive name searches across very large trees.updatedb: builds the database used bylocate, typically scheduled by cron or a systemd timer.
These four tools share code, a manual, and a maintenance team, and are collectively what most users mean when they refer to “GNU findutils.”
Standardization
The find utility is standardized by IEEE Std 1003.1 (POSIX) and the Single UNIX Specification. POSIX defines:
- The expression grammar and operator precedence.
- A core set of primaries:
-name,-type,-print,-exec(with\\;and+termination),-perm,-size,-newer,-mtime,-atime,-ctime,-user,-group,-links,-prune,-xdev,-depth, and a handful of others. - Behavior in the presence of symlinks (via
-H,-L,-Pglobal options).
GNU find conforms to POSIX when invoked in a POSIX-conforming environment (for example, with the POSIXLY_CORRECT environment variable set), while exposing its full set of extensions by default. This dual behavior is shared with most other GNU utilities and is one of the reasons GNU userland behavior can differ subtly from BSD or Solaris userland behavior in portability-sensitive shell scripts.
Comparison with Other Implementations
GNU find is one of several actively maintained find implementations:
- BSD find (FreeBSD, OpenBSD, NetBSD, macOS): POSIX-conformant with a different set of extensions. Notably, BSD find supports
-xas a shorter spelling of-xdev, and macOS’sfindlacks-printfand-regextype, which trips up scripts written on Linux. - Busybox find: a stripped-down implementation common on embedded Linux and Alpine-based container images; supports a useful subset of GNU primaries but omits the optimizer and several less-common predicates.
- Toybox find: similar to BusyBox in scope; ships in Android.
- fd-find (
fd): a Rust-language alternative that defaults to recursive search, ignores VCS and dotfile patterns by default, and uses a regex-first interface; not afindreplacement in the POSIX sense but commonly chosen for interactive use.
For automation and portability across distributions, GNU find remains the reference: scripts that target Linux can assume GNU semantics, and scripts that must run on macOS or *BSD typically restrict themselves to the POSIX subset that all implementations support.
Current Status
GNU findutils is actively maintained under James Youngman, with version 4.10.0 released in 2024 as the most recent stable version as of 2026. Development happens on Savannah, the GNU Project’s source-hosting infrastructure, and the package follows the typical GNU release cadence — slow, deliberate, and tightly synchronized with the rest of the GNU userland. Recent work has focused on portability fixes, security-conscious defaults, improved Unicode-locale handling, and continued conformance updates as POSIX revisions are finalized.
In day-to-day use, GNU find is one of the most universally-deployed pieces of free software: present on every GNU/Linux system, invoked from countless shell scripts, build systems, container-image build pipelines, and CI/CD jobs. Like many of the GNU utilities, its ubiquity is so taken-for-granted that its absence — for example, on a stripped-down Alpine container that ships only BusyBox find — becomes visible mostly as confusing error messages from scripts that quietly assumed the GNU extensions were available.
Why GNU Find Matters
- It is the reference
findfor the free Unix world. Almost every Linux script, Dockerfile, and CI pipeline that touches the filesystem invokes GNU find directly or indirectly. - Its extensions defined modern
findidioms.-print0/xargs -0,-printf,-execdir,-delete, and-regexare GNU inventions that became the de-facto baseline expected by Linux automation, even though they are not part of POSIX. - It is one of the few small declarative languages most programmers use daily. Filtering and acting on files through a predicate expression is, in spirit, closer to a database query language than to a typical Unix command-line interface — and
findpredatesawk(1)’s pattern-action model only by a few years. - It exemplifies the GNU philosophy of “compatible superset.” GNU find behaves like POSIX find when asked to, and like a richer, more featureful tool by default — a design pattern that the GNU Project applied throughout coreutils, findutils, grep, sed, and beyond.
For a tool whose surface job is “list files matching a pattern,” GNU find is a remarkable piece of software: a small, durable language that has outlived nearly every workflow it was ever embedded in, and continues to be the unglamorous workhorse of filesystem automation on the world’s most popular operating-system kernel.
Timeline
Notable Uses & Legacy
GNU/Linux Distributions
GNU find is the default `find(1)` implementation on essentially every GNU/Linux distribution — Debian, Ubuntu, Fedora, Arch, openSUSE, and their derivatives — shipped as part of the `findutils` package. System administration scripts, package post-install hooks, and cron jobs across the Linux ecosystem rely on its GNU-specific extensions such as `-printf`, `-regex`, `-delete`, and `-execdir`.
Build Systems (Make, CMake, Autotools)
Generated Makefiles and shell-driven build scripts use `find` to enumerate sources, locate generated files for cleaning, and feed file lists into `xargs`. Patterns like `find . -name '*.o' -delete` or `find src -name '*.c' -print0 | xargs -0 ...` are ubiquitous across the GNU build ecosystem.
System Administration and Backup Scripts
Administrators use `find` for log rotation (`find /var/log -mtime +30 -delete`), file-permission audits (`find / -perm -4000 -type f`), backup file selection by mtime, and bulk-permission repairs. The `-newer` and `-newerXY` predicates underpin incremental-backup tooling that selects files modified since a reference time.
Container Image Tooling
Dockerfiles and OCI build pipelines routinely use `find` to strip documentation, clear caches, fix permissions, and trim image size — e.g. `find /var/cache/apt -type f -delete` or `find / -name '*.pyc' -delete`. Alpine, Debian-slim, and most distro base images include a `find` implementation expressly because image-build scripts depend on it.
Coreutils and findutils Test Suites
The GNU coreutils and findutils projects themselves use `find` extensively in their own autotest-based test infrastructure to discover and validate test inputs and outputs, making `find` a self-bootstrapping component of the GNU userland.
CI/CD Pipelines
GitHub Actions, GitLab CI, Jenkins, and similar systems rely on `find` for artifact collection (`find build -name '*.log'`), workspace cleanup, and selective caching. Because `find` is part of the POSIX-mandated base utilities, it is one of the few tools a pipeline can assume is present on any Linux runner.
Language Influence
Influenced By
Influenced
Running Today
Run examples using the official Docker image:
docker pull alpine:latestExample usage:
docker run --rm -v $(pwd):/work -w /work alpine:latest find . -type f -name '*.txt'