PDL (Perl Data Language)
An array-programming extension that gives Perl fast, compact, multidimensional numeric data structures — a free, scriptable alternative to IDL and MATLAB built for scientific computing.
Created by Karl Glazebrook, with Jarle Brinchmann, Tuomas Lukka, and Christian Soeller
PDL — the Perl Data Language — is an extension that turns Perl into a capable environment for scientific and bulk numeric computing. Its core idea is to give ordinary Perl 5 the ability to compactly store and quickly manipulate the large, N-dimensional numeric arrays that are the everyday currency of science: images, spectra, time series, simulation grids, and the like. In doing so, PDL fills the same niche for Perl that NumPy fills for Python or that the commercial Interactive Data Language (IDL) and MATLAB fill in their respective worlds.
History & Origins
PDL was created by Karl Glazebrook, a British astronomer specializing in galaxy formation, who prototyped it early in 1996. The motivation was practical and a little rebellious: scientists doing data analysis resented being pushed into more limited or expensive proprietary environments, and Glazebrook wanted Perl — already beloved for text and “glue” work — to handle serious numeric arrays without giving up Perl’s convenience. The result was an open-source, Perl-based alternative to the commercial IDL.
A small core team quickly coalesced around the project, including Jarle Brinchmann, Tuomas Lukka, and Christian Soeller, who helped turn the early prototype into a real system with a code-generation layer, an interactive shell, and graphics output. PDL reached a wider audience in 1997 through write-ups such as “PDL: The Perl Data Language” in The Perl Journal (Spring 1997) and coverage in Dr. Dobb’s.
A note on the year. PDL is sometimes dated to 1997 because that is when the best-known introductory articles appeared. The language itself, however, was prototyped and in beta use in 1996, which is the first-appearance year used on this page; 1997 is treated as the year it gained wider visibility.
Design Philosophy
PDL is built around array programming (sometimes called vectorization): instead of looping element-by-element in interpreted Perl, you express operations over whole arrays at once, and the heavy lifting happens in compiled C. A few principles define the language:
- Perl convenience, C speed. PDL keeps Perl’s syntax, modules, and ecosystem, while array operations execute in tight compiled loops rather than the Perl interpreter.
- N-dimensional first. The central data type is the ndarray (historically nicknamed the piddle), an efficient, densely packed multidimensional array with an explicit numeric element type.
- Flexible indexing and “threading.” Operations defined on lower-dimensional pieces automatically broadcast across higher dimensions — PDL calls this threading — giving great flexibility in how data is sliced, indexed, and combined.
- Multiple styles welcome. PDL supports imperative code, a functional style, and a pipeline style where operations are chained together over data.
Key Features
- The ndarray data type for compact, fast storage and manipulation of large numeric datasets, supporting most native C numeric types and, in modern versions, C99 complex numbers.
- The
perldl/pdl2interactive shells, which let you experiment, compute, slice, and plot at a prompt without writing a full program — much like a MATLAB or IDL session. - Device-independent graphics through multiple back ends: PGPLOT, PLplot, Gnuplot, and Prima for 2D, and Gnuplot and OpenGL for 3D.
- Rich data I/O, including scientific and image formats such as FITS, NetCDF, and GRIB alongside JPEG, PNG, GIF, and ASCII tables.
- The PP (PDL::PP) metalanguage, a code generator that makes it straightforward to write new vectorized C routines that plug into PDL’s threading engine — the mechanism much of PDL itself is built with.
- Statistics and machine-learning modules, providing routines such as linear and logistic regression, PCA, ANOVA, and k-means clustering on top of the array core.
A flavour of the syntax
| |
The Perl heritage is obvious in the use statement and the $ sigils, but the array-at-a-time arithmetic — and the absence of explicit loops — is pure PDL.
Evolution
PDL has had an unusually long, steady life for a niche scientific tool. After its 1996 origin and late-1990s build-out, it settled into a long maintenance and refinement phase, distributed through CPAN and hosted for many years on SourceForge (the 2.4.x series was active into the early 2010s — PDL 2.4.10 shipped in February 2012). Development later moved to GitHub under the community-run PDLPorters organization, which modernized the build system, testing, and contribution process.
| Milestone | Approx. date |
|---|---|
| First prototyped by Karl Glazebrook | 1996 |
| Introduced in The Perl Journal / Dr. Dobb’s | 1997 |
| 2.4.x series in active development (e.g. 2.4.10) | 2012 |
| Migration to GitHub / PDLPorters | mid-2010s |
| PDL 2.098 | January 2025 |
| PDL 2.104 | April 2026 |
In recent years, maintenance has been led by Ed J (ETJ), with regular releases that have broadened numeric type support (including complex numbers) and kept the toolchain current with modern Perl. PDL is distributed under the same dual GNU General Public License / Artistic License terms as Perl itself.
Current Relevance
PDL remains an actively maintained project rather than a historical curiosity: releases continue on a regular cadence, and the code lives in a modern GitHub workflow. Its primary audience has always been scientists and engineers — particularly in astronomy, where it was born — who want serious numeric computing without leaving Perl. In a landscape now dominated by Python’s NumPy/SciPy stack and by Julia and R, PDL occupies a smaller but durable niche: the natural choice when a workflow is already steeped in Perl and needs fast array math, or when someone simply prefers Perl’s idioms for data wrangling.
Why It Matters
PDL is a notable example of a community extending a general-purpose language into a whole new domain. Long before “array programming in a scripting language” became mainstream through NumPy, PDL demonstrated that you could marry a high-level, expressive language with compiled-speed multidimensional arrays and a comfortable interactive shell. Born out of astronomers’ frustration with closed, costly tools, it embodied the open-source ethos of building the environment you wished you had — and it has kept that promise alive for three decades, giving the Perl world a credible, free path to scientific computing.
Timeline
Notable Uses & Legacy
Observational astronomy
PDL grew out of astronomical data analysis and remains used by astronomers for manipulating images, spectra, and large multidimensional data cubes, often via the FITS file format common in the field.
Scientific data processing and modeling
Researchers use PDL for bulk numeric processing, computer modeling of physical systems, image processing, and signal analysis where Perl's flexibility is combined with compiled-C speed.
Interactive numerical exploration
The perldl / pdl2 interactive shells let scientists and engineers do calculations, slicing, and plotting at a prompt without writing full programs, in the style of MATLAB or IDL sessions.
Visualization and plotting
PDL's graphics bindings (PGPLOT, PLplot, Gnuplot, OpenGL, and Prima) are used to produce 2D and 3D scientific plots and visualizations directly from in-memory arrays.
Statistics and machine learning
Add-on modules provide statistical and machine-learning routines — among them linear and logistic regression, PCA, ANOVA, and k-means clustering — on top of PDL's array engine.