CMS Pipelines
A record-oriented dataflow programming tool for IBM VM/CMS that extended the Unix pipe concept into the mainframe world.
Created by John P. Hartmann (IBM Denmark)
CMS Pipelines is a dataflow programming tool created by John P. Hartmann at IBM Denmark for the VM/CMS mainframe operating system. Inspired by Unix pipes but designed for the record-oriented world of IBM mainframes, CMS Pipelines extends the pipeline concept far beyond what Unix shells offer — supporting multi-stream pipelines, dynamic topology, and over 200 built-in processing stages. Sometimes referred to as “Hartmann Pipelines” in honor of its creator, the tool remains part of IBM’s z/VM operating system today.
History & Origins
John P. Hartmann, a Danish engineer at IBM Danmark A/S, began developing CMS Pipelines around 1980. The tool was first marketed as a separate IBM product in Europe around 1985, with worldwide distribution by 1989. Hartmann’s motivation was to bring the elegance of Unix’s pipeline concept to IBM’s VM/CMS environment — but the fundamental differences between Unix’s byte-stream I/O and the mainframe’s record-oriented architecture meant that a simple port would not suffice. What emerged was something Hartmann described as “recognizably different” from Unix pipes.
During the late 1980s, Hartmann presented CMS Pipelines at major IBM user conferences, including a SHARE conference in 1988. IBM integrated CMS Pipelines into VM/ESA as a standard component in late 1991. In 1992, Hartmann presented the paper “Pipelines: How CMS Got Its Plumbing Fixed” at the 3rd Annual REXX Symposium, articulating the design philosophy and technical innovations of the tool. The integration made it available to all VM/CMS users without requiring a separate product purchase.
Design Philosophy
CMS Pipelines was conceived as an “executable specification language” — a way to express data processing problems as dataflow graphs rather than procedural code. The key insight was that many data processing tasks are naturally described as a series of transformations on a stream of records, and translating such a dataflow design into procedural code is error-prone and obscures the intent.
Record-Oriented Processing
The most fundamental difference between CMS Pipelines and Unix pipes is the unit of data transfer. Unix pipes pass undifferentiated byte streams, requiring programs to scan for line delimiters to find record boundaries. CMS Pipelines passes discrete records, matching the record-oriented nature of IBM mainframe filesystems and I/O devices. This eliminates buffering overhead and delimiter scanning.
Lock-Step Execution
Unlike Unix pipes, which use kernel buffers between stages, CMS Pipelines operates in lock-step: a stage writes a record, and that record is immediately available to the next stage. The pipeline has its own lightweight dispatcher that coordinates the execution of all stages, managing startup, resource allocation, and pacing.
Multi-Stream Pipelines
Perhaps the most powerful extension beyond Unix pipes is support for multiple input and output streams. A pipeline stage can participate in several pipelines concurrently, reading from and writing to multiple streams. This enables complex data routing topologies that would require temporary files or elaborate workarounds in a Unix shell.
Key Features
Built-In Stages
CMS Pipelines includes over 200 built-in stages (programs) that implement common data processing operations and interface with VM/CMS devices and services. These stages cover:
- File I/O: Reading and writing CMS files, with shorthand aliases (
<for diskread,>for diskwrite) - Filtering: Selecting, rejecting, and transforming records based on content or position
- Sorting and aggregation: Ordering records and computing statistics
- String manipulation: Splitting, joining, padding, and reformatting record content
- System interfaces: Accessing VM/CMS system services, consoles, and devices
User-Defined Stages in REXX
Users can write custom pipeline stages in REXX, the standard scripting language for VM/CMS. REXX programs can read from and write to pipeline streams through a dedicated interface, enabling arbitrarily complex processing logic within the pipeline framework.
Pipeline Notation
The notation is deliberately minimal:
stage1 | stage2 | stage3
The | character separates stages within a pipeline (as in Unix). The ; or ? character separates multiple pipelines. Labels use : as a separator. This concise syntax allows complex data processing to be expressed in a single command line.
Dynamic Topology
A running stage can dynamically redefine the pipeline topology — replacing itself with another pipeline segment or inserting additional stages before or after itself. This provides a level of runtime flexibility not found in traditional pipe implementations.
Subroutine Pipelines
A stage can define a subroutine pipeline within a larger pipeline, enabling modular composition of pipeline logic.
Example
A simple pipeline that reads a file, selects lines containing a pattern, and writes matching lines to another file:
< input file | locate /ERROR/ | > error log
A multi-pipeline example that splits records between two output files:
< input file | a: locate /WARNING/ | > warnings
a: | > non_warnings
In the second example, the locate stage sends matching records to its primary output (the warnings file) and non-matching records to its secondary output (the non-warnings file), using the label a: to connect the secondary stream.
Evolution
After its mid-1980s release as a standalone product, CMS Pipelines became a standard part of VM/ESA in late 1991. The tool continued to evolve through the 1990s, with version 1.1.10 shipped with VM/ESA 2.3 in 1997. After that point, the version included in the base VM product was functionally frozen, though a more current version remained available separately.
In 1995, IBM released a TSO implementation called BatchPipeWorks as part of BatchPipes/MVS, bringing Hartmann’s pipeline concept to the MVS operating system environment.
Starting with z/VM 6.4 in November 2016, IBM re-included the current level of CMS Pipelines in the base z/VM product. The most recent stable release is version 1.1.12 sublevel 0012, dated June 2020. CMS Pipelines ships with z/VM through the current release (z/VM 7.4, released September 2024).
Relationship to Unix Pipes
CMS Pipelines was explicitly inspired by Unix pipes but diverged significantly:
| Feature | Unix Pipes | CMS Pipelines |
|---|---|---|
| Data unit | Byte stream | Discrete records |
| Buffering | Kernel buffers between stages | Lock-step, no buffering |
| Streams per stage | One input, one output | Multiple inputs and outputs |
| Topology | Linear | Branching, merging, dynamic |
| Built-in programs | OS utilities (grep, sort, etc.) | 200+ specialized stages |
| Stage language | Any executable | Built-in stages + REXX |
Current Relevance
CMS Pipelines remains an active component of IBM’s z/VM operating system. While z/VM is a niche platform compared to Linux or Windows, it serves critical roles in enterprise computing — particularly as the hypervisor for running thousands of Linux guests on IBM Z mainframes. In environments where z/VM’s CMS is used for system administration, CMS Pipelines continues to be a daily tool.
The Marist College VM community hosts the CMS/TSO Pipelines Runtime Library Distribution, and open-source reimplementations in Java and Swift demonstrate ongoing interest in Hartmann’s design beyond the mainframe world.
Why It Matters
CMS Pipelines represents one of the most ambitious extensions of the Unix pipe concept. While Unix pipes remain limited to linear, byte-stream processing, Hartmann demonstrated that the dataflow paradigm could be extended to support multi-stream, record-oriented processing with dynamic topology — all while maintaining the conceptual simplicity that makes pipes intuitive. The tool showed that the gap between a dataflow design and its executable implementation could be nearly eliminated, turning pipeline notation into an executable specification rather than just a convenience.
For anyone studying the history of dataflow programming, stream processing, or the influence of Unix concepts on other operating systems, CMS Pipelines is an essential case study in how a good idea can be adapted and extended for a fundamentally different computing environment.
Timeline
Notable Uses & Legacy
IBM VM/CMS System Administration
CMS Pipelines is a standard system administration and data processing tool on VM/CMS and z/VM mainframes, used for file manipulation, log analysis, and automation.
Enterprise Mainframe Data Processing
Financial institutions, government agencies, and large enterprises running IBM mainframes use CMS Pipelines for batch data transformation and reporting.
BatchPipes/MVS on TSO
IBM ported the pipeline concept to the TSO/MVS environment as BatchPipeWorks, extending the reach of Hartmann's design to MVS-based mainframe shops.
SHARE User Group Community
CMS Pipelines has been a regular topic at SHARE conferences, the primary IBM mainframe user group, with dedicated sessions and workshops spanning decades.