Est. 1994 Advanced

MPI (Message Passing Interface)

The community-defined standard that became the lingua franca of parallel supercomputing — a portable message-passing interface letting thousands or millions of processes coordinate across the world's largest machines.

Created by The MPI Forum (early drafts by Jack Dongarra, Tony Hey, and David W. Walker)

Paradigm Parallel / distributed-memory message passing; SPMD model
Typing Library specification (no language of its own); messages are typed via the MPI_Datatype system, bound into C and Fortran
First Appeared 1994
Latest Version MPI 5.0 (approved June 2025)

MPI, the Message Passing Interface, is not a programming language in the ordinary sense — it is a standardized specification for a library of routines that processes use to send messages to one another. Yet within high-performance computing it functions like a universal language: it is the common vocabulary in which scientific codes describe how thousands, or even millions, of cooperating processes exchange data while running on the world’s largest supercomputers. For three decades it has been the dominant model for programming distributed-memory parallel machines, and an enormous body of scientific software is written to its interface.

MPI was defined not by a single company or individual but by the MPI Forum, an open community of vendors, national laboratories, and academic researchers who set out in the early 1990s to end the chaos of incompatible, vendor-specific message-passing systems. The result — first released as MPI 1.0 in June 1994 — was a portable interface that a program could be written against once and then run on machines from many different manufacturers.

History & Origins

In the early 1990s, parallel computing was fragmented. Each parallel-machine vendor shipped its own message-passing library — Intel had NX, IBM and others had their own systems — and portable research efforts such as PVM (Parallel Virtual Machine, from Oak Ridge National Laboratory), p4, Express, PARMACS, Zipcode, and Chameleon each offered different abstractions. A program written for one platform usually had to be substantially rewritten to run on another. The community wanted a single, portable standard.

The effort coalesced at a workshop in Williamsburg, Virginia in April 1992, where the idea of an open standardization body — the MPI Forum — took shape. That November, Jack Dongarra, Tony Hey, and David W. Walker circulated a preliminary draft proposal nicknamed “MPI1.” The Forum began meeting regularly in January 1993, eventually drawing participants from approximately 40 organizations, and presented a draft at the Supercomputing ‘93 conference. After a public comment period, MPI 1.0 was released in June 1994.

The design consciously borrowed from its predecessors: MPI’s portability-first philosophy and many of its concepts were strongly influenced by PVM, while ideas about communicators, datatypes, and collective operations drew on systems like p4, Express, Zipcode, and PARMACS. Crucially, where PVM emphasized building a virtual machine and dynamic process management, MPI prioritized portable high performance across many vendors’ hardware — a difference in goals that shaped many of its design choices.

Design Philosophy

MPI’s design rests on a few durable principles.

Portability without sacrificing performance

The Forum’s central goal was that a correct MPI program should run unmodified on any conforming implementation, from a laptop to the largest supercomputer — while still allowing each vendor to map the standard onto their hardware as efficiently as possible. The specification defines what the routines do and the semantics they guarantee, but deliberately leaves the how to implementers.

Explicit, programmer-controlled communication

MPI follows an explicit message-passing model, typically used in a Single Program, Multiple Data (SPMD) style: every process runs the same program but operates on its own portion of the data and decides, in code, exactly when and what to communicate. There is no shared memory across processes and no hidden communication. This explicitness is demanding but gives the programmer precise control over data movement — often the deciding factor in parallel performance.

Safety through communicators

A signature MPI concept is the communicator, a named context that groups a set of processes and isolates their messages. Because a library can be given its own communicator, its messages can never accidentally collide with the application’s — a feature that made it safe to build large, composable parallel software stacks.

A typed message system

Although MPI itself adds no new programming language, it defines a rich system of datatypes (MPI_Datatype). Beyond basic types like MPI_INT and MPI_DOUBLE, programmers can build derived datatypes describing strided, structured, or non-contiguous data layouts, so the library can move complex data structures efficiently and portably between machines that may even differ in byte order.

Key Features

  • Point-to-point communication — blocking and nonblocking sends and receives (MPI_Send, MPI_Recv, MPI_Isend, MPI_Irecv) between specific process pairs.
  • Collective communication — coordinated group operations such as MPI_Bcast (broadcast), MPI_Reduce, MPI_Allreduce, MPI_Scatter, MPI_Gather, and barriers, with nonblocking variants added in MPI-3.0.
  • Communicators and groups — logical scoping of processes and message contexts, including Cartesian and graph topologies that map processes onto the structure of a problem.
  • Derived datatypes — descriptions of complex, non-contiguous data for efficient transfer.
  • One-sided communication (RMA) — remote memory access (MPI_Put, MPI_Get, MPI_Accumulate) introduced in MPI-2 and overhauled in MPI-3.0, where one process accesses another’s memory without an explicit matching receive.
  • Parallel I/O (MPI-IO) — coordinated access by many processes to shared files, added in MPI-2.
  • Dynamic process management — spawning and connecting processes at runtime (MPI-2).
  • Persistent and partitioned communication, large-count routines, and sessions — added in MPI-4.0 to address modern, accelerator-rich, multithreaded systems.

A minimal “hello” in C illustrates the model:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
#include <mpi.h>
#include <stdio.h>

int main(int argc, char **argv) {
    MPI_Init(&argc, &argv);

    int rank, size;
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);   /* my id */
    MPI_Comm_size(MPI_COMM_WORLD, &size);   /* total processes */

    printf("Hello from rank %d of %d\n", rank, size);

    MPI_Finalize();
    return 0;
}

Every process runs this same program; MPI_COMM_WORLD is the default communicator containing all of them, and each discovers its own rank (identity) and the total size of the job.

Evolution

MPI grew through a deliberate, consensus-driven sequence of standards rather than the whims of a single owner:

VersionYearHighlights
MPI 1.01994Core point-to-point and collective model, communicators, datatypes
MPI 1.11995Clarifications and corrections
MPI-2.01997Parallel I/O, one-sided communication, dynamic processes, C++ bindings
MPI 2.12008Combined MPI-1 and MPI-2 into one document
MPI 2.22009Refinements; C++ bindings deprecated
MPI-3.02012Nonblocking collectives, revamped RMA, Fortran 2008 bindings; C++ bindings removed
MPI-3.12015Corrections, nonblocking collective I/O
MPI-4.02021Large counts, persistent collectives, partitioned communication, sessions
MPI-4.12023Clarifications and incremental additions
MPI-5.02025Standardized application binary interface (ABI)

One notable course-correction was the C++ bindings: introduced in MPI-2.0, they proved to add little over the C interface while complicating maintenance and binary distribution (in part because C++ name mangling is not standardized), so they were deprecated in MPI-2.2 and removed in MPI-3.0. Most C++ programs simply call the C bindings or use third-party wrappers such as Boost.MPI.

The standard is brought to life by its implementations. The first and most influential is MPICH, developed at Argonne National Laboratory and Mississippi State University starting in 1992; it serves as a reference and a foundation that many vendors build upon. LAM/MPI (from the Ohio Supercomputer Center) was another early open implementation, and Open MPI was later formed by merging the FT-MPI, LA-MPI, LAM/MPI, and PACX-MPI projects. MVAPICH (Ohio State University) specialized in high-speed InfiniBand networks, and vendors including HPE/Cray, Intel, Microsoft, and NEC ship their own tuned, often MPICH-derived, implementations.

Current Relevance

More than thirty years after its debut, MPI remains the dominant programming model for distributed-memory high-performance computing. All three U.S. exascale systems — Frontier, El Capitan, and Aurora — rely on MPICH-derived MPI implementations, and nearly every application in the U.S. Exascale Computing Project is built on MPI, typically paired with on-node parallelism such as OpenMP or CUDA in a “MPI+X” hybrid style. In recognition of this foundational role, the MPICH team at Argonne received the 2024 ACM Software System Award.

The interface also reaches well beyond C and Fortran. Bindings such as mpi4py (Python), Boost.MPI (C++), MPJ Express and mpiJava (Java), Rmpi/pbdMPI (R), and others bring MPI’s communication model to a wide range of languages, including modern data-science and machine-learning workflows that need to scale across many nodes. The 2025 standardization of an ABI in MPI-5.0 is aimed squarely at the future, making it possible to build software once and run it against any conforming implementation.

Why It Matters

MPI is one of computing’s great success stories of standardization by community. Faced with a Tower of Babel of incompatible parallel libraries, a diverse coalition of competitors and researchers agreed on a common interface — and in doing so unlocked decades of portable, reusable scientific software. Codes written against MPI in the 1990s can still run, and scale far further, on machines their authors could never have imagined. From climate prediction and drug discovery to astrophysics and engineering, the simulations that underpin much of modern science speak MPI. Its endurance demonstrates that a carefully designed, vendor-neutral interface can outlast the hardware it was created for, becoming the stable foundation on which an entire field is built.


Sources: Message Passing Interface — Wikipedia, MPI Documents — MPI Forum, MPI: A Message-Passing Interface Standard Version 5.0, Cornell Virtual Workshop: MPI History and Evolution, MPICH: A High-Performance, Portable Implementation of MPI — Argonne National Laboratory, Argonne team honored for powering the communication backbone of supercomputing, Goals Guiding Design: PVM and MPI (Gropp & Lusk).

Timeline

1992
A workshop on standards for message passing is held in Williamsburg, Virginia (April 29-30, 1992), and the MPI Forum forms as an open-membership body. In November 1992 Jack Dongarra, Tony Hey, and David W. Walker circulate a preliminary draft proposal known as 'MPI1'.
1993
The MPI Forum begins regular meetings in January 1993, drawing participants from approximately 40 organizations across academia, national laboratories, and industry. A draft of the standard is presented at the Supercomputing '93 conference in November 1993.
1994
MPI 1.0 is released in June 1994 after a public comment period, defining the core point-to-point and collective communication model, communicators, and derived datatypes, with bindings for C and Fortran.
1995
MPI 1.1 is released in June 1995, correcting errors and clarifying ambiguities in the original document rather than adding major new functionality.
1997
MPI-2.0 is released (July 1997), greatly expanding the standard with parallel I/O (MPI-IO), one-sided (remote memory access) communication, dynamic process management, and C++ bindings.
2008
MPI 2.1 is released (June 2008), consolidating the MPI-1 and MPI-2 documents into a single combined standard.
2009
MPI 2.2 is released (September 2009) with further clarifications and small additions; the C++ bindings are deprecated in this version.
2012
MPI-3.0 is released (September 2012), adding nonblocking collective operations, a substantially revised one-sided communication interface, neighborhood collectives, and Fortran 2008 bindings. The little-used C++ bindings are removed.
2015
MPI-3.1 is released (June 2015), refining the standard with corrections and improvements such as nonblocking collective I/O.
2021
MPI-4.0 is approved (June 2021), introducing large-count routines (for messages exceeding 2^31 elements), persistent collective operations, partitioned communication, and sessions.
2023
MPI-4.1 is approved (November 2023), with clarifications and incremental additions building on the MPI-4.0 feature set.
2025
MPI-5.0 is approved (June 2025), notably standardizing a stable application binary interface (ABI) so that programs can switch between conforming MPI implementations without recompilation.

Notable Uses & Legacy

Exascale supercomputers (Frontier, El Capitan, Aurora)

The first U.S. exascale systems run MPICH-derived MPI implementations as the communication backbone for their largest simulations, coordinating across hundreds of thousands of CPU cores and tens of thousands of GPUs.

Climate and weather modeling

Large-scale community models such as those used for numerical weather prediction and climate research are parallelized with MPI, distributing the simulation grid across many nodes of a cluster or supercomputer.

Molecular dynamics (GROMACS, LAMMPS, NAMD)

Widely used molecular-dynamics packages rely on MPI to decompose atomic systems across processes, enabling biomolecular and materials simulations that span thousands of cores.

Computational fluid dynamics and engineering

CFD solvers and finite-element codes used in aerospace, automotive, and energy engineering use MPI domain decomposition to scale simulations of turbulence, combustion, and structural mechanics.

U.S. Department of Energy / Exascale Computing Project

Nearly all application projects in the DOE Exascale Computing Project build on MPI, often combined with on-node parallelism such as OpenMP or CUDA, making it foundational to U.S. scientific computing infrastructure.

mpi4py in scientific Python

Through the mpi4py bindings, MPI powers distributed-memory parallelism in Python-based scientific and machine-learning workflows, bringing supercomputer-scale communication to a high-level language.

Language Influence

Influenced By

PVM p4 Express Zipcode PARMACS

Influenced

mpi4py Boost.MPI MPJ Express

Running Today

Run examples using the official Docker image:

docker pull
Last updated: