S3
The simple, informal object system of the S language built on generic functions and class attributes, still the most widely used OOP style in R today
Created by John Chambers and the Statistical Models in S team (Bell Laboratories)
S3 is the original, deliberately lightweight object-oriented system of the S statistical computing language, and it remains the most widely used object system in S’s open-source successor, R. Rather than relying on formal class declarations, S3 builds objects out of ordinary data with an added class attribute, and it organizes behavior through generic functions that dispatch to class-specific methods. Its informality is the point: S3 lets analysts attach polished, class-aware behavior — printing, summarizing, plotting, predicting — to their data with very little ceremony.
History & Origins
The S language was created at Bell Laboratories beginning in 1976 by John Chambers and colleagues as an interactive environment for data analysis and graphics. By version 3 of S, documented in the 1988 “Blue Book” (The New S Language), the language had been redesigned around functions and a notion of objects carrying a class attribute. This version number is the source of the name later applied to its object system: S3.
The class-and-method framework that the name now denotes was brought together and documented in 1992 in Statistical Models in S — the “White Book” — edited by John Chambers and Trevor Hastie. Faced with presenting a whole family of statistical models (linear models, generalized linear models, tree-based models, and more), the authors adopted a uniform design: a class of objects for each kind of model, generic functions for the computations performed on those objects, and a method of each generic for each class. An implementation of generic functions and method dispatch was described in an appendix to that book, establishing the conventions that R programmers still follow.
The label “S3” itself is essentially a retronym. It came into common use after the more formal S4 system arrived in 1998, when the community needed a name to distinguish the older, informal class mechanism (from S version 3) from the new one.
Design Philosophy
S3 embodies a “convention over configuration” philosophy that predates that slogan by decades:
- Objects are data plus a label. An S3 object is just an existing R value — a vector, list, or other structure — with a
classattribute naming one or more classes. There is no separate class definition step. - Behavior lives in generic functions. Instead of methods belonging to classes, a generic function such as
printorsummaryis responsible for choosing which method to run based on the class of its argument. - Naming is the dispatch mechanism. Methods follow the simple convention
generic.class(for exampleprint.lmorsummary.data.frame). Defining a method is as easy as writing a function with the right name. - Minimal ceremony, maximum reach. Because so little is required to participate, almost any object can be given rich, idiomatic behavior, which is why S3 spread across the entire ecosystem.
This flexibility is also S3’s main weakness: there is no enforcement that an object actually has the structure its class implies, so correctness depends on discipline and convention rather than on the system itself.
Key Features
- Class attribute: An object’s class is a character vector, e.g.
class(x) <- c("subclass", "superclass"). The vector order encodes an inheritance chain. - Generic functions and
UseMethod(): A generic is typically a one-line function that callsUseMethod("generic"). Dispatch constructs candidate method names of the formgeneric.classfor each class in the object’s class vector. - Default methods: A special pseudo-class,
default, provides a fallback method (generic.default) used when no class-specific method matches. - Inheritance via
NextMethod(): When an object has several classes, a method can callNextMethod()to delegate to the method for the next class in the chain, giving a simple form of inheritance. - Single dispatch: Standard S3 dispatch is based on the class of the first argument (with special handling for the arguments of primitive binary operators), in contrast to the multiple dispatch later offered by S4.
A small illustrative example in R:
| |
Evolution
S3’s most important evolution is that it was carried wholesale into R. When Ross Ihaka and Robert Gentleman began R at the University of Auckland in 1993 as a free, largely S-compatible language, they adopted the S3 model of classes and generic functions. With R’s first stable release, R 1.0.0, in 2000, S3 became the default object system used throughout the language.
The S language itself moved on to a more formal system, S4, introduced in 1998 in Programming with Data (the “Green Book”). S4 added formal class definitions with declared slots, validity checking, and multiple dispatch. R supports S4 as well, and later added the reference-class (“R5”) system and, in the wider community, packages such as R6. Yet despite these more rigorous alternatives, S3 has never been displaced from everyday use. Modern tooling — notably Hadley Wickham’s Advanced R and packages like vctrs — has instead focused on documenting good S3 practice and providing helpers to build robust S3 classes, rather than replacing the system.
Current Relevance
S3 is, by a wide margin, the most common object system in R today. The objects that R users handle constantly — data.frame, factor, the fitted models returned by lm() and glm(), dates and times, and countless package-defined types — are S3 objects. Every time a user types the name of such an object and sees a nicely formatted result, an S3 print method is responsible. The same pattern underlies summary(), plot(), predict(), and format().
The tidyverse — including ggplot2, dplyr, and tibble — is built extensively on S3, and the overwhelming majority of packages on CRAN define S3 classes and methods. For most R programmers, S3 is not a historical curiosity but the first and most frequently used way to make their own data types behave like first-class citizens of the language.
Why It Matters
S3 demonstrated that object orientation does not require heavyweight class hierarchies or rigid declarations to be useful. By reducing an “object” to data with a class label and routing behavior through generic functions, it made it trivial for statisticians and data analysts — not just software engineers — to extend the language with their own well-behaved types. That low barrier to entry is precisely what allowed an entire ecosystem of statistical software to grow around a consistent set of generics.
Born from the practical need to present many statistical models through one uniform interface in the 1992 White Book, S3 has proven remarkably durable. More than three decades on, it remains woven into the daily experience of what is reportedly a community of millions of R users, a quiet but pervasive legacy of the design choices made in Bell Labs’ statistics research group.
Timeline
Notable Uses & Legacy
Base R
Core R objects such as data.frame, factor, and the results of lm() and glm() are S3 objects; generics like print(), summary(), and plot() dispatch on their class
CRAN package ecosystem
The vast majority of R packages define and extend S3 classes and methods, making S3 the de facto object system of the R community
Tidyverse (ggplot2, dplyr, tibble)
Widely used data-science packages rely heavily on S3 classes and method dispatch to provide consistent print, format, and computation behavior
Statistical modeling functions
Model-fitting functions return S3 objects whose summary(), predict(), and plot() methods give analysts a uniform interface across regression, GLMs, and tree-based models