Est. 1976 Intermediate

SAS

A procedural, fourth-generation language and analytics environment for data management and statistical analysis, built around the DATA step and an extensive library of procedures

Created by Anthony Barr, James Goodnight, John Sall, and Jane Helwig (SAS Institute)

Paradigm Procedural, statistical; fourth-generation data-step and procedure-based language
Typing Dynamic, weak; two fundamental data types (numeric and character)
First Appeared 1976
Latest Version SAS 9.4 Maintenance 9 (2025); SAS Viya on a continuous cloud release cadence

SAS — originally an acronym for Statistical Analysis System — is a procedural, fourth-generation programming language and a broad commercial software environment for data management, advanced analytics, and reporting. More than a single language, SAS is an integrated platform: a programming language wrapped around a vast library of pre-built statistical and data-processing procedures, all designed so that an analyst can read raw data, transform it, run rigorous statistical methods on it, and produce publication-ready output without leaving the system. For decades it has been the backbone of analytics in regulated industries where correctness, validation, and reproducibility matter more than novelty.

History & Origins

SAS grew out of an academic effort at North Carolina State University. Beginning around 1966, Anthony Barr was hired to write software for analysis of variance and regression on IBM System/360 mainframes, work funded by a consortium of agricultural experiment stations across the southern United States that needed a common tool for analyzing crop and agricultural data. Barr designed the fundamental structure and language; James Goodnight joined in 1968 and contributed the statistical engine, including general linear modeling.

Early releases circulated under year-based names — a limited SAS 71, followed by SAS 72, the first broadly distributed version, which introduced staples like the MERGE statement and missing-data handling. As demand grew beyond the original consortium, the founders — Barr, Goodnight, John Sall, and Jane Helwig, who wrote the first documentation — spun the project out of the university. SAS Institute Inc. was incorporated in 1976, the year most commonly cited as SAS’s debut as a commercial product. The company settled in North Carolina’s Research Triangle, eventually in Cary, where it remains headquartered today.

From there SAS grew into one of the most successful privately held software companies in the world. Goodnight, still the company’s CEO, and Sall together own the firm, which has famously remained private rather than going public.

Design Philosophy

SAS is built on a deliberately pragmatic, analyst-centered philosophy: give domain experts — statisticians, epidemiologists, actuaries — a language in which data preparation, analysis, and reporting are first-class and tightly integrated, and ship batteries-included procedures so that users call validated methods rather than reimplementing algorithms.

This philosophy shows up in the two-part rhythm of nearly all SAS programs:

  • The DATA step is SAS’s data-engineering workhorse. It reads, transforms, joins, and reshapes data, executing an implicit loop that processes one observation (row) at a time through an in-memory buffer known as the Program Data Vector. The DATA step blends executable logic with declarative statements and gives the programmer fine-grained control over how each record is built.
  • The PROC step invokes one of hundreds of pre-built proceduresPROC MEANS, PROC FREQ, PROC REG, PROC GLM, PROC SQL, and many more. Rather than writing a regression from scratch, the analyst calls the procedure and supplies options.

Because the procedures are tested, documented, and stable across decades, organizations in regulated fields can trust and audit their results — a major reason SAS became entrenched where validation is mandatory.

Key Features

FeatureWhat it provides
DATA stepRow-by-row data manipulation via the implicit-loop Program Data Vector model
PROC libraryHundreds of validated statistical, data-management, and reporting procedures
SAS macro languageText-substitution metaprogramming (%MACRO, & macro variables) for code generation and reuse
ODS (Output Delivery System)Renders output to HTML, PDF, RTF, Excel, and more
PROC SQLEmbedded SQL for relational querying inside SAS
Backward compatibilityDecades-old SAS programs typically still run on current releases

SAS’s type system is intentionally minimal: variables are either numeric or character, with no rich type hierarchy. This simplicity, combined with the implicit data-flow model of the DATA step, makes SAS approachable for analysts who are not professional programmers while still scaling to very large datasets.

A Taste of the Language

A small program reads some data and summarizes it, illustrating the DATA-step / PROC-step pairing:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
data sales;
   input region $ amount;
   datalines;
North 1200
South  950
East  1430
West  1100
;
run;

proc means data=sales mean sum;
   var amount;
run;

The data step builds a dataset named sales one observation at a time; the proc means step then computes summary statistics over it. The $ marks region as a character variable — one of SAS’s only two fundamental types.

Evolution

SAS evolved from year-named mainframe releases (SAS 71, 72, 76, 79, 82) into a sequentially versioned, multi-platform product. Version 6, beginning in the mid-1980s, ushered in a long, portable era in which the macro language matured and SAS was ported across many operating environments. SAS 8 (1999) introduced the Output Delivery System and the Enterprise Guide GUI, broadening SAS beyond pure programming. The SAS 9 platform (launched 2002, with rollout through 2004) added a metadata-server architecture and multithreaded procedures, and SAS 9.4 (2013) became the durable foundation of the modern 9.x line, still receiving maintenance releases — the latest being 9.4M9 in 2025.

The most significant recent shift is SAS Viya, launched in 2016: a cloud-native, container- and Kubernetes-based platform built for distributed, in-memory analytics and designed to interoperate with open-source languages. Viya represents SAS’s strategic answer to the rise of Python and R, letting those languages drive SAS’s analytics engine while retaining the governance and support enterprises expect.

Current Relevance

SAS Institute remains privately held and is among the largest privately owned software companies in the world, with annual revenue measured in the billions and a global workforce in the tens of thousands. Its position in the market is a study in contrasts. In academia and the broader data-science community, open-source R and Python have steadily displaced SAS, helped by zero licensing cost and enormous library ecosystems. Yet in regulated enterprise settings — pharmaceutical FDA submissions, banking risk and compliance, government statistics, insurance — SAS retains a deep moat, because the cost of switching away from validated, audited, decades-stable analytics pipelines is enormous.

SAS’s modernization strategy centers on Viya and on embracing rather than resisting open source: analysts can write Python or R against SAS’s compute engine, blending familiar tooling with enterprise-grade scalability and governance.

Why It Matters

SAS demonstrated, earlier and more thoroughly than almost any other system, that statistical computing could be packaged as a dependable industrial product. By integrating data wrangling, analysis, and reporting behind a consistent language and a library of validated procedures, it made sophisticated analytics accessible to domain experts and trustworthy enough for life-or-death decisions in medicine and high-stakes decisions in finance. Its DATA-step model, its macro system, and its procedure-driven workflow shaped how a generation of analysts thought about working with data. Even as open-source tools reshape the field, SAS’s influence — and its entrenched presence in the world’s most regulated industries — endures.

Timeline

1966
Development begins at North Carolina State University, where Anthony Barr starts building statistical analysis software for IBM System/360 mainframes; the project is funded by a consortium of agricultural experiment stations across the southern United States
1968
James Goodnight joins the project and contributes statistical routines, including general linear modeling, becoming a co-leader of the effort
1972
SAS 72, the first widely distributed full release, introduces features such as the MERGE statement and handling for missing data (an earlier limited release, SAS 71, preceded it)
1976
SAS Institute Inc. is incorporated as a private company in Raleigh, North Carolina, moving the project off campus; this is the commonly cited 'first appearance' year for SAS as a product
1985
Version 6 begins a long-lived, multi-host era; the SAS macro language matures and the software is ported across a wide range of operating environments
1989
JMP, a graphical statistical-discovery product led by John Sall, is introduced (initially for the Apple Macintosh)
1999
SAS 8 is released, introducing the Output Delivery System (ODS) for flexible reporting and the SAS Enterprise Guide point-and-click interface
2002
The SAS 9 platform launches (with broad rollout continuing through 2004), introducing a metadata-server-based architecture and multithreaded procedures
2013
SAS 9.4 is released (July), the foundation of the long-running 9.x line still maintained today
2016
SAS Viya launches, a cloud-native, microservices- and container-based analytics platform designed to interoperate with open-source languages such as Python and R
2025
SAS 9.4 Maintenance 9 (9.4M9) is released, continuing support for the traditional platform while SAS Viya advances on a rolling cloud release cadence

Notable Uses & Legacy

Pharmaceutical & clinical trials

SAS is the de facto standard for analyzing clinical-trial data and preparing regulatory submissions to the U.S. FDA, where validated, reproducible procedures and CDISC data standards are central to the approval process

Banking & financial services

Banks use SAS for credit scoring, risk modeling, regulatory compliance reporting, and fraud detection across large transactional datasets

Government & public sector

Statistical agencies, tax authorities, and census organizations use SAS for large-scale data processing, official statistics, and fraud or tax-evasion detection

Healthcare & life sciences

Health outcomes research, epidemiology, and payer/provider analytics rely on SAS for managing and analyzing complex, sensitive datasets

Insurance

Actuaries and insurers use SAS for pricing models, reserving, and predictive risk analytics

Academia & institutional research

Universities have long used SAS for statistics instruction and institutional research, though open-source tools have eroded its academic dominance in recent years

Language Influence

Influenced By

Influenced

JMP

Running Today

Run examples using the official Docker image:

docker pull
Last updated: