SQL
A declarative, special-purpose language for querying and managing data in relational databases, and one of the most widely used programming languages in the world
Created by Donald D. Chamberlin and Raymond F. Boyce (IBM)
SQL (Structured Query Language) is a declarative, special-purpose language for defining, querying, and manipulating data held in relational databases. Rather than describing how to fetch data step by step, an SQL statement describes what result is wanted—which rows, which columns, joined and filtered and aggregated in a particular way—and leaves it to the database’s query optimizer to work out an efficient execution plan. This separation of intent from mechanism, combined with a readable, English-like syntax, has made SQL one of the most widely used and longest-lived programming languages in the world. First designed at IBM in 1974, it remains the common tongue of data more than half a century later.
History & Origins
SQL grew directly out of the relational model of data, which Edgar F. “Ted” Codd introduced in a landmark 1970 paper written at IBM’s San Jose Research Laboratory. Codd proposed organizing data into relations—tables of rows and columns—and manipulating it with operations grounded in mathematical logic. To turn that theory into something practitioners could use, IBM built an experimental relational database system called System R.
In 1974, two IBM researchers, Donald D. Chamberlin and Raymond F. Boyce, designed a query language for System R that they called SEQUEL (Structured English Query Language). Their goal was a language accessible to people who were not trained mathematicians: queries should read something like structured English while still resting on the rigor of the relational model. Boyce, whose name also survives in Boyce–Codd normal form, died tragically young in 1974, not long after the language’s initial design.
The name SEQUEL was later dropped, reportedly for trademark reasons—it is said to have conflicted with a mark held by the UK firm Hawker Siddeley—and the language was renamed SQL. Although it is often pronounced “sequel” for that historical reason, the letters officially stand for Structured Query Language. System R proved that a relational database driven by such a language was practical, and its influence quickly spread beyond IBM.
Standardization
SQL’s real triumph was becoming a standard rather than a single vendor’s product. By the late 1970s, commercial systems were appearing: Relational Software, Inc.—soon renamed Oracle—shipped Oracle Version 2 in 1979, generally regarded as the first commercially available SQL relational database. As competing implementations multiplied, the need for a common definition grew.
- SQL-86 — the first ANSI standard (ANSI X3.135-1986), adopted by ISO the following year.
- SQL-89 — a modest revision adding integrity constraints.
- SQL-92 (SQL2) — a major expansion of the language; still a common reference point for “standard SQL.”
- SQL:1999 (SQL3) — recursive queries, triggers, and object-relational features.
- SQL:2003 — window functions, standardized identity/sequence generation, and SQL/XML.
- SQL:2006 / 2008 / 2011 — further refinements, including improved XML support and temporal (system-versioned) tables.
- SQL:2016 — native JSON support and row pattern matching.
- SQL:2023 — the current edition (ISO/IEC 9075:2023), adding property graph queries (SQL/PGQ) and a dedicated JSON data type.
In practice, every database vendor implements a dialect: a large common core defined by the standard plus proprietary extensions. Portability across systems is real but rarely perfect.
Design Philosophy
SQL is fundamentally declarative and set-oriented. You state the shape of the answer you want, and the database’s optimizer decides how to compute it—which indexes to use, in what order to join tables, whether to scan or seek. This is a sharp contrast to the imperative loops of general-purpose languages, and it is the single most important idea to internalize when learning SQL: you operate on whole sets of rows at once, not one record at a time.
The language is organized into a few sublanguages that reflect different jobs:
- DQL / DML (Data Query & Manipulation) —
SELECT,INSERT,UPDATE,DELETEfor reading and changing data. - DDL (Data Definition) —
CREATE,ALTER,DROPfor defining tables, views, indexes, and constraints. - DCL (Data Control) —
GRANT,REVOKEfor permissions. - TCL (Transaction Control) —
COMMIT,ROLLBACK,SAVEPOINTfor grouping changes into atomic transactions.
Underpinning all of this is the promise of ACID transactions (Atomicity, Consistency, Isolation, Durability), which lets many users read and write the same data concurrently without corrupting it.
Key Features
A SELECT statement is the heart of the language, combining several clauses that map closely to relational operations:
| |
Beyond basic queries, standard SQL offers:
- Joins — combining rows from multiple tables (
INNER,LEFT,RIGHT,FULL OUTER,CROSS). - Aggregation and grouping —
GROUP BYwith functions likeSUM,COUNT,AVG,MIN,MAX. - Subqueries and common table expressions (CTEs) — including recursive CTEs for hierarchical and graph-like data.
- Window functions —
ROW_NUMBER(),RANK(),SUM() OVER (...)and similar, which compute values across related rows without collapsing them. - Views — named, stored queries that behave like virtual tables.
- Constraints —
PRIMARY KEY,FOREIGN KEY,UNIQUE,CHECK, andNOT NULLto enforce data integrity. - Transactions — grouping statements so they succeed or fail as a unit.
SQL also has some famous rough edges, most notably its three-valued logic: comparisons involving NULL yield unknown rather than true or false, which surprises newcomers and requires IS NULL / IS NOT NULL tests.
Evolution
Over five decades SQL has steadily absorbed features that once seemed alien to a “relational” language. SQL:1999 brought recursive queries and object-relational types; SQL:2003 added window functions that transformed analytical querying; more recent editions added JSON support (SQL:2016) and property-graph querying (SQL:2023), acknowledging that relational systems increasingly need to handle semi-structured and connected data.
Each major database has also grown a procedural companion language—Oracle’s PL/SQL, Microsoft and Sybase’s Transact-SQL, PostgreSQL’s PL/pgSQL—that wraps SQL statements in variables, loops, and control flow for stored procedures and triggers. Meanwhile, the “NoSQL” movement of the 2000s and 2010s challenged SQL’s dominance for certain workloads, only for many of those systems to later add SQL-like query layers, and for “NewSQL” databases to reassert relational guarantees at scale.
Current Relevance
SQL is not a legacy curiosity; it is core infrastructure. It powers the databases behind an enormous share of the world’s websites, mobile apps, and enterprise systems, and it consistently ranks among the most-used languages in large developer surveys. Cloud data warehouses such as Snowflake, Google BigQuery, and Amazon Redshift use SQL as their primary interface, and big-data engines like Apache Spark and Apache Flink offer SQL layers so that analysts can query distributed datasets with familiar syntax. Learning SQL remains one of the highest-leverage skills for anyone who works with data, from software engineers to analysts and data scientists.
Why It Matters
SQL demonstrated that a declarative, standardized language could sit atop a rigorous mathematical model and still be usable by ordinary practitioners. It decoupled the question from the machinery of answering it, enabling decades of improvement in query optimizers and storage engines without breaking the queries people had already written. Few technologies from the 1970s remain not just alive but genuinely central to modern computing; SQL is one of them, and its blend of theoretical foundations, standardization, and practical staying power makes it one of the most consequential languages in the history of software.
Timeline
Notable Uses & Legacy
Relational database systems
SQL is the query and data-definition language for essentially every major relational database, including Oracle Database, Microsoft SQL Server, MySQL, MariaDB, PostgreSQL, IBM Db2, and SQLite
Web and application backends
The overwhelming majority of dynamic websites and business applications store their data in relational databases and use SQL (often through an ORM or query builder) to read and write it
Banking, finance, and enterprise systems
Transactional systems in banking, insurance, retail, and enterprise resource planning rely on SQL databases for their ACID guarantees, ensuring correctness under concurrent access
Analytics and data warehousing
Business intelligence and analytics platforms—including cloud warehouses such as Amazon Redshift, Google BigQuery, and Snowflake—expose SQL as the primary interface for querying very large datasets
Big data and streaming engines
SQL-based dialects and layers such as Apache Hive's HiveQL, Spark SQL, and Apache Flink SQL bring familiar declarative querying to distributed batch and stream processing