Est. 1974 Beginner

SQL

A declarative, special-purpose language for querying and managing data in relational databases, and one of the most widely used programming languages in the world

Created by Donald D. Chamberlin and Raymond F. Boyce (IBM)

Paradigm Declarative; special-purpose (domain-specific) query language for relational data
Typing Static, strong (values constrained by column data types and domains); exact type system varies by implementation
First Appeared 1974
Latest Version SQL:2023 (ISO/IEC 9075:2023)

SQL (Structured Query Language) is a declarative, special-purpose language for defining, querying, and manipulating data held in relational databases. Rather than describing how to fetch data step by step, an SQL statement describes what result is wanted—which rows, which columns, joined and filtered and aggregated in a particular way—and leaves it to the database’s query optimizer to work out an efficient execution plan. This separation of intent from mechanism, combined with a readable, English-like syntax, has made SQL one of the most widely used and longest-lived programming languages in the world. First designed at IBM in 1974, it remains the common tongue of data more than half a century later.

History & Origins

SQL grew directly out of the relational model of data, which Edgar F. “Ted” Codd introduced in a landmark 1970 paper written at IBM’s San Jose Research Laboratory. Codd proposed organizing data into relations—tables of rows and columns—and manipulating it with operations grounded in mathematical logic. To turn that theory into something practitioners could use, IBM built an experimental relational database system called System R.

In 1974, two IBM researchers, Donald D. Chamberlin and Raymond F. Boyce, designed a query language for System R that they called SEQUEL (Structured English Query Language). Their goal was a language accessible to people who were not trained mathematicians: queries should read something like structured English while still resting on the rigor of the relational model. Boyce, whose name also survives in Boyce–Codd normal form, died tragically young in 1974, not long after the language’s initial design.

The name SEQUEL was later dropped, reportedly for trademark reasons—it is said to have conflicted with a mark held by the UK firm Hawker Siddeley—and the language was renamed SQL. Although it is often pronounced “sequel” for that historical reason, the letters officially stand for Structured Query Language. System R proved that a relational database driven by such a language was practical, and its influence quickly spread beyond IBM.

Standardization

SQL’s real triumph was becoming a standard rather than a single vendor’s product. By the late 1970s, commercial systems were appearing: Relational Software, Inc.—soon renamed Oracle—shipped Oracle Version 2 in 1979, generally regarded as the first commercially available SQL relational database. As competing implementations multiplied, the need for a common definition grew.

  • SQL-86 — the first ANSI standard (ANSI X3.135-1986), adopted by ISO the following year.
  • SQL-89 — a modest revision adding integrity constraints.
  • SQL-92 (SQL2) — a major expansion of the language; still a common reference point for “standard SQL.”
  • SQL:1999 (SQL3) — recursive queries, triggers, and object-relational features.
  • SQL:2003 — window functions, standardized identity/sequence generation, and SQL/XML.
  • SQL:2006 / 2008 / 2011 — further refinements, including improved XML support and temporal (system-versioned) tables.
  • SQL:2016 — native JSON support and row pattern matching.
  • SQL:2023 — the current edition (ISO/IEC 9075:2023), adding property graph queries (SQL/PGQ) and a dedicated JSON data type.

In practice, every database vendor implements a dialect: a large common core defined by the standard plus proprietary extensions. Portability across systems is real but rarely perfect.

Design Philosophy

SQL is fundamentally declarative and set-oriented. You state the shape of the answer you want, and the database’s optimizer decides how to compute it—which indexes to use, in what order to join tables, whether to scan or seek. This is a sharp contrast to the imperative loops of general-purpose languages, and it is the single most important idea to internalize when learning SQL: you operate on whole sets of rows at once, not one record at a time.

The language is organized into a few sublanguages that reflect different jobs:

  • DQL / DML (Data Query & Manipulation)SELECT, INSERT, UPDATE, DELETE for reading and changing data.
  • DDL (Data Definition)CREATE, ALTER, DROP for defining tables, views, indexes, and constraints.
  • DCL (Data Control)GRANT, REVOKE for permissions.
  • TCL (Transaction Control)COMMIT, ROLLBACK, SAVEPOINT for grouping changes into atomic transactions.

Underpinning all of this is the promise of ACID transactions (Atomicity, Consistency, Isolation, Durability), which lets many users read and write the same data concurrently without corrupting it.

Key Features

A SELECT statement is the heart of the language, combining several clauses that map closely to relational operations:

1
2
3
4
5
6
SELECT department, COUNT(*) AS headcount, AVG(salary) AS avg_salary
FROM   employees
WHERE  active = TRUE
GROUP BY department
HAVING COUNT(*) > 5
ORDER BY avg_salary DESC;

Beyond basic queries, standard SQL offers:

  • Joins — combining rows from multiple tables (INNER, LEFT, RIGHT, FULL OUTER, CROSS).
  • Aggregation and groupingGROUP BY with functions like SUM, COUNT, AVG, MIN, MAX.
  • Subqueries and common table expressions (CTEs) — including recursive CTEs for hierarchical and graph-like data.
  • Window functionsROW_NUMBER(), RANK(), SUM() OVER (...) and similar, which compute values across related rows without collapsing them.
  • Views — named, stored queries that behave like virtual tables.
  • ConstraintsPRIMARY KEY, FOREIGN KEY, UNIQUE, CHECK, and NOT NULL to enforce data integrity.
  • Transactions — grouping statements so they succeed or fail as a unit.

SQL also has some famous rough edges, most notably its three-valued logic: comparisons involving NULL yield unknown rather than true or false, which surprises newcomers and requires IS NULL / IS NOT NULL tests.

Evolution

Over five decades SQL has steadily absorbed features that once seemed alien to a “relational” language. SQL:1999 brought recursive queries and object-relational types; SQL:2003 added window functions that transformed analytical querying; more recent editions added JSON support (SQL:2016) and property-graph querying (SQL:2023), acknowledging that relational systems increasingly need to handle semi-structured and connected data.

Each major database has also grown a procedural companion language—Oracle’s PL/SQL, Microsoft and Sybase’s Transact-SQL, PostgreSQL’s PL/pgSQL—that wraps SQL statements in variables, loops, and control flow for stored procedures and triggers. Meanwhile, the “NoSQL” movement of the 2000s and 2010s challenged SQL’s dominance for certain workloads, only for many of those systems to later add SQL-like query layers, and for “NewSQL” databases to reassert relational guarantees at scale.

Current Relevance

SQL is not a legacy curiosity; it is core infrastructure. It powers the databases behind an enormous share of the world’s websites, mobile apps, and enterprise systems, and it consistently ranks among the most-used languages in large developer surveys. Cloud data warehouses such as Snowflake, Google BigQuery, and Amazon Redshift use SQL as their primary interface, and big-data engines like Apache Spark and Apache Flink offer SQL layers so that analysts can query distributed datasets with familiar syntax. Learning SQL remains one of the highest-leverage skills for anyone who works with data, from software engineers to analysts and data scientists.

Why It Matters

SQL demonstrated that a declarative, standardized language could sit atop a rigorous mathematical model and still be usable by ordinary practitioners. It decoupled the question from the machinery of answering it, enabling decades of improvement in query optimizers and storage engines without breaking the queries people had already written. Few technologies from the 1970s remain not just alive but genuinely central to modern computing; SQL is one of them, and its blend of theoretical foundations, standardization, and practical staying power makes it one of the most consequential languages in the history of software.

Timeline

1970
Edgar F. Codd, working at IBM's San Jose Research Laboratory, publishes 'A Relational Model of Data for Large Shared Data Banks', laying the theoretical foundation for relational databases and the query languages built on them
1974
Donald D. Chamberlin and Raymond F. Boyce at IBM San Jose design SEQUEL (Structured English Query Language) as the query language for the System R relational database prototype
1977
Around this period SEQUEL is renamed SQL, reportedly because 'SEQUEL' conflicted with a trademark of the UK firm Hawker Siddeley; a revised design, sometimes called SEQUEL/2, evolves into the language that becomes SQL
1979
Relational Software, Inc. (later Oracle Corporation) ships Oracle Version 2, generally regarded as the first commercially available SQL-based relational database management system
1986
ANSI publishes the first SQL standard (ANSI X3.135-1986), commonly called SQL-86, formalizing the language after years of vendor implementations
1987
The International Organization for Standardization (ISO) adopts SQL as an international standard (ISO 9075)
1992
SQL-92 (SQL2) is published, a large and influential revision that greatly expanded the language and became the baseline many implementations still reference
1999
SQL:1999 (SQL3) adds recursive queries (common table expressions), triggers, user-defined types, and other object-relational features
2003
SQL:2003 introduces window functions, standardized sequences and auto-generated identity columns, and SQL/XML for working with XML data
2016
SQL:2016 adds native JSON support, row pattern recognition (MATCH_RECOGNIZE), and polymorphic table functions
2023
SQL:2023 (ISO/IEC 9075:2023), the ninth edition of the standard, is published, adding the SQL/PGQ property graph query part and a dedicated JSON data type

Notable Uses & Legacy

Relational database systems

SQL is the query and data-definition language for essentially every major relational database, including Oracle Database, Microsoft SQL Server, MySQL, MariaDB, PostgreSQL, IBM Db2, and SQLite

Web and application backends

The overwhelming majority of dynamic websites and business applications store their data in relational databases and use SQL (often through an ORM or query builder) to read and write it

Banking, finance, and enterprise systems

Transactional systems in banking, insurance, retail, and enterprise resource planning rely on SQL databases for their ACID guarantees, ensuring correctness under concurrent access

Analytics and data warehousing

Business intelligence and analytics platforms—including cloud warehouses such as Amazon Redshift, Google BigQuery, and Snowflake—expose SQL as the primary interface for querying very large datasets

Big data and streaming engines

SQL-based dialects and layers such as Apache Hive's HiveQL, Spark SQL, and Apache Flink SQL bring familiar declarative querying to distributed batch and stream processing

Language Influence

Influenced

PL/SQL Transact-SQL LINQ HiveQL PL/pgSQL

Running Today

Run examples using the official Docker image:

docker pull
Last updated: