Est. 2011 Intermediate

Cypher

A declarative graph query language designed for querying and updating property graphs, originally created for Neo4j.

Created by Andres Taylor (Neo Technology / Neo4j, Inc.)

Paradigm Declarative, Query
Typing Dynamic
First Appeared 2011
Latest Version openCypher 9 / GQL (ISO/IEC 39075:2024)

Cypher is a declarative query language designed specifically for querying, updating, and administering property graph databases. Created at Neo Technology (now Neo4j, Inc.), Cypher pioneered the use of ASCII art-style syntax to express graph patterns, making graph queries intuitive and readable. Its influence extends far beyond Neo4j — Cypher’s design became the primary foundation for GQL, the first new ISO database query language standard since SQL was standardized in 1987.

History & Origins

Cypher was created in 2011 by Andres Taylor, a software engineer at Neo Technology, the company behind the Neo4j graph database. Neo4j 1.0 had been released in February 2010, but at that time users queried the database through a Java traversal API — powerful but verbose and requiring programming expertise.

The motivation for Cypher was clear: graph databases needed a declarative query language analogous to what SQL provides for relational databases. Rather than writing imperative code to traverse nodes and relationships, users should be able to describe the patterns they were looking for and let the database engine figure out how to find them.

Taylor drew inspiration from several existing languages. SQL provided the overall declarative structure and familiar keywords like WHERE, ORDER BY, and RETURN. SPARQL, the W3C query language for RDF graphs, demonstrated that declarative graph querying was viable. Haskell and Python influenced Cypher’s list semantics, with Python’s list comprehension syntax adopted for inline collection transformations.

The openCypher Initiative

By 2015, Cypher had become the de facto standard for property graph querying, but it remained proprietary to Neo4j. Recognizing the value of an open standard, Neo4j launched the openCypher project in October 2015, releasing Cypher’s grammar, language specification, and a Technology Compatibility Kit (TCK) under the Apache 2.0 license.

This move enabled other graph database vendors to implement Cypher compatibility. Amazon Neptune, Memgraph, SAP HANA Graph, and RedisGraph (EOL announced in 2023) all adopted openCypher as a supported query interface.

From openCypher to GQL

The openCypher initiative catalyzed a broader effort to create an international standard for graph query languages. In September 2019, ISO/IEC approved the GQL (Graph Query Language) project proposal. In April 2024, GQL was published as ISO/IEC 39075:2024 — a landmark moment as the first new ISO database query language since SQL was standardized in 1987. GQL is heavily based on Cypher’s syntax and semantics, ensuring that Cypher’s design philosophy lives on in the international standard.

Design Philosophy

Cypher’s design rests on several core principles:

  • Declarative over imperative: Users describe what data they want, not how to retrieve it. The query optimizer determines execution strategy.
  • Visual pattern matching: Graph patterns are expressed using ASCII art — parentheses () for nodes and arrows -[]-> for relationships — making queries visually mirror the graph structure.
  • Readability first: Cypher prioritizes human readability over brevity, designed to be understandable by domain experts and analysts, not just programmers.
  • Familiarity through SQL: By deliberately borrowing SQL’s keyword vocabulary and clause structure, Cypher reduces the learning curve for the millions of developers already fluent in SQL.
  • Graph-native: Unlike attempts to bolt graph capabilities onto SQL, Cypher was designed from the ground up for the property graph data model.

Key Features

ASCII Art Pattern Syntax

Cypher’s most distinctive feature is its visual pattern matching syntax:

// Find friends of friends
MATCH (person:Person {name: 'Alice'})-[:KNOWS]->(friend)-[:KNOWS]->(foaf)
RETURN foaf.name

Nodes are represented by parentheses (), relationships by square brackets within arrows -[]->, and labels/types follow a colon. This syntax makes graph patterns immediately recognizable.

Core Clauses

ClausePurpose
MATCHDeclarative pattern matching against the graph
WHEREFilter results based on predicates
RETURNSpecify which data to return
CREATECreate new nodes and relationships
MERGEIdempotent creation — create only if not already present
DELETERemove nodes and relationships
SETUpdate properties on nodes and relationships
WITHChain query parts, enabling query pipelining
UNWINDTransform lists into rows for further processing
OPTIONAL MATCHLeft outer join equivalent for graph patterns

Variable-Length Path Matching

Cypher supports variable-length paths for traversing unknown depths:

// Find all people within 1 to 5 hops of Alice via KNOWS relationships
MATCH path = (alice:Person {name: 'Alice'})-[:KNOWS*1..5]->(distant)
RETURN distant.name, length(path) AS distance

Aggregation and Collection

// Count friends per person and collect their names
MATCH (p:Person)-[:KNOWS]->(friend)
RETURN p.name, count(friend) AS friendCount, collect(friend.name) AS friendNames
ORDER BY friendCount DESC

List Comprehensions

Borrowed from Python, Cypher supports list comprehensions for inline transformations:

RETURN [x IN range(1, 10) WHERE x % 2 = 0 | x * x] AS evenSquares

Schema and Constraints

While property graphs are inherently schema-flexible, Cypher supports optional schema enforcement:

// Ensure unique email addresses
CREATE CONSTRAINT unique_email FOR (u:User) REQUIRE u.email IS UNIQUE

// Require that all Person nodes have a name
CREATE CONSTRAINT person_name FOR (p:Person) REQUIRE p.name IS NOT NULL

Cypher vs. Other Graph Query Languages

FeatureCypherGremlinSPARQL
ParadigmDeclarativeImperative (traversal)Declarative
Data modelProperty graphProperty graphRDF (triples)
Syntax styleSQL-like with ASCII artMethod chainingPattern-based with URIs
Pattern matchingVisual, first-classManual traversal stepsTriple patterns
Learning curveModerate (SQL-like)SteepModerate
StandardizationopenCypher / GQL (ISO)Apache TinkerPopW3C Recommendation

Evolution

Cypher has evolved significantly through Neo4j’s major releases:

  • Neo4j 2.0 (2013): Introduced node labels and schema indexing, allowing queries like MATCH (p:Person) instead of scanning all nodes
  • Neo4j 3.0 (2016): Added the Bolt binary protocol, providing a more efficient wire format for Cypher queries compared to the REST API
  • Neo4j 4.0 (2020): Brought multi-database support, allowing a single Neo4j instance to host multiple databases queried independently with Cypher
  • Neo4j 5.0 (2022): Introduced “Cypher 5” with improved subquery support via CALL {} blocks, quantified path patterns, and enhanced type system capabilities

Current Relevance

Cypher and graph databases are experiencing growing adoption driven by several trends:

  • Knowledge graphs and AI: Graph databases queried with Cypher are increasingly used to build knowledge graphs that power retrieval-augmented generation (RAG) and other AI applications
  • Fraud detection: Financial institutions use graph pattern matching to identify complex fraud rings that are invisible in relational queries
  • Network analysis: Telecommunications and IT infrastructure companies model and query their networks as graphs
  • Recommendation engines: E-commerce and media companies leverage graph traversals for personalized recommendations

Neo4j reports a large developer community and significant enterprise adoption, including many Fortune 100 companies. The openCypher specification continues to evolve, with the 2024.1 release beginning the alignment process toward the GQL ISO standard.

Multiple graph database products now support Cypher or openCypher, including Amazon Neptune, Memgraph, and SAP HANA Graph, establishing it as the most widely implemented property graph query language.

Why It Matters

Cypher’s impact on the database world extends well beyond Neo4j:

  1. Made graph databases accessible: Before Cypher, querying graph databases required programming expertise. Cypher’s declarative, visual syntax opened graph querying to analysts, data scientists, and domain experts.

  2. Established the property graph query paradigm: Cypher demonstrated that SQL’s declarative approach could be successfully adapted for graph data, influencing the entire graph database industry.

  3. Catalyzed international standardization: Cypher’s success and the openCypher initiative directly led to GQL becoming an ISO standard, giving graph databases the same kind of standardized query language that SQL provides for relational databases.

  4. Proved visual syntax works: The ASCII art pattern syntax ()-[]->() has become iconic in the graph database world, showing that programming language syntax can be both expressive and visually intuitive.

Cypher transformed graph databases from a niche technology requiring specialized programming skills into an accessible tool for anyone who can write a SQL query — and in doing so, helped establish graph querying as a first-class discipline in the database world.

Timeline

2010
Neo4j 1.0 released by Neo Technology; Cypher not yet included
2011
Cypher created by Andres Taylor at Neo Technology and introduced into Neo4j
2013
Neo4j 2.0 released with schema indexing and node labels added to Cypher
2015
openCypher project launched, open-sourcing Cypher under the Apache 2.0 license
2016
Neo4j 3.0 released, introducing the Bolt binary protocol for Cypher queries
2017
First openCypher Implementers Meeting (oCIM) held in Walldorf, Germany
2019
ISO/IEC approves the GQL (Graph Query Language) project proposal, building on Cypher
2020
Neo4j 4.0 released with multi-database support and Cypher enhancements
2022
Neo4j 5.0 released with Cypher 5 language version
2024
GQL published as ISO/IEC 39075:2024 in April — the first new ISO database query language since SQL in 1987

Notable Uses & Legacy

ICIJ (Panama Papers / Paradise Papers)

The International Consortium of Investigative Journalists used Neo4j and Cypher to analyze hundreds of thousands of offshore entities, uncovering financial connections across 200+ countries.

PayPal

Reportedly uses graph analysis with Cypher for real-time fraud detection, identifying suspicious transaction patterns across their payment network.

Walmart

Reportedly employs Neo4j with Cypher queries for supply chain management and optimization across their global logistics network.

Airbus

Reportedly uses Neo4j and Cypher to manage complex engineering relationships and supply chain data across aircraft manufacturing.

Comcast

Reportedly leverages graph queries with Cypher for network infrastructure management and customer relationship analysis.

Language Influence

Influenced By

SQL SPARQL Haskell Python

Influenced

GQL (ISO/IEC 39075:2024) Amazon Neptune openCypher Memgraph RedisGraph

Running Today

Run examples using the official Docker image:

docker pull neo4j:latest

Example usage:

docker run --rm --publish=7474:7474 --publish=7687:7687 --env=NEO4J_AUTH=none neo4j:latest
Last updated: