Cypher
A declarative graph query language designed for querying and updating property graphs, originally created for Neo4j.
Created by Andres Taylor (Neo Technology / Neo4j, Inc.)
Cypher is a declarative query language designed specifically for querying, updating, and administering property graph databases. Created at Neo Technology (now Neo4j, Inc.), Cypher pioneered the use of ASCII art-style syntax to express graph patterns, making graph queries intuitive and readable. Its influence extends far beyond Neo4j — Cypher’s design became the primary foundation for GQL, the first new ISO database query language standard since SQL was standardized in 1987.
History & Origins
Cypher was created in 2011 by Andres Taylor, a software engineer at Neo Technology, the company behind the Neo4j graph database. Neo4j 1.0 had been released in February 2010, but at that time users queried the database through a Java traversal API — powerful but verbose and requiring programming expertise.
The motivation for Cypher was clear: graph databases needed a declarative query language analogous to what SQL provides for relational databases. Rather than writing imperative code to traverse nodes and relationships, users should be able to describe the patterns they were looking for and let the database engine figure out how to find them.
Taylor drew inspiration from several existing languages. SQL provided the overall declarative structure and familiar keywords like WHERE, ORDER BY, and RETURN. SPARQL, the W3C query language for RDF graphs, demonstrated that declarative graph querying was viable. Haskell and Python influenced Cypher’s list semantics, with Python’s list comprehension syntax adopted for inline collection transformations.
The openCypher Initiative
By 2015, Cypher had become the de facto standard for property graph querying, but it remained proprietary to Neo4j. Recognizing the value of an open standard, Neo4j launched the openCypher project in October 2015, releasing Cypher’s grammar, language specification, and a Technology Compatibility Kit (TCK) under the Apache 2.0 license.
This move enabled other graph database vendors to implement Cypher compatibility. Amazon Neptune, Memgraph, SAP HANA Graph, and RedisGraph (EOL announced in 2023) all adopted openCypher as a supported query interface.
From openCypher to GQL
The openCypher initiative catalyzed a broader effort to create an international standard for graph query languages. In September 2019, ISO/IEC approved the GQL (Graph Query Language) project proposal. In April 2024, GQL was published as ISO/IEC 39075:2024 — a landmark moment as the first new ISO database query language since SQL was standardized in 1987. GQL is heavily based on Cypher’s syntax and semantics, ensuring that Cypher’s design philosophy lives on in the international standard.
Design Philosophy
Cypher’s design rests on several core principles:
- Declarative over imperative: Users describe what data they want, not how to retrieve it. The query optimizer determines execution strategy.
- Visual pattern matching: Graph patterns are expressed using ASCII art — parentheses
()for nodes and arrows-[]->for relationships — making queries visually mirror the graph structure. - Readability first: Cypher prioritizes human readability over brevity, designed to be understandable by domain experts and analysts, not just programmers.
- Familiarity through SQL: By deliberately borrowing SQL’s keyword vocabulary and clause structure, Cypher reduces the learning curve for the millions of developers already fluent in SQL.
- Graph-native: Unlike attempts to bolt graph capabilities onto SQL, Cypher was designed from the ground up for the property graph data model.
Key Features
ASCII Art Pattern Syntax
Cypher’s most distinctive feature is its visual pattern matching syntax:
// Find friends of friends
MATCH (person:Person {name: 'Alice'})-[:KNOWS]->(friend)-[:KNOWS]->(foaf)
RETURN foaf.name
Nodes are represented by parentheses (), relationships by square brackets within arrows -[]->, and labels/types follow a colon. This syntax makes graph patterns immediately recognizable.
Core Clauses
| Clause | Purpose |
|---|---|
MATCH | Declarative pattern matching against the graph |
WHERE | Filter results based on predicates |
RETURN | Specify which data to return |
CREATE | Create new nodes and relationships |
MERGE | Idempotent creation — create only if not already present |
DELETE | Remove nodes and relationships |
SET | Update properties on nodes and relationships |
WITH | Chain query parts, enabling query pipelining |
UNWIND | Transform lists into rows for further processing |
OPTIONAL MATCH | Left outer join equivalent for graph patterns |
Variable-Length Path Matching
Cypher supports variable-length paths for traversing unknown depths:
// Find all people within 1 to 5 hops of Alice via KNOWS relationships
MATCH path = (alice:Person {name: 'Alice'})-[:KNOWS*1..5]->(distant)
RETURN distant.name, length(path) AS distance
Aggregation and Collection
// Count friends per person and collect their names
MATCH (p:Person)-[:KNOWS]->(friend)
RETURN p.name, count(friend) AS friendCount, collect(friend.name) AS friendNames
ORDER BY friendCount DESC
List Comprehensions
Borrowed from Python, Cypher supports list comprehensions for inline transformations:
RETURN [x IN range(1, 10) WHERE x % 2 = 0 | x * x] AS evenSquares
Schema and Constraints
While property graphs are inherently schema-flexible, Cypher supports optional schema enforcement:
// Ensure unique email addresses
CREATE CONSTRAINT unique_email FOR (u:User) REQUIRE u.email IS UNIQUE
// Require that all Person nodes have a name
CREATE CONSTRAINT person_name FOR (p:Person) REQUIRE p.name IS NOT NULL
Cypher vs. Other Graph Query Languages
| Feature | Cypher | Gremlin | SPARQL |
|---|---|---|---|
| Paradigm | Declarative | Imperative (traversal) | Declarative |
| Data model | Property graph | Property graph | RDF (triples) |
| Syntax style | SQL-like with ASCII art | Method chaining | Pattern-based with URIs |
| Pattern matching | Visual, first-class | Manual traversal steps | Triple patterns |
| Learning curve | Moderate (SQL-like) | Steep | Moderate |
| Standardization | openCypher / GQL (ISO) | Apache TinkerPop | W3C Recommendation |
Evolution
Cypher has evolved significantly through Neo4j’s major releases:
- Neo4j 2.0 (2013): Introduced node labels and schema indexing, allowing queries like
MATCH (p:Person)instead of scanning all nodes - Neo4j 3.0 (2016): Added the Bolt binary protocol, providing a more efficient wire format for Cypher queries compared to the REST API
- Neo4j 4.0 (2020): Brought multi-database support, allowing a single Neo4j instance to host multiple databases queried independently with Cypher
- Neo4j 5.0 (2022): Introduced “Cypher 5” with improved subquery support via
CALL {}blocks, quantified path patterns, and enhanced type system capabilities
Current Relevance
Cypher and graph databases are experiencing growing adoption driven by several trends:
- Knowledge graphs and AI: Graph databases queried with Cypher are increasingly used to build knowledge graphs that power retrieval-augmented generation (RAG) and other AI applications
- Fraud detection: Financial institutions use graph pattern matching to identify complex fraud rings that are invisible in relational queries
- Network analysis: Telecommunications and IT infrastructure companies model and query their networks as graphs
- Recommendation engines: E-commerce and media companies leverage graph traversals for personalized recommendations
Neo4j reports a large developer community and significant enterprise adoption, including many Fortune 100 companies. The openCypher specification continues to evolve, with the 2024.1 release beginning the alignment process toward the GQL ISO standard.
Multiple graph database products now support Cypher or openCypher, including Amazon Neptune, Memgraph, and SAP HANA Graph, establishing it as the most widely implemented property graph query language.
Why It Matters
Cypher’s impact on the database world extends well beyond Neo4j:
Made graph databases accessible: Before Cypher, querying graph databases required programming expertise. Cypher’s declarative, visual syntax opened graph querying to analysts, data scientists, and domain experts.
Established the property graph query paradigm: Cypher demonstrated that SQL’s declarative approach could be successfully adapted for graph data, influencing the entire graph database industry.
Catalyzed international standardization: Cypher’s success and the openCypher initiative directly led to GQL becoming an ISO standard, giving graph databases the same kind of standardized query language that SQL provides for relational databases.
Proved visual syntax works: The ASCII art pattern syntax
()-[]->()has become iconic in the graph database world, showing that programming language syntax can be both expressive and visually intuitive.
Cypher transformed graph databases from a niche technology requiring specialized programming skills into an accessible tool for anyone who can write a SQL query — and in doing so, helped establish graph querying as a first-class discipline in the database world.
Timeline
Notable Uses & Legacy
ICIJ (Panama Papers / Paradise Papers)
The International Consortium of Investigative Journalists used Neo4j and Cypher to analyze hundreds of thousands of offshore entities, uncovering financial connections across 200+ countries.
PayPal
Reportedly uses graph analysis with Cypher for real-time fraud detection, identifying suspicious transaction patterns across their payment network.
Walmart
Reportedly employs Neo4j with Cypher queries for supply chain management and optimization across their global logistics network.
Airbus
Reportedly uses Neo4j and Cypher to manage complex engineering relationships and supply chain data across aircraft manufacturing.
Comcast
Reportedly leverages graph queries with Cypher for network infrastructure management and customer relationship analysis.
Language Influence
Influenced By
Influenced
Running Today
Run examples using the official Docker image:
docker pull neo4j:latestExample usage:
docker run --rm --publish=7474:7474 --publish=7687:7687 --env=NEO4J_AUTH=none neo4j:latest