SNOBOL
A pioneering string manipulation language that introduced pattern matching as a first-class programming concept
Created by David J. Farber, Ralph Griswold, Ivan P. Polonsky
SNOBOL (StriNg Oriented and symBOlic Language) is a pioneering programming language developed at Bell Labs in the early 1960s. It introduced pattern matching as a first-class programming concept, fundamentally influencing how we think about text processing and string manipulation in programming languages today.
History & Origins
SNOBOL was created at AT&T Bell Telephone Laboratories by David J. Farber, Ralph Griswold, and Ivan P. Polonsky, beginning in 1962. The initial motivation was to create a tool for symbolic manipulation of polynomials, but the language quickly evolved into a general-purpose tool for string manipulation.
The SNOBOL Family
Unlike most languages that evolved gradually, SNOBOL went through several distinct versions that were essentially different languages:
- SNOBOL (1962): The original implementation on IBM 7090
- SNOBOL2 (1963): Added improved pattern matching
- SNOBOL3 (1965): Introduced user-defined functions and better I/O
- SNOBOL4 (1967): The definitive version, still in use today
SNOBOL4 is so different from its predecessors that it’s essentially a new language. When people say “SNOBOL” today, they almost always mean SNOBOL4.
Why SNOBOL Was Revolutionary
In the early 1960s, most programming was numerical - FORTRAN dominated, and strings were second-class citizens. SNOBOL changed this by:
- Making strings the primary data type - Not just arrays of characters, but genuine strings
- Introducing patterns as data - Patterns could be constructed, stored in variables, and combined
- Success/failure-based control flow - Every statement either succeeded or failed, driving program flow
- Dynamic typing - Variables could hold any type at any time
Core Concepts
Pattern Matching
SNOBOL’s most distinctive feature is pattern matching. Patterns aren’t just regular expressions - they’re first-class values that can be:
| |
Success and Failure
Every SNOBOL statement either succeeds or fails. This result controls program flow through goto labels:
| |
The :F(DONE) means “on failure, go to DONE”. The :S(label) means “on success, go to label”. You can combine them: :S(SUCCESS)F(FAILURE).
Dynamic Typing and Assignment
Variables in SNOBOL have no declared type and can hold anything:
| |
The Statement Format
SNOBOL statements have a distinctive format inherited from assembly language:
label subject pattern = replacement :goto
Where:
- label - Optional name for goto targets
- subject - The string to match against
- pattern - What to look for
- replacement - What to replace matched text with
- goto - Where to go based on success/failure
Language Features
Built-in Patterns
SNOBOL4 includes powerful built-in patterns:
| Pattern | Meaning |
|---|---|
ARB | Match arbitrary characters (shortest) |
REM | Match remainder of string |
LEN(n) | Match exactly n characters |
SPAN(s) | Match longest run of characters in s |
BREAK(s) | Match up to (not including) any character in s |
ANY(s) | Match any single character in s |
NOTANY(s) | Match any character not in s |
TAB(n) | Match to position n |
RTAB(n) | Match to n positions from end |
POS(n) | Succeed if at position n |
RPOS(n) | Succeed if n positions from end |
Pattern Operators
Patterns can be combined:
| |
Tables (Associative Arrays)
SNOBOL4 introduced associative arrays, which it calls “tables”:
| |
User-Defined Functions
| |
SNOBOL4 vs Modern Languages
Compared to Regular Expressions
SNOBOL patterns are more powerful than traditional regular expressions:
| |
Compared to Perl
Perl borrowed heavily from SNOBOL’s text processing philosophy:
| |
But SNOBOL patterns can do things Perl regex cannot, like recursive matching.
Compared to Icon
Ralph Griswold created Icon as SNOBOL’s successor, keeping the power but modernizing the syntax:
# Icon - more conventional syntax
every word := !words do
write(word)
| |
Code Examples
Word Frequency Counter
| |
Simple Calculator
| |
Modern SNOBOL
CSNOBOL4
Phil Budne’s CSNOBOL4 is the most actively maintained implementation. It’s a faithful port of the original Bell Labs SNOBOL4 to modern systems:
- Runs on any system with a C compiler
- Includes all SNOBOL4 features plus SPITBOL extensions
- Available in Debian/Ubuntu repositories
- Active development continues
SPITBOL
SPITBOL (Speedy Implementation of SNOBOL) was created for performance:
- Compiles to native code
- Significantly faster than interpreted SNOBOL4
- Macro SPITBOL is open source
- Used when SNOBOL4’s power is needed with better performance
Running SNOBOL Today
SNOBOL4 is readily available:
| |
The SNOBOL Legacy
Influence on Programming Languages
SNOBOL’s ideas appear throughout modern programming:
- Pattern Matching: Influenced Perl, AWK, Icon, and modern language pattern matching
- Associative Arrays: Tables pioneered what became hash maps/dictionaries everywhere
- String as First-Class: Set the standard for treating strings as fundamental
- Success/Failure Flow: Influenced Icon’s goal-directed evaluation and Prolog
The Griswold Connection
Ralph Griswold, one of SNOBOL’s creators, went on to create:
- SL5: An intermediate attempt between SNOBOL and Icon
- Icon: A modern language with SNOBOL’s power and cleaner syntax
- Unicon: Icon’s object-oriented successor
This family tree shows how SNOBOL’s ideas evolved into more accessible forms.
Learning SNOBOL
Key Mental Shifts
Coming from modern languages, you’ll need to adjust your thinking:
- Every statement has a result - Success or failure, not just a value
- Patterns are data - You build them up like any other data structure
- Goto is normal - SNOBOL predates structured programming
- Columns matter - Labels in column 1, statements start after a space/tab
Common Gotchas
- Statements must start with whitespace (column 1 is for labels only)
- The
*in column 1 makes a comment - Whitespace in patterns is significant
- String comparison is with
IDENT()andDIFFER(), not=
Why Learn SNOBOL?
Even if you never use SNOBOL in production:
- Understand Pattern Matching Origins: See where regex and pattern matching came from
- Different Paradigm: Success/failure-driven programming is a unique perspective
- Appreciate Modern Features: Understand why we have hash maps and string operations
- Historical Context: Understand the evolution from SNOBOL to Icon to modern languages
SNOBOL represents a fascinating branch of programming language evolution - one where text manipulation and pattern matching were the primary concerns, not numerical computation. Its influence echoes through every regex you write and every dictionary you use.
Learning Resources
Online
- SNOBOL4 Resources - https://www.regressive.org/snobol4/
- CSNOBOL4 - https://www.regressive.org/snobol4/csnobol4/
- SNOBOL4.org - http://www.snobol4.com/
Books
- The SNOBOL4 Programming Language by Griswold, Poage, and Polonsky (1971) - The definitive reference
- A SNOBOL4 Primer by Griswold and Griswold - Gentler introduction
- Algorithms in SNOBOL4 by Gimpel - Advanced techniques
Timeline
Notable Uses & Legacy
Text Processing at Bell Labs
SNOBOL was extensively used at Bell Labs for text processing, document analysis, and compiler construction in the 1960s and 1970s.
Natural Language Processing Research
Early NLP research used SNOBOL for parsing and analyzing natural language text due to its powerful pattern matching.
Humanities Computing
SNOBOL was widely adopted in digital humanities for concordance generation, stylistic analysis, and text processing in the 1970s-1980s.
Compiler Writing
The pattern matching capabilities made SNOBOL popular for writing compilers and language processors.
Teaching
SNOBOL was taught at many universities in the 1970s as an example of a non-procedural approach to programming.
Language Influence
Influenced By
Influenced
Running Today
Run examples using the official Docker image:
docker pull esolang/snobol:latestExample usage:
docker run --rm -v $(pwd):/app -w /app esolang/snobol:latest snobol hello.sno