Est. 1962 Intermediate

SNOBOL

A pioneering string manipulation language that introduced pattern matching as a first-class programming concept

Created by David J. Farber, Ralph Griswold, Ivan P. Polonsky

Paradigm Pattern-directed, Imperative
Typing Dynamic, Weak
First Appeared 1962
Latest Version SNOBOL4 (1967), CSNOBOL4 2.3 (2023)

SNOBOL (StriNg Oriented and symBOlic Language) is a pioneering programming language developed at Bell Labs in the early 1960s. It introduced pattern matching as a first-class programming concept, fundamentally influencing how we think about text processing and string manipulation in programming languages today.

History & Origins

SNOBOL was created at AT&T Bell Telephone Laboratories by David J. Farber, Ralph Griswold, and Ivan P. Polonsky, beginning in 1962. The initial motivation was to create a tool for symbolic manipulation of polynomials, but the language quickly evolved into a general-purpose tool for string manipulation.

The SNOBOL Family

Unlike most languages that evolved gradually, SNOBOL went through several distinct versions that were essentially different languages:

  • SNOBOL (1962): The original implementation on IBM 7090
  • SNOBOL2 (1963): Added improved pattern matching
  • SNOBOL3 (1965): Introduced user-defined functions and better I/O
  • SNOBOL4 (1967): The definitive version, still in use today

SNOBOL4 is so different from its predecessors that it’s essentially a new language. When people say “SNOBOL” today, they almost always mean SNOBOL4.

Why SNOBOL Was Revolutionary

In the early 1960s, most programming was numerical - FORTRAN dominated, and strings were second-class citizens. SNOBOL changed this by:

  1. Making strings the primary data type - Not just arrays of characters, but genuine strings
  2. Introducing patterns as data - Patterns could be constructed, stored in variables, and combined
  3. Success/failure-based control flow - Every statement either succeeded or failed, driving program flow
  4. Dynamic typing - Variables could hold any type at any time

Core Concepts

Pattern Matching

SNOBOL’s most distinctive feature is pattern matching. Patterns aren’t just regular expressions - they’re first-class values that can be:

1
2
3
4
5
6
7
*       Building patterns from components
        VOWEL = ANY('aeiouAEIOU')
        CONSONANT = ANY('bcdfghjklmnpqrstvwxyzBCDFGHJKLMNPQRSTVWXYZ')
        SYLLABLE = CONSONANT VOWEL CONSONANT

*       Using patterns in matching
        "cat" SYLLABLE                            :S(MATCHED)F(NOT_MATCHED)

Success and Failure

Every SNOBOL statement either succeeds or fails. This result controls program flow through goto labels:

1
2
3
4
5
*       Read input until end of file
LOOP    LINE = INPUT                              :F(DONE)
        OUTPUT = LINE
                                                  :(LOOP)
DONE    END

The :F(DONE) means “on failure, go to DONE”. The :S(label) means “on success, go to label”. You can combine them: :S(SUCCESS)F(FAILURE).

Dynamic Typing and Assignment

Variables in SNOBOL have no declared type and can hold anything:

1
2
3
        X = 42                     ;* Integer
        X = "Hello"                ;* Now a string
        X = PATTERN                ;* Now a pattern

The Statement Format

SNOBOL statements have a distinctive format inherited from assembly language:

label   subject pattern = replacement   :goto

Where:

  • label - Optional name for goto targets
  • subject - The string to match against
  • pattern - What to look for
  • replacement - What to replace matched text with
  • goto - Where to go based on success/failure

Language Features

Built-in Patterns

SNOBOL4 includes powerful built-in patterns:

PatternMeaning
ARBMatch arbitrary characters (shortest)
REMMatch remainder of string
LEN(n)Match exactly n characters
SPAN(s)Match longest run of characters in s
BREAK(s)Match up to (not including) any character in s
ANY(s)Match any single character in s
NOTANY(s)Match any character not in s
TAB(n)Match to position n
RTAB(n)Match to n positions from end
POS(n)Succeed if at position n
RPOS(n)Succeed if n positions from end

Pattern Operators

Patterns can be combined:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
*       Concatenation (adjacent patterns)
        WORD = SPAN(&LCASE)        ;* Lowercase word

*       Alternation (either/or)
        DIGIT = ANY('0123456789')
        SIGN = ANY('+-') | NULL

*       Conditional assignment during match
        "Hello World" BREAK(' ') . FIRST ' ' REM . REST
*       FIRST = "Hello", REST = "World"

Tables (Associative Arrays)

SNOBOL4 introduced associative arrays, which it calls “tables”:

1
2
3
        COUNT = TABLE()
        COUNT<'apple'> = COUNT<'apple'> + 1
        COUNT<'banana'> = COUNT<'banana'> + 1

User-Defined Functions

1
2
3
4
5
        DEFINE('FACTORIAL(N)')                    :(END.FACTORIAL)
FACTORIAL
        FACTORIAL = EQ(N,0) 1                     :S(RETURN)
        FACTORIAL = N * FACTORIAL(N - 1)          :(RETURN)
END.FACTORIAL

SNOBOL4 vs Modern Languages

Compared to Regular Expressions

SNOBOL patterns are more powerful than traditional regular expressions:

1
2
3
4
5
*       SNOBOL can match balanced parentheses - regex cannot
        BALANCED = '(' ARBNO(NOTANY('()') | BALANCED) ')'

*       SNOBOL can capture during matching
        "John Smith" BREAK(' ') . FIRST ' ' REM . LAST

Compared to Perl

Perl borrowed heavily from SNOBOL’s text processing philosophy:

1
2
3
4
5
# Perl regex
$text =~ /(\w+)\s+(\w+)/;

# SNOBOL equivalent
TEXT SPAN(&LCASE &UCASE) . WORD1 SPAN(' ') SPAN(&LCASE &UCASE) . WORD2

But SNOBOL patterns can do things Perl regex cannot, like recursive matching.

Compared to Icon

Ralph Griswold created Icon as SNOBOL’s successor, keeping the power but modernizing the syntax:

# Icon - more conventional syntax
every word := !words do
    write(word)
1
2
*       SNOBOL - statement-based syntax
LOOP    OUTPUT = WORDS<I = I + 1>   :S(LOOP)

Code Examples

Word Frequency Counter

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
*       Count word frequencies in input
        WORD = SPAN(&LCASE &UCASE)
        COUNT = TABLE(100)
*
READ    LINE = INPUT                              :F(PRINT)
NEXT    LINE WORD . W =                           :F(READ)
        COUNT<W> = COUNT<W> + 1                   :(NEXT)
*
PRINT   OUTPUT = CONVERT(COUNT, 'ARRAY')          :(END)
END

Simple Calculator

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
*       Parse and evaluate simple arithmetic
        DIGITS = SPAN('0123456789')
        &ANCHOR = 1
*
READ    OUTPUT = 'Enter expression: '
        EXPR = INPUT                              :F(END)
        EXPR DIGITS . A '+' DIGITS . B            :F(TRY_SUB)
        OUTPUT = A + B                            :(READ)
TRY_SUB EXPR DIGITS . A '-' DIGITS . B            :F(ERROR)
        OUTPUT = A - B                            :(READ)
ERROR   OUTPUT = 'Invalid expression'             :(READ)
END

Modern SNOBOL

CSNOBOL4

Phil Budne’s CSNOBOL4 is the most actively maintained implementation. It’s a faithful port of the original Bell Labs SNOBOL4 to modern systems:

  • Runs on any system with a C compiler
  • Includes all SNOBOL4 features plus SPITBOL extensions
  • Available in Debian/Ubuntu repositories
  • Active development continues

SPITBOL

SPITBOL (Speedy Implementation of SNOBOL) was created for performance:

  • Compiles to native code
  • Significantly faster than interpreted SNOBOL4
  • Macro SPITBOL is open source
  • Used when SNOBOL4’s power is needed with better performance

Running SNOBOL Today

SNOBOL4 is readily available:

1
2
3
4
5
6
7
8
# Debian/Ubuntu
apt install snobol4

# Docker (via esolang-box)
docker run --rm -v $(pwd):/app -w /app esolang/snobol snobol program.sno

# From source (CSNOBOL4)
# https://www.regressive.org/snobol4/

The SNOBOL Legacy

Influence on Programming Languages

SNOBOL’s ideas appear throughout modern programming:

  • Pattern Matching: Influenced Perl, AWK, Icon, and modern language pattern matching
  • Associative Arrays: Tables pioneered what became hash maps/dictionaries everywhere
  • String as First-Class: Set the standard for treating strings as fundamental
  • Success/Failure Flow: Influenced Icon’s goal-directed evaluation and Prolog

The Griswold Connection

Ralph Griswold, one of SNOBOL’s creators, went on to create:

  • SL5: An intermediate attempt between SNOBOL and Icon
  • Icon: A modern language with SNOBOL’s power and cleaner syntax
  • Unicon: Icon’s object-oriented successor

This family tree shows how SNOBOL’s ideas evolved into more accessible forms.

Learning SNOBOL

Key Mental Shifts

Coming from modern languages, you’ll need to adjust your thinking:

  1. Every statement has a result - Success or failure, not just a value
  2. Patterns are data - You build them up like any other data structure
  3. Goto is normal - SNOBOL predates structured programming
  4. Columns matter - Labels in column 1, statements start after a space/tab

Common Gotchas

  • Statements must start with whitespace (column 1 is for labels only)
  • The * in column 1 makes a comment
  • Whitespace in patterns is significant
  • String comparison is with IDENT() and DIFFER(), not =

Why Learn SNOBOL?

Even if you never use SNOBOL in production:

  1. Understand Pattern Matching Origins: See where regex and pattern matching came from
  2. Different Paradigm: Success/failure-driven programming is a unique perspective
  3. Appreciate Modern Features: Understand why we have hash maps and string operations
  4. Historical Context: Understand the evolution from SNOBOL to Icon to modern languages

SNOBOL represents a fascinating branch of programming language evolution - one where text manipulation and pattern matching were the primary concerns, not numerical computation. Its influence echoes through every regex you write and every dictionary you use.

Learning Resources

Online

Books

  • The SNOBOL4 Programming Language by Griswold, Poage, and Polonsky (1971) - The definitive reference
  • A SNOBOL4 Primer by Griswold and Griswold - Gentler introduction
  • Algorithms in SNOBOL4 by Gimpel - Advanced techniques

Timeline

1962
SNOBOL (StriNg Oriented and symBOlic Language) created at Bell Labs by Farber, Griswold, and Polonsky
1963
SNOBOL2 released with improved pattern matching capabilities
1965
SNOBOL3 released with user-defined functions and improved I/O
1967
SNOBOL4 released - the definitive version with patterns as first-class data types
1971
SPITBOL (Speedy Implementation of SNOBOL) created for performance-critical applications
1977
Ralph Griswold begins work on Icon as SNOBOL's successor at University of Arizona
1986
Phil Budne begins CSNOBOL4, a portable C implementation
1990s
SNOBOL use declines as Perl and AWK gain popularity for text processing
2023
CSNOBOL4 2.3 released, maintaining active development

Notable Uses & Legacy

Text Processing at Bell Labs

SNOBOL was extensively used at Bell Labs for text processing, document analysis, and compiler construction in the 1960s and 1970s.

Natural Language Processing Research

Early NLP research used SNOBOL for parsing and analyzing natural language text due to its powerful pattern matching.

Humanities Computing

SNOBOL was widely adopted in digital humanities for concordance generation, stylistic analysis, and text processing in the 1970s-1980s.

Compiler Writing

The pattern matching capabilities made SNOBOL popular for writing compilers and language processors.

Teaching

SNOBOL was taught at many universities in the 1970s as an example of a non-procedural approach to programming.

Language Influence

Influenced By

COMIT Fortran

Influenced

Icon AWK Perl Lua

Running Today

Run examples using the official Docker image:

docker pull esolang/snobol:latest

Example usage:

docker run --rm -v $(pwd):/app -w /app esolang/snobol:latest snobol hello.sno

Topics Covered

Last updated: