Procedural Content Generation: Grammars, Search, Constraint Satisfaction, & Wave Function Collapse

Rule-Based PCG: Recap & Limitations

Previously covered: simple rule systems can generate impressive outputs (e.g.
- Intricate height-map terrain
- Cellular-automata caverns)
Fundamental issue: they give no formal guarantees about desirable global properties
- Example: A terrain might look convincing yet be unnavigable for the
  player or AI
- Fixes require ad-hoc rule tweaks → hard to prove they work for all
  outcomes
Motivating question → How do we add guarantees while keeping generative
power? ⇒ Enter grammars & L-systems

Generative Grammars

Historical roots: Noam Chomsky (1960s)
- Studied the formal structure of natural language
- Abstract view: a language = symbols + production rules specifying legal
  arrangements
Core components
- Symbols / variables – placeholders (e.g., nouns, limbs, tiles)
- Terminals / constants – atomic tokens that appear in the final string
- Production rules – rewrite one symbol sequence into another
- Axiom / start symbol – initial string (seed)
Repeated rule application ⇒ large space of outputs
- Flexibility lets us model
- Sentences → natural-language generation
- Limbs → creature or avatar synthesis
- Tiles/rooms → level layouts

Strengths & Weaknesses

Pros
- Highly general-purpose
- Capable of high-quality, hierarchical structure
- Fast to evaluate once rules exist
- Designer retains full control via authored rules/components
Cons
- Heavy authorial burden – every piece & relationship must be encoded
- Debugging nightmare – to test a fix you must in theory sample
  $\infty$ outputs
- Risk of over-constraining (no variety) or under-constraining
  (nonsensical outputs)

Case Study: Spelunky Level Generation

Each level begins as a grid of rooms (4×4 in classic Spelunky)
Designer-made templates (e.g., Landing, Corridor, Drop) are the
building blocks
Generation algorithm
1. Place templates one at a time
2. When choosing a template, enforce neighborhood constraints (edges must
  align, entrance/exit connectivity, etc.)
3. Resulting macro-layout later filled with finer entities (traps, gold,
  monsters)
Effect:
- Combines designer authorship (templates) + grammatical positioning rules
- Guarantees a fully traversable level while preserving theme variety

L-Systems (Lindenmayer Systems)

Special grammar where every symbol is rewritten simultaneously each
timestep ⇒ good for growth simulations
Formal tuple: $\langle \Sigma, \Gamma, \omega, P \rangle$
- $\Sigma$ = variables (rewriteable, draw commands optional)
- $\Gamma$ = constants (no rewrite, usually turtle commands like +, -, [, ])
- $\omega$ = axiom (initial string)
- $P$ = production rules applied in parallel
Classic tree example
- Variables: ${0,1}$ (0=leaf, 1=branch segment)
- Constants: $[\ ]$ (push/pop turtle state → branch)
- Rules: $1 \to 11$ , $0 \to 1[0]0$
- After several iterations, yields branching structure with self-similar
  fractal detail
Fractal plant (angle $25^\circ$ ) rules
- $X \to F+[[X]-X]-F[-FX]+X$
- $F \to FF$
- F = draw forward, +/- = rotate ±angle, [ ] = push/pop
Significance
- Concise rule set ⇒ visually rich organic forms
- Widely used for foliage, roads, lightning, coral, etc.

Search-Based Procedural Content Generation (SB-PCG)

Idea: Skip explicit rules; instead define a fitness / heuristic that
scores output quality and let a search algorithm find high-scoring artifacts
Common algorithms
- Hill Climbing
- Simulated Annealing
- Genetic Algorithms / Evolutionary Search
Reference: Togelius et al., “Search-Based Procedural Content Generation”

Heuristics & Energy Analogy

Model generation as an optimization problem
- Energy $E(s)$ ↔ negative fitness (lower better) or vice-versa
Need
1. Representation of content (parameter vector, tile grid, etc.)
2. Neighbor function (small random tweak) ⇒ defines search space graph
3. Fitness/energy function $E$

Hill Climbing

Greedy ascent (or descent) along best neighboring state
Characteristics
- Simple, fast
- Local maxima/minima trap risk
Mitigation
- Multiple random restarts
- Momentum variants (keep moving even if small decline, hoping to pass
  narrow valleys)
Visualization: “fitness landscape” with peaks; algorithm follows gradient

Simulated Annealing (SA)

Metallurgy inspiration: heat $\to$ explore, cool $\to$ exploit
Pseudocode essence
1. $state \leftarrow$ random initial
2. For $k = 0 .. k_{max}$
- $T = schedule(k/k_{max})$
- $new \leftarrow$ random neighbor
- If \Delta E = E(new) - E(state) < 0 accept (better)
- Else accept with probability $P = e^{ -\Delta E / T }$
1. Track best-ever state as fallback
Properties
- Can (in theory) reach global optimum if cooling schedule $T(t)$ is
  slow enough
- Bad schedule ⇒ random walk or premature freezing

Genetic Algorithms (GA)

Population-based; loosely mirrors biological evolution
Key steps each generation
1. Selection / Reduce – keep top $X$ individuals
2. Crossover – pair parents (prob. ∝ fitness) and recombine their genetic
  representation to create offspring until population size $Y$ (Y > X)
3. Mutation – with low probability, randomly alter parts of an individual
Exploration vs exploitation
- Mutation ⇒ exploration (avoids local traps, but pure mutation = random
  search)
- Crossover ⇒ exploitation (refines existing good traits, but alone may
  converge to suboptimal)
Pros & Cons
- Moderate authoring burden (representation + fitness)
- Tunable balance between search breadth/depth
- Requires experience to choose mutation/crossover rates, map genotype
  (bitstring, graph) to phenotype (level, texture)

Comparing Broad PCG Families

Rule-based: Simple local rules, emergent complexity; hard to control
Generative grammars: High quality, explicit control; heavy design & debugging effort
Search-based: Minimal authorship (heuristic only); heuristic design is unintuitive for many

Constraint Satisfaction Problems (CSP)

Alternative framing: specify constraints that must hold; solver finds
any assignment satisfying them
Terminology
- Variables / slots – empty placeholders (tiles, names, pixels)
- Domains / values – options each variable can take
- Constraints – relations over subsets of variables that restrict joint
  assignments (e.g., adjacency, count limits, connectivity)
Trivial example: name generator
- Variables: $\langle First, Last \rangle$
- Domain: list of names
- Constraint: initial letters must differ
General solving loop (resembles Wave Function Collapse)
1. Initially each variable domain = all values
2. Observe / Collapse – pick variable, assign one value (via heuristic or
  entropy measure)
3. Propagate – remove incompatible values from neighbors via constraints
4. Repeat until every domain has size $1$ (solution) or contradiction ⇒ backtrack
Relation to search
- CSP = search with aggressive pruning (propagation) ⇒ typically faster than
  naive heuristic search

Worked Example: 4×4 Dungeon Room

Domain values: {1 Blank, 2 Wall, 3 Enemy, 4 Treasure}
Constraints
1. Path must exist from entrance to all treasures/enemies
2. ≤1 treasure overall
3. ≤2 enemies overall
4. Enemies cannot be orthogonally adjacent
Stepwise collapse/propagation illustrated: domains shrink from {1-4} to singletons
Insight: If output looks bad, add or tweak constraints (vs tweaking heuristic weight)

Wave Function Collapse (WFC)

Invented by Max Gumin (2016); popularised by gif demos & games like Caves of Qud
Key insight: auto-derive constraints from a sample image or tilemap →
lowers authorial burden further

Big Ideas

Pattern extraction – scan input for every $N \times N$ tile patch
(e.g., $N=2$ ). Each unique patch = a pattern; frequency counts stored
Legal overlaps – for every relative offset (up, down, left, right), record
which pattern pairs appear in sample ⇒ propagator matrix

High-Level Algorithm

PatternsFromSample()      # derive variables & domains
BuildPropagator()         # precompute allowed adjacencies
while !Finished:
    Observe()             # choose lowest-entropy cell, collapse to pattern
    Propagate()           # eliminate incompatible neighbors (classic CSP)
Return output             # visualize by stitching collapsed patterns

Observe step often chooses value via weighted random (weights = pattern
frequencies) giving similarity to input
Larger neighborhood size $N$ ⇒ output more closely mimics source, but
computation heavier and variety lower

Novelty vs Classic CSP

Novelty: automatic constraint learning from example + entropy-driven
observation order ("wave" = probability field over patterns)
Under the hood: pure CSP propagation (Arc Consistency, backtracking on
contradiction)

CSP versus Other Techniques

Pros

Constraints more intuitive to author than numerical heuristics
Often faster than search thanks to propagation
WFC shows constraints can even be learned, reducing design labor
Cons
Requires content be expressible in formal logic
Propagation cost > simple grammar rewrite (though usually acceptable)

Practical Guidance for Designers / Engineers

Choose approach based on
- Desired authorial control vs procedural surprise
- Team’s familiarity with formal grammars, optimization, or logic
- Performance constraints (mobile vs PC, runtime vs offline pre-compute)
Common workflow mix
- Prototype with WFC (fast results) → analyze failures → add explicit
  constraints or grammar tweaks for edge cases
- Combine grammar for macro layout + search/CSP to ensure
  playability, difficulty pacing
Remember debugging strategies
- Grammars: unit-test components & derivations
- Search: log fitness curve, visualize neighborhoods
- CSP: visualize domain sizes, heat-map remaining possibilities

Key Equations & Notation Recap

Simulated annealing acceptance
$P(\Delta E, T) = e^{ -\Delta E / T }$
Entropy used in WFC (Shannon)
$H(i) = -\sum<em>j p</em>{ij} \log p_{ij}$ (select cell with lowest non-zero $H$ )
L-system definition tuple
$L = \langle \Sigma, \Gamma, \omega, P \rangle$

Further Resources & References

Search-Based PCG survey – Togelius, Shaker, Nelson (ACM TOG)
AI for Games (Millington & Funge) – Figures 7.2, 8.14, 8.16, 8.20 (mazes,
MST rooms)
Game Maker’s Toolkit video on Spelunky PCG (1:36 mark)
WaveFunctionCollapse GitHub – https://github.com/mxgmn/WaveFunctionCollapse
Procedural generation talks by Jeff Wilson (Georgia Tech)