Notes on Irrevocable and Adversarial Search (FIT3080)
Notes on Irrevocable and Adversarial Search (FIT3080)
Topic focus: Irrevocable search algorithms and adversarial search in AI, with practical examples and complexity considerations.
Learning objectives (overview)
Problem formulation and problem-solving agents
Search algorithms and strategies
Tentative algorithms: Backtrack (Ch. 5.1, 5.3, 5.4)
Tree and graph search strategies
Uninformed: BFS, UCS, DFS, DLS, IDS (Ch. 3.1–3.4.4, 3.4.6)
Informed: Greedy best-first search, A, A* (Ch. 3.5.1–3.5.4)
Irrevocable algorithms: Informed variants like Hill climbing, Local beam search, Simulated annealing, Genetic algorithms (Ch. 4.1)
Adversarial search algorithms (Ch. 6.1–6.3)
Optimal decisions via Minimax and α-β pruning
Irrevocable Search Algorithms
Local search overview
Keep a single current state and attempt to improve it
Often used for optimization where the goal is a feasible configuration
State space: set of complete configurations
Example application: constraints satisfaction (e.g., n-queens)
Local search examples and concepts
n-Queens problem: place n queens on an n×n board so that no two attack each other (row/column/diagonal).
Hill climbing and Local Beam Search are described as alternative irrevocable methods.
Hill climbing and its procedure (Hill Climbing(current-state))
1. If current-state = goal-state, return it.
2. Otherwise, repeat until a solution is found or no operators remain:
a) Select an operator not yet applied to current-state and apply it to generate new-state
b) Evaluate new-state:
i) If new-state = goal-state, return it
ii) Else if new-state is better than current-state, set current-state ← new-state
Variant noted: Steepest ascent hill-climbing selects the best operator.
Key limitation: can get stuck in local maxima.
Hill climbing example: 8-puzzle visual intuition
Objective: minimize a heuristic such as the number of tiles out of place.
Example heuristic values shown as f = - (number of tiles out of place).
Demonstrates progression from initial state toward improving states, with potential stagnation at local maxima.
Local Beam Search
Maintains k states instead of a single current state.
Initialize with k randomly generated states.
Each iteration: generate all successors of all k states; if any successor is a goal, return it.
Else, select the k best successors from the complete list and repeat.
Simulated Annealing (SA)
Inspiration from physical annealing processes.
Idea: escape local maxima/minima by allowing worse moves with decreasing frequency over time (temperature T).
Annealing schedule: how T is lowered over time.
SA procedure (current-state):
1. If current-state = goal-state, return it
2. BestSoFar ← current-state
3. Initialize T according to the schedule
4. Until no more operators can be applied do:
a) Generate new-state from an untried operator
b) Compute ΔE = Value(current-state) − Value(new-state)
i) If new-state = goal-state, return it
ii) Else if ΔE < 0 (new-state better than current-state), set current-state ← new-state; if new-state better than BestSoFar, update BestSoFar
iii) Else, with probability Pr = e^{−ΔE / T}, set current-state ← new-state
c) Revise T according to schedule
d) If T = 0, return BestSoFar
Maximization problems are typically targeted in SA examples.
Properties of Simulated Annealing
If T decreases slowly enough, SA can find a global optimum with probability approaching 1.
Widely used in domains like VLSI layout and airline scheduling.
Genetic Algorithms (GAs)
Start with a population of k randomly generated states (chromosomes).
A chromosome is a string over a finite alphabet (often binary 0/1).
A successor is generated by combining two parent states (crossover).
Evaluation function (fitness): higher values indicate better states.
GA loop: produce next generation via selection, crossover, and mutation.
GA example: 8-Queens Problem (representation and fitness)
Chromosome: one gene per column (8 genes per chromosome)
Gene: row number (1–8) of the queen in that column
Fitness: number of non-attacking queen pairs (min = 0, max = 28)
Maximum possible: 8 imes 7 / 2 = 28
Example selection probabilities (illustrative): ~31%, ~29%, ~26%, ~14% for different chromosomes (as shown in slides).
Demonstrates initial population, selection, crossover, and evolution toward higher fitness.
Summary of graph-search informedness (brief)
Informedness in graph-search uses f(n) = g(n) + h(n).
For A*: f(n) = g(n) + h(n) with constraints:
g(n) \ge g^*(n) (g is a lower bound on the true cost to reach the goal)
0 \le h(n) \le h^*(n)
Uninformed graph-search: BFS, UCS, DFS, DLS, IDS correspond to specific choices of g and h (e.g., h = 0 for BFS/UCS).
Informed graph-search: Greedy best-first search uses g(n) = 0,\ h(n) \ge 0 leading to a greedy approach.
A* uses a balance between g and h to achieve optimality under admissible heuristics.
Adversarial Search Algorithms
Real-world examples of adversarial play
Chess: Kasparov vs Deep Blue (1997) – about 10^40 states? The slide notes approx. 10^40 states in the context of processing power; Deep Blue searched roughly 30 billion positions per move, sometimes extending to deeper plies.
Go: AlphaGo (2016) – Go’s branching factor is enormous (Go > 361 possible moves per ply), requiring advanced methods beyond brute-force search.
Types of games (context for AI search)
Perfect information vs imperfect information (deterministic and fully observable vs stochastic/partially observable)
2-player games; sequential; turn-based vs real-time; zero-sum vs non-zero-sum
The focus is on 2-player, sequential, perfect information, zero-sum games (e.g., chess, tic-tac-toe).
Searching game trees: conventions
Players are MAX and MIN; a MAX-favorable position has a positive value, MIN-favorable is negative.
Goal: find a winning strategy for MAX (MAX aims to maximize utility; MIN aims to minimize).
Example: Tic-tac-toe game tree (illustrative)
Root: initial board; moves lead to child nodes alternating MAX and MIN turns; leaves are terminal with utilities like +1 (win for MAX), 0 (draw), -1 (loss for MAX).
Game-tree concepts
Root: initial configuration
Ply: one level of moves (one move by one player)
Leaves/terminal: win, loss, draw or utility value
Frontier: current boundary of explored nodes
Each node has a minimax value or utility
Solving game trees
A solution labels the root with an outcome (win, loss, draw, or a utility value)
Each opponent move is a subproblem; all subproblems must be solved to solve the parent node
Winning strategy criteria (for MAX)
For every MIN move possible, MAX can still win from at least one resulting position
For every MAX move, MAX should be able to force a win from at least one resulting position
Games vs search problems
Opponent is unpredictable: must account for many possible opponent replies
Time limits: not all games can be fully searched; select a good first move instead
Minimax search
Core idea: recursively evaluate game tree using backward induction
Key definitions:
S0: initial state
ACTIONS(s): set of moves from state s
RESULT(s, a): state after applying action a in s
IS-TERMINAL(s): whether to stop down the tree
UTILITY(s): payoff for the current player at state s
Minimax search pseudocode (conceptual):
MAX-VALUE(state): if IS-TERMINAL(state) return UTILITY(state)
for each a ∈ ACTIONS(state): v ← MIN-VALUE(RESULT(state, a)); track max v and best move
MIN-VALUE(state): similarly, but minimizes over MAX-VALUE(RESULT(state, a))
Depth-first exploration: complete to depth m; time complexity O(b^m); space complexity O(b^m) for DFS, or O(m) for standard backtracking
Illustrative 2-ply example shows how values propagate up the tree.
Minimax example: Tic-tac-toe evaluation
Evaluation function in the example: difference in available lines for MAX vs MIN
Example values illustrate how moves influence future opportunities (e.g., center control is strong, etc.)
Alpha-Beta pruning (α-β pruning)
Definitions:
α-value for a MAX node: best value found so far for MAX (lower bound on MAX’s value)
β-value for a MIN node: best value found so far for MIN (upper bound on MIN’s value)
Pruning rules:
α-cutoff: can prune below a MIN node if its β-value ≤ α-value of some MAX ancestor; set the MIN node’s value to its β-value
β-cutoff: can prune below a MAX node if its α-value ≥ β-value of some MIN ancestor; set the MAX node’s value to its α-value
Termination condition: once all successors of the start node have final backed-up values, the best first move is the successor with the highest value
Practice examples and figures show α and β propagation and when cutoffs occur
Move ordering and search efficiency
α-β effectiveness depends on ordering of examined states
With perfect ordering, time complexity can be O(b^(m/2)) – the depth of search effectively doubles
Dynamic ordering schemes improve pruning efficiency toward the theoretical limit
Resource limits and practical considerations
Time per move is limited; e.g., 100 seconds per move, exploring ~10^6 nodes per move
If terminal states are not reached within time, treat the frontier as a terminal node and rely on an evaluation function to estimate utility
Evaluation functions (for non-terminal searches)
From experience (e.g., chess):
Piece material values (pawn = 1, …, queen = 9)
Linear weighted sum of features: Eval = w1 f1 + w2 f2 + … + wn fn
Example features: difference in piece counts, positional heuristics, king safety, mobility, etc.
From preprocessing (e.g., endgame databases in chess): use pre-computed endgame data for the last few plies
Search strategies (high-level naming and roles)
Depth-limited search with enhancements: quiescence search, singular extension
Iterative deepening search
Beam search (prune to n-best moves)
Two broad types:
Type A: wide but shallow
Type B: deep but narrow
Deterministic games in practice (highlights and landmarks)
Checkers (Chinook): defeated world champion in 1990 using α-β search with a huge endgame database (~39 trillion endgame positions)
Chess: Deep Blue (1997) searched ~30 billion positions per move, depth often 14, extended up to 40 for promising lines; heuristic reductions reduced effective branching factor (EBF) to ~3 in practice
Othello: computers defeated world champions (1997) 6-0
Go: branching factor too large for brute-force α-β; 2016 AlphaGo used deep learning and search synergistically to beat the world champion 4-1
Summary of insights from adversarial search
Perfection is unattainable; we must approximate optimal play
These problems force us to think about what to think about (which nodes to keep or discard and to what depth)
Reading and references
Primary text: Russell, S. and Norvig, P., Artificial Intelligence – A Modern Approach (4th ed), Prentice Hall
Chapter 4, Section 4.1 – Irrevocable search algorithms
Chapter 6, Sections 6.1–6.3 – Adversarial search algorithms
Next lecture topic
Bayesian Networks
Quick glossary of key equations and concepts
Heuristic search scoring: f(n) = g(n) + h(n)
g(n) \ge g^*(n)
0 \le h(n) \le h^*(n)
A* optimality requires admissible heuristics (h(n) never overestimates the true cost)
Simulated annealing acceptance probability: P( ext{accept}) = e^{-\Delta E / T}
Genetic algorithm fitness: higher fitness means better candidate; selection/crossover/mutation drive evolution toward high-fitness solutions
n-Queens fitness example: max fitness for 8-queens is 28 = 8 \times 7 / 2
Minimax complexity: O(b^m) time, O(b^m) space for classical DFS exploration; optimizations with α-β pruning aim to reduce explored nodes
Notation recap
MAX and MIN denote the two players
Terminal leaves correspond to win/loss/draw or utility values
Depth (ply) indicates how many moves ahead are considered
Branching factor b, depth m, leading to rough complexity estimates
If you’d like, I can tailor these notes further to specific topics you expect on the exam (e.g., focus on Minimax vs Alpha-Beta, or detailed pseudocode refinements).