Notes on Irrevocable and Adversarial Search (FIT3080)

Topic focus: Irrevocable search algorithms and adversarial search in AI, with practical examples and complexity considerations.

Learning objectives (overview)

Problem formulation and problem-solving agents
Search algorithms and strategies
- Tentative algorithms: Backtrack (Ch. 5.1, 5.3, 5.4)
- Tree and graph search strategies
- Uninformed: BFS, UCS, DFS, DLS, IDS (Ch. 3.1–3.4.4, 3.4.6)
- Informed: Greedy best-first search, A, A* (Ch. 3.5.1–3.5.4)
- Irrevocable algorithms: Informed variants like Hill climbing, Local beam search, Simulated annealing, Genetic algorithms (Ch. 4.1)
Adversarial search algorithms (Ch. 6.1–6.3)
- Optimal decisions via Minimax and α-β pruning

Irrevocable Search Algorithms

Local search overview
- Keep a single current state and attempt to improve it
- Often used for optimization where the goal is a feasible configuration
- State space: set of complete configurations
- Example application: constraints satisfaction (e.g., n-queens)
Local search examples and concepts
- n-Queens problem: place n queens on an n×n board so that no two attack each other (row/column/diagonal).
- Hill climbing and Local Beam Search are described as alternative irrevocable methods.
Hill climbing and its procedure (Hill Climbing(current-state))
- 1. If current-state = goal-state, return it.
- 2. Otherwise, repeat until a solution is found or no operators remain:
- a) Select an operator not yet applied to current-state and apply it to generate new-state
- b) Evaluate new-state:
  - i) If new-state = goal-state, return it
  - ii) Else if new-state is better than current-state, set current-state ← new-state
- Variant noted: Steepest ascent hill-climbing selects the best operator.
- Key limitation: can get stuck in local maxima.
Hill climbing example: 8-puzzle visual intuition
- Objective: minimize a heuristic such as the number of tiles out of place.
- Example heuristic values shown as f = - (number of tiles out of place).
- Demonstrates progression from initial state toward improving states, with potential stagnation at local maxima.
Local Beam Search
- Maintains k states instead of a single current state.
- Initialize with k randomly generated states.
- Each iteration: generate all successors of all k states; if any successor is a goal, return it.
- Else, select the k best successors from the complete list and repeat.
Simulated Annealing (SA)
- Inspiration from physical annealing processes.
- Idea: escape local maxima/minima by allowing worse moves with decreasing frequency over time (temperature T).
- Annealing schedule: how T is lowered over time.
- SA procedure (current-state):
- 1. If current-state = goal-state, return it
- 2. BestSoFar ← current-state
- 3. Initialize T according to the schedule
- 4. Until no more operators can be applied do:
  - a) Generate new-state from an untried operator
  - b) Compute ΔE = Value(current-state) − Value(new-state)
  - i) If new-state = goal-state, return it
  - ii) Else if ΔE < 0 (new-state better than current-state), set current-state ← new-state; if new-state better than BestSoFar, update BestSoFar
  - iii) Else, with probability Pr = e^{−ΔE / T}, set current-state ← new-state
  - c) Revise T according to schedule
  - d) If T = 0, return BestSoFar
- Maximization problems are typically targeted in SA examples.
Properties of Simulated Annealing
- If T decreases slowly enough, SA can find a global optimum with probability approaching 1.
- Widely used in domains like VLSI layout and airline scheduling.
Genetic Algorithms (GAs)
- Start with a population of k randomly generated states (chromosomes).
- A chromosome is a string over a finite alphabet (often binary 0/1).
- A successor is generated by combining two parent states (crossover).
- Evaluation function (fitness): higher values indicate better states.
- GA loop: produce next generation via selection, crossover, and mutation.
GA example: 8-Queens Problem (representation and fitness)
- Chromosome: one gene per column (8 genes per chromosome)
- Gene: row number (1–8) of the queen in that column
- Fitness: number of non-attacking queen pairs (min = 0, max = 28)
- Maximum possible: 8 imes 7 / 2 = 28
- Example selection probabilities (illustrative): ~31%, ~29%, ~26%, ~14% for different chromosomes (as shown in slides).
- Demonstrates initial population, selection, crossover, and evolution toward higher fitness.
Summary of graph-search informedness (brief)
- Informedness in graph-search uses f(n) = g(n) + h(n).
- For A*: f(n) = g(n) + h(n) with constraints:
- g(n) \ge g^*(n) (g is a lower bound on the true cost to reach the goal)
- 0 \le h(n) \le h^*(n)
- Uninformed graph-search: BFS, UCS, DFS, DLS, IDS correspond to specific choices of g and h (e.g., h = 0 for BFS/UCS).
- Informed graph-search: Greedy best-first search uses g(n) = 0,\ h(n) \ge 0 leading to a greedy approach.
- A* uses a balance between g and h to achieve optimality under admissible heuristics.

Adversarial Search Algorithms

Real-world examples of adversarial play
- Chess: Kasparov vs Deep Blue (1997) – about 10^40 states? The slide notes approx. 10^40 states in the context of processing power; Deep Blue searched roughly 30 billion positions per move, sometimes extending to deeper plies.
- Go: AlphaGo (2016) – Go’s branching factor is enormous (Go > 361 possible moves per ply), requiring advanced methods beyond brute-force search.
Types of games (context for AI search)
- Perfect information vs imperfect information (deterministic and fully observable vs stochastic/partially observable)
- 2-player games; sequential; turn-based vs real-time; zero-sum vs non-zero-sum
- The focus is on 2-player, sequential, perfect information, zero-sum games (e.g., chess, tic-tac-toe).
Searching game trees: conventions
- Players are MAX and MIN; a MAX-favorable position has a positive value, MIN-favorable is negative.
- Goal: find a winning strategy for MAX (MAX aims to maximize utility; MIN aims to minimize).
Example: Tic-tac-toe game tree (illustrative)
- Root: initial board; moves lead to child nodes alternating MAX and MIN turns; leaves are terminal with utilities like +1 (win for MAX), 0 (draw), -1 (loss for MAX).
Game-tree concepts
- Root: initial configuration
- Ply: one level of moves (one move by one player)
- Leaves/terminal: win, loss, draw or utility value
- Frontier: current boundary of explored nodes
- Each node has a minimax value or utility
Solving game trees
- A solution labels the root with an outcome (win, loss, draw, or a utility value)
- Each opponent move is a subproblem; all subproblems must be solved to solve the parent node
Winning strategy criteria (for MAX)
- For every MIN move possible, MAX can still win from at least one resulting position
- For every MAX move, MAX should be able to force a win from at least one resulting position
Games vs search problems
- Opponent is unpredictable: must account for many possible opponent replies
- Time limits: not all games can be fully searched; select a good first move instead
Minimax search
- Core idea: recursively evaluate game tree using backward induction
- Key definitions:
- S0: initial state
- ACTIONS(s): set of moves from state s
- RESULT(s, a): state after applying action a in s
- IS-TERMINAL(s): whether to stop down the tree
- UTILITY(s): payoff for the current player at state s
- Minimax search pseudocode (conceptual):
- MAX-VALUE(state): if IS-TERMINAL(state) return UTILITY(state)
  - for each a ∈ ACTIONS(state): v ← MIN-VALUE(RESULT(state, a)); track max v and best move
- MIN-VALUE(state): similarly, but minimizes over MAX-VALUE(RESULT(state, a))
- Depth-first exploration: complete to depth m; time complexity O(b^m); space complexity O(b^m) for DFS, or O(m) for standard backtracking
- Illustrative 2-ply example shows how values propagate up the tree.
Minimax example: Tic-tac-toe evaluation
- Evaluation function in the example: difference in available lines for MAX vs MIN
- Example values illustrate how moves influence future opportunities (e.g., center control is strong, etc.)
Alpha-Beta pruning (α-β pruning)
- Definitions:
- α-value for a MAX node: best value found so far for MAX (lower bound on MAX’s value)
- β-value for a MIN node: best value found so far for MIN (upper bound on MIN’s value)
- Pruning rules:
- α-cutoff: can prune below a MIN node if its β-value ≤ α-value of some MAX ancestor; set the MIN node’s value to its β-value
- β-cutoff: can prune below a MAX node if its α-value ≥ β-value of some MIN ancestor; set the MAX node’s value to its α-value
- Termination condition: once all successors of the start node have final backed-up values, the best first move is the successor with the highest value
- Practice examples and figures show α and β propagation and when cutoffs occur
Move ordering and search efficiency
- α-β effectiveness depends on ordering of examined states
- With perfect ordering, time complexity can be O(b^(m/2)) – the depth of search effectively doubles
- Dynamic ordering schemes improve pruning efficiency toward the theoretical limit
Resource limits and practical considerations
- Time per move is limited; e.g., 100 seconds per move, exploring ~10^6 nodes per move
- If terminal states are not reached within time, treat the frontier as a terminal node and rely on an evaluation function to estimate utility
Evaluation functions (for non-terminal searches)
- From experience (e.g., chess):
- Piece material values (pawn = 1, …, queen = 9)
- Linear weighted sum of features: Eval = w1 f1 + w2 f2 + … + wn fn
- Example features: difference in piece counts, positional heuristics, king safety, mobility, etc.
- From preprocessing (e.g., endgame databases in chess): use pre-computed endgame data for the last few plies
Search strategies (high-level naming and roles)
- Depth-limited search with enhancements: quiescence search, singular extension
- Iterative deepening search
- Beam search (prune to n-best moves)
- Two broad types:
- Type A: wide but shallow
- Type B: deep but narrow
Deterministic games in practice (highlights and landmarks)
- Checkers (Chinook): defeated world champion in 1990 using α-β search with a huge endgame database (~39 trillion endgame positions)
- Chess: Deep Blue (1997) searched ~30 billion positions per move, depth often 14, extended up to 40 for promising lines; heuristic reductions reduced effective branching factor (EBF) to ~3 in practice
- Othello: computers defeated world champions (1997) 6-0
- Go: branching factor too large for brute-force α-β; 2016 AlphaGo used deep learning and search synergistically to beat the world champion 4-1
Summary of insights from adversarial search
- Perfection is unattainable; we must approximate optimal play
- These problems force us to think about what to think about (which nodes to keep or discard and to what depth)

Reading and references

Primary text: Russell, S. and Norvig, P., Artificial Intelligence – A Modern Approach (4th ed), Prentice Hall
- Chapter 4, Section 4.1 – Irrevocable search algorithms
- Chapter 6, Sections 6.1–6.3 – Adversarial search algorithms

Next lecture topic

Bayesian Networks

Quick glossary of key equations and concepts

Heuristic search scoring: f(n) = g(n) + h(n)
- g(n) \ge g^*(n)
- 0 \le h(n) \le h^*(n)
A* optimality requires admissible heuristics (h(n) never overestimates the true cost)
Simulated annealing acceptance probability: P( ext{accept}) = e^{-\Delta E / T}
Genetic algorithm fitness: higher fitness means better candidate; selection/crossover/mutation drive evolution toward high-fitness solutions
n-Queens fitness example: max fitness for 8-queens is 28 = 8 \times 7 / 2
Minimax complexity: O(b^m) time, O(b^m) space for classical DFS exploration; optimizations with α-β pruning aim to reduce explored nodes
Notation recap
- MAX and MIN denote the two players
- Terminal leaves correspond to win/loss/draw or utility values
- Depth (ply) indicates how many moves ahead are considered
- Branching factor b, depth m, leading to rough complexity estimates

If you’d like, I can tailor these notes further to specific topics you expect on the exam (e.g., focus on Minimax vs Alpha-Beta, or detailed pseudocode refinements).