AI in Adversarial Games: Minimax, Alpha-Beta Pruning, and Stochastic Approaches

Zero-Sum Games

Definition: Games where one player's gain is exactly another player's loss.
Characteristics:
- Deterministic: Outcomes are precisely determined by actions.
- Two-player: Involve only two participants.
- Turn-taking: Players alternate moves.
- Perfect information: Both players have full knowledge of the game state (e.g., all pieces visible on a board).
Players: Typically referred to as Player MAX and Player MIN.
- Player MAX: Tries to maximize their gain.
- Player MIN: Tries to minimize Player MAX's gain (which is equivalent to maximizing their own gain).
Total Sum: The sum of the utilities for both players is always $0$ . There is no scenario where both players win (no win-win situation).
Examples within AI: Chess and Go are classic examples often studied.
Real-world Relevance: Concepts are also common in fields like politics and business.

Search Applied to Adversarial Games

Initial State: Describes the current board position or general status of the game.
Operators: Define the set of legal moves that each player (MAX and MIN) can make from a given state.
Terminal Nodes: These are leaf nodes in the game tree, signifying that the game has concluded.
Utility Function (Payoff Function):
- Assigns a numerical value to the outcome of a game, not individual moves.
- Represents the value of the terminal state for Player MAX.
- Example (Tic-Tac-Toe): Utility values are typically $-1$ (MIN wins), $0$ (draw), or $1$ (MAX wins).
Game Tree: A representation of all possible sequences of moves in a game.
- Nodes represent game states.
- Edges represent moves.
- Levels alternate between Player MAX and Player MIN.

Game Tree Example - Tic-Tac-Toe

Illustrates how a game state (e.g., initial empty board for MAX) branches into possible moves, leading to subsequent states for the opposing player (MIN), and so on, until terminal states are reached. Each terminal state is assigned a utility value ( $-1, 0, +1$ ).

Minimax Algorithm

Objective: To find the optimal strategy for Player MAX, assuming both players play optimally.
Core Idea: MAX chooses the move that maximizes its utility, while MIN chooses the move that minimizes MAX's utility.
Process:
1. Search the tree to the end: Explore all possible moves until terminal nodes are reached.
2. Assign utility values: At terminal nodes, apply the utility function to determine their values.
3. Back-propagate values: Work backwards from terminal nodes up the tree.
  - At a MIN node, the value propagated up is the minimum of the values of its children (because MIN wants to minimize MAX's score).
  - At a MAX node, the value propagated up is the maximum of the values of its children (because MAX wants to maximize its own score).
4. Best Move for MAX: The optimal move for MAX from the initial state is the one that leads to the child node with the highest minimax value.
Recursion: The optimal strategy is determined by recursively examining the minimax value of each node.
Example (2-ply game): A diagram would show terminal nodes with values, MIN nodes taking the minimum of their children's values, and MAX nodes taking the maximum.

Minimax Algorithm (Formal)

function MINIMAX-SEARCH(game, state) returns an action
  player < game.TO-MOVE(state)
  value, move < MAX-VALUE(game, state, -∞, +∞) // Note: -∞, +∞ are used for Alpha-Beta, not strictly for Minimax here but helpful for context
  return move

function MAX-VALUE(game, state) returns a (utility, move) pair
  if game.IS-TERMINAL(state) then return game.UTILITY(state, player), null
  v < -∞
  for each a in game.ACTIONS(state) do
    v2, a2 < MIN-VALUE(game, game.RESULT(state, a))
    if v2 > v then
      v, move < v2, a
  return v, move

function MIN-VALUE(game, state) returns a (utility, move) pair
  if game.IS-TERMINAL(state) then return game.UTILITY(state, player), null
  v < +∞
  for each a in game.ACTIONS(state) do
    v2, a2 < MAX-VALUE(game, game.RESULT(state, a))
    if v2 < v then
      v, move < v2, a
  return v, move

Properties of Minimax

Complete: Yes, if the game tree is finite.
Optimal: Yes, it guarantees the best possible move against an optimal opponent (or an opponent employing the same strategy).
- This is because it exhaustively explores all possibilities to determine the absolute best outcome under perfect play.
Time Complexity: $O(b^m)$
- $b$ = branching factor (average number of legal moves from a state).
- $m$ = maximum depth of the game tree.
Space Complexity: $O(b \times m)$ (for depth-first exploration).
Feasibility: For complex games like chess, with $b \approx 35$ and $m \approx 100$ for reasonable games, an exact solution via pure Minimax is completely infeasible due to the astronomical number of states.

Alpha-Beta Pruning

Problem: Pure Minimax examines too many states, making it computationally expensive for deep game trees.
Solution: Alpha-Beta pruning simplifies the search space without compromising optimality.
Core Idea: Avoid searching branches of the game tree that can't possibly influence the final decision. It does this by keeping track of the best value already found for MAX (alpha) and for MIN (beta) along the current path.
- Alpha ( $\alpha$ ): The best (highest-value) choice found so far for the maximizing player (MAX) on the path from the root to the current node. Initialized to $-$ \infty. A MAX node updates its alpha value.
- Beta ( $\beta$ ): The best (lowest-value) choice found so far for the minimizing player (MIN) on the path from the root to the current node. Initialized to $+$ \infty. A MIN node updates its beta value.
Pruning Conditions:
- Beta Pruning: If at a MIN node, the value found for a child node ( $v$ ) is less than or equal to the current alpha value ( $\alpha \leq v$ ), then MAX would have already chosen an alternative path with a value of at least $\alpha$ . Therefore, this branch for MIN can be cut off (pruned) because it will not lead to a better outcome for MAX than what's already guaranteed.
- Alpha Pruning: If at a MAX node, the value found for a child node ( $v$ ) is greater than or equal to the current beta value ( $v \geq \beta$ ), then MIN would have already chosen an alternative path with a value of at most $\beta$ . Therefore, this branch for MAX can be cut off (pruned) because it will not lead to a worse outcome for MIN than what's already guaranteed.
Effect on Time Complexity: Can drastically reduce the number of nodes evaluated. In the best case (perfect move ordering), it reduces the complexity to $O(b^{m/2})$ . While still exponential, this is a significant improvement, effectively allowing the search to go twice as deep as pure Minimax in the same amount of time.

Stochastic Approaches

Definition: In contrast to deterministic games (like Chess or Go), stochastic games include elements of chance or randomness.
Characteristics:
- Uncertainty: Outcomes of actions are not always certain; they involve probabilities (e.g., dice rolls in Monopoly, drawing cards).
- Expected Utility: Instead of directly minimizing/maximizing a value, agents consider the expected utility of different outcomes, weighted by their probabilities.
Algorithms:
- Expectiminimax: An adaptation of the Minimax algorithm for games with chance nodes.
  - Chance Nodes: Represent points where random events occur. At these nodes, the value propagated up is the expected value (average) of its children's values, weighted by their probabilities.
  - MAX/MIN Nodes: Behave as in standard Minimax, choosing the maximum or minimum of their children's expected values.
Real-world Examples: Backgammon (dice rolls), poker (random cards), most board games involving dice.