1/25
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
How is a two-player deterministic game represented as a search problem?
By defining the initial state, actions, terminal test, and utility function. In zero-sum games, utilities are equal and opposite.
What is the minimax algorithm?
An algorithm that chooses moves to maximize the player's minimum guaranteed payoff, assuming the opponent plays optimally.
What are the properties of minimax?
Complete (if tree is finite), Optimal (against optimal opponent), Time and Space complexity: O(b^m).
What is a cutoff test and evaluation function in game search?
A cutoff test ends the search early (e.g. depth limit), and an evaluation function estimates utility at cutoff states using heuristics.
Why are monotonic transformations acceptable in deterministic games?
They preserve the order of utility values, so the relative preference between moves is unchanged.
What is alpha-beta pruning?
An optimization of minimax that prunes branches that cannot affect the final decision, using α (max’s best) and β (min’s best) values.
What is the time complexity of minimax with alpha-beta pruning under perfect ordering?
O(b^(m/2)), effectively doubling the depth that can be searched.
What strategy was used by Chinook in checkers?
Chinook used alpha-beta pruning with an endgame database for perfect play with ≤8 pieces, covering ~400 billion positions.
How did Deep Blue outperform Kasparov in chess?
Deep Blue used alpha-beta pruning, advanced evaluation functions, and could search up to 40-ply with massive parallel processing.
What made Go especially difficult for traditional AI?
Its large branching factor (b > 300) and pattern-recognition complexity made conventional search ineffective.
What is the expectiminimax algorithm?
An extension of minimax for games with chance nodes, computing the expected utility over probabilistic outcomes.
Why can't expectiminimax use arbitrary utility transformations?
Exact values matter for averaging; only positive linear transformations preserve behavior, unlike monotonic ones in minimax.
How did AlphaGo defeat human champions?
By combining deep learning, reinforcement learning, and Monte Carlo tree search, exploiting patterns and evaluation learning.
What is TD-Gammon and how did it perform?
A backgammon program using depth-2 search with a learned evaluation function; performed at near world-champion level.
1. Random moves (slow learning)
2. Game-specific heuristics
3. Learned evaluation policies (e.g. from self-play)"