1/59
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
State space
Provides a number of possible solutions to a given problem.
Problem space
The problem domain where the knowledge is represented.
Solution
The answer for the problem that can be searched for in the space.
Blind search
Algorithms that search through all the problem spaces for the solution.
Backtracking
A blind search algorithm that searches every possible combination. Recursive and uses a depth-first approach.
Brute force search
Another name for backtracking search.
Exhaustive search
Another name for backtracking search.
Breadth-first search
Systematically searches for a solution along all nodes on one level before considering nodes at the next level.
Guaranteed to find the shortest path.
Time complexity of BFS
O(b^d)
(b = branching factor; d = depth; Exponential).
Space complexity of BFS
O(b^d)
(b = branching factor; d = depth; Exponential).
Depth-first search
Looks for a solution along each branch of a problem space to its full vertical length.
Blind alley
A situation where DFS goes deeper into the search space without finding anything.
Time complexity of DFS
O(b^d)
(b = branching factor; d = depth; Exponential).
Space complexity of DFS
O(d)
(d = depth; linear)
Uniform cost search
Uses the lowest cumulative COST to find a path from the source to the destination.
Special case of A* where heuristic is constant.
Priority queue
The structure used to store the frontier in uniform cost search.
Iterative deepening depth first search (ID-DFS)
Hybridizes BFS and DFS, also known as depth limited DFS.
Time complexity of ID-DFS
O(b^d)
(b = branching factor; d = depth; Exponential)
Space complexity of ID-DFS
O(bd)
(b = branching factor; d = depth)
Heuristic search
Algorithms that use a rule of thumb (a HEURISTIC) to select which state to explore first.
Heuristic
A rule of thumb that hints at how close a state is to the goal.
Sum of cost to node and estimate of cost from node to goal.
Greedy best first search
Attempts to find the most promising path (best HEURISTIC) from a given starting point to a goal.
Special case of A* where edge weights are constant
Heuristic function
The function $h(n)$ used to estimate the cost to reach the goal.
A-star search
Uses the heuristic function $h(n)$ along with the cost to reach the node $n$ from the start state $g(n)$.
Fitness number
The path with the lowest $f(n) = h(n) + g(n)$.
Admissible heuristic
A heuristic that never overestimates the true optimal cost from a node to the goal.
Adversarial games
Real-world problems where the agent is not the only one making decisions, such as turn-based board games.
Game tree
Models the possible choices that can be made by either player throughout the game.
Depth
Refers to the number of turns since the start state.
Ply
Refers to a single cycle of all players' turns.
2-player zero-sum game
Played between 2 players where one player's gain is equivalent to the other player's loss.
Zero-sum game structure
Defines the structure of a zero-sum game with players, start state, actions, successor function, end state, utility, and player turn.
Players
The set Players = {agent, opp}
Start state
s_start: The initial state of the game
Actions
Actions(s): The possible actions from a state s.
Successor function
Succ(s, a): The resulting state when choosing an action a from a state s.
End state
IsEnd(s): True iff s is an end state (game is finished).
Utility
Utility(s): Agent's utility at an end state $s$ (i.e., the expected value of an end state $s$).
Player turn
Player(s): Whose turn it is at state $s$ (agent or opponent).
Policy
Describes how the opponent behaves at a given action point.
Probability of action
π_opp(s, a) represents the probability of the opponent choosing a specific action a in state s.
Dumb opponent
The opponent picks a random move with equal probability
Stochastic opponent
The opponent picks a move based on some probability distribution, e.g. π_opp(s, a) ∈ [0, 1]
Expectimax
The expectimax search function will attempt to pick the best move for the agent assuming a stochastic opponent policy.
Expected value for expectimax
The expected value of a state $s$ is given by the evaluation function
End state in expectimax
The value is given by the utility function.
Agent's turn in expectimax
The value is the max out of the values of all possible actions.
Opponent's turn in expectimax
The value is a weighted average (based on probability) of the values of all possible actions.
Minimax
The *minimax search function will attempt to pick the best move for the agent assuming an optimally smart* opponent.
Expected value for minimax
For the cases where $s$ is an end state and for where $s$ is the agent's turn, identical to expectimax.
Opponent's turn in minimax
The value is the minimum of the values of all possible actions.
Characteristics of minimax
*Complete if the game tree is finite; Optimal* iff the opponent also behaves optimally; Time complexity: $O(b^d)$; space complexity: $O(bD)$.
Minimax algorithm
Finds the optimal strategy (or just the best first move) for max (agent).
Brute-force approach in minimax
1. Generate the *entire game tree; 2. Apply the utility function to the leaves (their value*); 3. Back-up values from leaves towards the root; 4. Upon reaching the root: choose max value and the corresponding move.
Optimizing the minimax algorithm
There are three main ways to further optimize the minimax algorithm.
Pruning redundant states
Prune redundant states based on the rules of the game.
Depth-limited search
For game trees that are too deep, set a limit to the depth to be explored.
Evaluation function
An *evaluation function* is an estimation of how good the state is for the agent.
Alpha-beta pruning
Don't explore a branch if we're sure a better path already exists, which may significantly shrink the search space.
Alpha-beta pruning process
1. Initialize alpha to $-\infty$ and beta to $\infty$; 2. Perform the minimax algorithm, but update alpha and beta values accordingly.