Study Notes on Artificial Intelligence Search Techniques

Artificial Intelligence Overview

Presenter: Steve James
Topic: Search methodologies in Artificial Intelligence

Components of an Autonomous System

On-board Logic: The framework governing the decision-making process of the system.
Camera: A sensor that captures images, potentially used for navigation or environmental analysis.
Control Unit: Devices or systems that manage the operation of other components.
Bump Detector: A sensor designed to detect physical obstacles in the path of the system.
Caster Wheel: A type of wheel that allows for movement in multiple directions.
Antenna for Radio Link: Facilitates communication with external systems.
Range Finder: A sensor that measures distance to an object, enabling spatial awareness.
Television Camera: A device that captures video for processing and analysis.
Drive Motor & Wheel: Mechanisms that provide mobility to the system.

Search Fundamentals

Search: Defined as the process of finding a solution to achieve a specified goal.
- Automated Theorem Proving: A search process starting from axioms to find steps required to reach a theorem.
- Games like Checkers and Chess: Utilize search by beginning from a position and determining the moves needed to secure victory.

Search Definitions

Agent: A decision-making entity which can be a robot or software program.
State: A representation of the agent's world or environment at a specific instance in time.
State Space: The comprehensive set of all possible states that the agent may encounter.
Initial State: The starting point within the state space.
Successor Function: Information on possible actions available to the agent at a given state, including the costs associated with these actions and the resulting states.
Goal Test: A function that determines whether a goal condition is met based on the current state. Returns true if goals are satisfied, otherwise false.

Search Problem Structure

Components of a Search Problem:
- A state space
- A successor function
- An initial state
- A goal test, formalized as:
- $ext{test}_{state} =\begin{cases}\text{true} & \text{if all dots eaten}\ \text{false} & \text{otherwise}\end{cases}$
Solution: A sequence of actions (plan) that transitions the initial state to a goal state.
Optimal Solution: The solution that minimizes total cost incurred while transitioning from start to goal.

Problem Definition

Key Challenge: Identifying the state space for the problem at hand is essential. States must capture necessary details for planning while allowing for abstraction.
Example 1 - Navigation Problem:
- States: Represented as $(x,y)$ coordinates.
- Actions: Moving in North, South, East, or West (NSEW).
- Successor: Function that updates coordinates based on the chosen action.
- Goal Test: Achieving a target coordinates $(x,y) = TARGET$.
Example 2 - Dots Eating Problem:
- States: Represented as $(x,y)$ locations along with boolean values indicating dot presence.
- Actions: Move NSEW.
- Successor: Update location and boolean state of dots.
- Goal Test: All dots must be false.
State Space Representation: Can be calculated as:
- $ext{State Space} = ext{locations} \times 2^{ ext{dots}}$

Formal Problem Definition

Elements Formally Defined:
- Set of states: $S$
- Start state: $s \in S$
- Set of actions: $A$ along with transformation rules: $a: s \rightarrow s'$
- Goal test: g(s) = egin{cases} 0 & \text{if goal not reached} \ 1 & \text{if goal reached} \end{cases}
- Cost function: $C(s,a,s') \rightarrow \mathbb{R}^+$
Structured Search Problem: Expressed as $\langle S, s, A, g, C \rangle$

Action Sequence Search

Objective: To find a sequence of actions $a_1, …, a_n$ and corresponding states $s_1, …, s_n$ such that:
- $s_0 = s$
- $s_i = a_i s_{i-1},$ for $i = 1, …, n$
- $g(s_n) = 1$ while minimizing $\sigma_{i=1}^{n} C(s_{i-1}, a_i, s_i)$

Visualization of Search Tree

Definition: A tree that illustrates potential outcomes resulting from various actions.
Structure:
- Root Node: Represents the starting state.
- Children Nodes: Represent successor states.
- Edges: Connect root to nodes in the tree; signify the plan being followed.
- Cost of Plan: The sum of the costs associated with traversed edges.
Challenges:
- Many problems prevent the feasible construction of the entire tree, with computational complexity represented as $O(b^d)$ where $d$ is the maximum length of the planning sequence.

Planning Using Search Tree

Process:
- Expand tree nodes to investigate potential plans.
- Maintain a frontier of partial plans for consideration.
- Aim to converge on a solution while minimizing node expansions.

General Tree Search Overview

Key Concepts:
- Frontier: Current candidates for exploration.
- Expansion: Process of moving from a node to its successors.
- Exploration Strategy: Determines the order of which frontier nodes should be explored.
Implementation Consideration:
- Keep track of visited states to prevent redundant expansions of the same state within the tree.

Search Strategies Comparison

Tree search algorithms are designed to optimize the order of node expansion. Factors to consider include:
- Completeness: Does the strategy guarantee a solution if it exists?
- Optimality: Does it ensure a least-cost solution?
- Time Complexity: Amount of nodes generated during search.
- Space Complexity: Maximum nodes stored in memory.
Complexity influences: driven by the maximum branching factor, depth of the optimal solution, and maximum depth within the state space.

Uninformed Search Strategies

Definition: Approach using only information provided by the problem definition without external knowledge.
Common Strategies:
- Breadth-First Search (BFS)
- Uniform-Cost Search (UCS)
- Depth-First Search (DFS)

Breadth-First Search (BFS)

Method: Expand the shallowest unexpanded node.
Data Structure: Implements a FIFO queue for managing the frontier.
- New successors are queued at the back when adding children.
Example Execution of BFS:
1. Expand node 2
2. Then expand nodes 4, 5, and 7 in order.
Function Pseudocode:

  function BREADTH-FIRST-SEARCH(problem) returns a solution, or failure
  node a node with STATE = problem.INITIAL-STATE, PATH-COST = 0
  if problem.GOAL-TEST(node.STATE) then return SOLUTION(node)
  frontier a FIFO queue with node as the only element
  explored an empty set
  loop do
  if EMPTY?(frontier) then return failure
  node POP(frontier) /* chooses the shallowest node in frontier */
  add node.STATE to explored
  for each action in problem.ACTIONS(node.STATE) do
  child - CHILD-NODE(problem, node, action)
  if child.STATE is not in explored or frontier then
  if problem.GOAL-TEST(child.STATE) then return SOLUTION(child)
  frontier + INSERT (child, frontier)

BFS Properties:
- Completeness: Yes, if the branching factor $b$ is finite.
- Space Complexity: $O(b^d)$ , where $d$ is the depth of the shallowest solution.
- Time Complexity: $O(b^d)$ , resulting in the summation being $1 + b + b^2 + … + b^d$ .
- Optimality: Yes, if costs are constant.

Limitations of BFS

Time and Space Complexity: Both are constrained to $O(b^d)$ .
Assumptions:
- Generating rate is 1 million nodes per second.
- Each node consumes 1 kilobyte.
- $b = 10$ .
Outcomes:
- If $d = 8$ , the time is 2 minutes, memory usage is 102 GB.
- If $d = 12$ , time extends to 13 days, memory usage swells to 1000 TB.

Uniform-Cost Search (UCS)

Intro: UCS improves over BFS by considering costs associated with path lengths, optimizing for plans with minimized overall costs.
Execution: Similar to BFS but utilizes a priority queue that ranks nodes by their cost to reach them.
Key Characteristics:
- Matches BFS when costs are constant everywhere.
- Closely parallels Dijkstra's algorithm, but with no defined goal.

UCS Implementation Details

General Method: Replace the standard queue with a priority queue allowing nodes to be prioritized based on their costs.
Additional Check: Update paths to children if a shorter path is discovered.
Practical Comparison with BFS:
- UCS can produce solutions with longer paths if they incur lower overall costs, unlike BFS.

UCS Properties

Execution: Expands nodes based on cost less than the cheapest solution found, ensuring higher cost efficiency.
Completeness: Assured if costs are $ext{cost} <br>eq 0$ and the best solution holds finite value.
Time Complexity: $O(b imes ext{ceiling}(C^* / ext{epsilon}))$ where $C^*$ is the optimal path cost.
Space Complexity: Similar to the time complexity: $O(b imes ext{ceiling}(C^* / ext{epsilon}))$ .
Optimality: Guaranteed; nodes are expanded based on increasing cost.

Depth-First Search (DFS)

Concept: Expand the deepest unexpanded node available.
Data Structure: Uses a stack for managing the frontier; can also be implemented recursively.

DFS Characteristics and Performance

Node Expression in Search Space:
- Uses only the current path from the root to conserve memory.
Completeness: Yes, for finite-state spaces.
Time Complexity: $O(b^m)$ , where $m$ is the maximum depth of a solution.
Space Complexity: $O(b imes m)$ , which accounts only for the current path and its unexplored siblings.
Optimality: No; DFS may find a “leftmost” solution without guaranteeing optimality.

Depth-Limited DFS

Challenge: In infinite state spaces, standard DFS may never yield a solution.
Solution: Implement a depth limit $l$ ensuring that search does not exceed it.
Performance:
- Adjusted time complexity: $O(b^l)$ .
- Adjusted space complexity: $O(b^l)$ .
Depth Selection:
- Can rely on domain knowledge or compute diameters based on state distributions.

Iterative Deepening Search (IDS)

Method: Combine advantages of both depth-limited DFS and BFS by iterating through increasing depths until a solution is identified.
Execution Style: Run DFS at depth levels from 1 upwards until a solution is found.
Analysis Example:
- If using depth-limit run for up to $d$ ; computations go as follows:
- For $b = 10$ , $d = 5$ :
- Complexity of traditional DFS yields: $N_{DFS} = 111111$
- Complexity for IDS yields: $N_{IDS} = 123456$
- The difference in efficiency is approximately 11%.

Properties of IDS

Completeness: Yes, similar in structure to BFS.
Time Complexity: Same as BFS represented as $O(b^d)$ .
Space Complexity: $O(b^d)$ .
Optimality: Guaranteed, similar to BFS under constant costs.

Bidirectional Search Strategy

Concept: Conduct simultaneous searches from both the start and goal states and aim for convergence.
Efficiency Expectation:
- $O(b^{d/2}) + O(b^{d/2}) ext{ is substantially less than } O(b^d)$ .
Requirements:
- Ability to invert action rules can complicate implementation.
- Unique or single solutions needed for assurance of convergence.
- Stopping criterion based on the intersection of frontier nodes containing candidate solutions.

Methods Overview & Comparison Table

Search Strategy	Completeness	Time Complexity	Space Complexity	Optimality
Breadth-First Search (BFS)	Yes	O(b^d)	O(b^d)	Yes
Uniform-Cost Search (UCS)	Yes	O(61+[C*/ε])	O(61+[C*/ε])	Yes
Depth-First Search (DFS)	No	O(b^m)	O(bm)	No
Depth-Limited DFS	Yes	O(b^l)	O(bl)	No
Iterative Deepening Search	Yes	O(b^d)	O(b^d)	No
Bidirectional Search	Yes	O(b^{d/2})	O(b^{d/2})	Yes

Key Notes on Performance:
- $b$ signifies the branching factor; depth $d$ represents depth to first solution; $m$ is the maximum depth of the search tree; and $l$ denotes the depth limit.

Knowledge Enhancement Strategies in Search

Question Posed: How can knowledge be integrated into the search algorithms?
Goal: Implement a task-specific expansion strategy.

From UCS to Heuristic Searches

Key Concept Transition: Transition from Uniform-Cost Search (UCS) to heuristic approaches involves managing total costs as a function of both reaching a node and estimating future costs.
Function Definitions:
- Within UCS: $f(n) = g(n)$
- In greedy best-first search (GBFS): $f(n) = h(n)$
Heuristic Relationships:
- Heuristic function $h(n)$ estimates the cost from the node $n$ to the goal.

Greedy Best-First Search (GBFS)

Concept: Focuses on prioritizing nodes by their estimated costs to the goal based on domain knowledge.
Evaluation Example:
- For navigation tasks, distance estimates can represent heuristic values.

Properties of GBFS

Completeness:
- Incomplete in tree search situations, but complete in finite graph search scenarios.
Time & Space Complexity: Identically $
- O(bm)$, with maximized search depth determined by the problem's nature.

A* Search Algorithm

Framework Differences:
- UCS utilizes costs only based on reaching a node.
- GBFS utilizes heuristic estimates of future costs only.
- A* Search ideally manages exploration in a manner that combines both costs:
- $f(n) = g(n) + h(n)$
- In this framework, $h(n)$ signifies estimated future costs.

A* Optimality Concerns

Dependency: The effectiveness and optimality of A* are reliant on the quality of heuristic utilized.
Challenges Associated: Poor heuristics may lead to failures in guaranteed optimal paths, particularly when lower-cost paths are misestimated.
Graphical Representation:
- Example illustrating cost estimation pitfalls:

A* Heuristic Function Properties

Admissible Heuristic: A heuristic is admissible if it never overestimates the cost to reach the goal.
- Requires that:
- $h(n) ≤ ext{TrueFutureCost}(n)$
- This ensures that the cost through any node never exceeds actual costs incurred.
Consistency Requirement:
- A heuristic is consistent if:
  $h(n) ≤ c(n,a,n') + h(n')$
- All consistent heuristics are also admissible.
- Guarantees optimal solutions for both A* tree and graph searches.

Properties of A*

Overall Completeness: A* is established as complete and optimally efficient with respect to heuristic.
Time Complexity Calculation: Expressed as:
- $O(b^εd)$ , where $ε$ indicates the relative error in heuristic estimation.
Effective Branching Factor Consideration: Provides a practical account of node pruning, which lessens overall branching factors due to unnecessary nodes being precluded from consideration.

Implementing Heuristics

Central Challenge: The essential task involves crafting heuristics that are both admissible and consistent.
Navigation Example: Straight-line distance serves as a robust heuristic due to the invariant nature of geography.
- Can be further optimized through data-driven methods, like machine learning or precomputation methods.

Heuristic Relaxations

Generalization: The concept of relaxing constraints from original problems can lead to effective heuristic establishment:
- Example Case:
- Relaxation 1: Allow tiles anywhere for solving an 8-puzzle, measured as the count of misplaced tiles (an underestimate of true movement costs).
- Relaxation 2: Slide tiles around to estimate based on sum of distances to final placements.

Conclusion on Heuristic Effects

Key Tradeoffs: The relationship between the heuristic quality and computational workload per node is crucial.
Performance Outcomes:
- As heuristic accuracy improves, fewer nodes are expanded while keeping per-node computation manageable.
- Example averages across different path lengths demonstrate significant variability based on expansions and sequences, emphasizing the heuristic's role in efficiency.