Lecture 8 Uninformed Search

Uninformed Search Algorithms

Introduction

In the realm of Artificial Intelligence, search algorithms are essential for problem-solving, especially when exploring various states to find a solution. These algorithms can be categorized into two main types: informed (heuristic) and uninformed (blind) search algorithms. This note focuses on uninformed search algorithms, which do not rely on any domain-specific knowledge or heuristics to guide the search process. Instead, they explore the search space based solely on the information available in the problem definition.

Search Strategies

Node Objects: Consider these as signposts on a map. They represent states in the problem space. Each node contains information about the state, its parent node (to trace the path back), and the action that led to this state.
Search Tree: Imagine you are exploring a maze. Each path you take can be visualized as a branch in a tree. The search tree is constructed by expanding node objects, exploring available actions, and branching out to successor nodes.
- The root node is where you start—the initial state.
- Successor nodes represent new possible states you can reach from the current state.
Expansion: This is like trying out all possible doors in a room. Expansion involves considering all possible actions from the current node, creating new node objects for each potential successor state, and adding these to the frontier.
The search continues, door by door, room by room, until a node corresponding to a goal state is created—you've found your way out of the maze.

Costs

Think of costs like the distance between cities on a map. Costs are associated with actions (e.g., distance in kilometers, time, or fuel). In some search algorithms, these costs become important when trying to find the most efficient path.
Breadth-first search (BFS) initially ignores these costs, focusing instead on minimizing the number of steps (or cities visited) rather than the total cost (kilometers traveled). It's like saying, "I want to visit the fewest cities, even if it means a longer drive."

Breadth-First Search (BFS)

BFS is like exploring a maze by systematically checking every possibility at each level before moving deeper. It finds the solution path with the smallest number of actions.
Process:
- Create a node object for the initial state. This is your starting point.
- Mark the initial node in green to indicate you haven't explored its neighbors yet.
- Expand the node by identifying all available actions and generating successor nodes. These are the immediate neighbors or possible next steps.
- Maintain a frontier of nodes—nodes that have been generated but not yet explored. Add all node objects to a list.
  - Use a linked list for efficient implementation. Think of it like a queue where you add new nodes to the end and pick the first node for expansion (first-in, first-out).
- Use a hash set to store all visited states. This prevents revisiting the same states, speeding up the search process. It's like marking rooms you've already been in to avoid going in circles.

Implementation Details

Linked list: The frontier of nodes is stored in a queue-like structure.
- New nodes are added to the end of the queue.
- The first node in the queue is selected for expansion (FIFO).
Hash set: Visited states are stored in a hash set to quickly check if a state has already been explored, avoiding redundant expansions.

Retrieving the Path

Each node stores a reference to its parent node. You’re essentially creating a chain of breadcrumbs back to the starting point.
Once the goal state is reached, follow the parent references from the goal node back to the root to reconstruct the solution path. This gives you the exact steps you took to solve the problem.

Queue Data Structure

The frontier of nodes is implemented as a queue, ensuring nodes are explored in a FIFO manner.
BFS uses a first-in, first-out (FIFO) queue. It’s like a waiting line, where the first person in line is the first to be served.
Other search algorithms might use different types of queues, like priority queues, where nodes are prioritized based on some criteria.

Pseudocode

The pseudocode in the textbook is very similar to Python.
Input:
- Problem definition: Includes the initial state, goal test function, action function, etc. These define the search problem.
Steps:
- Create a node for the initial state. This is the starting point of your search.
- Check if the initial state is the goal state. If so, return the current node and you’re done!
- Initialize the frontier with the initial node. The frontier is the set of nodes to explore.
- Initialize a set to store reached states. This helps avoid revisiting the same states.
- Loop until the frontier is empty or a goal state is found:
  - Pop the first element from the queue (FIFO). This is the node to expand.
  - Expand the node to generate child nodes. These are the possible next steps.
  - For each child node:
    - If the state is the goal, return the child node.
    - If the state hasn't been reached before:
      - Add the state to the set of reached states.
      - Add the child to the frontier.

Example

The algorithm expands nodes level by level. It starts with the root (A), then explores its children (B and C), then their children (D, E, F, and G), and so on. This ensures that the shallowest solutions are found first.

Advantages of BFS

BFS guarantees finding the solution with the fewest number of actions. It always finds the shortest path in terms of the number of steps.

Disadvantages of BFS

Exponential growth in the size of the queue can lead to memory problems, especially for large search spaces. The amount of memory required grows exponentially with the depth of the search tree.
BFS doesn’t take costs into account. It optimizes for the number of steps, not the total cost, which might not be ideal for all problems.

General Search Algorithm

Heuristic Search vs. Uninformed Search

Heuristic Search (Informed Search): Uses a heuristic function—an educated guess—to estimate the distance to the goal state. This helps guide the search in a more informed way.
Uninformed Search: Doesn't use a heuristic function. It explores the search space without any additional information. BFS is a prime example of an uninformed search algorithm.

Best-First Search

A generic search framework that serves as a foundation for various search algorithms. It's like a template where you can plug in different strategies.
The type of queue used determines the order in which nodes are expanded. Using different queues allows you to implement various search algorithms.

Prioritization

Nodes are prioritized based on specific criteria. This determines which nodes are explored first.
In BFS, nodes are prioritized based on their age (FIFO queue). The oldest nodes get expanded first.

Depth-First Search (DFS)

DFS uses a last-in, first-out (LIFO) queue, also known as a stack. Imagine a stack of plates: the last plate you put on is the first one you take off.
It pops off the youngest element, exploring deeply along one path before backtracking.

Priority Queue

In a priority queue, nodes are ordered based on an evaluation function $f$ . This function assigns a priority to each node.
The function $f$ returns a value indicating the priority of the node. Lower values typically indicate higher priority.
A heap data structure can be used to implement a priority queue efficiently. A heap is a tree-based data structure that satisfies the heap property: the value of each node is greater than or equal to the value of its parent (min-heap) or less than or equal to the value of its parent (max-heap).
Python has a heapq module for priority queues, making it easy to implement priority queue-based algorithms.

Generic Search Algorithm with Priority Queue

This algorithm uses a priority queue instead of a FIFO queue, allowing you to prioritize nodes based on a specific evaluation function.
The evaluation function $f$ determines which node to expand next. The node with the highest priority (lowest value returned by $f$ ) is expanded first.
The algorithm maintains a dictionary of reached states and their costs. This dictionary stores the cost of the best known path to each reached state.
If a lower-cost path to a state is found, it replaces the existing solution. This ensures that the algorithm always keeps track of the best known path to each state.
Rather than using a set to store the states that have been reached so far, a dictionary is used in this generalized algorithm. This is because this algorithm is designed to handle costs, ensuring that if a state is reached via a node with a lower path cost, the existing solution is replaced.

Breadth-First Search via Priority Queue

To emulate BFS with a priority queue, use the depth of a node as the evaluation function $f$ . The shallower the node, the higher its priority.

Depth-First Search via Priority Queue

To emulate DFS, use the inverse of the depth of the node as the evaluation function $f$ (e.g., $-depth$ ). The deeper the node, the higher its priority.

Advantages of DFS

DFS has lower storage requirements than BFS because it only needs to store the nodes on the current path, rather than all nodes at a given level.

Disadvantages of DFS

DFS does not guarantee finding the solution with the smallest number of steps. It may find a longer path if it goes deep down one branch before finding the solution.

Iterative Deepening Search

Iterative deepening search combines the advantages of BFS and DFS. It performs a series of depth-limited DFS searches with increasing depth limits.

Uniform Cost Search

In this map problem, we have the kilometers there that need to be traveled when we perform an action to moving from one town to the next.

We should take the path cost into account if we want to find the shortest route anyway, and maybe also there are also other search problems where we have costs associated with the actions. We want to minimize the total cost of the solution, and then we can easily get this by using this generic algorithm that we looked at a minute ago and just replacing this priority queue sorry, replacing the evaluation function that's used in the priority queue. This function f, which we in Python can just pass to the corresponding priority queue object.

Uniform cost search takes path costs into account, finding the most efficient route based on cost.
It uses the generic search algorithm with a priority queue, prioritizing nodes based on their path cost.
The evaluation function $f$ returns the path cost of the node. The lower the path cost, the higher the priority.
Uniform cost search finds the minimum-cost solution, making it ideal for problems where costs matter.

Implementation

In Python, pass the path cost as the evaluation function $f$ to the priority queue.
best_first_search(problem, path_cost)
Uniform cost search provides the optimal solution. When applied to path finding problems with maps, it delivers the shortest path.

A* Search (Future Topic)

A* search extends uniform cost search by using a heuristic function to estimate the cost to the goal state. It combines the path cost from the start node to the current node and the estimated cost from the current node to the goal node.
The evaluation function $f$ is the sum of the path cost and the heuristic estimate: $f(n) = g(n) + h(n)$ , where $g(n)$ is the path cost and $h(n)$ is the heuristic estimate.
If the heuristic function never overestimates the cost, A* search is guaranteed to find the optimal solution.

Optimality and Run Time

Uniform cost search gives you the optimal solution.

Why would we need some other algorithm if this gives us the optimal solution?

Well, we also need to consider the run time of the algorithm, right, how long it takes to find the solution. And it turns out that by using a heuristic evaluation function and taking that into account when we order the notes in our priority queue, we can get a often get a solution more quickly, an optimum solution more quickly. So this is optimum, it gives us the shortest path solution, but it may not do so in the smallest amount of running time. So as computer scientists, we are interested in algorithms that give us a desired solution as quickly as possible.

Uniform Cost Search will give us an optimum solution in the sense of optimality defined by the search problem, path cost, but it's not optimal in terms of runtime. Yeah. This breadth first sorry, uniform cost search is not optimum in terms of runtime.

You can if you have a good heuristic function that has a good estimate for getting from the current state to a gold state, yeah, then you can speed up the search because you expand the nodes in a better order basically in the search tree.

Although Uniform Cost Search guarantees an optimal solution in terms of path cost, it may not be optimal in terms of runtime. It may take longer to find the solution compared to other algorithms that use heuristics.

Summary for using Generic Best First Search

If we just use the depth of a node as the value that of the evaluation function that f that gets returned by f for a particular node, then and then use a priority queue with f, then the shallowest depth node will be the node that's popped off next from the so called priority queue.
Use the inverse of the depth of the node. Yeah. One over the depth of the node or the additive inverse minus the depth of the node. We use that as the value returned by the evaluation function, and then we get what's called depth first search.
We have in each node, if this function here is used to expand nodes, we keep track we keep track of the path cost, the the cost of traveling or performing all the actions that got us to the the current node. Right? So here we create a new node and we store the cost, the path cost in this new node that we create. So all we need for the evaluation function then is to grab this when it's given a node, right, the evaluation function f is given a node, when it's used by the priority queue, and this evaluation function will just look up the path cost and return that.

Yeah. And then the priority queue will be based on the path cost.