Artificial Intelligence Foundations, Search Strategies & Machine Learning – Comprehensive Study Notes
Foundations of Artificial Intelligence
• Four descriptive perspectives of AI (what is being studied / engineered):
• Think like humans – e.g., neural networks inspired by how people think.
• Think rationally – formal reasoning that adds new, logically-consistent knowledge.
• Act like humans – behaviour judged by the Turing Test; if the interrogator cannot detect the machine, it is said to be intelligent.
• Act rationally – goal-directed behaviour that produces the best available outcome.
• Interdisciplinary roots feeding AI foundations:
• Philosophy • Mathematics • Economics / decision theory • Neuroscience • Psychology • Computer Engineering • Control Theory • Linguistics
• Enabled by trends in: Computing power, memory/storage, connectivity, automation.
History of AI (Milestones & Phases)
• 1943 – 1956 “Inception”
• Boolean circuits of the brain (McCulloch & Pitts) – first formalisation of neurons with logic.
• SNARC – early neural-network computer using vacuum tubes.
• Alan Turing’s “Computing Machinery & Intelligence.”
• John McCarthy coins the term “Artificial Intelligence.”
• 1952 – 1969 Early enthusiasm
• First AI programs solving mathematics and playing games.
• Lisp emerges as a primary AI language.
• Work focused on microworlds – limited, toy domains.
• 1966 – 1973 Dose of reality
• Systems limited to the information explicitly supplied (informed introspection only).
• Computational-complexity barriers (time & memory blow-ups).
• Neural-network research almost disappears.
• 1969 – 1986 Expert-systems era
• Knowledge-based systems: store facts, derive new truths, evolve internal KB.
• Medical exemplars: DENDRAL, MYCIN, R1.
• 1980 – 1988: Commercial expert-system boom. 1988 – 1993: subsequent bust.
• Contemporary AI (1986 – 2021+)
• > 1986 Return of neural networks
• > 1987 Probabilistic reasoning & Machine Learning
• > 2001 Big-data explosion
• > 2011 Deep Learning
• > 2021 Generative AI (prototype systems since ≈ 2001)
What AI Can Do Today (examples)
• Robotic / autonomous vehicles • Automated planning / scheduling • Machine translation • Speech recognition • Recommender systems • Game-playing (Go, Chess, video games) • Medical diagnosis & treatment planning • Conversational agents / chatbots
Approaches to AI
Systems Acting Like Humans – Turing Test
• Measures indistinguishability of machine output from a human’s.
• Requires four sub-capabilities:
• Natural-language processing • Knowledge representation • Automated reasoning • Machine learning
• No system has yet passed an unrestricted test.
Systems Thinking Like Humans – Cognitive Modelling
• Build a computational theory of mind (psychology / cognitive science).
• Represent neurons → computational units and implement programs.
Systems Thinking Rationally – Laws of Thought
• Goal: ideal intelligence, independent of human quirks.
• Utilize logic, syllogisms, probability & uncertainty calculus.
Systems Acting Rationally – Rational-Agent View
• Design agents whose actions maximize expected goal achievement.
Toward Beneficial Machines & Value Alignment
• Standard models assume perfect specification of objectives – unrealistic.
• Real systems (e.g., self-driving cars) must operate under incomplete or partially incorrect objectives.
• Value-alignment research: align machine behaviour with true human preferences.
Intelligent Agents
• Agent: entity that perceives its environment via sensors and acts via actuators.
• Artificial Intelligence: study/building of rational agents.
• Rational agent: chooses an action that maximises expected performance measure given the history of percepts.
Formal Agent Function
• Notation: where is the current percept.
• In general: (maps percept → action).
• With percept history: so .
Agent Structure
• Agent = Architecture (hardware / body) + Program (software implementing ).
PEAS Task-Environment Specification
• Performance measure, Environment, Actuators, Sensors – enumerate each when designing an agent.
Environment Properties
• Fully vs Partially observable • Single- vs Multi-agent • Deterministic vs Stochastic • Episodic vs Sequential • Static vs Dynamic • Discrete vs Continuous • Known vs Unknown (rules known?).
Taxonomy of Agents
• Simple Reflex • Reflex w/ State • Goal-Based • Utility-Based • Learning Agents (build on predecessors).
Problem Solving & Search
• Problem-solving agent: goal given, correct action sequence not obvious → plan by search.
• State = configuration of agent + environment.
• State space = set of all reachable states.
• Transition model defines result of each action.
Problem-Solving Process
Goal formulation 2. Problem formulation 3. Search 4. Execution.
• Formal problem = ⟨State space, Initial state, Actions, Transition model, Goal test, Action-cost function⟩.
Search Trees
• Root = initial state; edges = actions; nodes = resulting states.
• Frontier = collection (queue/stack/PQ) of nodes generated but not expanded.
Uninformed (Blind) Search Strategies
• No domain-specific information; know only actions + costs.
Breadth-First Search (BFS)
• Frontier as FIFO queue.
• Explores shallowest nodes first.
• Complete & optimal when all step costs equal.
• Time: Space: where = branching factor, = depth of shallowest goal.
Depth-First Search (DFS)
• Frontier as LIFO stack.
• Explores deep branches; may not terminate in infinite trees.
• Not optimal.
• Time: Space: with = maximum depth.
Depth-Limited Search
• DFS with depth limit ⇒ terminates.
• Incomplete if l < d.
• Time: Space: .
Iterative Deepening Search (IDS)
• Repeated depth-limited for .
• Complete & optimal (unit costs).
• Time: Space: .
Bidirectional Search
• Simultaneous forward search from start and backward search from goal until frontiers meet.
• Potential time but backward model sometimes difficult.
Uniform-Cost Search (UCS)
• Priority queue ordered by path cost .
• Expands cheapest-cost node; complete & optimal for positive costs.
• Reduces to BFS if all costs equal.
Informed (Heuristic) Search
• Utilize problem-specific knowledge (estimated cost to goal).
Greedy Best-First Search
• Priority by only.
• Fast but not optimal; risks dead-ends.
A★ Search
• Priority by .
• Optimal if heuristic is admissible (never overestimates) and consistent.
• Never expand costlier paths than necessary; ignore already-explored states.
Complexity & Strategy Evaluation
• Completeness? • Optimality? • Time complexity (nodes generated). • Space complexity (max frontier size).
Machine Learning Overview
• ML ⊂ AI: enables agents to build models from data and improve performance.
• Deep Learning ⊂ ML: multi-layer artificial neural networks.
• An agent is “learning” when post-observation performance measure improves.
Learning Paradigms
• Supervised learning – learn mapping from inputs to labelled outputs.
• Tasks: classification (discrete labels), regression (continuous value).
• Algorithms: Nearest Neighbors, Decision Trees, Neural Networks, SVMs, Linear Regression.
• Unsupervised learning – find structure / patterns in unlabeled data.
• Task: clustering, association rules.
• Algorithms: K-Means, Hierarchical (Agglomerative, Divisive), Gaussian Mixture Models, Spectral Clustering, Apriori.
• Reinforcement learning – agent learns policy via reward/punishment signals (exploration vs exploitation balance).
• Algorithms: Q-Learning, SARSA, Deep Q-Networks.
Clustering Details (Unsupervised)
K-Means
• Steps:
Choose ; randomly select data points as initial centroids.
Assign each point to nearest centroid.
Recompute centroid = mean of assigned points.
Repeat 2-3 until assignments stabilize.
Hierarchical / Agglomerative Clustering
• Start each instance as its own cluster; iteratively merge closest clusters.
• Produces dendrogram; horizontal cut at desired yields clusters.
• Divisive (top-down) = inverse process.
Gaussian Mixture Models (GMM)
• Probabilistic model assuming data from a mixture of Gaussians; EM algorithm fits parameters.
Spectral Clustering
• Utilizes graph Laplacian eigenvectors for non-convex shapes (e.g., spirals).
Cluster-Quality Metrics
• Homogeneity – similarity within a cluster.
• Heterogeneity – separation between clusters.
Evaluation methods:
• Inertia (within-cluster sum of squares) – lower better. Use Elbow method to pick .
• Procedure: compute distances of each point to its centroid, square, sum.
• Silhouette score (ranges −1…1, higher better) where = intra-cluster distance, = nearest-cluster distance.
Classification (Supervised)
• Dataset: instances (rows) × features (columns) + label column (≥ 2 classes).
• Goal: learn pattern to predict label from unseen features.
Models
• k-Nearest Neighbors – locate nearest training points, vote; choose odd .
• Decision Trees – sequence of feature tests partitions space; prone to over-fit.
• Random Forests – ensemble of many decision trees reduces over-fitting.
• Support Vector Machines – find maximum-margin hyperplane separating classes.
• Artificial Neural Networks – layers of weighted neurons; back-propagation adjusts weights.
Model Evaluation
• Train-test (or cross-validation) split.
• Metrics:
• Accuracy .
• Confusion Matrix yields Precision , Recall .
• .
• Perfect accuracy often indicates over-fitting (poor generalisation).
Ethical / Practical Implications (embedded throughout)
• Specification problems (incomplete objectives) risk unsafe behaviour.
• Over-fitted ML models misclassify novel data; fairness & bias considerations.
• Real-world deployment requires balancing performance, safety, interpretability.
All agent and learning paradigms build upon one another: Reflex → State → Goal → Utility → Learning, mirroring historical progress from symbolic AI to data-driven deep and generative systems.