Statistical Machine Learning Algorithm Design

The framework is a useful guideline, not a strict set of rules.
Recipe boxes simplify complex ideas, making it easier to understand subtle and complicated issues later.

Objective: Define the mathematical model of the machine learning algorithm's operating environment.
Insights gained: This step helps determine the learning paradigm (supervised, unsupervised, reinforcement learning) and the structure of data.
- Learning Paradigms: Identifying patterns without explicit instructions suggests unsupervised learning.
- Pattern Structure: How data is represented, including feature vectors and feature maps (transforming real-world data into feature vectors).
- Stationary Statistical Environment: Determining if the probability of observing a feature vector remains constant across different learning trials.
  - Non-Stationary Environment Example (Stock Market): Statistical irregularities in stock market data from 50 years ago are likely different from today, making the environment non-stationary. Data from consecutive, shorter periods (e.g., 5-10 years) might be more stable. The financial environment is often slowly shifting.
  - Non-Stationary Environment Example (Robot Learning): A robot learning to recognize images in its environment changes its statistical environment by actively moving and looking at different things. The environment is non-stationary because it changes as a function of the robot's current knowledge and actions.
This initial modeling provides rich information before even considering the specific machine learning algorithm.

Simplification: Keep the architecture as simple as possible initially.
Circles and Arrows Notation:
- Units (Circles): Represent computing functions with a real-valued