Markov Chains: Applications, Weather Example, and Steady State

Technical setup and course context

  • The lecturer starts with a camera issue (one camera offline, one blurry) and fixes it by rebooting hardware; a light anecdote about technical disruptions.
  • The session introduces the week’s topic: applications of previous material, focusing on Markov chains and their real-world implications (e.g., Google’s PageRank).
  • Administrative notes: MATLAB quiz this week; no workshops; grading midterms; two weeks of teaching break; office hours available; students encouraged to seek help if behind;
  • The instructor offers an encouraging message: you are smart, and midterm outcomes can be redeemed by continued effort;
  • A few remarks on the general tone of the course and the instructor’s openness to discussion.
  • Preview: tomorrow’s topic includes the PageRank algorithm; the session frames Markov chains as probabilistic models for systems with a finite set of states.

Markov chains: basic setup and definitions

  • A Markov chain is a probabilistic model used for systems that evolve in steps with probabilistic transitions between a finite set of states.
  • Core components:
    • A finite set of states, often denoted by S = {S, C, R} for weather: Sunny (S), Cloudy (C), Rainy (R).
    • A transition mechanism described by a transition matrix T, encoding probabilities of moving from one state to another in one time step.
  • Markov (memoryless) property: the next state depends only on the current state, not on the history of past states.
  • Reading the transition matrix T:
    • Columns correspond to the from-state (the current state).
    • Rows correspond to the to-state (the next state).
    • Entry T_{ij} is the probability of transitioning to state i given that we are in state j right now.
    • Example (weather): if today is Rainy, the probabilities for tomorrow could be Sunny, Cloudy, Rainy as 1/4, 1/2, 1/4 respectively; this would place those values in the Rainy-column of T.
  • Column sums: each column sums to 1 because probabilities from a given current state must total 100%.
  • Row sums are not generally 1; a row gives how likely it is to end up in a particular to-state from all possible from-states, which need not sum to 1.
  • Die example to illustrate: a system with six states (the faces 1–6) and a transition matrix where every entry is 1/6 (every from-state can go to any to-state with probability 1/6). In this case, each column sums to 1, and all rows are identical.

Transition matrix, state evolution, and notation

  • State distribution at time k is a probability vector x_k with nonnegative entries summing to 1:
    • x_k(i) = P(
      current state at time k = i)
  • Evolution across one step:
    • x<em>k+1=Tx</em>kx<em>{k+1} = T x</em>k
    • This is a standard matrix multiplication: the current distribution xk is transformed by T to yield the next distribution x{k+1}.
  • Initial state vector x_0:
    • Example: starting with Sunny only would be x0=[1 0 0]x_0 = \begin{bmatrix}1 \ 0 \ 0\end{bmatrix} if S, C, R are ordered as 1, 2, 3.
    • If starting from Rainy, x0 would be the unit vector corresponding to R, i.e. x</em>0=[0 0 1]x</em>0 = \begin{bmatrix}0 \ 0 \ 1\end{bmatrix}.
  • Long-run evolution:
    • After k steps: x<em>k=Tkx</em>0x<em>k = T^k x</em>0.
    • The analysis often focuses on the limit as k → ∞, i.e., the steady-state behavior of the chain.
  • Key long-term question: what will the distribution look like after many steps (e.g., after a week, a decade)?

The weather model in detail (example narrative)

  • The model uses three states: Sunny (S), Cloudy (C), Rainy (R).
  • Transition matrix T is a 3×3 matrix with outside-the-head interpretation: columns are from-states (S, C, R), rows are to-states (S, C, R).
  • An explicit example given: when it is Rainy today, tomorrow is Sunny with probability 1/4, Cloudy with probability 1/2, and Rainy with probability 1/4.
  • Implications:
    • The column for Rainy is [141214]\begin{bmatrix} \tfrac{1}{4} \\ \tfrac{1}{2} \\ \tfrac{1}{4} \end{bmatrix}.
    • The first row (to-Sunny) sums to 1/2 in that example; this illustrates that row sums need not be 1.
  • In general, the evolution uses the rule: from current state j, the next-state distribution is given by the j-th column of T.
  • If the system is iterated, the distribution tends to a steady state q, i.e., a fixed distribution that does not change under T:
    • Tq=qT q = q.
    • In this weather example, long-run distributions stabilize: e.g., the steady-state vector may be q=[0.2 0.4 0.4]q = \begin{bmatrix} 0.2 \ 0.4 \ 0.4 \end{bmatrix} (sums to 1).
  • The phenomena observed over time (x1, x2, …) often show convergence toward a stable distribution, with differences between successive distributions shrinking as k grows.

Steady state, regularity, and long-term behavior

  • Regular stochastic matrix: a square matrix T where some power T^m has all entries positive (i.e., every state can be reached from every other state in some number of steps).
  • Consequences of regularity:
    • There exists a unique steady-state vector q with Tq=qT q = q and 1Tq=1\mathbf{1}^T q = 1 (where 1T=[1,1,1]\mathbf{1}^T = [1,1,1]).
    • For any initial distribution x_0, the distribution after many steps converges to q:
    • lim<em>kTkx</em>0=q\lim<em>{k \to \infty} T^k x</em>0 = q, independent of x_0.
  • Interpretation: long-term behavior is determined by the transition probabilities rather than the starting state.
  • In the weather example, as k grows large, the distribution of weather types stabilizes at the steady-state q; the chain forgets its initial condition due to regularity.
  • Practical note: while such a simple Markov model is not a reliable short-term weather predictor, it can yield meaningful long-run averages and is used to motivate ideas like stationary distributions and drift toward equilibrium.
  • Short-term weather forecasting is typically more complicated and involves stiff numerical models; long-term Markov models abstract away fine dynamics to capture average behavior.

Finding and interpreting the steady-state vector

  • Stationary (steady-state) vector q satisfies:
    • Tq=qT q = q and <em>iq</em>i=1,\sum<em>i q</em>i = 1, i.e. qRn,q<em>i0,</em>iqi=1.q \in \mathbb{R}^n, q<em>i \ge 0, \sum</em>i q_i = 1.
  • Linear-algebraic approach to find q:
    • Solve the homogeneous system (TI)q=0(T - I) q = 0 with the normalization constraint 1Tq=1\mathbf{1}^T q = 1.
    • Equivalently, find q in the null space of TIT - I and then scale to sum to 1.
  • A common computational path:
    • Compute a basis for Null(TI)(T - I), then pick the (unique) probability vector in that subspace by enforcing <em>iq</em>i=1\sum<em>i q</em>i = 1.
  • Worked illustration (from the transcript):
    • If the given T is regular, the steady-state vector is unique and can be found by solving the above system.
    • In the example discussed, the steady state was found to be q=[0.2 0.4 0.4]q = \begin{bmatrix} 0.2 \ 0.4 \ 0.4 \end{bmatrix}.
    • This vector is an eigenvector of T associated with eigenvalue 1:
    • Tq=q.T q = q.
  • Relationship to eigenvectors/vectors and eigenvalues:
    • Steady states correspond to eigenvectors of T with eigenvalue 1 (right eigenvectors when x{k+1} = T xk).
    • The discussion foreshooks eigenvectors/eigenvalues as a key topic for understanding Markov chains and many other linear-algebraic systems.

Practical takeaways and connections

  • Long-term independence from initial state:
    • For a regular stochastic matrix, the limit distribution is the same regardless of where you start, provided the chain can reach every state (the well-mixed property).
  • What the model can and cannot tell you:
    • It gives long-term averages and stationary distribution rather than precise short-term forecasts.
    • The actual weather on a specific future day is not predicted exactly; rather, the probabilities of weather types stabilize.
  • Real-world relevance and extensions:
    • Markov chains underpin PageRank (to be discussed in the next lecture).
    • There are extensions to handle more complex real-world phenomena, including non-regular matrices, irreducible vs reducible chains, and aperiodicity concerns.
  • Terminology recap:
    • Probability vector: A vector with nonnegative entries that sum to 1.
    • Stochastic matrix: A square matrix whose columns are probability vectors (i.e., columns sum to 1).
    • Regular matrix: A stochastic matrix whose some power has all entries nonzero (well-mixed, irreducible in the standard sense).
    • Steady-state (equilibrium) vector: A probability vector q with Tq=qT q = q; the long-run distribution of states.

Quick definitions and formulas to memorize

  • Probability vector: p=[p<em>1,p</em>2,,p<em>n]Tp = [p<em>1, p</em>2, \ldots, p<em>n]^T with p</em>i0p</em>i \ge 0 and <em>i=1np</em>i=1.\sum<em>{i=1}^n p</em>i = 1.
  • Transition matrix: T=[Tij]T = [T_{ij}] with columns summing to 1:
    • <em>i=1nT</em>ij=1for all j.\sum<em>{i=1}^n T</em>{ij} = 1\quad \text{for all } j.
  • State evolution: x<em>k+1=Tx</em>kx<em>{k+1} = T x</em>k and x<em>k=Tkx</em>0x<em>k = T^k x</em>0.
  • Steady state: Tq=qT q = q and 1Tq=1\mathbf{1}^T q = 1, i.e. <em>iq</em>i=1.\sum<em>i q</em>i = 1.
  • Null-space formulation: solve (TI)q=0(T - I) q = 0 subject to the normalization constraint.
  • Long-run convergence: for regular stochastic matrix T,
    • lim<em>kTkx</em>0=q\lim<em>{k \to \infty} T^k x</em>0 = q for any initial distribution x_0.

Preview of next topics and closing notes

  • Next lecture: deeper models and specific algorithms, including the PageRank algorithm used by Google.
  • The current discussion sets a foundation for understanding how local transition rules aggregate into global, long-run behavior.
  • The instructor invites questions and emphasizes that grasping these concepts builds intuition for more advanced topics in linear algebra and applied probability.

Connections to prior material and broader context

  • Link to previous lectures: change of coordinates and matrix invertibility themes tie into how we analyze and compute steady states using linear algebra.
  • Foundational principles: linear transformations, eigenvectors/eigenvalues, and the role of probability distributions in discrete state spaces.
  • Real-world relevance: Markov chains model a wide range of phenomena beyond weather (e.g., search ranking, genetics, queueing systems, economics).

Ethical, philosophical, and practical implications

  • Ethically: when modeling real-world phenomena, it is important to acknowledge modeling assumptions (memorylessness, fixed transition probabilities) and the limitations they impose.
  • Philosophically: Markov models embody a principle that long-run behavior can be dictated by local transition rules, sometimes independent of starting conditions, which raises questions about determinism vs randomness.
  • Practically: the choice of whether a Markov model is appropriate depends on the system’s memory, the availability of transition data, and the desired forecasting horizon.

Summary of key takeaways

  • Markov chain = finite states + transition matrix with memoryless transitions.
  • Transition matrix interpretation: columns = from-states; rows = to-states; entries give one-step transition probabilities.
  • State evolution: x<em>k+1=Tx</em>kx<em>{k+1} = T x</em>k; long-run behavior explores the limit of Tkx0T^k x_0.
  • Regular stochastic matrices guarantee a unique steady-state distribution q and convergence lim<em>kTkx</em>0=q\lim<em>{k\to\infty} T^k x</em>0 = q, independent of the starting distribution.
  • The steady-state vector q satisfies Tq=qT q = q with normalization <em>iq</em>i=1\sum<em>i q</em>i = 1, and can be found by solving the linear system (TI)q=0(T - I) q = 0 plus the normalization constraint.
  • In the weather example, the steady state was found to be q=[0.2 0.4 0.4]q = \begin{bmatrix} 0.2 \ 0.4 \ 0.4 \end{bmatrix}, illustrating a non-uniform long-run distribution.
  • The topic sets the stage for algorithms like PageRank and more advanced stochastic models.