How to Predict the Future – Long Tails

Overview

This document serves as an exhaustive study guide based on the topics covered in the lecture titled "How to Predict the Future – Long Tails." The central focus is on understanding predictions in the context of statistical distributions, particularly long-tail distributions that arise in various fields, including finance and social sciences.

Today's Agenda

The agenda for the discussion includes the following key topics:

  1. Review the Central Limit Theorem (CLT)

  2. Describe the problem of tail risk

  3. Analyze the circumstances under which the assumptions of the CLT may be violated

  4. Introduce long tail probability distributions

Warmup: The Galton Board

  • Definition and Purpose: The Galton Board is a device that demonstrates the Central Limit Theorem by showing how random processes can result in a normal distribution, typically a bell curve.

  • Explanation: Using the definition of the Central Limit Theorem, the Galton Board produces a bell curve shape because it involves a series of independent trials where the aggregate effect of a large number of random events ultimately leads to a normal distribution of outcomes.

Tail Risk

  • Definition: Tail risk refers to the risk of extreme results that fall far outside of what is expected based on standard statistical assumptions.

  • Significance: It's particularly relevant in various sectors such as financial analysis, military planning, and emergency management. Tail risk helps in quantifying the likelihood of unlikely but high-impact events.

  • Example from Previous Discussion: Tail risk was linked to the margin of error in political polling, providing a concrete application of the concept.

Bell Curves and Tail Risk

  • Standard Deviation (σ): Standard deviation is a measure of the spread of a distribution and indicates how far, on average, individual observations deviate from the expected value.

  • Sigma Events:

    • A "2 sigma" event occurs with a likelihood of only 5%.

    • A "3 sigma" event has a probability of about 0.3%.

Financial Markets Example

  • Observed Behaviors: In financial markets, the behavior of price movements often appears to align with a bell curve; for instance, significant price changes over 1.68% are expected to happen only about once in every 20 trading days. This aligns with the observation that on 94% of trading days, prices fluctuate within predictable ranges.

The Problem of Extreme Events

  • Case Study: On March 16, 2020, the S&P 500 index experienced a decline of 11.98%, an event described as a "14 sigma" event.

    • Probability Estimation: According to the traditional bell curve, the likelihood of a 14 sigma event is extraordinarily low, approximated at 0.00000000000000000000000000000000000000000077935370.0000000000000000000000000000000000000000007793537%. This estimation suggests that such an extreme occurrence is virtually beyond expectation—if the New York Stock Exchange had existed since the universe began (13.8 billion years ago), it would still be improbable to witness such a price drop.

  • Conclusion: This indicates that actual financial markets exhibit greater tail risk than what conventional bell curves would predict (as noted by Taleb in 2007).

    • Critical Inquiry: Considerations arise about which aspects of the Central Limit Theorem are violated in these situations.

Violations of the Central Limit Theorem

Violation 1: Multiplicative Processes
  • Scenario Setup: Consider gamblers starting with $100 who participate in wagers that change their wealth by ±10% with equal probability. Despite independent random events, the outcome is multiplicative rather than additive.

  • Wealth Distribution Consequence: After many wagers, wealth distribution will not conform to typical bell curve expectations due to the multiplicative nature of these events.

  • Distribution Example: This scenario leads to a lognormal distribution, which is characterized by its behavior on a logarithmic scale.

Lognormal Distributions
  • Characteristics: Lognormal distributions arise from outcomes that grow multiplicatively over time.

    • Real-World Example: A common manifestation of lognormal distributions is seen in city population sizes across the United States.

Violation 2: Interdependence
  • Scenario of Student Clubs: Students can either create a new club or join an existing one, with a tendency to join popular clubs, demonstrating a preference for certain outcomes that are not independent.

    • Outcome: The size of clubs grows based on the popularity of clubs, violating the independence assumption inherent in the Central Limit Theorem and leading to distributions highlighting interdependent choices rather than independent events.

Power Laws
  • Definition: Distributions that arise from interdependent processes are termed power laws, displaying a unique shape characterized by their frequency of large versus small events.

  • Examples of Power Law Distributions: Common applications include social media subscriber counts, word usage frequency, occurrences of earthquakes, and incidences of armed conflict. Each of these demonstrates a snowball effect whereby larger outcomes have a greater likelihood of occurring than smaller ones.

Key Takeaways

  1. For outcomes following a bell curve, tail risk tends to be low, correlating to the empirical rule of 68-95-99.7%.

  2. When the sum or independence assumptions of the Central Limit Theorem are breached, long tails become prevalent in the resulting distributions.

  3. Lognormal and power law distributions are significantly influenced by extreme events that occur more frequently than predicted by a standard bell curve.

References

  • Taleb, Nassim Nicholas. 2007. The Black Swan: The Impact of the Highly Improbable. London: Allen Lane.