Time Series Basics: Concepts, Forecasting Approaches, and Key Components

Why time series analysis?

  • Time plays a major factor in many decisions across organizations.
  • Examples:
    • Retail scheduling: retailers hire more employees in the fourth quarter due to holiday sales.
    • UPS increases hires in Q4 to meet higher shipping demand.
    • Healthcare experiences enrollments in the fall (around October–November).
  • Takeaway: time-based factors influence outcomes and decision-making; capturing this impact is valuable for planning.

Why forecasting is difficult

  • Forecasting uses past experience and knowledge (your own, competitors, and other learnings) to predict the future.
  • Two main challenges:
    1) Developing the knowledge and models is hard; tools exist but the knowledge isn’t easy to build.
    2) Business changes over time (e.g., retail industry in shambles) reduce the predictive power of past behavior.
  • Nevertheless, time series forecasting is highly sought after because any insight (even around 20–25%) is valuable when planning for the future.
  • Without insight, organizations face uncertainty during time periods of change.

Common forecasting approaches

  • Judgmental forecasting:
    • Based on experience and personal judgment; often lacks a formal scientific basis.
  • Extrapolation (quantitative):
    • Uses time series data to extract the time-specific impact on the output.
    • Techniques include moving averages, cumulative moving averages, seasonality extraction, and seasonality indices.
  • Econometrics:
    • Uses regression to model how time (and other variables) affects the dependent variable.
    • Can include variables like advertising, market changes, product development, etc.

Extrapolation: how it works (quantitative model)

  • Data source: past time series data (e.g., quarterly sales data).
  • Steps in the example workflow:
    • Calculate a moving average: MA<em>t=1n</em>i=0n1ytiMA<em>t = \frac{1}{n} \sum</em>{i=0}^{n-1} y_{t-i}
    • Compute a cumulative moving average: CMA<em>t=1t</em>i=1tyiCMA<em>t = \frac{1}{t} \sum</em>{i=1}^{t} y_i
    • Identify seasonality from the series: extract seasonal patterns within the time frame.
    • Develop a seasonality index: Ij=Average in season jOverall averageI_j = \frac{\text{Average in season } j}{\text{Overall average}}
    • Deseasonalize the data: y<em>t=y</em>tIjy<em>t^* = \frac{y</em>t}{I_j} where j indicates the season/quarter (e.g., Q1, Q2, Q3, Q4).
    • Fit a regression/forecast model on the deseasonalized data to obtain a fitted value: \hat{y}_t^{reg}.
    • Forecast and re-seasonalize using the seasonality index: the forecast is based on the regression fit, then scaled back by the appropriate season index.
  • Example of seasonality indices (quarters):
    • Q1: 1.0 (baseline)
    • Q2: 1.1 (10% above average)
    • Q3: 1.05 (5% above average)
    • Q4: 0.96 (4% below average)
    • These indices show how much sales deviate from the average within a year.
  • Visual outcome you would see:
    • Historical data (in black) and forecast (in red) on a chart.
    • Example dataset described as fifteen years of data used to forecast four quarters for year 2017 and four for 2018.
  • Summary of the process: deseasonalize → regression/fitted value → forecast → reapply seasonality to obtain final forecast.

Problems with this extrapolation approach

  • Perceived as simplistic because it focuses mainly on time without other drivers.
  • Cannot easily incorporate other factors like new product development, market changes, or other external influences.
  • Limitations arise because the model is primarily time-based and may miss important drivers of change.

Econometric models (brief overview)

  • Econometric models use regression to quantify how time and other variables affect the dependent variable (e.g., sales).
  • They allow including multiple explanatory variables (e.g., monthly advertising, promotions) to explain changes.
  • They are more flexible than simple moving-average methods but come with challenges.

Shortcomings of econometric models

  • Mathematical and identification challenges: it can be hard to determine what is actually causing observed changes.
  • Lag effects: sometimes the effect of a variable is not immediate.
    • Example discussed: spending $50,000 on advertising in Quarter 1 might not produce sales increases until Quarter 2 due to lag.
  • Autocorrelation: current values can be correlated with past values, complicating inference and model validity.
    • Inventory example: if a product is out of stock today, it is likely to be out of stock tomorrow, creating a dependency between consecutive observations.
    • Autocorrelation can distort standard statistical assumptions and forecasting if not addressed.
  • The recording suggests that autocorrelation will be tackled in a future video; for now, focus on understanding the basic concepts.

Three key concepts in time series (as introduced)

  • Trend:
    • How the dependent variable changes over time, across an extended period (e.g., 15 years).
    • Question to answer: Is the variable increasing, decreasing, or stable over time?
  • Seasonality:
    • A repeating pattern within a defined time period, typically within a year (e.g., quarters).
    • Within-year patterns: e.g., Quarter 1 to Quarter 2, Quarter 2 to Quarter 3, Quarter 3 to Quarter 4 changes indicate seasonality.
    • Could be defined over other periods (e.g., one week). Example question: do we sell more on Mondays vs. Wednesdays?
  • Cycle (cyclical component):
    • Similar to seasonality but with a longer time frame.
    • Example: economic cycles spanning a decade, such as recessions roughly every ten years.
    • A cycle is a longer-term fluctuation that is not tied to a fixed calendar period.

Practical implications and connections

  • Forecasting provides actionable insight even if imperfect (e.g., a partial forecast can guide inventory, staffing, and marketing decisions).
  • Understanding time series components (trend, seasonality, cycle) helps in selecting appropriate forecasting methods and interpreting results.
  • Recognizing limitations (lag, autocorrelation, and missing drivers) points to when econometric models or more sophisticated approaches may be warranted.
  • The discussion foreshadows more advanced econometric techniques that place less emphasis on purely time-based factors and more on causal drivers.

Quick recap of formulas to remember

  • Moving average (n-period): MA<em>t=1n</em>i=0n1ytiMA<em>t = \frac{1}{n} \sum</em>{i=0}^{n-1} y_{t-i}
  • Cumulative moving average: CMA<em>t=1t</em>i=1tyiCMA<em>t = \frac{1}{t} \sum</em>{i=1}^{t} y_i
  • Seasonality index (per season j): I<em>j=yˉ</em>jyˉI<em>j = \frac{\bar{y}</em>j}{\bar{y}}
    • where \bar{y}_j is the average in season j and \bar{y} is the overall average
  • Deseasonalized value: y<em>t=y</em>tIjy<em>t^* = \frac{y</em>t}{I_j}
  • Forecast after deseasonalizing and fitting: y^<em>t+hreg\hat{y}<em>{t+h}^{reg} is the regression forecast on the deseasonalized data, then reapply seasonality: y^</em>t+h=y^<em>t+hregI</em>j+h\hat{y}</em>{t+h} = \hat{y}<em>{t+h}^{reg} \cdot I</em>{j+h}

Note on sources and scope

  • This set of notes captures the key ideas, methods, examples, and terminology presented in the recording.
  • It emphasizes the rationale for time-based forecasting, the main forecasting approaches discussed, the specifics of an extrapolation-based workflow with seasonality, and the core concepts of trend, seasonality, and cycle.