Forecasting fundamentals and time series techniques (Module summary)
Forecasting fundamentals
- A forecast is a statement about the future value of a variable of interest.
- Forecasts are used to balance supply and demand by providing a planning tool or model; without forecasts, decisions would be guesses and less effective.
- Two important aspects of forecasts:
- Expected level of demand, based on structural variation such as trend or seasonal variation. Examples:
- Web traffic for a retail store tends to trend upward as more people shop online.
- There is seasonal variation, e.g., more hits in December than in other months.
- Forecast accuracy, related to the potential size of forecast error; it measures how close the forecast is to the actual value.
- Forecasting techniques share common features:
- They assume an underlying causal system that existed in the past and will persist into the future.
- Good managers should not rely solely on the model; they must monitor the environment for changes (e.g., tax cuts, new inventions, weather events) that can alter demand.
- Forecasts are imperfect; randomness can interfere with accuracy.
- Forecasts for groups of items are usually more accurate than for individual items due to the canceling effect.
- Canceling effect (group forecasting): forecasting for a group (e.g., all trail bikes) tends to be more accurate than forecasting for individual items because random errors can cancel out when aggregated.
- Forecast accuracy typically declines as the forecasting horizon increases: longer horizons are harder to forecast accurately (e.g., 20-year vs 10-year vs 1-year forecasts).
- Good forecast attributes:
- Timely: provides enough time to ramp up production if needed.
- Accurate: has a measurable error bound.
- Reliable: stable across runs; avoid garbage-in garbage-out.
- Expressed in meaningful units (e.g., dollars, units, etc.) and in a usable format for the audience.
- Simple to understand: not overly complicated; stakeholders should have confidence in the forecast.
- Simpler forecasting techniques (e.g., naive approach) are popular because they are easy to understand, even if they may be less accurate.
Forecasting process steps
Step 1: Determine the purpose of the forecast, which informs the required level of detail (e.g., big-picture planning vs. production scheduling for the next twelve months).
Step 2: Establish the time horizon (daily, monthly, yearly) to understand the required accuracy and suitable techniques.
Step 3: Gather historical data; obtain, clean, and analyze the data; check for accuracy and reasonableness (dirty data will degrade forecasts).
Step 4: Select a forecasting technique.
Step 5: Produce the forecast.
Step 6: Monitor forecast error and adjust as needed; forecasting is a cycle, not a single event, because errors reveal how the model can be improved.
Forecast error: the difference between the actual value and the forecast value.
- If the error falls outside established boundaries (e.g., you forecast 100 bikes and sell 80), a corrective action may be required (revise the model or use a different technique).
- Forecast accuracy is assessed using historical performance through accuracy metrics.
Forecast accuracy metrics (MAD, MSE, MAPE)
There are three common forecast accuracy metrics:
Mean Absolute Deviation (MAD):
MAD = rac{\sum{t=1}^{n} \,|At - F_t|
\n }{n}At = actual value at time t; Ft = forecast at time t; n = number of periods.
Mean Squared Error (MSE):
MSE = rac{\sum{t=1}^{n} (At - F_t)^2}{n - 1}Squaring emphasizes larger errors.
Mean Absolute Percentage Error (MAPE):
MAPE = rac{1}{n} \,\sum{t=1}^{n} \, \left| \frac{At - Ft}{At} \right| \,\times 100Expresses error as a percentage of the actual value.
Worked example (illustrative data from transcript):
- Data (periods 1–5):
- Period 1: A1 = 110, F1 = 107, error e1 = A1 - F_1 = 3 (but transcript presents -3 for F1 from A1; absolute error = 3).
- Period 2: A2, F2, e_2, etc. (not all values shown in transcript).
- Provided summary values in the transcript:
- Sum of absolute errors:
\sum{t=1}^{5} |At - F_t| = 13 - Sum of squared errors:
\sum{t=1}^{5} (At - F_t)^2 = 9.75 - Sum of absolute percentage errors (per-period, in percentage):
\sum{t=1}^{5} \left| \frac{At - Ft}{At} \right| \,\times 100 \,\approx 11.23 - From these sums:
- MAD: MAD = rac{13}{5} = 2.6
- MSE: first compute the mean of squared errors with n - 1 in the denominator: MSE = rac{9.75}{5 - 1} = rac{9.75}{4} \,=\ 2.4375
- The transcript notes the sum of squared errors as 9.75, leading to an MSE of 2.4375 (rounded as appropriate).
- MAPE: using the transcript value, MAPE = \frac{11.23}{5} \approx 2.246\% \approx 2.25\%.
Interpretation:
- MAD provides average magnitude of errors in the original units.
- MSE penalizes larger errors more strongly due to squaring.
- MAPE expresses error as a percentage of actual demand, enabling comparison across items or time periods.
- These metrics are used to compare different forecasting techniques using the same data; the goal is to minimize these error measures.
Forecasting approaches: qualitative vs quantitative
- Qualitative forecasting:
- Relies on soft information and human judgment (e.g., CEO intuition, expert opinion, market perception).
- Difficult to quantify precisely and often used to supplement quantitative methods.
- Quantitative forecasting:
- Relies on hard data and statistical methods; increasingly dominant in the era of big data.
- Provides objective, reproducible results.
- Interaction of approaches:
- Qualitative insights can inform the selection or adjustment of quantitative models (e.g., recognizing anomalies or likely future events).
- Example from transcript: airline passengers may spike in October due to a World Series-related event in a given year; this kind of one-off effect may not repeat next year, so qualitative judgment helps avoid overfitting to such an anomaly.
Time series forecasting concepts (patterns and components)
- Time series: a sequence of observations taken at regular time intervals (e.g., customers served per hour, sandwiches made per day).
- Time series forecast aims to estimate future values by identifying patterns in recent observations.
- Common time series behaviors/patterns (shown in transcript):
- Trend: a general upward or downward movement over time (e.g., an upward trend in demand).
- Irregular variation: a short-lived, unexplained spike (e.g., a weather event causing a sudden surge in demand).
- Random variation: unpredictable fluctuations without a discernible pattern.
- Seasonality: regular, predictable patterns that repeat within fixed periods (e.g., higher demand in December, or weekly patterns such as supermarket sales peaking on weekends).
- Cycles: repeated rises and falls with a length that is not fixed to a specific period (e.g., cycles in housing markets or inflationary periods).
- Examples discussed in transcript:
- Half-hourly electricity demand in England and Wales showing seasonality/recurrence over weeks.
- Monthly sales of new single-family houses in the US from 1973–1995, showing strong seasonality and cyclic behavior with periods roughly 6–10 years.
- Supermarket sales showing weekly seasonality (higher spending on Fridays/Saturdays) and holiday-related seasonal increases.
Forecasting techniques introduced in the module
Naive forecasting:
- Very simple: the forecast for the next period equals the last observed value.
- Works best for stable time series with little to no trend or seasonality; easy to understand and implement.
- Example (transcript): for a stable series with a last data point of 64 (period 5), the naive forecast for period 6 is 64.
Averaging forecasts (includes moving average variants, discussed next):
- Averages recent observations to smooth fluctuations and generate forecasts.
- Useful when the series tends to vary around a central level.
Moving average (simple moving average):
- Forecast is the average of the most recent m actual observations.
- Formula:
\hat{y}t = \frac{1}{m} \sum{i=1}^{m} y_{t-i} - Example from transcript: using a 5-period moving average with data back to period t-5 yields a forecast of 61 (compared to naïve 64).
- Characteristics:
- As new data become available, older data are dropped; tends to lag the actual values.
- Smoothing effect increases with larger m; decreases responsiveness to changes in the series.
- Treats all data points equally (equal weighting).
Weighted moving average:
- More recent observations get higher weights; weights sum to 1.
- Rationale: recent data are typically better predictors than older data.
- General formula:
\hat{y}t = \sum{i=0}^{m-1} wi \, y{t-i}, \, \quad \sum{i=0}^{m-1} wi = 1 - Example from transcript (illustrative weights): use the last three data points with weights 0.5 (most recent), 0.3, and 0.2; forecast becomes 0.5 yt + 0.3 y{t-1} + 0.2 y_{t-2} = 60.4 (rounded to 60 for practicality).
Exponential smoothing (also called single exponential smoothing):
- A weighted moving average where weights decline exponentially for older observations; uses a smoothing constant α ∈ (0, 1).
- Concept: forecast for the next period equals previous forecast plus a portion of the previous forecast error.
- Formula (as described in transcript):
F{t+1} = Ft + \alpha (At - Ft) = (1 - \alpha) Ft + \alpha At - Example from transcript (α = 0.4):
- Period 2 forecast: using naive, F_2 = 60 (from period 1).
- Period 3 forecast: F3 = F2 + 0.4 (A2 - F2) = 60 + 0.4 (65 - 60) = 62.
- Period 4 forecast: F4 = F3 + 0.4 (A3 - F3) = 62 + 0.4 (55 - 62) = 59.2.
- Period 5 forecast: F5 = F4 + 0.4 (A4 - F4) = 59.2 + 0.4 (58 - 59.2) = 58.72.
- Period 6 forecast (as stated in transcript): F_6 = 60.83 (following the same rule with the given α and data).
- Characteristics:
- More responsive than simple moving averages for small α; smoother for small fluctuations when α is small.
- The choice of α influences sensitivity to recent changes.
Overall takeaway from the techniques:
- Four techniques were demonstrated: Naive, Moving Average, Weighted Moving Average, and Exponential Smoothing.
- No single method is best for all data; model performance depends on the time series structure (trend, seasonality, variability).
- The next module would cover more time series forecasting techniques and their applications.
Connections to practice and real-world relevance
- Forecasts are used to balance supply and demand, plan capacity, manage inventory, and schedule production.
- Accuracy metrics (MAD, MSE, MAPE) provide quantitative means to compare forecasting methods and choose an appropriate technique for a given dataset.
- Time series components (trend, seasonality, cycles, irregular and random variations) guide which forecasting method to apply. For example:
- Strong seasonality might favor methods that explicitly account for seasonal effects.
- High random variation may require smoother methods with more history, or alternative approaches that can handle noise.
- Ethical and practical implications:
- Relying on forecasts without considering external changes (policy shifts, technology changes) can lead to poor decisions.
- Overreliance on a single model can reduce resilience; combining qualitative insights with quantitative forecasts can improve robustness.
- Simpler models (like the naive forecast) can be useful benchmarks and are often easier to explain to stakeholders, even if they are not the most accurate.
Quick recap: key takeaways
- Forecasting aims to predict future values and to support decision-making under uncertainty.
- Forecast quality depends on: usefulness of the forecast, accuracy (error magnitude), timeliness, reliability, meaningful units, and simplicity.
- Forecasting involves a cycle: define purpose, determine horizon, collect and clean data, choose method, forecast, monitor and adjust.
- Error metrics MAD, MSE, and MAPE help quantify forecast accuracy and compare methods.
- Time series methods include naive forecasting, moving averages (simple and weighted), and exponential smoothing; each has trade-offs in simplicity, lag, and responsiveness.
- Real-world data exhibit trend, seasonality, cycles, and irregular/random variations; appropriate modeling requires recognizing these patterns and choosing suitable techniques.