1/232
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
What is a time series?
A sequence of data points recorded at regular time intervals.
What are some examples of time series frequency?
Daily, weekly, monthly, quarterly, yearly.
What is the primary purpose of forecasting in organizations?
Better planning and informed decision-making.
What are two objectives of time series analysis?
Explain patterns in historical data. 2. Predict future values of the time series.
What is a time series model?
A model that quantifies the relationship between current and historical values, and current exogenous variables.
What are the two criteria for a good time series model?
Parsimonious structure. 2. High forecasting performance.
What is R Markdown used for?
To generate reproducible analytics reports combining R code and text.
What is the importance of good visualization in time series analysis?
It helps in understanding series patterns and exceptions, preparing for modeling.
What is the function to convert a vector into a time series object in R?
ts(data, start, end, frequency)
How can data from an Excel file be imported into R?
Using readexcel(file, colnames=T) or read.csv(file, header=T).
What is the purpose of the 'window' function in R?
To subsample a time series.
What is a key benefit of hierarchical forecasting in the retail industry?
It allows for better planning across different subcategories and regions.
What are some applications of time series forecasting in economics?
Predicting inflation and unemployment rates.
What is one way to achieve the objectives of time series analysis?
By learning a time series model based on observed historical data.
What is the significance of the 'forecast' R package?
It provides functions for time series forecasting.
What is the expected outcome of mastering time series forecasting techniques?
Proficiency in implementing statistical learning techniques for real-world business decision-making.
What is the role of exogenous data in time series forecasting?
It can provide additional context and improve prediction accuracy.
What are the main components of a time series decomposition?
Level, trend, seasonality, exogenous factors, and noise (𝜖).
What is the difference between additive and multiplicative decomposition?
Additive: Y = level + trend + seasonality + noise; Multiplicative: Y = level × trend × seasonality × noise.
What is the purpose of using regression-based time series forecasting?
To capture each component of yt using covariates xt without including the history of Y itself.
How can seasonality be captured in regression models?
Using dummy variables or Fourier series.
What is a linear trend model?
A model where trend = β0 + β1t, representing a deterministic trend.
What is the significance of stationarity in time series forecasting?
It assumes that the future behavior of the time series is similar to its past behavior in a probabilistic sense.
What does the function tslm() do in R?
Estimates a regression-based time series model.
What are the two approaches to handle multiple seasonality in time series?
1) Remove one seasonality via aggregation; 2) Add two sets of dummy variables or Fourier series.
What is the role of the noise term (𝜖) in time series models?
It represents random, unpredictable variations and ideally should be a stationary process.
What is the purpose of Fourier series in time series analysis?
To capture seasonality using periodic sine and cosine functions.
What is a dynamic regression model?
An extension of regression models that imposes autocorrelation on the noise term (𝜖).
What is the function of the frequency parameter in time series analysis?
It indicates the number of observations per full seasonal cycle.
What is the main advantage of using dummy variables for seasonality?
They are simple, flexible, and interpretable.
What is the disadvantage of using Fourier series for seasonality?
They are less interpretable compared to dummy variables.
What is the Box-Cox transformation?
A method to stabilize variance and make the data more normally distributed, often used in conjunction with regression models.
What does the residual process (et) represent in a fitted model?
It is an estimate of the noise process (𝜖), calculated as et = yt - ŷt.
What is the significance of the trend component in time series?
It indicates long-term growth or decline in the data.
What is the purpose of the Amtrak Ridership time series analysis?
To quantify components of ridership patterns and create forecasts for the next 12 months.
What are the key R functions mentioned for time series forecasting?
ts(), tslm(), fourier(), and functions for data aggregation like window() and aggregate().
What is the expected outcome of time series forecasting?
To create reliable predictions about future data points based on historical patterns.
What is the difference between additive and multiplicative models in time series?
Additive models sum the components, while multiplicative models multiply them.
What R functions are commonly used for regression-based time series forecasting?
ts(), window(), aggregate(), tapply(), tslm(), and fourier().
What is the purpose of the Box-Cox transformation in time series analysis?
To stabilize variation of time series over time.
What is a point forecast in regression-based models?
The expectation of the predicted value based on the estimated regression model.
What is the significance of training and validation data in time series forecasting?
Training data is used for model estimation, while validation data is used for model evaluation.
What are common accuracy measures for forecasting models?
MAE (Mean Absolute Error), MAPE (Mean Absolute Percentage Error), RMSE (Root Mean Squared Error), ME (Mean Error), and MPE (Mean Percentage Error).
How can non-stationarity in a time series be handled?
By using exogenous covariates or estimating the model only with observations after an event.
What is the formula for a linear trend with seasonal dummies in regression-based time series?
yt = β0 + β1t + γ2Dt,2 + γ3Dt,3 + … + γmDt,m + εt.
What is the purpose of the forecast() function in R?
To generate h steps ahead forecast based on an estimated time series model.
What does the accuracy() function do in R?
Computes the error metrics for forecasting by comparing predicted values with actual values.
What is the role of the validation period in time series partitioning?
To assess the model's forecast performance on unseen data.
What is the expected outcome of a model that fits the data well?
It may not necessarily forecast well due to the risk of overfitting.
What is the significance of the λ parameter in the Box-Cox transformation?
It controls the type of transformation applied; λ=0 corresponds to log transformation.
What is the formula for the MAPE metric?
MAPE = 100 × (1/T) ∑ |et/yt|, where et is the forecast error.
What is the function of the tslm() in R?
To estimate a regression-based time series model.
What is the importance of the forecast horizon in time series forecasting?
It determines how far into the future the forecasts will be made.
What is the implication of having a training data set that has never seen the validation data?
It ensures that the model's performance is evaluated on truly unseen data.
What is the formula for the RMSE metric?
RMSE = sqrt((1/T) ∑ et^2), where et is the forecast error.
What does the term 'seasonality' refer to in time series analysis?
The periodic fluctuations that occur at regular intervals, such as daily, weekly, or monthly.
What is the expected result of using a model with too many parameters?
It can lead to overfitting, where the model fits the training data perfectly but performs poorly on new data.
What does the term 'exogenous factors' mean in the context of time series forecasting?
External influences that can affect the time series, such as promotions or holidays.
How is the log transformation related to the Box-Cox transformation?
The log transformation is a specific case of the Box-Cox transformation when λ=0.
What are the two main types of time series models discussed?
Additive and multiplicative models.
What is the difference between trend and seasonality in time series forecasting?
Trend is a non-periodic long-term movement, while seasonality refers to periodic fluctuations such as daily, weekly, or monthly effects.
What is the purpose of Box-Cox transformation in time series forecasting?
To stabilize variance and make the data more normally distributed, allowing for better modeling.
What is a point forecast?
A mean estimation problem that predicts the expected value of 𝑦𝑇+ℎ.
What is an interval forecast?
A forecast that establishes a range (e.g., a < 𝑦𝑇+ℎ < b) with a specified probability, indicating uncertainty in predictions.
What are macroeconomic indicators?
Time series data that measure aggregate economic activity, such as GDP, unemployment rate, inflation, or retail sales.
What is the significance of the CPI index for the Federal Reserve?
It is a key index used to set the Federal Fund Target Rate (FFTR).
What R functions are commonly used for regression-based time series forecasting?
Functions include ts(), window(), aggregate(), tapply(), tslm(), fourier(), cbind(), time(), forecast(), and accuracy().
How is data partitioning structured for time series forecasting?
Training data typically consists of the first 25 years, while validation data consists of the subsequent 3 years.
What is the purpose of using seasonal dummies in regression models?
To account for seasonal effects in the data by including indicator variables for different seasons.
What is the role of exogenous variables in time series forecasting?
Exogenous variables are external factors that can influence the dependent variable, such as promotions or economic events.
What does the term 'non-stationarity' refer to in time series analysis?
Non-stationarity refers to a time series whose statistical properties, like mean and variance, change over time.
What is the effect of the 2008 financial crisis in time series modeling?
It can be modeled as an indicator variable to assess its impact on economic indicators.
What are the key steps in constructing a data matrix for exogenous variables?
Create binary indicators for events (e.g., post-crisis) and combine them with other relevant variables.
What is the purpose of the 'forecast' function in R?
To generate forecasts based on fitted models and evaluate their accuracy.
What is the significance of the 'fourier' function in time series analysis?
It is used to model seasonal patterns in time series data through Fourier series expansion.
What are the common metrics for forecast evaluation?
Metrics include Mean Error (ME), Mean Percentage Error (MPE), Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), and Root Mean Square Error (RMSE).
What does the term 'training data' refer to in the context of time series forecasting?
Data used to fit the model, typically consisting of historical observations.
What does 'validation data' refer to in time series forecasting?
Data used to test the model's predictive performance, typically consisting of more recent observations.
What is the purpose of data visualization in time series analysis?
To identify patterns, trends, and anomalies in the data that can inform forecasting models.
Differencing
Taking the difference between two observations of the time series.
Lag-1 difference
𝑤𝑡= 𝑦𝑡−𝑦𝑡−1, for (approximately) removing trend.
Lag-𝑚 difference
𝑤𝑡= 𝑦𝑡−𝑦𝑡−𝑚, for removing seasonality with 𝑚 seasons.
Double-differencing
Difference the differenced series.
Simple moving average (SMA)
Forecast future local mean by using an average of several past points: SMA𝑡= 𝑦𝑡+ 𝑦𝑡−1 + ⋯+ 𝑦𝑡−𝑤+1 /𝑤.
Key parameter of SMA
Width of window 𝑤.
Effects of different window sizes
The more we smooth the data (i.e. larger 𝑤), the more clearly we see the 'signal' instead of the 'noise'.
Visualization with SMA
SMA can also be used as a visualization tool, which helps reveal the local level of a time series.
Forecast with SMA
SMA can only be used to forecast short-term local level, cannot capture long-term trend or seasonality.
Simple exponential smoothing (SES)
The simple moving average (SMA) model takes an equally-weighted average of the last 𝑤 data points.
Local level model
After differencing, we have 𝑦𝑡= 𝑙𝑡−1 + 𝑒𝑡, where 𝑙𝑡−1 is the local mean of the time series.
Global mean model
𝑦𝑡= 𝜇+ 𝑒𝑡.
Smoothing methods in R
Functions include diff(), rollmean(), ets().
Time series decomposition
Decompose 𝑌 into several components: 𝑌= level + trend + seasonality + 𝜖.
Trend
Non-periodic pattern over time, can be upward or downward.
Seasonality
Periodic pattern that repeats every season, such as daily, weekly, or monthly effects.
Forecasting future local mean
We know that 𝔼𝑦𝑡+1 = 𝑙𝑡.
SMA formula
SMA𝑡= 𝑙𝑡−1 + ⋯+ 𝑙𝑡−𝑤/𝑤 + 𝑒𝑡 + ⋯+ 𝑒𝑡−𝑤+1 /𝑤.
Smoothing vs. Responsiveness
Greater clarity in signal comes at the expense of getting the latest signal later.
Random noise/shock
Represented by 𝑒𝑡∼0, 𝜎2 in the local level model.
Average in SMA
SMA takes an equally-weighted average of the last 𝑤 data points.