Data Transformation Flashcards
- Focus on transformations involving a single variable.
Natural Logs
- Natural logs are frequently used in business and economics for data transformation to simplify analysis and obtain quick answers.
- Natural Logarithm and Exponential Function: Quick Refresh
- If ab=x, then loga(x)=b.
Natural Log
- It is logarithm to the base e, where e is an irrational number approximately equal to 2.7182.
- Notation: ln(x)=loge(x), where x must be positive.
Exponential Function
- Exponent of x is commonly written as ex.
- If y=ex, then x=ln(y).
Approximating Natural Log and Exponential Functions
- Approximations are used to simplify formulas and get quick answers, but they only work if x is small.
- For small x, ln(1+x)≈x.
- Exponential approximation: ex≈1+x.
Investment Growth Example
- Assume an investment grows at a rate of growth r, starting with an initial value x0.
- Value of investment at time t is denoted as x(t).
- At time 0, the investment value is x0.
- After one time period, the value is x0(1+r).
- After two time periods, the value is x0(1+r)2.
- After t time periods, the value is x0(1+r)t.
- If r > 0, there is exponential growth.
- If r < 0, there is exponential decay.
Numerical Example
- Investing $100 for 10 years at 3% interest yields $100 * (1 + 0.03)^{10} = $134.39.
Application of Natural Logs to Economics Data
- Economic and financial data often resemble exponential functions when graphed.
- Natural logs are helpful in linearizing exponential growth for regression analysis.
Linearizing Exponential Growth
- Start with the exponential growth formula: x(t)=x0(1+r)t.
- Apply the log transformation to both sides: ln(x(t))=ln(x0(1+r)t).
- Using log rules, rewrite the right side: ln(x(t))=ln(x0)+ln((1+r)t).
- Further simplification: ln(x(t))=ln(x0)+t∗ln(1+r).
- Approximation for small r: ln(1+r)≈r.
- Final linearized equation: ln(x(t))≈ln(x0)+t∗r.
Proportionate Changes
- Proportionate change in x is defined as x<em>0Δx=x</em>0x</em>1−x<em>0.
- It represents the new value minus the old value, divided by the old value.
- Example: Price inflation calculation.
Stata Demonstration
- Opening the real GDP per capita dataset in Stata.
- Using the
describe command to understand the file. - Using square brackets to refer to specific observations in Stata (e.g.,
year[_n]).
Calculating Price Inflation in Stata
- Inflation in 1930 is calculated as (Price in 1930 - Price in 1929) / Price in 1929.
- Stata command:
generate inflation1 = (price[_n] - price[_n-1]) / price[_n-1]. - Alternative approach using difference and lag functions:
generate inflation2 = D.price / L.price.
Poll Everywhere Questions
- Question about calculating the yearly inflation based on monthly data.
- Question about approximating ex for small x.
- The correct formula for approximating ex is 1+x.
Approximating Proportionate Changes with Logs
- For small proportionate changes, the change in the log of x is approximately equal to the proportionate change in x: Δln(x)≈xΔx.
- Example: If x<em>0=40 and x</em>1=40.4, then ln(40.4)−ln(40)≈0.01, which is close to the proportionate change in x.
- Good approximation for series with relatively small changes.
Compounding and the Rule of 72
- The rule of 72 estimates how long it takes for an investment to double.
- If you invest $1 at interest rate R, after N time periods, you'll have (1+R)N.
- To find when the investment doubles (becomes $2), solve 2=(1+R)N for N.
- Applying logs: ln(2)=N∗ln(1+R).
- Solving for N: N=ln(1+R)ln(2)≈Rln(2).
- Using the approximation ln(2)≈0.69, leads to the rule of 69.
- Since 72 is more convenient (divisible by 3, 4, 6, 8, 9), we use it instead, giving the rule of 72.
- Formula: N≈R72, where R is in percent.
- Example: At an 8% interest rate, an investment doubles in approximately 9 years.
- Data in economics is often skewed to the right due to high outliers.
- Example: Income data.
- Log transformation makes the data more symmetrical.
Compound Interest Rates
- If nominal interest rate is R and there are N compounding periods per year, the effective interest rate is given by:
- Reffective=(1+NR)N−1
- R is the Annual Percentage Rate (APR).
- Reffective is the Annual Percentage Yield (APY).
- For continuous compounding, the formula converges to eR as N approaches infinity.
- Standardized Scores (Z-scores)
- Z<em>i=sx</em>i−xˉ, where xˉ is the sample mean and s is the sample standard deviation.
- Moving Averages
- Average of observations in several successive time periods.
- Simple Moving Average: Average of current and immediate past observations.
- Three-period moving average: 3x<em>t+x</em>t−1+xt−2.
- Centered Moving Average: Current observation is in the middle, one before and one after
- 3x<em>t−1+x</em>t+xt+1.
- Reasons for Using Moving Averages
- Reduces random noise in the data.
- Smooths business cycle variations and seasonal variations.
- Seasonal Adjustments
- Adjusting for seasonal variation.
- Real vs. Nominal Data
- Per Capita Data
- Divide by the size of population.
- Growth Rates and Percentage Changes
- One-period percentage change: Xt−1X<em>t−X</em>t−1∗100.
- Often converted to annualized rates.
- Percent vs. Percentage Point
- Difference between them.
- Basis Point: 1/100 of a percentage point.
Practice Questions
- If the interest rate is 4%, how many years will it take for the investment to double?
- Answer: 72 / 4 = 18 years.
- If x increases from 500 to 520, what is the proportionate change in x?
- Answer: (520 - 500) / 500 = 0.04.
- If x increases from 500 to 520, what is the absolute change in the log of x?
- Answer: ln(520) - ln(500) = 0.0392.
- Three-period simple moving average at time t = 4
- Three-period centered moving average at time t = 4
- The interest rate increases from 5.05 to 5.06. The increase based on percent is
Stata Demonstration of Moving Averages
- Calculating simple and centered moving averages in Stata.
- Demonstration of importing and formatting date variables in Stata.
- Using the
date function to convert string variables to date variables. - Formatting dates using the
format command.