EMF 3 - Statistical Inference

0.0(0)

Studied by 0 people

0.0(0)

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/40

There's no tags or description

Looks like no tags are added yet.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

41 Terms

New cards

What does a probability distribution describe?

how uncertainty is spread across possible outcomes of a random variable

New cards

For a continuous variable, how do we compute the probability that it lies between two values a and b?

P(a<X<b) = f(x)dx

the area under the probability density function between a and b

New cards

Why can a probability density be greater than 1?

Because it’s a density, not a probability — only the area under the curve equals 1; height alone isn’t constrained

New cards

What is a percentile?

The value below which a given percentage of observations fall

e.g., 95th percentile means 95% of data are smaller

New cards

What is a z-score?

the number of Standard Deviations (σ²) an observation lies from the mean:

z = (x−μ)/σ

New cards

What is P(|Z|>1.96)?

It’s the total probability in both tails beyond ±1.96, defining the 5% significance region for a two-tailed test.

New cards

Why do financial returns often approximate a normal distribution (at least roughly)?

many small independent influence aggregate to form approximately normal behavior (Central Limit Theorem)

New cards

What is an unbiased estimator?

One whose expected value equals the true parameter, E[β^] = β

New cards

Under MLR.1–5 (including homoskedasticity), what is Var⁡(β^{^}₁)?

Var(β^{^}₁)=σ²/SST_x

New cards

What two factors make estimates more precise =» Smaller Variance?

Lower error variance (σ²)
More variation in x
Larger sample size in x

New cards

What assumption is MLR.5?

Homoskedasticity: the error term u has constant variance

Var⁡(u∣x)=σ²

New cards

Homoskedasticity vs Heteroskedasticity

Homoskedastic: constant error variance across x

e.g.: Test scores (Y) vs. study hours (X)— variance stays constant

Heteroskedastic: variance of errors changes with x

e.g.: Age (x) vs. Income (Y) — young people earn similarly low amounts, but as people age, income differences grow widely

<p><strong>Homoskedastic</strong>: constant error variance across x</p><ul><li><p><em>e.g.: Test scores (Y) vs. study hours (X)— variance stays constant</em></p></li></ul><p></p><p><strong>Heteroskedastic</strong>: variance of errors changes with x</p><ul><li><p><em>e.g.:</em> Age (x) vs. Income (Y) — young people earn similarly low amounts, but as people age, income differences grow widely</p></li></ul><p></p>

New cards

Why can’t we use true errors u_i to estimate σ²?

because true errors are unobsorved

Hence we estimate them through residuals u_^{^}i=y_i−y^{^}_i

New cards

What is the unbiased estimator of error variance?

n = num. of observations

k = num. of regressors

New cards

Why divide by n−k−1 instead of n?

Because we lose k+1 degrees of freedom when estimating parameters (intercept + slopes); this correction makes σ^{^2} unbiased

New cards

What does a standard error measure?

It shows how much an estimated value (like a sample mean or regression coefficient) would vary if you repeated the sample many times

How precise your estimate is

New cards

What happens to SE when you double N?

It shrinks roughly by 1/sqrt(2)

New cards

What are the null and alternative hypotheses in regression?

Null hypothesis (H₀): The coefficient equals zero → the variable is statistically insignificant (no effect on Y)
Alternative hypothesis (H₁): The coefficient is not zero → the variable is statistically significant (has an effect on Y)

<ul><li><p><strong>Null hypothesis (H₀):</strong> The coefficient equals zero → the variable is <strong>statistically insignificant</strong> (no effect on Y)</p></li><li><p><strong>Alternative hypothesis (H₁):</strong> The coefficient is not zero → the variable is <strong>statistically significant</strong> (has an effect on Y)</p></li></ul><p></p>

New cards

What is the t-statistic?

It measures how many standard errors (SE) the estimated coefficient is away from zero.
In other words, it shows how strong the evidence is against the null hypothesis.

👉 Formula:
t = (β̂ − 0) / SE(β̂)

👉 Interpretation:

Large |t| → coefficient likely significant (real effect)
Small |t| → coefficient likely insignificant (random noise)

<p>It measures <strong>how many standard errors</strong> (SE) the estimated coefficient is away from zero.<br>In other words, it shows <strong>how strong the evidence</strong> is against the null hypothesis.</p><p><span data-name="point_right" data-type="emoji">👉</span> Formula:<br>t = (β̂ − 0) / SE(β̂)</p><p><span data-name="point_right" data-type="emoji">👉</span> Interpretation:</p><ul><li><p>Large |t| → coefficient likely <strong>significant</strong> (real effect)</p></li><li><p>Small |t| → coefficient likely <strong>insignificant</strong> (random noise)</p></li></ul><p></p>

New cards

What does a p-value represent?

The probability, assuming H0 true, of observing a t-statistic as extreme or more extreme than the one obtained

New cards

Rejection rule using p-values?

Reject H₀ if p-value < significance level (α)

1%, 5% (most common) or 10%

New cards

Rejection rule using critical values?

Reject H0 if ∣t∣>c_α/2, df, where c is the critical t-value

New cards

Typical significance and confidence levels?

1% (99% confidence), 5% (95%), and 10% (90%)

New cards

Difference between statistical significance and economic significance?

Statistical significance measures how unlikely a result is under H₀
Economic significance measures whether the effect size matters in real terms

New cards

What extra assumption (MLR.6) is needed for exact t-tests?

Normality of errors: u_i∼iid N(0,σ²)

New cards

What happens when sample size increases?

t-distribution converges to Normal
Normality of Errors becomes less critical

New cards

What’s the connection between confidence intervals and hypothesis tests?

If a 100(1−α)% confidence interval for β_j excludes β_j,₀, we reject H₀ at significance level α

New cards

If β^SAT = 0.0057 and se⁡=0.0027, what is t?

t = 0.0057 / 0.0027 = 2.16

New cards

Difference between unbiasedness and statistical significance

Unbiasedness: The estimate is correct on average across many samples
Statistical significance: The estimate in this sample is unlikely due to chance (large t, small p-value)

👉 Unbiased = long-run accuracy
👉 Significant = strong sample evidence

New cards

What are Type I and Type II errors?

Type I: rejecting a true null (false positive)
Type II: failing to reject a false null (false negative)
Significance level = probability of Type I error.

New cards

What is the relationship between an F-statistic and a t-statistic in a regression with one restriction?

F=t²

For a single-parameter test, the F-test and t-test are equivalent and yield identical p-values

New cards

What does the F-statistic test in regression output?

It tests whether a set of coefficients (often more than one) are jointly zero

whether the included regressors explain significant variation in y

New cards

How do you interpret a large F-statistic?

A large F means the null hypothesis (that all tested coefficients equal zero) is unlikely given the data

Indicating at least one regressor in the group is statistically significant.

New cards

What is the null and alternative in an F-test for overall significance?

H0: β₁ = β₂ =...=β_k = 0 vs 𝐻𝐴: at least one 𝛽𝑗 ≠ 0

New cards

What does the homoskedasticity assumption (MLR.5) state?

The variance of the error term is constant across all levels of the explanatory variables: Var⁡(u∣x)=σ²

New cards

What does the Breusch–Pagan (BP) test check for?

It tests whether the squared residuals are systematically related to the explanatory variables — evidence of heteroskedasticity

New cards

How do you interpret the BP test?

If the p-value is small (e.g., <0.05), reject the null of homoskedasticity ⇒ evidence of heteroskedasticity

If the p-value is large, we fail to reject constant variance

New cards

What are the consequences of heteroskedasticity for OLS?

OLS estimates remain unbiased and consistent, but no longer efficient (not minimum variance), and standard errors become biased, leading to invalid inference

New cards

When is an OLS coefficient unbiased?

When the regressor is uncorrelated with the error term: Cov⁡(x_j,u)=0

New cards

A p-value = 0.23 means that the null hypothesis is rejected with probability 23%. True or false?

❌ False. The p-value is the probability of observing data as extreme or more extreme than your sample result if the null hypothesis is true. It is not the probability that the null is true or false

New cards

What does a p-value of 0.23 actually tell you?

If H₀ were true, you’d see a test statistic this large (or larger) about 23% of the time. This is not rare ⇒ fail to reject H₀