K

Hypothesis Testing Lecture

Hypothesis Testing and Statistical Inference

Overview of Hypothesis Testing

  • Definition: Hypothesis testing is a statistical method used to make inferences or draw conclusions about population parameters based on sample data.

  • Key Questions:

    • Is it likely that our observed results could have occurred by chance?

    • How confident can we be that our sample estimate reflects the true population parameter?

Fundamentals of Hypothesis Testing

  • Sample vs Population: In hypothesis testing, we work with samples from a population; thus, we do not know population parameters.

  • Statistical Hypothesis: A statement or assumption regarding a population parameter that we aim to test.

    • Example: It could be about the mean, proportion, or regression coefficients.

  • Decision-Making: Hypothesis testing involves deciding whether or not to reject a statistical hypothesis based on sample data.

Formulation of Hypotheses

  • Researcher Statements: The first step in hypothesis testing involves stating the null and alternative hypotheses clearly.

  • Null Hypothesis (H0): The hypothesis that indicates no effect or status quo, which the researcher does not expect to see.

    • Example 1: If expecting a positive coefficient:

    • H0: β ≤ 0 (null: no positive effect)

    • HA: β > 0 (alternative: positive effect exists)

    • Example 2: If expecting a negative coefficient:

    • H0: β ≥ 0 (null: no negative effect)

    • HA: β < 0 (alternative: negative effect exists)

Types of Hypotheses

One-Sided vs Two-Sided Tests
  • One-Sided Tests: Tests that evaluate the possibility of the relationship in one direction only.

    • Right-Sided:

    • H0: β ≤ 0

    • HA: β > 0

    • Left-Sided:

    • H0: β ≥ 0

    • HA: β < 0

  • Two-Sided Tests: Tests that evaluate the possibility of a relationship in both directions.

    • Formulation:

    • H0: β = 0

    • HA: β ≠ 0

Testing Techniques

  • Typical Testing Techniques in Econometrics:

    • Methods for Testing Hypotheses:

      1. t-test

      2. p-value

      3. Confidence interval

Decision Rules in Hypothesis Testing

  • Decision Rule: Method to conclude whether to reject the null hypothesis by comparing a sample statistic to a critical value.

  • Critical Value: A threshold that separates acceptance and rejection regions in hypothesis testing, derived from statistical tables based on the test statistic used.

  • Regions:

    • Acceptance Region: Where we fail to reject H0.

    • Rejection Region: Where we reject H0 based on the critical value.

One-Sided Test of β

  • Concept: It assesses whether the estimated coefficient from sample data falls within the defined acceptance or rejection regions.

Two-Sided Test of β

  • Concept: Similar to one-sided tests but allows for both positive and negative values, testing the hypothesis against zero.

t-Test

  • Purpose: The t-test is employed to test hypotheses regarding individual slope coefficients in regression analysis.

  • Conditions for Usage:

    1. The stochastic error term (ϵ) should be normally distributed.

    2. The variance of the distribution must be estimated.

  • Formula: For typical multiple regression equations:

    • Yi = β0 + β1X{1i} + β2X{2i} + … + βKX{Ki} + ϵ_i

    • To calculate the t-statistic for the kth coefficient:

    • tk = \frac{(\hat{βk} - β{H0})}{SE(\hat{βk})} where:

      • \hat{β_k} = estimated regression coefficient for the kth variable

      • β_{H0} = hypothesized value of the coefficient

      • SE(\hat{βk}) = standard error of \hat{βk}

Example of t-Test

  • For a two-sided test where H0: β = 0, the t-statistic can be simplified to:

    • tk = \frac{(\hat{βk} - 0)}{SE(\hat{βk})} = \frac{\hat{βk}}{SE(\hat{β_k})}

Level of Significance (α)

  • Definition: The maximum probability of committing a Type I Error (rejecting a true null hypothesis).

  • Selection: Should be chosen prior to testing; common levels are:

    • 10% (0.10)

    • 5% (0.05)

    • 1% (0.01)

  • Implications: Lowering α increases the chance of a Type II error (failing to reject a false null hypothesis).

Types of Errors in Hypothesis Testing

  1. Type I Error: Rejecting H0 when it is true (finding an effect that does not exist).

  2. Type II Error: Failing to reject H0 when it is false (not finding an effect that exists).

Confidence Level (1 − α)

  • Definition: Represents the likelihood that the confidence interval contains the true parameter.

  • For example, at a significance level of 5% (α = 0.05), the confidence level equals 95% (1 - 0.05).

Decision Rule for t-Test

  • To make decisions using the t-test:

    1. Compare the calculated t-statistic (tk) with the critical t-value (tc).

    2. Criteria for decision:

    • Reject H0 if |tk| > tc and if the sign of t_k aligns with HA.

    • Fail to reject H0 otherwise.

Limitations of t-Test

  1. Does not test theoretical validity of the regression model.

  2. Does not indicate the importance of coefficients—statistically significant coefficients do not imply practical importance.

  3. Not meant for testing entire populations; increasing N reduces standard error, thus inflating t-scores.

p-value in Hypothesis Testing

  • Definition: The p-value represents the smallest significance level at which the null hypothesis can be rejected.

  • Decision Rule: Reject H0 if the p-value of the estimated coefficient is less than α.

  • Important Note: Statistical software reports p-values typically for two-sided hypotheses, regardless of the nature of the test performed.

p-value Example with Stata Regression Results

  • Consider a regression Pricet = β0 + β1GDPt + ϵt with a reported p-value of 0.000 for βGDP:

    • Since this p-value is less than 0.05, we reject H0 at the 5% significance level, indicating significance.

Confidence Intervals

  • Purpose: Another method of hypothesis testing that provides a range of plausible values for a parameter.

  • Formula: For the confidence interval of β_k:

    • CI(βk) = \hat{βk} \pm tc \times SE(\hat{βk}) where t_c is the critical value for the chosen significance level.

  • Decision Rule: If the hypothesized value lies within this interval, H0 cannot be rejected.

Confidence Level Example

  • For a 90% CI for β_1:

    • Calculation: 1.288 \pm (1.699 * 0.543)

    • Resulting bounds:

    • UCB = 1.288 + 0.923 = 2.211

    • LCB = 1.288 − 0.923 = 0.365

    • Interpretation: There is a 90% chance the true β_1 is between 0.365 and 2.211.

Hypothesis Testing Using Confidence Intervals

  1. For H0: β1 = 0 and HA: β1 \neq 0: We reject H0 at the 10% significance level because 0 is outside the interval.

  2. For H0: β1 = 1 and HA: β1 \neq 1: We cannot reject H0 at the 10% significance level since 1 falls within the interval.

Summary of Hypothesis Testing Steps using t-Test

  1. Set up the null (H0) and alternative hypotheses (HA).

  2. Choose a level of significance (α) and determine the critical t-value (t_c).

  3. Run the regression to obtain the t-statistic (t_k).

  4. Apply the decision rule to decide whether to reject or not reject H0 based on tk vs. tc.

Regression Example

Model Description
  • Dependent Variable: Y (gross sales of Woody’s restaurant)

  • Independent Variables:

    • N: Competition (number of direct competitions within a two-mile radius)

    • P: Population (number of people within a three-mile radius)

    • I: Income (average household income in the area)

  • Sample Size: N = 33

Degrees of Freedom in Hypothesis Testing

  • Definition: Degrees of freedom (DF) is the difference between the number of observations (N) and the number of estimated coefficients (K + intercept).

    • Formula: DF = N - K - 1

    • Example: For a model with 3 variables and 120 observations, DF = 120 - 3 - 1 = 116

  • Implications: DF indicate how many observations can vary independently in estimations; lower DF can lead to imprecise parameter estimates.