knowt logo

Class 20

Goodness-of-Fit Test Overview

  • Goodness-of-fit tests are used to determine how well observed data fit a specified distribution.

  • In this case, testing claims regarding car crash fatalities across different days of the week.

Week 12: Session 20 Introduction

  • Topic: Goodness-of-fit Statistics.

  • Relevant references: BW Baldwin Wallace University MTH 108: Biostatistics.

Research Scenario: Car Crash Deaths

  • Data Source: Insurance Institute for Highway Safety.

  • Key Question: Do car crash fatalities occur with equal frequency across the days of the week?

  • Parameters of interest involve:

    • Identifying the random variable.

    • Understanding the variable type and grouping.

Expected Frequencies Calculation

  • Observed frequencies (O) vs. expected frequencies (E).

  • To calculate E when all expected frequencies are equal:

    • E = n/k (where n = total observations, k = number of categories).

  • When E is not equal across categories:

    • E = np (p = probability that a sample value falls within a particular category).

Goodness-of-Fit Test Assumptions

  • Important conditions for valid results:

    1. Data must be randomly selected.

    2. Sample data should be frequency counts for all categories.

    3. Each expected frequency must be at least 5.

Application to the Research Scenario

  • For this case:

    • n = 819 fatal crashes.

    • k = 7 days of the week.

    • Calculated expectation: E = 819/7 ≈ 117 for each day.

  • All these expected frequencies satisfy the requisite condition of being at least 5.

Measuring Differences: The Chi-Squared Statistic

  • Chi-squared test statistic (χ²) is calculated as:

    • χ² = Σ[(O - E)² / E].

    • This measures the magnitude of differences between observed and expected frequencies.

  • Degrees of freedom: df = k - 1.

Step-by-Step Hypothesis Testing

  • Null Hypothesis (H0): Frequency counts agree with the uniform distribution.

  • Alternative Hypothesis (H1): At least one of the probabilities differs from the others.

  • Common significance level chosen: α = 0.05.

  • If p-value < α, reject H0 (evidence suggests frequencies do not occur equally).

Results Interpretation

  • Based on the findings:

    • Significant outcomes show that fatalities do not occur with equal frequency on all days.

    • Evidence indicates that weekend fatalities are higher than expected (P < .0001).

Conclusion

  • Summarize findings in terms of hypotheses:

    • Reject H0: Evidence suggests unequal distribution of car crash fatalities across the week.

    • Address implications: More accidents occur likely due to increased weekend activities, indicating public safety considerations.

M

Class 20

Goodness-of-Fit Test Overview

  • Goodness-of-fit tests are used to determine how well observed data fit a specified distribution.

  • In this case, testing claims regarding car crash fatalities across different days of the week.

Week 12: Session 20 Introduction

  • Topic: Goodness-of-fit Statistics.

  • Relevant references: BW Baldwin Wallace University MTH 108: Biostatistics.

Research Scenario: Car Crash Deaths

  • Data Source: Insurance Institute for Highway Safety.

  • Key Question: Do car crash fatalities occur with equal frequency across the days of the week?

  • Parameters of interest involve:

    • Identifying the random variable.

    • Understanding the variable type and grouping.

Expected Frequencies Calculation

  • Observed frequencies (O) vs. expected frequencies (E).

  • To calculate E when all expected frequencies are equal:

    • E = n/k (where n = total observations, k = number of categories).

  • When E is not equal across categories:

    • E = np (p = probability that a sample value falls within a particular category).

Goodness-of-Fit Test Assumptions

  • Important conditions for valid results:

    1. Data must be randomly selected.

    2. Sample data should be frequency counts for all categories.

    3. Each expected frequency must be at least 5.

Application to the Research Scenario

  • For this case:

    • n = 819 fatal crashes.

    • k = 7 days of the week.

    • Calculated expectation: E = 819/7 ≈ 117 for each day.

  • All these expected frequencies satisfy the requisite condition of being at least 5.

Measuring Differences: The Chi-Squared Statistic

  • Chi-squared test statistic (χ²) is calculated as:

    • χ² = Σ[(O - E)² / E].

    • This measures the magnitude of differences between observed and expected frequencies.

  • Degrees of freedom: df = k - 1.

Step-by-Step Hypothesis Testing

  • Null Hypothesis (H0): Frequency counts agree with the uniform distribution.

  • Alternative Hypothesis (H1): At least one of the probabilities differs from the others.

  • Common significance level chosen: α = 0.05.

  • If p-value < α, reject H0 (evidence suggests frequencies do not occur equally).

Results Interpretation

  • Based on the findings:

    • Significant outcomes show that fatalities do not occur with equal frequency on all days.

    • Evidence indicates that weekend fatalities are higher than expected (P < .0001).

Conclusion

  • Summarize findings in terms of hypotheses:

    • Reject H0: Evidence suggests unequal distribution of car crash fatalities across the week.

    • Address implications: More accidents occur likely due to increased weekend activities, indicating public safety considerations.

robot