Inferential Statistics: Single Factor ANOVA

INFERENTIAL STATISTICS: DIFFERENCES BETWEEN MEANS: SINGLE FACTOR ANOVA TEST

Purpose of Analysis of Variance (ANOVA)

  • Purpose: ANOVA is very similar to the t-Test and serves the function of determining statistical differences between group means.

  • Comparison with t-Test:

    • t-Test: Examines whether there is a statistical difference between two means (Dependent Variable, DV) across two categories of an Independent Variable (IV) which is nominal and dichotomous.

    • ANOVA: Used when comparing three or more group means (DV) across three or more categories of an IV that can be nominal, non-dichotomous, or ordinal, applying variance in the testing process.

Single Factor ANOVA

  • Important Components of the ANOVA Test:

    • Grand Mean vs. Group Means: The overall mean of all data points compared to individual group means.

    • Within Group Variation vs. Between Group Variation:

    • Within Group Variation: Variation within each individual group.

    • Between Group Variation: Variation between the means of different groups.

    • The F and the F Distribution: The ratio obtained in ANOVA to determine statistical significance.

    • Degrees of Freedom: Number of independent pieces of information in analysis.

    • Completing the Analysis: Involves using post hoc tests after ANOVA to further assess which means are significantly different.

Example of Single Factor ANOVA

  • Objective: To determine if there is a statistical difference in the mean number of MPH over the speed limit for three groups categorized by age: 24 and under, 25-34, and 35 and over.

  • Hypotheses:

    • H1 (Alternative Hypothesis): There is a statistical difference in the overall mean scores across the three groups.

    • H0 (Null Hypothesis): There is not a statistical difference in the overall mean scores across the three groups.

  • Data Summary:

    • Groups:

    • 24 and Under: 78 individuals

    • 25 to 34: 75 individuals

    • 35 and Over: 78 individuals

    • Grand Mean: 11.61472

    • Standard Error: 0.212923

    • Sum Average Variance per group:

    • 24 and Under: 12.48718 (Variance: 15.11022)

    • 25 to 34: 10.91026 (Variance: 8.030803)

    • 35 and Over: 11.44 (Variance: 7.114595)

    • Median: 11

    • Mode: 10

    • Standard Deviation: 3.236148

    • Sample Variance: 10.47265

    • Kurtosis: 0.33254

    • Skewness: 0.730164

    • Range: 15 (Minimum: 6, Maximum: 21)

    • Sum: 2683

    • Count: 231

Grand Mean and Group Means Formula

  • The Grand Mean is calculated via
    extGrandMean=racX<em>1+X</em>2+X33ext{Grand Mean} = rac{X<em>1 + X</em>2 + X_3}{3}

    • Where $X1$, $X2$, $X_3$ are the group means for the respective age groups involved.

  • Between Group Variation & Within Group Variation are key metrics calculated for F-ratio determination.

The F and the F Distribution

  • Meaning of F:

    • The F statistic represents the ratio of the mean square values derived from within-group variance and between-group variance.

  • Interpreting F:

    • If the null hypothesis (H0) is true, the ratio should hover close to 1. The higher the calculated F ratio, the stronger the evidence against H0 and the greater likelihood that the differences in group means are significant.

Formulae and Critical Values
  • Degrees of Freedom:

    • Between DF = Number of groups - 1

    • Within DF = Sample Size - Number of groups

    • The F-ratio is computed as
      F=racextBetweenMSextWithinMSF = rac{ ext{Between MS}}{ ext{Within MS}}

  • P-Value:

    • Represents the alpha, typically set at 0.05 for a cutoff of 95% confidence.

    • If p-value ≤ 0.05, significant differences exist; otherwise, they do not.

  • F Crit: The critical value retrieved from the F distribution table, necessary for comparing calculated F against the critical F value to determine significance.

Post Hoc Analysis

  • While the ANOVA test can establish if an overall difference exists, it does not pinpoint where these differences lie.

  • Post Hoc Analysis (Tukey-Kramer): Used to identify specific group differences after finding a statistically significant result in ANOVA.

  • Questions Addressed by Post Hoc Analysis:

    • Is there a difference between the groups 24 and Under and 25-34?

    • Is there a difference between 25-34 and 35 and Older?

    • Is there a difference between 24 and Under and 35 and Older?

Results of Post Hoc Analysis
  • Indicates that even though there’s an overall significant difference, the only significant difference in means occurs between the groups 24 and Under versus 35 and Over.