M2.Lecutre 5: ANOVA Notes

ANOVA: Analysis of Variance

  • ANOVA stands for Analysis of Variance.
  • Purpose: compare the means of three or more groups.
  • DV (dependent variable) level of measurement: interval or ratio.
  • Key assumptions:
    • Independence of observations.
    • Normality of the DV within groups.
    • Homogeneity of variances across groups.
  • Research vs null hypotheses:
    • Research hypothesis: at least one group mean is different.
    • Null hypothesis: all group means are equal.
  • Post hoc testing: conducted if the ANOVA shows a significant result to identify which groups differ.

Assumptions and prerequisites

  • Independence
  • Normality within groups
  • Homogeneity of variances (equal variances across groups)
  • DV level: interval/ratio

Hypotheses and significance

  • Null hypothesis (H0): there is no difference among the group means.
  • Alternative hypothesis (Ha): at least one group mean differs from the others.
  • Level of significance: \alpha = 0.05
  • Test: ANOVA (F-test)
  • If significant, proceed to post hoc analyses to locate specific group differences.

Degrees of freedom (DF)

  • Let: k = number of groups, n = number of subjects per group (assuming equal n), N = kn = total sample size.
  • dfbetween (between groups): df{between} = k - 1
  • dfwithin (within groups): df{within} = Nk - k \; \text{or} \; N - k
  • dftotal (total): df{total} = Nk - 1 \; \text{or} \; N - 1
  • Example (from transcript):
    • Three groups, 18 participants per group: k = 3,
      n = 18,
      N = 54
    • Therefore: df_{between} = 3 - 1 = 2
    • df_{within} = 54 - 3 = 51
    • df_{total} = 54 - 1 = 53
    • Reported as: F(2, 51) in the example.
  • The general F-test uses these DF values:
    • F = \frac{MS{between}}{MS{within}} where MS \equiv \frac{SS}{df} (Mean Square).

ANOVA: Hypothesis testing steps (as per transcript)

  1. Develop null and research hypotheses.
  2. Choose a level of significance (\alpha).
  3. Determine which statistical test is appropriate (ANOVA for comparing 3+ means).
  4. Run analysis to obtain test statistic and p-value.
  5. Make a decision about rejecting or failing to reject the null hypothesis.
  6. Make a conclusion.

Example study (discharge instructions)

  • Research question: Difference in knowledge recall among three groups:
    • Printed discharge instructions only
    • Verbal discharge instructions only
    • Combination of both printed and verbal discharge instructions
  • Reported ANOVA result: F(2,\,51) = 13.630,\; p < 0.000
  • Interpretation: Since p < \alpha = 0.05 , reject the null hypothesis; there is a difference among the group means.
  • Specific conclusion provided in the slide analysis: the combination (printed + verbal) yields higher recall on average than either printed alone or verbal alone.

ANOVA results and interpretation (output components)

  • F-statistic with its degrees of freedom and p-value: F(2, 51) = 13.630,\; p < 0.000
  • Decision rule: if p < \alpha , reject H0; otherwise fail to reject H0.
  • Conclusion: There is a significant difference in knowledge recall among the three groups.

Descriptive statistics and group comparisons (from transcript)

  • Descriptives table provides: N, Mean (\bar{X}), Std. Deviation (SD), Std. Error, 95% CI for the Mean (Lower/Upper).
  • Example group means and dispersion (from the descriptive notes):
    • Combination (printed + verbal): \bar{X}_{\text{Both}} = 17.5,\; SD = 5.26
    • Printed only: \bar{X}_{\text{Printed}} = 12.78,\; SD = 4.57
    • Verbal only: \bar{X}_{\text{Spoken}} = 9.78,\; SD = 3.39
  • These descriptive statistics support the ANOVA finding that the combination group has higher recall on average.
  • The transcript shows additional descriptive outputs such as 95% CIs and the broader ANOVA table values (e.g., Sum of Squares, Mean Squares) in the output blocks labeled with Descriptives and ANOVA.

Test of Homogeneity of Variances

  • A test of equal variances across groups is reported (commonly Levene's test in ANOVA output).
  • The slide indicates sections labeled: "Test of Homogeneity of Variances" with associated significance values (Sig).
  • Interpretation (general): if the test is non-significant (p > 0.05), the assumption of equal variances is considered satisfied; if significant (p < 0.05), consider using a Welch ANOVA or a different approach.
  • In the transcript, the exact p-value for Levene’s test is not provided, but the presence of this test is noted in the output under ANOVA.

Post hoc testing (when ANOVA is significant)

  • Since the example ANOVA is significant (p < 0.05), post hoc tests would be conducted to identify which specific group means differ.
  • The transcript does not specify which post hoc test was used (e.g., Tukey's HSD, Bonferroni, Scheffé), but it states that post hoc testing should be completed if a significance is found.
  • Example interpretation (from provided data): the combination group differs from both individual instruction groups, with higher recall in the combination group.

Formulas and key equations (summary)

  • ANOVA model: Partitioning of variance:
    • Total variance: SS_{Total}
    • Between-group variance: SS_{Between}
    • Within-group variance: SS_{Within}
  • Mean Squares:
    • MS{Between} = \frac{SS{Between}}{df_{Between}}
    • MS{Within} = \frac{SS{Within}}{df_{Within}}
  • F-statistic:
    • F = \frac{MS{Between}}{MS{Within}}
  • Degrees of freedom (static forms):
    • df_{Between} = k - 1
    • df_{Within} = Nk - k \; \text{or} \; N - k
    • df_{Total} = Nk - 1 \; \text{or} \; N - 1
  • Example numeric values (equal groups):
    • k = 3,
      n = 18,
      N = 54
    • df_{Between} = 3 - 1 = 2
    • df_{Within} = 54 - 3 = 51
    • df_{Total} = 54 - 1 = 53
    • Reported as: F(2, 51) = 13.630,
      p < 0.000
  • Significance level and decision:
    • \alpha = 0.05
    • If p < \alpha , reject H_0 ; otherwise fail to reject.

Connections and implications

  • Connections to foundational statistics:
    • ANOVA extends t-tests to more than two groups by using partitioning of variance.
    • Requires homogeneity of variances; violation may affect Type I error rate and require alternatives (e.g., Welch's ANOVA).
  • Real-world relevance:
    • Helps determine whether different instructional methods lead to different knowledge recall levels.
    • In health education and evidence-based practice, ANOVA informs decisions about which delivery method(s) to implement.
  • Practical implications:
    • If the combination of modalities yields higher recall, practitioners might adopt mixed-method discharge instructions.
    • Needs confirmation via post hoc tests to understand pairwise differences and inform targeted interventions.
  • Ethical/philosophical considerations:
    • Ensure fair comparison groups and adequate power to avoid false negatives.
    • Post hoc analyses increase the risk of Type I errors if not properly controlled; use appropriate correction methods.

Quick reference checklist (from the slides)

  • Define the research and null hypotheses.
  • Choose (\alpha = 0.05).
  • Confirm ANOVA is the appropriate test for comparing three or more group means.
  • Run ANOVA to obtain the F statistic and p-value.
  • Decide whether to reject H0 based on p-value.
  • Draw a conclusion about the effect of group on the DV.
  • If significant, perform post hoc tests to identify specific group differences.
  • Review Descriptives and Test of Homogeneity of Variances to validate assumptions.
  • Report means, standard deviations, F-statistic with df, and p-value, and interpret in context.

Display notes (LaTeX formatting reference)

  • Degrees of freedom:
    • df_{between} = k - 1
    • df_{within} = Nk - k \; \text{or} \; N - k
    • df_{total} = Nk - 1 \; \text{or} \; N - 1
  • Example:
    • k = 3, \ n = 18, \ N = 54
    • df{between} = 2, df{within} = 51,
      df_{total} = 53
    • F(2, 51) = 13.630, \quad p < 0.000
  • Group means example:
    • \bar{X}_{\text{Both}} = 17.5,\; SD = 5.26
    • \bar{X}_{\text{Printed}} = 12.78,\; SD = 4.57
    • \bar{X}_{\text{Spoken}} = 9.78,\; SD = 3.39
  • Significance level:
    • \alpha = 0.05
  • p-values terminology:
    • p < 0.000 (as reported in slide)