Lecture Notes on ANOVA and Multiple Groups Designs

PYB210 Research Design & Data Analysis - Lecture 5: Multiple Groups Designs & Analysis of Variance (ANOVA)

Learning Goals

  • Utility of Multiple-Group Designs: Understand why experiments may require more than two groups.

  • Between-Group Variability: Comprehend this concept as it relates to ANOVA.

  • ANOVA Structure: Know how to structure an ANOVA and test the null hypothesis effectively.

Why More Than 2 Groups?

  • Traditional experiments often examine only two conditions (e.g., A vs. B).

  • Real-world independent variables (IVs) can have multiple levels.

  • Understanding relationships between IV and dependent variables (DVs) may require more than two groups.

Refining Our Understanding of IV and DV Relationships

  • Differences may exist, such as:

    • Football players vs. non-contact athletes in cognitive performance.

  • Multiple groups can enhance understanding by:

    • Evaluating differences based on nuances (e.g., number of head injuries).

    • Investigating dose-response relationships for IVs affecting DVs.

Common IV and DV Relationships

  • Linear: e.g., Alcohol and brain cell death.

  • Curved: e.g., Strength training's effects on endurance.

  • Quadratic: e.g., Anxiety levels and exam performance.

Setting Up IV with DV

  1. Number of Levels:

    • Determined by expected relationship type (linear, curvilinear).

  2. Spacing of Levels:

    • Varies; e.g., drug doses (1mg, 4mg, 7mg, 10mg) for clear analysis.

    • Important for IVs based on measurements, not mere categories.

Analysis of Variance (ANOVA)

  • Why Not Multiple t-tests?

    • Multiple t-tests (three groups = 3 tests, four groups = 6 tests) increase the chance of Type 1 errors significantly (greater than 0.05).

  • ANOVA addresses this by allowing a single test to assess differences among group means.

Transitioning from T-tests to ANOVA

  • T-tests assess differences between two means; ANOVA looks at multiple differences and variance.

  • Key Variances:

    • Variance between groups (BG): Impact of IV on means.

    • Variance within groups (WG): Individual score variation not due to the IV.

  • Foundation Ratio in ANOVA:

    • F Ratio: F = Between-groups variance / Within-groups variance.

Calculation of Variance

  1. Total Variance: Calculate variance for all scores.

  2. Within-Groups Variance: Approximated by pooling variances of individual groups.

  3. Between-Groups Variance: Calculated by assessing group means against the grand mean.

Example Calculation

  • Sleep deprivation study example:

    • Group Means (Sleep hours): 28, 20, 12, 8 with respective scores for target identification.

    • Statistical ratios guide reported variance assessments.

Partitioning the Variance in ANOVA

  • Total Variation: Breaks down into between-groups and within-groups contributions.

  • Additive property of sums of squares:

    • SSTotal = SSBetween + SSWithin.

Degrees of Freedom

  • Critical for variance calculations:

    • Total df = N - 1

    • Between-groups df = a - 1 (a = number of groups)

    • Within-groups df = N - a.

The F-statistic

  • F calculated as the ratio of Mean Squares:

    • F = MSBetween / MSWithin.

Hypothesis Testing & F Ratio

  • Null Hypothesis (H0): All group means are equal.

  • Alternate Hypothesis (H1): At least one group mean differs.

  • F Ratio compares variability to assess the influence of the independent variable (IV) on dependent variable (DV).

F Distribution

  • Under H0, the expected F value approaches 1.

  • Variation due to sampling is inherent; distributions vary based on degrees of freedom.

ANOVA Table Structure

  • Displays sums of squares, degrees of freedom, mean squares, F values, significance:

    • BG variability, WG variability, overall totals summarized.

Reporting ANOVA

  • Format for reporting statistical results:

    • E.g., F(3, 12) = 7.34, p = .005.

  • Ensure to describe analysis succinctly:

    1. Type of ANOVA conducted (one-way).

    2. Dependent variable details (DV).

    3. Independent variable (IV) and levels.

Next Steps

  • Time to familiarize with assumptions of ANOVA and refine understanding of variance comparisons in future lectures.

Learning Goals

The utility of multiple-group designs is crucial to understand why experiments may require more than two groups. Between-group variability is a central concept related to ANOVA, and knowing how to structure an ANOVA effectively and test the null hypothesis is fundamental in research design.

Why More Than 2 Groups?

Traditional experiments often examine only two conditions, such as A versus B. However, real-world independent variables (IVs) can have multiple levels. Understanding the relationships between IVs and dependent variables (DVs) may necessitate employing more than two groups.

Refining Our Understanding of IV and DV Relationships

Differences may exist between various groups, for instance, when comparing football players to non-contact athletes in terms of cognitive performance. Utilizing multiple groups allows for a more enhanced understanding by evaluating differences based on nuances, such as the number of head injuries sustained. This design can also aid in investigating dose-response relationships for IVs affecting DVs.

Common IV and DV Relationships

These relationships can take different forms: linear (such as the correlation between alcohol intake and brain cell death), curved (like the effects of strength training on endurance), and quadratic (for instance, anxiety levels in relation to exam performance).

Setting Up IV with DV

Setting up an independent variable with a dependent variable involves determining the number of levels based on the expected relationship type, whether linear or curvilinear. The spacing of levels is also important and can vary; for example, dosing of drugs at 1mg, 4mg, 7mg, and 10mg provides clearer analysis. This spacing is particularly significant for IVs based on measurements rather than mere categories.

Analysis of Variance (ANOVA)

Multiple t-tests are not ideal for analyzing more than two groups because they increase the chance of Type 1 errors significantly. For example, three groups would lead to three tests, while four groups would result in six tests, all heightening the overall risk. ANOVA addresses this issue by allowing a single test to assess differences among group means.

Transitioning from t-tests to ANOVA involves assessing differences between multiple means and understanding variances. In ANOVA, the key variances are variance between groups (BG), which reflects the impact of IVs on means, and variance within groups (WG), representing individual score variation not attributable to the IV. The foundational ratio in ANOVA, known as the F Ratio, is defined as F = Between-groups variance / Within-groups variance.

Calculation of Variance

The total variance is calculated for all scores. Within-groups variance is approximated by pooling variances of individual groups, while between-groups variance is calculated by assessing group means against the grand mean. For instance, in a sleep deprivation study with group means of 28, 20, 12, and 8 sleep hours, respective scores for target identification can guide variance assessment.

Partitioning the Variance in ANOVA

Total variation can be broken down into contributions from between-groups and within-groups variability, thus following the additive property of sums of squares: SSTotal = SSBetween + SSWithin.

Degrees of Freedom

Degrees of freedom (df) are critical for variance calculations where the total df = N - 1, the between-groups df = a - 1 (with a representing the number of groups), and within-groups df = N - a.

The F-statistic

The F-statistic is calculated as the ratio of Mean Squares: F = MSBetween / MSWithin.

Hypothesis Testing & F Ratio

The null hypothesis (H0) posits that all group means are equal, while the alternate hypothesis (H1) posits that at least one group mean differs. The F Ratio is used to compare variability to assess the influence of the independent variable (IV) on the dependent variable (DV).

F Distribution

Under H0, the expected F value approaches 1, with variation due to inherent sampling. The distributions vary based on degrees of freedom.

ANOVA Table Structure

The ANOVA table displays sums of squares, degrees of freedom, mean squares, F values, and significance, summarizing both between-groups and within-groups variability.

Reporting ANOVA

When reporting statistical results, the format typically looks like: F(3, 12) = 7.34, p = .005. It is essential to succinctly describe the analysis by detailing the type of ANOVA conducted (i.e., one-way), clarifying the dependent variable (DV), and outlining the independent variable (IV) and its levels.

Next Steps

As a next step, it is important to familiarize oneself with the assumptions of ANOVA and refine the understanding of variance comparisons in future lectures.