AI

Introduction to Analysis of Variance (ANOVA)

ANOVA

  • A hypothesis-testing procedure used to evaluate mean differences between two or more treatments or populations.

  • Involves one independent variable or factor with two or more levels.

  • Determines if differences exist between three or more populations based on sample data.

  • Can be used with independent measures or repeated measures.

ANOVA Usage

  • Used to evaluate results from research studies with more than one factor or independent variable (IV).

  • Example: Two-factor Design or Factorial Design.

Factorial Design

  • Involves multiple factors (e.g., Anxiety, Audience) and their levels (Low, High, With, Without).

  • Treatment conditions are indicated by combining different levels of each factor.

    • Low anxiety with audience.

    • High anxiety with audience.

    • Low anxiety without audience.

    • High anxiety without audience.

Applications of ANOVA

  • Can be applied in various research situations.

  • Includes:

    • Single-factor design.

    • Independent measures.

    • Repeated measures.

    • Factorial design.

Statistical Hypotheses for ANOVA

  • Example: Examining learning performance under three temperature conditions: 50, 70, and 90 degrees.

  • Three samples are selected, one for each treatment condition.

  • Purpose: Determine whether room temperature affects learning performance.

  • H_0: Room temperature has no effect on learning performance.

  • H_1: Room temperature has an effect on learning performance.

Hypotheses

  • Null Hypothesis (H0): \mu1 = \mu2 = \mu3

  • Alternative Hypothesis (H_1): At least one population mean is different from another.

  • Examples of alternative hypotheses:

    • H1: \mu1 \neq \mu2 \neq \mu3

    • H1: \mu1 = \mu3, but \mu2 is different.

Test Statistic for ANOVA

  • Similar to t statistics.

  • Test statistic is called an F-ratio.

  • Based on variance instead of sample mean difference.

  • Variance describes the differences between all sample means.

  • F = \frac{\text{Variance (difference) between sample means}}{\text{Variance (difference) expected by chance (error)}}

Logic of ANOVA

  • Example: Three treatments with different temperatures (15, 24, 34 degrees).

  • Samples from each treatment are measured.

    • Treatment 1 (15 degrees): 0, 1, 3, 1, 0; M = 1

    • Treatment 2 (24 degrees): 4, 3, 6, 3, 4; M = 4

    • Treatment 3 (34 degrees): 1, 2, 2, 0, 0; M = 1

Steps

  • Determine the total variability for the entire data set by combining all scores from separate samples.

  • Break down the total variability into separate components and analyze their variability.

  • Between-Treatments Variance: Difference between sample means.

  • Within-Treatments Variance: Variability inside each treatment condition.

Purpose

  • Analyze between-treatments variance to distinguish between two alternative explanations:

    • Differences between treatments are significantly greater than can be explained by chance alone (treatment effects).

    • Differences between treatments are simply due to chance.

  • Measure chance differences by computing the variance within treatments.

Within-Treatments Variance

  • Inside each treatment condition, individuals are treated the same, but scores may differ due to chance.

  • Within-treatments variance measures how much difference is reasonable to expect by chance.

  • Variability when there is no treatment that could cause differences.

Components of Variability

  • Total Variability:

    • Between-treatments variance measures differences due to:

      1. Treatment effects.

      2. Chance.

    • Within-treatments variance measures differences due to:

      1. Chance.

The F-Ratio

  • Compares variability between and within treatments.

  • When the treatment effect is zero, the F-ratio is expected to be nearly 1.00.

  • Denominator is also known as the error term.

  • F = \frac{\text{Variance between treatments}}{\text{Variance within treatments}}

  • F = \frac{\text{Treatment effect + Differences due to chance}}{\text{Difference due to chance}}

ANOVA Notation

  • k = the number of treatment conditions.

  • n = the number of scores in each treatment.

  • N = total number of scores.

  • T = the total for each treatment condition (\Sigma X).

  • G = sum of all scores in the research study (\Sigma T).

  • SS = sum of squares.

  • M = mean.

  • MS = mean square, variance.

Analysis of Sum of Squares

  • Total Sum of Squares (SS_{Total}).

  • Within-Treatments Sum of Squares (SS{W/I}): SS{W/I} = \Sigma SS_{\text{inside each treatment}}

  • Between-Treatments Sum of Squares (SS_{Between}): Differences between sample means.

  • SS_{Total} = \Sigma X^2 - \frac{G^2}{N}

  • SS_{Between} = \Sigma \frac{T^2}{n} - \frac{G^2}{N}

Analysis of Degrees of Freedom

  • Total Degrees of Freedom (df{total}): df{total} = N - 1

  • Within-Treatment Degrees of Freedom (df{w/i}): df{w/i} = N - k

  • Between-Treatment Degrees of Freedom (df{between}): df{between} = k - 1

Calculation of Variances

  • Variance between treatments (MS{between}): MS{between} = \frac{SS{between}}{df{between}}

  • Variance within treatments (MS{w/i}): MS{w/i} = \frac{SS{w/i}}{df{w/i}}

F-ratio Calculation

  • F-ratio: F = \frac{MS{between}}{MS{w/i}}

The F Distribution

  • F values are always positive.

  • When H_0 is true, the numerator and denominator of the F-ratio measure the same variance.

  • In this case, the two sample variances should be about the same size, so the ratio should be near 1.00.

  • The distribution of F-ratios piles up around one.

  • Based on the df of the numerator and denominator, and the alpha level, reject H_0 if the obtained value is greater than the critical F value.

F Distribution Example

  • Critical value F = 2.67 (alpha = 0.05).

Post Hoc Tests

  • The F-ratio indicates that somewhere among the entire set of mean differences, there is at least one that is greater than would be expected by chance.

  • It does not tell exactly which means are significantly different and which are not.

  • Identifying these specific differences is the purpose of post hoc tests.

  • Done after the analysis of variance.

Tukey’s Honestly Significant Difference (HSD)

  • Computes a single value that determines the minimum difference between treatment means necessary for significance.

  • Used to compare any two treatment conditions.

  • If the mean difference exceeds the HSD, the treatments are significantly different.

Tukey’s HSD Formula

  • It is a tabular value based on the selected alpha level, the number of treatment conditions (k), and the df for the MS_{w/i}.

  • Tukey’s test requires that the sample size (n) for all the treatment conditions be the same.

  • HSD = q \sqrt{\frac{MS_{w/i}}{n}}

The Scheffé Test

  • Computing an F-ratio using data from only two treatment conditions.

  • Use when sample sizes are unequal

  • SS_{between}

  • Maintain original df_{between}