ANOVA and Correlation Notes

Chapter 12: ANOVA (Analysis of Variance)

  • Used to evaluate mean differences between two or more treatment groups.
  • Uses sample data to draw conclusions about populations.
  • Advantage over T-test: Can compare more than two groups simultaneously.
  • Provides more flexibility in study design and interpretation.
  • Goal: Determine if observed mean differences among samples are significant enough to conclude there are mean differences among populations.

Hypotheses

  1. Null Hypothesis ( (H0) ): No differences in populations; observed sample mean differences are due to chance or error. Represented as H0 : \mu1 = \mu2 = \mu_3.
  2. Alternative Hypothesis ( (H1) ): Populations have different means, causing differences in sample means. Represented as H1 : \mu1 \neq \mu2 \neq \mu3 or H1 : \mu1 = \mu2, but (\mu_3) is different.

Factors and Levels

  • Factor: The independent variable designating the groups being compared.
  • Levels of the Factor: The conditions that make up the factor.
  • Factorial Design: A study that combines two or more factors.

Type I Error and ANOVA

  • ANOVA limits Type I error by performing all comparisons at once.
  • Testwise Alpha Level: The risk of Type I error for a single hypothesis test.
  • Experimentwise Alpha Level: The total probability of a Type I error accumulated from all individual tests. ANOVA has a lower experimentwise alpha error.

ANOVA Test Statistic

  • F-ratio: F = \frac{Variance \ between \ sample \ means}{Variance \ expected \ with \ no \ treatment \ effect}

Conducting ANOVA Test

  1. Determine total variability.
    • Between-Treatment Variance: Measures differences between sample means to get an overall variance measure.
    • Within-Treatment Variance: Measures variability/differences between values within each treatment.

Between-Treatment Variance

  • Can occur for two reasons:
    • Naturally occurring differences (sampling error).
    • Differences caused by treatment effects.
  • Measurement: Compare variances between treatments against sampling error without treatment effect.

Within-Treatment Variance

  • Provides a measure of how big the differences are when (H_0) is true.

F-Ratio for Independent Measures

  • Compares between- and within-treatment components.
  • F = \frac{Variance \ between \ treatments \ (differences \ including \ any \ treatment \ effects)}{Variance \ within \ treatments \ (differences \ with \ no \ treatment \ effects)}
  • If F-ratio is near 1.00, treatments are random and unsystematic, suggesting no treatment effect.
  • If F-ratio is larger than 1.00, the numerator is significantly larger than the denominator, indicating significant differences between treatments.
  • Error Term: The denominator of the F-ratio, measuring only random variability.

ANOVA Notation

  • (k): Number of treatment conditions/separate samples.
  • (n): Number of scores in each treatment.
  • (N): Total number of scores in the entire study.
  • (T): (\sum X) sum of scores for each treatment condition.
  • (G): Sum of all scores in the study.

Formulas

  • Sample Variance: (s^2)
  • Sum of Squares Total: SS_{total} = \sum X^2 - \frac{G^2}{N}
    • Degrees of Freedom Total: df_{total} = N - 1
  • Sum of Squares Within-Treatment: SS_{within\ treatment} = \sum SS \ of \ each \ treatment \ group
    • Degrees of Freedom Within: df_{within} = \sum (n - 1) = \sum df \ in \ each \ treatment
  • Sum of Squares Between-Treatment: SS{between} = SS{total} - SS_{within}
    • Degrees of Freedom Between: df_{between} = k - 1
  • F-Ratio Calculation: F = \frac{MS{between}}{MS{within}}, where (MS) represents Mean Square.

ANOVA and Hypothesis Testing

  • Distribution of F: Expected to be around 1.00.
    • F-ratios are always positive.
    • When (H_0) is true, the ratio should be near 1.00.
    • With smaller df, F values are more spread out.

Steps for Hypothesis Testing

  1. State Hypotheses:
    • (H0 : \mu1 = \mu2 = \mu3)
    • (H_1): At least one of the treatment means is different.
  2. Locate the Critical Region for F:
    • Find df total, within, and between.
    • Use df within and df between to find the critical F-value.
  3. Compute the Observed F Test Statistic:
    • Find all 3 SS (total, within, between).
    • Calculate Mean Squares.
    • Calculate F.
  4. Make a Statistical Decision About the Null Hypothesis:
    • If F value is within the critical region, reject (H_0) and accept the alternative.
    • If F value is not within the critical region, fail to reject (H_0).

Chapter 14: Two-Factor Analysis

Two-Factor ANOVA

  • Allows examination of three types of mean differences within one analysis.
  • Goal: Evaluate mean differences produced by factors acting independently or together.

Main Effects

  • Main Effect: Mean differences among levels of one factor. Example: (\delta - 4 = 4) is the main effect for factor gender.
  • Evaluation of main effects involves 2/3 hypothesis tests for 2-factor ANOVA.
  • Hypotheses:
    • For Factor A: (H0 : M{a1} = M{a2}) and (H1 : M{a1} \neq M{a2})
    • For Factor B: (H0 : M{b1} = M{b2}) and (H1 : M{b1} \neq M{b2})

Interactions

  • Interaction: Any extra differences not caused by the main effects.
  • If the difference between factors is not constant, there is an interaction.
  • Calculated by: F = \frac{Variance \ (mean \ differences \ not \ explained \ by \ main \ effect)}{Variance \ (differences \ expected \ if \ no \ treatment)}
  • Hypotheses:
    • (H_0): There is no interaction between A and B.
    • (H_1): There is an interaction between A and B.
  • If levels of one factor depend on the levels of the other, there is an interaction.
  • Non-parallel lines on a graph indicate an interaction.

Two-Factor ANOVA Hypothesis Test

  1. A effect.
  2. B effect.
  3. (A \times B) Interaction.

Examples of Main Effects and Interactions

  • Main effect of 10, no interaction: +10, +10, +20, +20
  • Main effect, interaction: +10, +10, -10, -10

Analysis of Two-Factor ANOVA

  • Total variance is split into between-treatment and within-treatment variance.
  • Between-treatment variance is further divided into factor A variance, factor B variance, and interaction variance.

Stages of Analysis

  1. Total Variability:
    • SS_{total} = \sum X^2 - \frac{G^2}{N}
    • df_{total} = N - 1
  2. Within-Treatment:
    • SS_{within} = \sum SS \ each \ treatment
    • df_{within} = \sum df \ each \ treatment
  3. Between-Treatment:
    • SS{between} = SS{total} - SS_{within}
    • df_{between} = # \ of \ cells - 1
  4. Factor A:
    • SS_A = …
    • df_A = # \ of \ Rows - 1
  5. Factor B:
    • SS_B = …
    • df_B = # \ of \ Columns - 1
  6. (A \times B) Interaction:
    • SS{A \times B} = SS{between} - SSA - SSB
    • df{A \times B} = df{between} - dfA - dfB

F-Ratios and Critical Values

  • 3 F-ratios for the 3 variances.
  • Finding Critical F Value:
    • Numerator: between groups df.
    • Denominator: within groups df.
    • df{within} = df{total} - dfA - dfB - df_{A \times B}
    • (\alpha = 0.05)
    • Critical F = (F(df{between}, df{within}))

Chapter 15: Correlation

Correlation

  • Measures and describes the relationship between two variables by measuring different variables for each individual.

Characteristics of a Relationship

  • Direction (negative or positive).
  • Form (linear - Pearson correlation).

Pearson Correlation

  • Measures the degree and direction of a linear relationship.
  • r = \frac{Covariability \ of \ X \ and \ Y}{Variability \ of \ X \ and \ Y \ separately}

Sum of Products of Deviations

  • Similar to SS, measures the amount of covariability between two variables.
  • Definitional Formula: SP = \sum (X - MX)(Y - MY)

Pearson Correlation Formula

  • r = \frac{SP}{\sqrt{SSX SSY}}

Hypothesis Testing

  • Asks whether a correlation exists in the population.
  • Hypotheses:
    • (H_0 : \rho = 0) ((\rho) = population correlation)
    • (H_1 : \rho \neq 0)
  • Test Statistic:
  • df = n - 2
  • Effect Size: