ANOVA and Correlation Notes

Chapter 12: ANOVA (Analysis of Variance)

Used to evaluate mean differences between two or more treatment groups.
Uses sample data to draw conclusions about populations.
Advantage over T-test: Can compare more than two groups simultaneously.
Provides more flexibility in study design and interpretation.
Goal: Determine if observed mean differences among samples are significant enough to conclude there are mean differences among populations.

Hypotheses

Null Hypothesis ( (H0) ): No differences in populations; observed sample mean differences are due to chance or error. Represented as H0 : \mu1 = \mu2 = \mu_3.
Alternative Hypothesis ( (H1) ): Populations have different means, causing differences in sample means. Represented as H1 : \mu1 \neq \mu2 \neq \mu3 or H1 : \mu1 = \mu2, but (\mu_3) is different.

Factors and Levels

Factor: The independent variable designating the groups being compared.
Levels of the Factor: The conditions that make up the factor.
Factorial Design: A study that combines two or more factors.

Type I Error and ANOVA

ANOVA limits Type I error by performing all comparisons at once.
Testwise Alpha Level: The risk of Type I error for a single hypothesis test.
Experimentwise Alpha Level: The total probability of a Type I error accumulated from all individual tests. ANOVA has a lower experimentwise alpha error.

ANOVA Test Statistic

F-ratio: F = \frac{Variance \ between \ sample \ means}{Variance \ expected \ with \ no \ treatment \ effect}

Conducting ANOVA Test

Determine total variability.
- Between-Treatment Variance: Measures differences between sample means to get an overall variance measure.
- Within-Treatment Variance: Measures variability/differences between values within each treatment.

Between-Treatment Variance

Can occur for two reasons:
- Naturally occurring differences (sampling error).
- Differences caused by treatment effects.
Measurement: Compare variances between treatments against sampling error without treatment effect.

Within-Treatment Variance

Provides a measure of how big the differences are when (H_0) is true.

F-Ratio for Independent Measures

Compares between- and within-treatment components.
F = \frac{Variance \ between \ treatments \ (differences \ including \ any \ treatment \ effects)}{Variance \ within \ treatments \ (differences \ with \ no \ treatment \ effects)}
If F-ratio is near 1.00, treatments are random and unsystematic, suggesting no treatment effect.
If F-ratio is larger than 1.00, the numerator is significantly larger than the denominator, indicating significant differences between treatments.
Error Term: The denominator of the F-ratio, measuring only random variability.

ANOVA Notation

(k): Number of treatment conditions/separate samples.
(n): Number of scores in each treatment.
(N): Total number of scores in the entire study.
(T): (\sum X) sum of scores for each treatment condition.
(G): Sum of all scores in the study.

Formulas

Sample Variance: (s^2)
Sum of Squares Total: SS_{total} = \sum X^2 - \frac{G^2}{N}
- Degrees of Freedom Total: df_{total} = N - 1
Sum of Squares Within-Treatment: SS_{within\ treatment} = \sum SS \ of \ each \ treatment \ group
- Degrees of Freedom Within: df_{within} = \sum (n - 1) = \sum df \ in \ each \ treatment
Sum of Squares Between-Treatment: SS{between} = SS{total} - SS_{within}
- Degrees of Freedom Between: df_{between} = k - 1
F-Ratio Calculation: F = \frac{MS{between}}{MS{within}}, where (MS) represents Mean Square.

ANOVA and Hypothesis Testing

Distribution of F: Expected to be around 1.00.
- F-ratios are always positive.
- When (H_0) is true, the ratio should be near 1.00.
- With smaller df, F values are more spread out.

Steps for Hypothesis Testing

State Hypotheses:
- (H0 : \mu1 = \mu2 = \mu3)
- (H_1): At least one of the treatment means is different.
Locate the Critical Region for F:
- Find df total, within, and between.
- Use df within and df between to find the critical F-value.
Compute the Observed F Test Statistic:
- Find all 3 SS (total, within, between).
- Calculate Mean Squares.
- Calculate F.
Make a Statistical Decision About the Null Hypothesis:
- If F value is within the critical region, reject (H_0) and accept the alternative.
- If F value is not within the critical region, fail to reject (H_0).

Chapter 14: Two-Factor Analysis

Two-Factor ANOVA

Allows examination of three types of mean differences within one analysis.
Goal: Evaluate mean differences produced by factors acting independently or together.

Main Effects

Main Effect: Mean differences among levels of one factor. Example: (\delta - 4 = 4) is the main effect for factor gender.
Evaluation of main effects involves 2/3 hypothesis tests for 2-factor ANOVA.
Hypotheses:
- For Factor A: (H0 : M{a1} = M{a2}) and (H1 : M{a1} \neq M{a2})
- For Factor B: (H0 : M{b1} = M{b2}) and (H1 : M{b1} \neq M{b2})

Interactions

Interaction: Any extra differences not caused by the main effects.
If the difference between factors is not constant, there is an interaction.
Calculated by: F = \frac{Variance \ (mean \ differences \ not \ explained \ by \ main \ effect)}{Variance \ (differences \ expected \ if \ no \ treatment)}
Hypotheses:
- (H_0): There is no interaction between A and B.
- (H_1): There is an interaction between A and B.
If levels of one factor depend on the levels of the other, there is an interaction.
Non-parallel lines on a graph indicate an interaction.

Two-Factor ANOVA Hypothesis Test

A effect.
B effect.
(A \times B) Interaction.

Examples of Main Effects and Interactions

Main effect of 10, no interaction: +10, +10, +20, +20
Main effect, interaction: +10, +10, -10, -10

Analysis of Two-Factor ANOVA

Total variance is split into between-treatment and within-treatment variance.
Between-treatment variance is further divided into factor A variance, factor B variance, and interaction variance.

Stages of Analysis

Total Variability:
- SS_{total} = \sum X^2 - \frac{G^2}{N}
- df_{total} = N - 1
Within-Treatment:
- SS_{within} = \sum SS \ each \ treatment
- df_{within} = \sum df \ each \ treatment
Between-Treatment:
- SS{between} = SS{total} - SS_{within}
- df_{between} = # \ of \ cells - 1
Factor A:
- SS_A = …
- df_A = # \ of \ Rows - 1
Factor B:
- SS_B = …
- df_B = # \ of \ Columns - 1
(A \times B) Interaction:
- SS{A \times B} = SS{between} - SSA - SSB
- df{A \times B} = df{between} - dfA - dfB

F-Ratios and Critical Values

3 F-ratios for the 3 variances.
Finding Critical F Value:
- Numerator: between groups df.
- Denominator: within groups df.
- df{within} = df{total} - dfA - dfB - df_{A \times B}
- (\alpha = 0.05)
- Critical F = (F(df{between}, df{within}))

Chapter 15: Correlation

Correlation

Measures and describes the relationship between two variables by measuring different variables for each individual.

Characteristics of a Relationship

Direction (negative or positive).
Form (linear - Pearson correlation).

Pearson Correlation

Measures the degree and direction of a linear relationship.
r = \frac{Covariability \ of \ X \ and \ Y}{Variability \ of \ X \ and \ Y \ separately}

Sum of Products of Deviations

Similar to SS, measures the amount of covariability between two variables.
Definitional Formula: SP = \sum (X - MX)(Y - MY)

Pearson Correlation Formula

r = \frac{SP}{\sqrt{SSX SSY}}

Hypothesis Testing

Asks whether a correlation exists in the population.
Hypotheses:
- (H_0 : \rho = 0) ((\rho) = population correlation)
- (H_1 : \rho \neq 0)
Test Statistic:
df = n - 2
Effect Size: