ANOVA and Correlation Notes
Chapter 12: ANOVA (Analysis of Variance)
- Used to evaluate mean differences between two or more treatment groups.
- Uses sample data to draw conclusions about populations.
- Advantage over T-test: Can compare more than two groups simultaneously.
- Provides more flexibility in study design and interpretation.
- Goal: Determine if observed mean differences among samples are significant enough to conclude there are mean differences among populations.
Hypotheses
- Null Hypothesis ( (H0) ): No differences in populations; observed sample mean differences are due to chance or error. Represented as H0 : \mu1 = \mu2 = \mu_3.
- Alternative Hypothesis ( (H1) ): Populations have different means, causing differences in sample means. Represented as H1 : \mu1 \neq \mu2 \neq \mu3 or H1 : \mu1 = \mu2, but (\mu_3) is different.
Factors and Levels
- Factor: The independent variable designating the groups being compared.
- Levels of the Factor: The conditions that make up the factor.
- Factorial Design: A study that combines two or more factors.
Type I Error and ANOVA
- ANOVA limits Type I error by performing all comparisons at once.
- Testwise Alpha Level: The risk of Type I error for a single hypothesis test.
- Experimentwise Alpha Level: The total probability of a Type I error accumulated from all individual tests. ANOVA has a lower experimentwise alpha error.
ANOVA Test Statistic
- F-ratio: F = \frac{Variance \ between \ sample \ means}{Variance \ expected \ with \ no \ treatment \ effect}
Conducting ANOVA Test
- Determine total variability.
- Between-Treatment Variance: Measures differences between sample means to get an overall variance measure.
- Within-Treatment Variance: Measures variability/differences between values within each treatment.
Between-Treatment Variance
- Can occur for two reasons:
- Naturally occurring differences (sampling error).
- Differences caused by treatment effects.
- Measurement: Compare variances between treatments against sampling error without treatment effect.
Within-Treatment Variance
- Provides a measure of how big the differences are when (H_0) is true.
F-Ratio for Independent Measures
- Compares between- and within-treatment components.
- F = \frac{Variance \ between \ treatments \ (differences \ including \ any \ treatment \ effects)}{Variance \ within \ treatments \ (differences \ with \ no \ treatment \ effects)}
- If F-ratio is near 1.00, treatments are random and unsystematic, suggesting no treatment effect.
- If F-ratio is larger than 1.00, the numerator is significantly larger than the denominator, indicating significant differences between treatments.
- Error Term: The denominator of the F-ratio, measuring only random variability.
ANOVA Notation
- (k): Number of treatment conditions/separate samples.
- (n): Number of scores in each treatment.
- (N): Total number of scores in the entire study.
- (T): (\sum X) sum of scores for each treatment condition.
- (G): Sum of all scores in the study.
- Sample Variance: (s^2)
- Sum of Squares Total: SS_{total} = \sum X^2 - \frac{G^2}{N}
- Degrees of Freedom Total: df_{total} = N - 1
- Sum of Squares Within-Treatment: SS_{within\ treatment} = \sum SS \ of \ each \ treatment \ group
- Degrees of Freedom Within: df_{within} = \sum (n - 1) = \sum df \ in \ each \ treatment
- Sum of Squares Between-Treatment: SS{between} = SS{total} - SS_{within}
- Degrees of Freedom Between: df_{between} = k - 1
- F-Ratio Calculation: F = \frac{MS{between}}{MS{within}}, where (MS) represents Mean Square.
ANOVA and Hypothesis Testing
- Distribution of F: Expected to be around 1.00.
- F-ratios are always positive.
- When (H_0) is true, the ratio should be near 1.00.
- With smaller df, F values are more spread out.
Steps for Hypothesis Testing
- State Hypotheses:
- (H0 : \mu1 = \mu2 = \mu3)
- (H_1): At least one of the treatment means is different.
- Locate the Critical Region for F:
- Find df total, within, and between.
- Use df within and df between to find the critical F-value.
- Compute the Observed F Test Statistic:
- Find all 3 SS (total, within, between).
- Calculate Mean Squares.
- Calculate F.
- Make a Statistical Decision About the Null Hypothesis:
- If F value is within the critical region, reject (H_0) and accept the alternative.
- If F value is not within the critical region, fail to reject (H_0).
Chapter 14: Two-Factor Analysis
Two-Factor ANOVA
- Allows examination of three types of mean differences within one analysis.
- Goal: Evaluate mean differences produced by factors acting independently or together.
Main Effects
- Main Effect: Mean differences among levels of one factor. Example: (\delta - 4 = 4) is the main effect for factor gender.
- Evaluation of main effects involves 2/3 hypothesis tests for 2-factor ANOVA.
- Hypotheses:
- For Factor A: (H0 : M{a1} = M{a2}) and (H1 : M{a1} \neq M{a2})
- For Factor B: (H0 : M{b1} = M{b2}) and (H1 : M{b1} \neq M{b2})
Interactions
- Interaction: Any extra differences not caused by the main effects.
- If the difference between factors is not constant, there is an interaction.
- Calculated by: F = \frac{Variance \ (mean \ differences \ not \ explained \ by \ main \ effect)}{Variance \ (differences \ expected \ if \ no \ treatment)}
- Hypotheses:
- (H_0): There is no interaction between A and B.
- (H_1): There is an interaction between A and B.
- If levels of one factor depend on the levels of the other, there is an interaction.
- Non-parallel lines on a graph indicate an interaction.
Two-Factor ANOVA Hypothesis Test
- A effect.
- B effect.
- (A \times B) Interaction.
Examples of Main Effects and Interactions
- Main effect of 10, no interaction: +10, +10, +20, +20
- Main effect, interaction: +10, +10, -10, -10
Analysis of Two-Factor ANOVA
- Total variance is split into between-treatment and within-treatment variance.
- Between-treatment variance is further divided into factor A variance, factor B variance, and interaction variance.
Stages of Analysis
- Total Variability:
- SS_{total} = \sum X^2 - \frac{G^2}{N}
- df_{total} = N - 1
- Within-Treatment:
- SS_{within} = \sum SS \ each \ treatment
- df_{within} = \sum df \ each \ treatment
- Between-Treatment:
- SS{between} = SS{total} - SS_{within}
- df_{between} = # \ of \ cells - 1
- Factor A:
- SS_A = …
- df_A = # \ of \ Rows - 1
- Factor B:
- SS_B = …
- df_B = # \ of \ Columns - 1
- (A \times B) Interaction:
- SS{A \times B} = SS{between} - SSA - SSB
- df{A \times B} = df{between} - dfA - dfB
F-Ratios and Critical Values
- 3 F-ratios for the 3 variances.
- Finding Critical F Value:
- Numerator: between groups df.
- Denominator: within groups df.
- df{within} = df{total} - dfA - dfB - df_{A \times B}
- (\alpha = 0.05)
- Critical F = (F(df{between}, df{within}))
Chapter 15: Correlation
Correlation
- Measures and describes the relationship between two variables by measuring different variables for each individual.
Characteristics of a Relationship
- Direction (negative or positive).
- Form (linear - Pearson correlation).
Pearson Correlation
- Measures the degree and direction of a linear relationship.
- r = \frac{Covariability \ of \ X \ and \ Y}{Variability \ of \ X \ and \ Y \ separately}
Sum of Products of Deviations
- Similar to SS, measures the amount of covariability between two variables.
- Definitional Formula: SP = \sum (X - MX)(Y - MY)
- r = \frac{SP}{\sqrt{SSX SSY}}
Hypothesis Testing
- Asks whether a correlation exists in the population.
- Hypotheses:
- (H_0 : \rho = 0) ((\rho) = population correlation)
- (H_1 : \rho \neq 0)
- Test Statistic:
- df = n - 2
- Effect Size: