Wk 6: Multi-Factor Between-Participant Designs - Regression

Correlation, Z-Scores, and Variance Explained

  • Correlation and variance explained are two variables that are fundamentally linked; stronger correlations imply that more variance is explained in the criterion.

  • This connection allows for the estimation of FF statistics in regression without solving for regression coefficients, predicting data, or manually summing squared residuals. Instead, sets of correlation coefficients can be used to calculate Rreg2R_{reg}^2.

  • Correlation is a standardized measure of the degree of association between two variables, which can be calculated in two ways:

    • Calculating the covariance first and then standardizing.

    • Standardizing the variables into Z-scores first and then calculating the covariance.

  • Standardization before calculating covariance is often easier. If systematic variance exceeds unsystematic variance, the vector will correlate with the criterion.

  • This means positive Z-scores on the vector will correspond with positive Z-scores on the criterion, and negative Z-scores will correspond similarly.

  • The average product of Z-scores results in the correlation coefficient (rr).


Multi-Factor Between-Participant Designs

  • A multi-factor design involves the manipulation of more than one independent variable (IV) or factor.

    • e.g. medication x2, and therapy type x2

  • It is termed a factorial design if all possible conditions created by crossing the factors are included in the study.

  • In a factorial design, researchers examine two types of effects:

    • Main Effect: The effect of one factor considered separately from the effect of the other factor.

    • Interaction: The effect of one factor in combination with another factor. This occurs when the effect of Factor A differs across different levels of Factor B (and vice versa).

  • Example of a 2-factor design:

    • Main effect of Factor A (averaged across levels of Factor B).

    • Main effect of Factor B (averaged across levels of Factor A).

    • An interaction effect (A×BA \times B).


Factorial Designs in Regression and Vector Coding

  • Estimating main effects and interactions in regression follows the same process as one-way designs: coding vectors to capture between-groups variability.

  • Vector Counts:

    • In one-way designs, there is one set of vectors containing one less vector than the number of conditions.

    • In factorial designs, sets of vectors are created for each main effect and each interaction.

    • In a 2×22 \times 2 design, there is 1 vector for the first factor, 1 vector for the second factor, and 1 vector for the interaction.

  • Coding Schemes:

    • Dummy-coding: Participants in one condition are coded as 11 and others as 00.

    • Contrast-coding: Participants in one condition are coded as 11 and others as 1-1.

  • 3-Level Factors: These require two vectors (V1V1 and V2V2) to capture the variability among three groups.

  • Coding the Interaction:

    • Interaction vectors are created by multiplying the vectors coding for the main effects.

    • For a 2×22 \times 2 design with V1V1 (Factor A) and V2V2 (Factor D), the interaction vector (V3V3) is calculated as V1×V2V1 \times V2.

    • For a 3×23 \times 2 design with Factor A (V1V1, V2V2) and Factor D (V3V3), there are two interaction vectors: V4=V1×V3V4 = V1 \times V3 and V5=V2×V3V5 = V2 \times V3.

Regression Equations for Factorial Designs

  • The linear regression equation adds terms for every coded vector.

  • 2×22 \times 2 Design Equation:

    • Y^=a+b1X1+b2X2+b3X1X2\hat{Y} = a + b_1X_1 + b_2X_2 + b_3X_1X_2

    • b1b_1 is the slope for Factor A; b2b_2 is the slope for Factor D; b3b_3 is the slope for the interaction.

  • 3×23 \times 2 Design Equation:

    • Y^=a+b1X1+b2X2+b3X3+b4X1X3+b5X2X3\hat{Y} = a + b_1X_1 + b_2X_2 + b_3X_3 + b_4X_1X_3 + b_5X_2X_3

    • b1b_1 and b2b_2 represent the slopes for the two vectors of Factor A; b3b_3 represents Factor D; b4b_4 and b5b_5 represent the interaction vectors.

  • Interpretation of Intercept (aa) and Slopes (bb):

    • Dummy-coding: aa is the mean of the category coded as 00 across all vectors (a single cell). bb for a main effect vector is the difference between the relevant cell in the same row/column as aa and the value of aa.

    for factor Afor factor D
    • Contrast-coding:

      • aa is the grand mean.

      • bb = difference between mean of category coded 1 on that vector and grand mean

  • green: A2 marginal mean vs grand mean

    blue: A3 marginal mean vs grand mean


Vector Sets and Calculating RReg2R_{Reg}^2

  • To derive FF values for each effect (Main Effect A, Main Effect D, Interaction AXD), separate RReg2R_{Reg}^2 values must be calculated: RA2R_A^2, RD2R_D^2, and RAxD2R_{AxD}^2.

    • variance explained by the whole model

    • variance explained by factor A

    • variance explained by factor D

    • variance explained by interaction

  • If a set contains only a single vector, RReg2R_{Reg}^2 is simply the square of the correlation between that vector and the criterion (r2r^2).

  • If a set contains more than one vector (e.g., Factor A in a 3×23 \times 2 design), the inter-correlation between those vectors must be controlled using the formula:

    • RReg2=r12+r22(2×r1×r2×r12)1r122R_{Reg}^2 = \frac{r_1^2 + r_2^2 - (2 \times r_1 \times r_2 \times r_{12})}{1 - r_{12}^2}

    • Where r1r_1 and r2r_2 are correlations with the criterion and r12r_{12} is the correlation between the vectors in the set.

Inference and Significance Testing

  • Residual Variance: RRes2=1RReg2=1(RA2+RD2+RAxD2)R_{Res}^2 = 1 - \sum R_{Reg}^2 = 1 - (R_A^2 + R_D^2 + R_{AxD}^2).

  • Degrees of Freedom (dfdf):

    • dfA=kAdf_A = k_A (number of vectors for Factor A).

    • dfD=kDdf_D = k_D (number of vectors for Factor D).

    • dfAxD=kA×kDdf_{AxD} = k_A \times k_D (product of the degrees of freedom of the main factors).

    • dfRes=Nk1df_{Res} = N - \sum k - 1.

  • Mean Squares (MSMS):

    • MSA=RA2/dfAMS_A = R_A^2 / df_A

    • MSD=RD2/dfDMS_D = R_D^2 / df_D

    • MSAxD=RAxD2/dfAxDMS_{AxD} = R_{AxD}^2 / df_{AxD}

    • MSRes=RRes2/dfResMS_{Res} = R_{Res}^2 / df_{Res}

  • F-Ratios:

    • F(dfEffect,dfRes)=MSEffect/MSResF(df_{Effect}, df_{Res}) = MS_{Effect} / MS_{Res}.

    • If F_{obs} > F_{crit}, reject the null hypothesis (H0H_0).


Example 1: 2×22 \times 2 Factorial Design

  • Design: 2 (Animal Type: Approach/Avoid) ×\times 2 (Mood: Positive/Negative) Between-Participants.

  • Animal Groups:

    • Approach: dog, cat, rabbit, beaver, quokka.

    • Avoid: lion, tiger, bear, rhino, bison.

  • Dependent Variable (DV): Number of animals remembered from a list.

  • Data Recap:

    • Grand mean (aa) = 6.046.04.

    • b1b_1 (Approach mean - Grand mean) = 4.926.04=1.124.92 - 6.04 = -1.12.

    • b2b_2 (Positive mean - Grand mean) = 56.04=1.045 - 6.04 = -1.04.

  • Step-by-Step Results:

    • Correlation (rr) with criterion: V1=0.566V1 = -0.566, V2=0.524V2 = -0.524, V3=0.314V3 = 0.314.

    • RAnimalType2=0.32R_{AnimalType}^2 = 0.32, RMood2=0.27R_{Mood}^2 = 0.27, RInt2=0.10R_{Int}^2 = 0.10.

    • RRes2=1(0.32+0.27+0.10)=0.307R_{Res}^2 = 1 - (0.32 + 0.27 + 0.10) = 0.307.

    • Degrees of freedom: dfAnimalType=1df_{AnimalType} = 1, dfMood=1df_{Mood} = 1, dfInt=1df_{Int} = 1, dfRes=20df_{Res} = 20.

    • Mean Squares: MSAnimalType=0.32MS_{AnimalType} = 0.32, MSMood=0.27MS_{Mood} = 0.27, MSInt=0.10MS_{Int} = 0.10, MSRes=0.015MS_{Res} = 0.015.

    • F-ratios:

      • Animal Type: F(1,20)=0.32/0.015=20.83F(1, 20) = 0.32 / 0.015 = 20.83 (p < 0.001).

      • Mood: F(1,20)=0.27/0.015=17.86F(1, 20) = 0.27 / 0.015 = 17.86 (p < 0.001).

      • Interaction: F(1,20)=0.10/0.015=6.43F(1, 20) = 0.10 / 0.015 = 6.43 (p < 0.001).

  • Interpretation: Significant main effects (greater memory for avoid animals and negative moods) and a significant interaction.


Example 2: 3×23 \times 2 Factorial Design

  • Design: 3 (Animal Type: Approach/Avoid/Neutral) ×\times 2 (Mood: Positive/Negative).

  • Contrast Coding Scheme:

    • V1 (Avoid vs. Others): Avoid = 11, Approach = 1-1, Neutral = 00.

    • V2 (Neutral vs. Others): Neutral = 11, Approach = 1-1, Avoid = 00.

    • V3 (Mood): Positive = 11, Negative = 1-1.

  • Calculations:

    • Grand Mean (aa) = 5.645.64.

    • b1b_1 (Avoid Mean - Grand Mean) = 7.175.64=1.537.17 - 5.64 = 1.53.

    • b2b_2 (Neutral Mean - Grand Mean) = 4.835.64=0.814.83 - 5.64 = -0.81.

    • b3b_3 (Positive Mean - Grand Mean) = 4.285.64=1.364.28 - 5.64 = -1.36.

  • Correlations and R2R^2:

    • Animal Type Set (V1,V2V1, V2): r1=0.420r_1 = 0.420, r2=0.016r_2 = -0.016, r12=0.50r_{12} = 0.50. RAnimalType2=0.244R_{AnimalType}^2 = 0.244.

    • Mood Set (V3V3): r=0.622r = -0.622. RMood2=0.39R_{Mood}^2 = 0.39.

    • Interaction Set (V4,V5V4, V5): r4=0.233r_4 = -0.233, r5=0.295r_5 = -0.295, r45=0.50r_{45} = 0.50. RInt2=0.097R_{Int}^2 = 0.097.

  • F-Statistics:

    • RRes2=0.27R_{Res}^2 = 0.27, dfRes=30df_{Res} = 30, MSRes=0.009MS_{Res} = 0.009.

    • Animal Type: F(2,30)=0.12/0.009=13.47F(2, 30) = 0.12 / 0.009 = 13.47 (p < 0.001).

    • Mood: F(1,30)=0.39/0.009=42.72F(1, 30) = 0.39 / 0.009 = 42.72 (p < 0.001).

    • Interaction: F(2,30)=0.049/0.009=5.36F(2, 30) = 0.049 / 0.009 = 5.36 (p < 0.001).

Effect Sizes

  • While Significance testing (pp-values) helps decide whether to reject H0H_0, effect sizes describe the magnitude.

  • RReg2R_{Reg}^2: The proportion of total criterion variance explained by an effect. However, it does not account for the overlap or presence of other effects.

  • Partial R2R^2: The proportion of residual criterion variance explained by an effect, controlling for other effects.

  • Formula: Partial R2=RReg2RRes2+RReg2\text{Partial } R^2 = \frac{R_{Reg}^2}{R_{Res}^2 + R_{Reg}^2}.

  • Cut-offs for R2R^2 and Partial R2R^2:

    • Small: >0.01 to 0.060.06.

    • Medium: >0.06 to 0.140.14.

    • Large: >0.14.

  • Comparison Example (2×22 \times 2):

    • Animal Type: R2=0.32R^2 = 0.32; Partial R2=0.32/(0.307+0.32)=0.51R^2 = 0.32 / (0.307 + 0.32) = 0.51.

    • Mood: R2=0.27R^2 = 0.27; Partial R2=0.27/(0.307+0.27)=0.47R^2 = 0.27 / (0.307 + 0.27) = 0.47.

    • Interaction: R2=0.10R^2 = 0.10; Partial R2=0.10/(0.307+0.10)=0.24R^2 = 0.10 / (0.307 + 0.10) = 0.24.