Stats 3 Comparing Quantitative Variables

BSc Medical Sciences - Experimental Design and Statistics: Workshop 3 - Comparing Quantitative Variables

Introduction

The focus of Workshop 3 is on comparing quantitative (continuous) variables between groups in various contexts, particularly in the realm of health research. The workshop will cover the following main themes: method choices between parametric and non-parametric tests, the structure of studies comparing two or more groups, and the underlying assumptions required for valid statistical testing.

Methods for Comparing Quantitative Variables

Overview of Methods

Parametric Methods: Used when data satisfy certain assumptions, primarily related to the distribution and variance of the data.
Non-parametric Methods: Employed when assumptions associated with parametric methods do not hold, focusing on rank ordering of data rather than direct scores.
Grouping Scenarios:
- Two Groups vs. Three or More Groups
- Independent Groups vs. Paired Groups

Comparing Two Independent Groups

Common Scenarios

Often encountered in randomized trials (e.g., intervention vs. control) and cohort studies (e.g., exposed vs. non-exposed).

Example: Infant Sleep Study

Population: All children aged 7 months experiencing sleep problems.
Sample Size: 328 children, with random assignment to intervention or control groups.
Outcome: Maternal depression measured at child age 10 months, reported on a scale from 0 to 30.
Results: For the intervention group, the mean (SD) depression score was 6.8 (5.1) and for the control group it was 7.8 (5.5).

Hypothesis Testing

A confidence interval (CI) helps assess the true mean difference between intervention and control groups and determine if there's a statistically significant difference.
Structure of data includes independent identifiers for each participant, along with group identifiers and depression scores collected at 10 months.

Assumptions for Two Sample (Unpaired) T-Test

Normality: Data within each group must be either normally distributed, or the sample size must be sufficiently large (typically n > 30) even if skewed.
Homogeneity of Variance: Standard deviations across the two groups should be comparable.
Independence: Participants must be independent, meaning that the scores of one group do not affect the scores of the other.

Statistical Outcomes from the Infant Sleep Study

Mean Difference: -1.0 units indicating less maternal depression in the intervention group, although with a 95% CI from -2.2 to 0.1. This implies uncertainty about the superiority of the intervention over usual care.
Null Hypothesis: Assumes no difference in mean depression scores between groups; the resulting p-value from the t-test was 0.09, indicating weak evidence against the null hypothesis.
Interpretation of P-Values:
- p < 0.001: Strong evidence against the null
- p ~ 0.05: Moderate evidence
- p ~ 0.1: Weak evidence
- p > 0.1: Little evidence against the null

Comparing Two Paired Groups

Design and Analysis

Pairing can occur through randomized matching or measuring the same participants before and after an intervention.
By analyzing within-pair differences, statistical power is increased and assumptions are better satisfied due to the inherently controlled design.

Example: Infant Sleep Study Control Group

Data showed a mean depression score at 7 months of 8.28 (0.45) and at 10 months of 7.84 (0.42), assessing whether notable changes occurred.

Hypothesis Testing with Paired T-Test

The null hypothesis posits no change in mean depression scores; results showed a p-value of 0.25 which implies little evidence of significant change from 7 months to 10 months.

Comparing Three or More Independent Groups

Design Considerations

When exploring multiple groups (e.g., education levels among mothers), separate t-tests for all group comparisons can increase type I error rates. Thus, a single global p-value using analysis of variance (ANOVA) is preferred.

Example: Depression by Education Level

ANOVA procedures provide a global assessment. A hypothesis that mothers' depression scores would be equal across groups led to a p-value of 0.04, suggesting evidence of differences.
Education categories showed different depression score distributions (mean scores sorted by educational qualifications).

Post Hoc Testing

If ANOVA indicates significant differences, further post hoc analyses like Tukey's test can help identify which specific groups differ significantly.

Comparing Three or More Paired Groups

Methodology

In cases where multiple measurements are taken (e.g., depression scores at multiple time points), repeated measures ANOVA serves to analyze the means from the matched data.
Example: Compared depression scores of mothers at ages 7, 10, and 12 months postnatal.
A p-value of 0.02 indicated statistically significant changes in depression scores over time, supported by post hoc comparisons to verify specific time differences.

Non-Parametric Methods

When to Use

These methods are applied when parametric assumptions cannot be satisfied. They analyze rank ordering of data and provide only p-values, typically summarizing groups using medians and interquartile ranges.

Notable Non-Parametric Tests

Mann-Whitney Test: For comparing two independent groups.
- Example from the Infant Sleep Study indicates weak evidence of differences in depression scores with a p-value of 0.10.
Kruskal-Wallis Test: For comparing three or more independent groups, measuring changes in distributions.
- Applied to assess depression scores across various educational levels with p-values indicating weak to moderate evidence.
Wilcoxon Signed Ranks Test: For paired data, examining differences before and after interventions.
- Results from the Infant Sleep Study suggested minimal changes with a p-value of 0.24.
Friedman Test: For repeated measures, analyzing distributions across paired groups to detect changes over time with notable p-values indicating significant changes.

Advantages and Disadvantages of Non-Parametric Methods

Advantages: Valid under skewed distributions or for ordinal data; can provide similar results to parametric methods under satisfied assumptions.
Disadvantages: Limited interpretability, as they don’t directly estimate mean differences and provide fewer statistical parameters compared to parametric tests.

Checking Assumptions for Parametric Testing

Normal Distribution: Use histograms or box plots to visually inspect data fit.
Similarity of Variance: Ensure variance of one group does not exceed four times the variance of another.
Choice of Test Based on Data Shape: Employ parametric tests only if data conforms to normality and variance being equal, else opt for non-parametric approaches.

Summary of Workshop 3

Focus on comparing quantitative variables requires understanding both parametric and non-parametric methods across varying group structures (two groups vs. three more; independent vs. paired). Attention must be paid to statistical assumptions to ensure validity of the chosen analysis plans. This workshop provided a comprehensive overview for effectively interpreting health research data.