module 3 stat part 3
Analysis of Continuous Data: Comparing Two Means
Overview of Comparing Two Means
Continuous data analysis often involves comparing the means of two groups.
The independent sample t-test (or two-sample t-test) is commonly used for this purpose.
In this analysis:
The independent variable (categorical) indicates group membership (e.g., treatment vs. placebo).
The outcome (response variable) is continuous (e.g., systolic blood pressure).
Example of Comparing Means
Systolic Blood Pressure Analysis:
Example variable: Systolic blood pressure (in mmHg).
Compare means across two groups:
Group 1: Patients receiving a placebo.
Group 2: Patients receiving a new drug (e.g., Drug A).
Dependent variable (outcome): Mean systolic blood pressure.
Mental Health Score Analysis:
Example variable: Mental health score (range 0 to 100).
Groups may include:
Physically active vs. inactive participants (categorical independent variable).
Hypotheses Formulation
Null Hypothesis (H0): Mean systolic blood pressure for patients on placebo equals that for patients on Drug A.
Alternative Hypothesis (H1): There is a difference in mean systolic blood pressure between the two groups.
Assumptions of the Two-Sample T-Test
Normality of Distribution:
Distribution of responses must be normally distributed (especially for small sample sizes).
Not necessary for larger sample sizes (n > 30).
Independence of Samples:
Participants in each group must be independent.
No repeated measures or related participants involved.
Homogeneity of Variances:
Variances between the two groups should be approximately equal.
Assumption can be visually inspected using box plots.
Levene's test can formally assess variance equality.
Box Plots and Variance Comparison
Box Plot Analysis:
Visual representation to determine normal distribution and variance similarity.
Medians represented as lines within boxes; should ideally be centered.
If medians are skewed, suggests potential normality violations.
Example:
Box plot comparison for systolic blood pressure in placebo vs. Drug A:
Assess variability and check if assumptions are satisfied.
Levene's Test
Used to formally assess the equality of variances.
If variances are unequal, a modified t-test should be used.
Reporting Results of T-Test
Results should summarize the findings succinctly:
E.g., "There was significant evidence of mean differences in mental health scores between active and inactive groups."
Statistical significance should be reported (t-statistic and p-value).
Confidence intervals for mean differences give insights into the population means.
Non-Parametric Alternatives: Mann Whitney U Test
If assumptions for the t-test are violated, a non-parametric test can be used:
Mann Whitney U Test:
Used when sample sizes are small or data are not normally distributed.
Compares medians by ranking data rather than directly comparing means.
Non-parametric tests can be more flexible but typically have lower power than parametric tests.
Historical Context
The t-test was developed by William Gossett, who worked at Guinness Brewery and published under the pseudonym "Student".
Summary of Assumptions for Two-Sample T-Test
Normal Distribution: Check for normality, crucial for small samples.
Independent Samples: Ensure that groups are independent.
Equal Variances: Test and confirm variances are similar using tests like Levene's test.
Looking Ahead
Next topic will cover analysis of paired data. This builds on the concepts learned in comparing two means, offering a further nuanced exploration into statistical analysis methods.