Continuous data analysis often involves comparing the means of two groups.
The independent sample t-test (or two-sample t-test) is commonly used for this purpose.
In this analysis:
The independent variable (categorical) indicates group membership (e.g., treatment vs. placebo).
The outcome (response variable) is continuous (e.g., systolic blood pressure).
Systolic Blood Pressure Analysis:
Example variable: Systolic blood pressure (in mmHg).
Compare means across two groups:
Group 1: Patients receiving a placebo.
Group 2: Patients receiving a new drug (e.g., Drug A).
Dependent variable (outcome): Mean systolic blood pressure.
Mental Health Score Analysis:
Example variable: Mental health score (range 0 to 100).
Groups may include:
Physically active vs. inactive participants (categorical independent variable).
Null Hypothesis (H0): Mean systolic blood pressure for patients on placebo equals that for patients on Drug A.
Alternative Hypothesis (H1): There is a difference in mean systolic blood pressure between the two groups.
Normality of Distribution:
Distribution of responses must be normally distributed (especially for small sample sizes).
Not necessary for larger sample sizes (n > 30).
Independence of Samples:
Participants in each group must be independent.
No repeated measures or related participants involved.
Homogeneity of Variances:
Variances between the two groups should be approximately equal.
Assumption can be visually inspected using box plots.
Levene's test can formally assess variance equality.
Box Plot Analysis:
Visual representation to determine normal distribution and variance similarity.
Medians represented as lines within boxes; should ideally be centered.
If medians are skewed, suggests potential normality violations.
Example:
Box plot comparison for systolic blood pressure in placebo vs. Drug A:
Assess variability and check if assumptions are satisfied.
Used to formally assess the equality of variances.
If variances are unequal, a modified t-test should be used.
Results should summarize the findings succinctly:
E.g., "There was significant evidence of mean differences in mental health scores between active and inactive groups."
Statistical significance should be reported (t-statistic and p-value).
Confidence intervals for mean differences give insights into the population means.
If assumptions for the t-test are violated, a non-parametric test can be used:
Mann Whitney U Test:
Used when sample sizes are small or data are not normally distributed.
Compares medians by ranking data rather than directly comparing means.
Non-parametric tests can be more flexible but typically have lower power than parametric tests.
The t-test was developed by William Gossett, who worked at Guinness Brewery and published under the pseudonym "Student".
Normal Distribution: Check for normality, crucial for small samples.
Independent Samples: Ensure that groups are independent.
Equal Variances: Test and confirm variances are similar using tests like Levene's test.
Next topic will cover analysis of paired data. This builds on the concepts learned in comparing two means, offering a further nuanced exploration into statistical analysis methods.