Statistical Analysis and Interpretation
Study Design and Analysis Overview
Overview of Study Process
Identify study question
Select study approach
Design study and collect data
Analyze data
Report findings
Objectives
Define hypotheses used in statistical testing
Interpret p-values
Interpret confidence intervals
Select correct statistical tests
Statistical Testing
Reasons for Statistical Testing
Used for comparing multiple groups
Applicable in bivariate and multivariate analysis
Key Consideration: "Are the groups being compared different?"
Types of Analysis by Number of Variables
Univariate (1 variable)
Examples: counts, proportions, percentages
Bivariate (2 variables)
Examples: rate ratios, odds ratios
Multivariate (3+ variables)
Examples: regression models
Comparative Analysis Definitions
Parameter: A measurable numeric characteristic of a population.
Statistic: A measured characteristic of a sample population (e.g., sample mean).
Inferential Statistics: Use statistics from a random sample to make assumptions about the values of parameters in the population as a whole (e.g., population mean).
Study Approaches
Key Analysis Steps
Case-control study
Objective: Show that cases and controls are similar except for disease status
Measure: Odds ratio to compare exposure histories
Adjusted OR: Through regression or other methods
Cohort study
Objective: Show that exposed and unexposed are similar except for exposure status
Measure: Rate ratio or risk ratio to compare disease by exposure history
Adjusted RR or IRR: Through regression or other methods
Experiment
Objective: Show similarities in individuals assigned to intervention and control groups
Measure: Compare outcomes by exposure group using intent-to-treat analysis
Metric: Usually RR or IRR, often converted to efficacy
Note: Poor experimental design indicated; per-protocol analysis may be considered.
Hypotheses in Statistical Tests
Comparative Statistical Tests Definitions
Designed to test for differences.
Null Hypothesis (H0)
Describes the expected result if there is no difference between the groups being compared.
Null Result: Indicates no statistically significant difference.
Alternative Hypothesis (Ha)
Describes the expected result if there is a difference between populations.
Example Analysis
Mean Midterm Scores Analysis between male and female students in HK 201:
Hypotheses:
H0 = Mean scores are not different.
Ha = Mean scores are different
Overall Mean = 114.25 (SD = 22.60)
Range: 64.50 – 150
Group Statistics Analysis
Independent Samples Test Results
Group Statistics: Female versus Male
| Sex | N | Mean | Std. Deviation | Std. Error Mean |
|------|----|---------|----------------|------------------|
| 0 | 44 | 113.9205| 21.09694 | 3.18048 |
| 1 | 20 | 115.0000| 26.12470 | 5.84166 |
Levene's Test for Equality of Variances
Tests if two groups have equal variances.
Results are necessary for the two-sample t-test interpretation.
Interpretation of p-values
Definition of p-value
A p-value describes the likelihood of obtaining a test statistic as extreme as, or more extreme than, the one observed, under the assumption that the null hypothesis is true.
Threshold for p-values
Low p-value (typically ≤ 0.05):
Suggests observed effect is unlikely due to random chance.
Provides strong evidence against the null hypothesis (statistically significant).
High p-value (typically > 0.05):
Indicates insufficient evidence to reject the null hypothesis.
Significance Level
Standard significance level used is α = 0.05 or 5%.
Measures of Association
Common Comparative Statistics Types
Odds Ratio (OR): Used in case-control studies.
Rate Ratio (IRR or RR): Used in cohort studies.
Prevalence Ratio (PR): Used in cross-sectional studies.
Interpretation of Confidence Intervals
95% Confidence Interval (CI):
Associated with a significance level of α = 0.05.
For OR and RR, a CI that does not overlap 1 indicates statistical significance.
Effect Size
Cohen's d: A measure for the magnitude of effect, interpreted as follows:
Small effect: 0.1
Moderate effect: 0.3
Large effect: 0.5
Appropriateness of Statistical Tests
Assumption Checks
Analysts must ensure that the selected statistical tests are appropriate based on variable characteristics and their distributions.
Types of Tests
Parametric Tests: Assume a specific distribution (normal), typically for ratio/interval variables.
Nonparametric Tests: No distribution assumptions made, used for ranked variables.
Summary of Common Tests
Common tests for comparing two or more groups based on the type of variable being examined.
Correlation Analysis
Types of Correlation Coefficients
Pearson (r): For linear relationships between two ratio/interval variables.
Spearman's rho (ρs): For ordinal/rank variables.
Kendall's tau (τ): For ordinal/rank variables.
Phi Coefficient (φ): For binomial variables.
Cramér’s V: For categorical variables.
Regression Analysis
Regression Models Defined
Linear Regression: Predicts a value for continuous outcome variables.
Logistic Regression: Estimates association with binary outcomes.
Comparing Paired Data
Paired Data Definition
Variables linked for analysis from individuals matched on specific characteristics or from the same individual at multiple points in time.
Common Tests for Paired Populations
McNemar's Test: For nominal/binary data.
Wilcoxon Test: For ordinal data.
Repeated-measures ANOVA: For assessing differences in means.
Example of Tests for Pretest and Posttest Results
Tests for differences based on samples from the same group before and after an intervention.
Learning Outcomes Check
Understand and distinguish between null (H0) and alternative (Ha) hypotheses.
Interpret p-values and confidence intervals appropriately.
Select appropriate statistical analysis based on variable types and research goals.