Effect Sizes and Confidence Intervals
Statistical Significance Testing
Previous lecture focused on statistical significance testing.
Current lecture shifts focus to effect sizes and confidence intervals.
Effect sizes and confidence intervals are becoming preferred metrics for understanding data.
Transitioning from Statistical Significance Testing
Some experts argue against continuing use of statistical significance testing.
Confidence intervals can provide the same information about effect sizes as statistical significance testing.
Effect Sizes
Effect sizes quantify the strength or magnitude of an experimental effect.
Allows an understanding of how large or small the effect or relationship between two variables is.
Different from statistical significance testing, which answers the binary question of whether an effect exists.
Effect sizes clarify how impactful that effect actually is.
Types of Effect Sizes
Three main types of effect sizes discussed:
Standardized difference
Correlation
Proportions
It's important to recognize that many tests can be utilized to measure these effect sizes.
Standardized Differences
Standardized mean difference is commonly associated with experimental designs.
Difference between standardized mean difference and raw mean difference:
Standardized mean difference helps interpret results without context on what change signifies.
Example:
Testing a drug meant to reduce hiccups from 1000 to 925 daily (a reduction of 75),
Without context, the magnitude of 75 hiccups is unclear.
Standardization provides context via standard deviation, indicating how significant this change really is.
Raw mean difference may suffice when the context offers meaningful understanding.
Comparison of Standardized and Raw Mean Differences
Standardized mean difference = context-sensitive.
Raw mean difference = specific context offers understanding (e.g., a reduction of 10 cigarettes daily in smoking intervention).
Common Tests for Effect Sizes
Standardized Differences:
Cohen's d, Hedges' g
Correlations:
Pearson's r, Point-Biserial
Odds Ratios: Only metric for proportions.
Understanding Correlation Measures
Two types of correlation:
Pearson's r: Strength between two continuous variables.
Point-biserial: Strength between one continuous variable and one dichotomous variable (two options).
Dichotomous variables define two clear categories (e.g., yes/no).
Odds Ratios
Commonly used in observational studies.
Note the unique scoring range:
Score range: 0 to ∞ (no negative odds ratios).
Odds ratio of 1 indicates no effect.
Scores < 1 signify a protective factor; scores > 1 indicate greater risk of outcome.
Effect Size Ranges
Effect size measures have predefined ranges for interpretation:
Standardized Mean Difference:
Small effect = 0.2; Medium effect = 0.5; Large effect = 0.8.
Pearson's r (Correlations):
Small = 0.1 to 0.3;
Large > 0.5.
Odds Ratios:
< 1 = protective factor; > 1 = risky.
Interpretation of Effect Sizes
Effect sizes communicate the magnitude of relationships or experimental effects.
Confidence intervals further clarify expected ranges of effect sizes.
Confidence Intervals
Confidence interval (CI): a range of expected values for the effect sizes from different sample collections.
Allows qualitative assessment of confidence in findings based on different samples.
Common calculation levels: 95% and 99%.
CIs built around effect sizes are based on sample characteristics.
Relationship between Confidence Intervals and Statistical Testing
CIs connect to significance testing via alpha levels:
Alpha = 0.05 relates to a 95% CI.
Alpha = 0.01 relates to a 99% CI.
The calculated CI is always associated with the effect size calculated in the study.
Interpretation Essentials for CIs
Report CIs using lower and upper limits.
Example, for an odds ratio calculation of 7.5: 95% CI may present as [5.32, 10.45].
Interpret CIs to evaluate if they contain no effect values (0 for mean differences, 1 for odds ratios).
Meaning of Confidence Intervals
If confidence intervals contain 0 or 1:
Confidence is weak in asserting an effect occurred.
Provides more information than simple statistical significance:
Indicates if effect size is likely positive, negative, protective, or risky.
Precision of Estimates
Smaller confidence intervals indicate greater precision in findings.
Larger intervals suggest less precision, indicating potential variability in effect sizes across samples.
Examples of Confidence Intervals
Example (Standardized Mean Difference):
Cohen's d with a 95% CI of [0.1, 0.9]: An effect occurred, as 0 is not included. Precision is relatively large (spanning from positive ranges).
Example (Odds Ratios):
Odds ratio calculated at 2 with a 99% CI spanning [0.7, 3.5]: Indicates an inconclusive effect since it contains the value 1. Thus, statistical significance is not assured. The potential impact remains uncertain (protective vs. risky).