Effect Sizes and Confidence Intervals

Statistical Significance Testing

  • Previous lecture focused on statistical significance testing.

  • Current lecture shifts focus to effect sizes and confidence intervals.

  • Effect sizes and confidence intervals are becoming preferred metrics for understanding data.

Transitioning from Statistical Significance Testing

  • Some experts argue against continuing use of statistical significance testing.

  • Confidence intervals can provide the same information about effect sizes as statistical significance testing.

Effect Sizes

  • Effect sizes quantify the strength or magnitude of an experimental effect.

  • Allows an understanding of how large or small the effect or relationship between two variables is.

  • Different from statistical significance testing, which answers the binary question of whether an effect exists.

  • Effect sizes clarify how impactful that effect actually is.

Types of Effect Sizes

  • Three main types of effect sizes discussed:

    • Standardized difference

    • Correlation

    • Proportions

  • It's important to recognize that many tests can be utilized to measure these effect sizes.

Standardized Differences

  • Standardized mean difference is commonly associated with experimental designs.

  • Difference between standardized mean difference and raw mean difference:

    • Standardized mean difference helps interpret results without context on what change signifies.

    • Example:

    • Testing a drug meant to reduce hiccups from 1000 to 925 daily (a reduction of 75),

    • Without context, the magnitude of 75 hiccups is unclear.

    • Standardization provides context via standard deviation, indicating how significant this change really is.

  • Raw mean difference may suffice when the context offers meaningful understanding.

Comparison of Standardized and Raw Mean Differences

  • Standardized mean difference = context-sensitive.

  • Raw mean difference = specific context offers understanding (e.g., a reduction of 10 cigarettes daily in smoking intervention).

Common Tests for Effect Sizes

  • Standardized Differences:

    • Cohen's d, Hedges' g

  • Correlations:

    • Pearson's r, Point-Biserial

  • Odds Ratios: Only metric for proportions.

Understanding Correlation Measures

  • Two types of correlation:

    • Pearson's r: Strength between two continuous variables.

    • Point-biserial: Strength between one continuous variable and one dichotomous variable (two options).

  • Dichotomous variables define two clear categories (e.g., yes/no).

Odds Ratios

  • Commonly used in observational studies.

  • Note the unique scoring range:

    • Score range: 0 to ∞ (no negative odds ratios).

    • Odds ratio of 1 indicates no effect.

    • Scores < 1 signify a protective factor; scores > 1 indicate greater risk of outcome.

Effect Size Ranges

  • Effect size measures have predefined ranges for interpretation:

    • Standardized Mean Difference:

    • Small effect = 0.2; Medium effect = 0.5; Large effect = 0.8.

  • Pearson's r (Correlations):

    • Small = 0.1 to 0.3;

    • Large > 0.5.

  • Odds Ratios:

    • < 1 = protective factor; > 1 = risky.

Interpretation of Effect Sizes

  • Effect sizes communicate the magnitude of relationships or experimental effects.

  • Confidence intervals further clarify expected ranges of effect sizes.

Confidence Intervals

  • Confidence interval (CI): a range of expected values for the effect sizes from different sample collections.

  • Allows qualitative assessment of confidence in findings based on different samples.

  • Common calculation levels: 95% and 99%.

  • CIs built around effect sizes are based on sample characteristics.

Relationship between Confidence Intervals and Statistical Testing

  • CIs connect to significance testing via alpha levels:

    • Alpha = 0.05 relates to a 95% CI.

    • Alpha = 0.01 relates to a 99% CI.

  • The calculated CI is always associated with the effect size calculated in the study.

Interpretation Essentials for CIs

  • Report CIs using lower and upper limits.

  • Example, for an odds ratio calculation of 7.5: 95% CI may present as [5.32, 10.45].

  • Interpret CIs to evaluate if they contain no effect values (0 for mean differences, 1 for odds ratios).

Meaning of Confidence Intervals

  • If confidence intervals contain 0 or 1:

    • Confidence is weak in asserting an effect occurred.

  • Provides more information than simple statistical significance:

    • Indicates if effect size is likely positive, negative, protective, or risky.

Precision of Estimates

  • Smaller confidence intervals indicate greater precision in findings.

  • Larger intervals suggest less precision, indicating potential variability in effect sizes across samples.

Examples of Confidence Intervals

  • Example (Standardized Mean Difference):

    • Cohen's d with a 95% CI of [0.1, 0.9]: An effect occurred, as 0 is not included. Precision is relatively large (spanning from positive ranges).

  • Example (Odds Ratios):

    • Odds ratio calculated at 2 with a 99% CI spanning [0.7, 3.5]: Indicates an inconclusive effect since it contains the value 1. Thus, statistical significance is not assured. The potential impact remains uncertain (protective vs. risky).