02 CARDS Significance Effect Size Power

Statistical Significance, Effect Size & Power

Statistical Power

  • Statistical Power Definition: The probability that a study will yield a significant result if the research hypothesis is true.

    • Correct rejection of a false null hypothesis (Ho).

    • Formula: Power = 1 – β

    • β represents the probability of a Type II error (failing to reject a false null hypothesis).

Factors Affecting Statistical Power 1- β

  • Increases with:

    • α (alpha): Probability of Type I Error

    • True Alternative Hypothesis (H1)

    • Sample Size (n)

    • Test Being Used

Power as a Function of α

  • Larger α Levels: Higher levels of α result in greater statistical power obtained.

Power as a Function of H1

  • Effect Size: The difference between means of Ho and H1 (μo - μ1).

    • Larger effect sizes lead to greater power.

Power as a Function of Sample Size

  • Impact of n:

    • Standard deviation of sampling distribution (standard error) decreases as n increases.


Example: Finding Differences (t-test)

  • Sample Size (n): 100

    • α = 0.05, β = 0.49

  • Sample Size (n): 199

    • α = 0.05, β = 0.20


Statistical Significance vs Practical Significance

  • Statistical Significance: Achieved when the outcome is highly unlikely to have occurred due to chance.

  • Hypothesis Test: Evaluates the statistical significance of research study results.

Importance of Sample Size

  • Influence of treatment effect size and sample size.

    • A small effect can become significant with a large sample.


Effect Size (Cohen's d)

  • Cohen’s d Explained:

    • A standardized measure of effect size that quantifies the mean difference in terms of standard deviation, similar to z-scores.

    • Indicates extent to which two scores do not overlap due to experimental procedure.

Formula for Effect Size

  • Standardized effect size = (Population Mean 1 - Population Mean 2) / Population Standard Deviation.

Challenges in Defining Effect Size

  • Cohen’s (1988) Caution: Potential risks in conventional definitions for diverse behavioral science inquiries.

    • Difficulty in categorizing effect sizes as small, medium, or large.


Effect Size Examples

  • Table of Effect Sizes:

    • Elements include Cues (1.25), Reinforcement (1.17), Corrective feedback (.94), Engagement (.88), etc.

Factors Influencing Power

  • Effect Size (d):

    • Large d increases power; small d decreases it.

  • Sample Size (N):

    • Large N increases power; small N decreases it.

  • Significance Level (α):

    • Lenient α increases power; extreme low α decreases power.


How to Increase Power

  • Predicted Difference Between Population Means: Increase experimental procedure intensity.

    • Caution: May distort study meaning.

  • Standard Deviation:

    • Use a less diverse population; however, this may constrain generalizability.

  • Sample Size (N): Use larger samples, but this can be costly.

  • Significance Level (α): Allow more lenient significance levels like .10, but raises Type I error risk.

  • One-Tailed vs Two-Tailed Tests: Use a one-tailed test where applicable.

  • Hypothesis Testing Procedure: Opt for a more sensitive procedure when available.


Examples

Example of Power as a Function of α

Example: If a clinical trial examines whether a new drug reduces blood pressure, increasing α from 0.01 to 0.05 might allow the study to detect the drug's effect more effectively.

Example of Power as a Function of H1

Effect Size: The difference between means of Ho and H1 (μo - μ1).Example: In a research study on exercise, if the average weight loss for a placebo group (Ho) is 5 lbs and for the treatment group (H1) is 15 lbs, this significant difference indicates a large effect size, leading to greater power.

Example of Power as a Function of Sample Size

Impact of n: Larger sample sizes reduce the standard error.Example: A marketing survey with 1000 participants can yield more reliable insights into consumer preferences than one with only 50 participants.

Statistical Significance vs Practical Significance

Statistical Significance: Achieved when the outcome is highly unlikely to have occurred due to chance.Example: A study finds that a new teaching method improves student test scores with a p-value of 0.001, indicating strong statistical significance.

Hypothesis Test: Evaluates the statistical significance of research study results.Example: Research testing the effectiveness of a new drug may show p < 0.05, indicating that the results aren’t likely due to chance.

Importance of Sample Size: Influence of treatment effect size and sample size.Example: A drug may have a small effect size but could still achieve statistical significance with a large sample, such as reducing blood cholesterol levels by 1% in a study of 1000 participants.

Effect Size (Cohen's d)

Cohen’s d Explained: A standardized measure of effect size that quantifies the mean difference in terms of standard deviation.Example: If two groups have average test scores of 70 (Group A) and 80 (Group B) with a standard deviation of 10, Cohen's d would indicate a medium to large effect size, showing practical relevance in educational testing.

Formula for Effect Size: Standardized effect size = (Population Mean 1 - Population Mean 2) / Population Standard Deviation.

Challenges in Defining Effect Size

Cohen’s (1988) Caution: Potential risks in conventional definitions for diverse behavioral science inquiries.Example: In psychological studies, small effect sizes might be statistically significant but could lead researchers to overstate their importance, as seen in studies linking minor interventions to major changes in behavior.

Difficulty in Categorizing Effect Sizes as Small, Medium, or Large.

Effect Size Examples

Table of Effect Sizes:

  • Cues (1.25): In a classroom, the introduction of visual cues improved student recall of information by a significant margin.

  • Reinforcement (1.17): A reward system in a workplace led to increased productivity levels significantly.

  • Corrective Feedback (.94): Providing detailed feedback on student assignments led to improved learning outcomes.

  • Engagement (.88): Increased student engagement through hands-on activities significantly boosted attendance rates.

Factors Influencing Power

  • Effect Size (d): Large d increases power; small d decreases it.

  • Sample Size (N): Large N increases power; small N decreases it.

  • Significance Level (α): Lenient α increases power; extreme low α decreases power.

How to Increase Power

  • Predicted Difference Between Population Means: Increase experimental procedure intensity.Example: In an educational setting, introducing an intensive study program could reveal more substantial differences in test scores.

  • Standard Deviation: Use a less diverse population; however, this may constrain generalizability.Example: Focusing on a specific demographic in a study can reduce variability and enhance power, but at the cost of broad applicability.

  • Sample Size (N): Use larger samples, but this can be costly.Example: Conducting a national health survey may require substantial funding for a larger sample size to gain reliable insights.

  • Significance Level (α): Allow more lenient significance levels like .10, but raises Type I error risk.Example: A clinical trial might opt for α = 0.10 when preliminary data suggests a promising drug effect.

  • One-Tailed vs Two-Tailed Tests: Use a one-tailed test where applicable.Example: Researchers might use a one-tailed test to determine if a new medication lowers cholesterol compared to a placebo.

  • Hypothesis Testing Procedure: Opt for a more sensitive procedure when available.