Distribution of Sample Means - Chapter 7 Study Guide

Chapter 7 | Probability and Samples: The Distribution of Sample Means

Overview of Probability

  • Problem presented by Amos Tversky and Daniel Kahneman:

    • Context: An urn containing balls where two-thirds are red, one-third white.

    • Samples:

    • Individual 1: Selects 5 balls, finding 4 red and 1 white (80% red).

    • Individual 2: Selects 20 balls, finding 12 red and 8 white (60% red).

    • Question: Who should feel more confident that the urn contains two-thirds red balls?

    • Findings: Most believed Individual 1 had stronger evidence. However, the correct conclusion is that Individual 2's larger sample size leads to greater confidence.

Importance of Sample Size

  • Law of Large Numbers: Larger samples provide a better representation of the population.

  • Example of Coin Tossing: Likely to see 3 heads in a row, but if 20 heads appear, suspicion of a trick coin arises. Larger samples yield more reliable representations.

Learning Objectives

  1. Define the distribution of sample means, predict its characteristics, and determine these for specific populations and sample sizes.

  2. Apply concepts of z-scores and probability to samples larger than one.

Distribution of Sample Means

  • Definition: Collection of sample means from all possible random samples of a specific size (n) from a population.

  • Difference from previous distributions: Values are statistics (sample means) rather than individual scores.

  • Sampling Distribution: The distribution of statistics, including the distribution of sample means.

Characteristics of Distribution of Sample Means
  1. Sample means cluster around the population mean (central tendency).

  2. Distribution approaches normality as sample size increases, typically n > 30; smaller samples display more variability.

  3. Larger samples yield sample means closer to the population mean, indicating more reliable data.

Example Calculation for Distribution of Sample Means
  • Given Population: 4 scores (2, 4, 6, 8), construct the distribution of sample means for n = 2.

    • Observation: The sample means approximate the population mean (5) and demonstrate normal distribution.

    • Probability Question: Probability of sample mean > 7 = p(M > 7) = count of means > 7 / total counts.

Sampling Error

  • Definition: Difference between sample statistics and population parameters; natural discrepancies expected due to random sampling.

  • Example: Selects n = 25 students; sample mean IQ will vary from population mean.

Z-Scores and Sample Means

  • Z-score Definition: A standardized score indicating how many standard deviations an observation is from the mean.

  • Formula for Sample Mean: The z-score can be recalibrated to apply to sample means as:
    z = \frac{M - m}{s_M}

  • Critical Values: Determines whether sample mean is extreme regarding population mean.

Understanding Sampling Distributions

  • Common Practices: Choosing size n for better representation and future probabilities.

  • Distribution of Sample Means: Each mean has its own parameters; large variability indicates complex representations of population data.

The Central Limit Theorem (CLT)

  • CLT ensures that sample means will be approximately normally distributed if n is sufficiently large, regardless of the population shape.

Statistical Power

  • Power is the probability of correctly rejecting a false null hypothesis:
    Power = 1 - (Type\; II \; error\; rate)

  • Involves relationships with sample size, treatment effect, and alpha levels.

  • Table: Displays sample size needed for achieving various power levels for both medium and small effect sizes.

Cohen’s d Effect Size

  • Cohen's d measures the effect size.

    • d = \frac{M - m}{s}

  • Indicates impact magnitude clearly and effectively.

  • Standardized effect sizes:

    • Small: d = 0.2

    • Medium: d = 0.5

    • Large: d = 0.8

Confidence Intervals

  • Confidence intervals provide a range for the population mean based on sample data:

    • m = M \pm t(s_M)

  • Used along with t-tests to estimate parameters more accurately.

  • Wider intervals indicate more confidence in estimates.

Conclusion and Practical Applications

  • Hypothesis testing can correctly identify treatment effects, influencing further research decisions and application.

  • Careful representation of errors and confidence levels necessary when presenting results. Effect size must accompany significance for thorough understanding of treatment outcomes.