Distribution of Means, Z Tests, and Confidence Intervals

Review of Populations, Samples, and Research Objectives

Population vs. Sample:
- A population is the entire group that a researcher is interested in (e.g., all depressed people).
- A sample is a smaller subset of that population (e.g., a sample of $75$ depressed people).
- Because it is usually not feasible to study an entire population, researchers u8ose samples to make inferences about the larger group.
Statistics vs. Parameters:
- Parameters are numerical values that describe a population.
- Statistics are numerical values that describe a sample.
- Researchers use statistics to estimate population parameters.
Objective of Inferences:
- The primary goal is for samples to represent the population accurately so that results can be claimed about the population as a whole.

Sampling Error and Estimation

Defining Sampling Error:
- This refers to the difference between the values actually existing in the population and the values observed in a sample.
- It is the likelihood that sample values (such as the Mean, $M$ , and Standard Deviation, $SD$ ) will differ from the actual population parameters.
- It does not imply that mistakes were made in data collection or analysis; it is a natural byproduct of sampling.
Managing Sampling Error:
- In real-world scenarios, the exact amount of sampling error is unknown because the population parameters are rarely known.
- Therefore, researchers estimate the average amount of sampling error present in the data.
Illustration of Sampling Error:
- Population (NY College Students):
  - Total population size ( $N$ ) = $145,000$ .
  - Population Parameters: Mean Age = $21.3$ , Mean IQ = $112.5$ , Gender = $65\%$ Female, $35\%$ Male.
- Sample #1 (U. Albany Students, $n = 100$ ):
  - Sample Statistics: Mean Age = $20.4$ , Mean IQ = $114.2$ , Gender = $40\%$ Female, $60\%$ Male.
- Sample #2 (Ithaca College Students, $n = 100$ ):
  - Sample Statistics: Mean Age = $23.3$ , Mean IQ = $118.0$ , Gender = $80\%$ Female, $20\%$ Male.
- Sample #3 (NYU Students, $n = 100$ ):
  - Sample Statistics: Mean Age = $19.8$ , Mean IQ = $104.6$ , Gender = $60\%$ Female, $40\%$ Male.

The Distribution of Means

Conceptual Overview:
- When samples of more than one individual are taken, the comparison distribution for the null hypothesis shifts from a distribution of individual scores to a distribution of means.
- Also referred to as the "Sampling Distribution of Means" or "Distribution of Sample Means."
- These means tend to "pile up" around the actual population mean ( $\mu$ ).
Characteristics of the Distribution of Means:
- Mean of the Distribution of Means: The mean of the sampling distribution is identical to the mean of the population of individuals ( $\mu$ ).
- Impact of Sample Size ( $n$ ):
  - Larger sample sizes lead to sample means that are closer to the population mean on average.
  - Large samples are generally better representatives of the population than small samples.
- Variance of the Distribution of Means ( $\sigma_M^2$ or $\text{Var}_M$ ):
  - Calculated as the variance of the population divided by the number of individuals in each sample: $\text{Variance} = \frac{\sigma^2}{n}$ .
- Standard Deviation of the Distribution of Means:
  - This is the square root of the variance of the distribution of means.

The Standard Error of the Mean (SEM)

Definition: The standard deviation of the sampling distribution of means is formally called the Standard Error of the Mean (SEM).
Function:
- It reflects the accuracy with which sample means estimate the population mean.
- it represents the average deviation (sampling error) of sample means from the population mean ( $\mu$ ).
Mathematical Relationship and Sample Size:
- As sample size ( $n$ ) increases, the SEM decreases, meaning the sample mean becomes a better estimate of the population mean.
- If the SEM is large, sample means will tend to differ significantly from one another, and many will not be accurate representations of $\mu$ .
- It answers the practical question: "If I had taken another sample, how different would my result be?"

Hypothesis Testing with Z Tests

Applicability: Z tests are used when population parameters (population mean and standard deviation) are known.
The Procedure:
- A sample mean is compared to a known null population mean using the SEM (adjusted for sample size) as the standard of measure.
- Z-Test Formulas:
  - Standardizing the mean: $Z = \frac{M - \mu}{SEM}$
  - Where $SEM = \frac{\sigma}{\sqrt{n}}$
Z Test Example: College XYZ IQ Scores:
- Known Population Parameters: Mean ( $\mu$ ) = $100$ , Standard Deviation ( $\sigma$ ) = $15$ .
- Hypotheses:
  - Null Hypothesis ( $H_0$ ): $\mu_1 = \mu_2$ (No difference).
  - Alternative Hypothesis ( $H_1$ ): $\mu_1 \neq \mu_2$ (Difference exists).
Scenario 1: Small Sample ( $n = 10$ ):
- Observed Mean ( $M$ ) = $107$ .
- Alpha Level ( $\alpha$ ) = $0.05$ (two-tailed).
- Critical Value ( $Z$ ) = $\pm 1.96$ .
- Calculation:
  - $SEM = \frac{15}{\sqrt{10}} = 4.7434... \approx 4.75$
  - $Z = \frac{107 - 100}{4.75} = 1.47$
- Conclusion: 1.47 < 1.96. Since the observed $Z$ is smaller than the critical value, the null hypothesis is not rejected. The result is inconclusive.
Scenario 2: Large Sample ( $n = 50$ ):
- Observed Mean ( $M$ ) = $107$ .
- Calculation:
  - $SEM = \frac{15}{\sqrt{50}} = 2.1213... \approx 2.12$
  - $Z = \frac{107 - 100}{2.12} = 3.3$ (or $3.33$ )
- Conclusion: 3.33 > 1.96. Since the observed $Z$ is larger than the critical value, the null hypothesis is rejected in favor of the alternative. The class mean appears to come from a population with a mean higher than $100$ .

Confidence Intervals (CI)

Purpose: An alternative to the "all-or-none" decision of hypothesis testing. It establishes a range of values around a sample mean where the population mean is likely to reside.
Calculation Components:
- Uses the sample mean ( $M$ ).
- Uses the standard error (SEM).
- Uses a specific level of confidence (typically $95\%$ or $99\%$ ).
Confidence Levels and Z Scores:
- For a $95\%$ CI, the Z score is $1.96$ .
- For a $99\%$ CI, the Z score is $2.57$ .
Formula:
- $CI = M \pm (Z \times SEM)$
- Lower Limit: $M - (Z \times SEM)$
- Higher Limit: $M + (Z \times SEM)$

Confidence Interval Examples

Example: IQ of 25 Children:
- Given: $M = 107$ , Variance of population = $324$ , $n = 25$ .
- Step 1: Calculate SEM:
  - Variance of D of M = $\frac{324}{25} = 12.96$
  - $SEM = \sqrt{12.96} = 3.6$
- $95\%$ Confidence Interval:
  - $107 \pm (1.96 \times 3.6)$
  - $107 \pm 7.056$
  - Interval: $99.944$ to $114.056$
- $99\%$ Confidence Interval:
  - $107 \pm (2.57 \times 3.6)$
  - $107 \pm 9.252$
  - Interval: $97.748$ to $116.252$
Observations on Intervals:
- The $99\%$ CI is wider than the $95\%$ CI. A larger range is required to be more confident that the interval contains the population mean.
- Effect of Sample Size:
  - As $n$ increases, SEM decreases, making the confidence interval smaller (more precise).
  - Example at $95\%$ \ confidence: If $n = 25$ , the range is $99.94$ to $114.06$ . If $n = 100$ , the range narrows to $103.65$ to $110.35$ .
Interpretation:
- It is technically more accurate to say we are calculating a range of values that contains the true population mean based on the sample mean, rather than being " $95\%$ \ confident this specific interval contains it."

SPSS Application and Data Visualization

Error Bar Graphs:
- Used to visualize confidence intervals for different categories.
- Path: Graph -> Error Bar -> Simple -> Define.
Interpreting Overlap:
- Nonsignificant Results: Confidence intervals around the means will typically fall within each other's limits (they overlap significantly).
- Significant Results: The confidence interval around one mean does not overlap with the confidence interval of the other mean.
Variables Examined in Class:
- Independent Variable (IV) vs. Dependent Variable (DV).
- Examples: Family_Visit by Sex, Current_GPA by Sex.

Scholarly Research Examples

Trypophobia and Comfort Levels (Pipitone & DiMattina, 2020):
- Study involved $31$ trypophobic images with three manipulated versions (original, scrambled, phase).
- Sample size: $146$ participants.
- Results (Error bars representing $95\%$ CIs):
  - Phase explained the most variance in comfort ( $24.9\%$ ).
  - Amplitude explained $9\%$ \ variance.
  - Interaction of phase and amplitude explained $6.1\%$ .
Kinship and Fertility in Iceland (Helgason et al., 2008):
- Analyzed all known Icelandic couples born between $1800$ and $1965$ .
- Found a significant positive association between kinship and fertility.
- Greatest reproductive success observed for couples related at the level of third and fourth cousins.
- Used $95\%$ confidence intervals across seven intervals of kinship level to measure:
  - Total number of children.
  - Number of children who reproduced.
  - Number of grandchildren.
  - Mean life expectancy of children.