Samples, Populations, & Sampling

Purpose of Samples: We use samples to learn about populations.
Sampling Error: Every sample contains sampling error.
Variability of Results: Different samples yield different results; this is a normal occurrence.
Reliability Affected by Variability: Variability in the data affects how reliable a sample is.
Distribution of Sample Means: We utilize the distribution of sample means to make inferences about populations.

Population: The entire group of interest (e.g., all college students).
Sample: A subset of the population that is measured.
Data: An individual score or measurement obtained from the sample.
Sampling Error: The difference between a sample result and the true population value due to chance.
Variability: The extent to which the data points are spread out in a dataset.

Imperfect Representation: No sample perfectly represents the population.
Effect of Variability: More variability within a population leads to more sampling error in the sample.
Sample Size: Larger, more representative samples generally incur less sampling error.
Universal Occurrence: All studies, regardless of design, experience some level of sampling error.

Definition: Each individual in the population has a known chance of being selected for the sample.
- Simple Random Sample: Every individual has an equal chance of being chosen.
- Stratified Sample: The population is divided into groups (strata), and the samples represent the proportions of these groups within the population.
- Cluster Sample: Groups (clusters) are randomly selected, and individuals are then sampled within those groups.
- Best Use: Cluster sampling is advantageous for reducing bias and is effective for large or dispersed populations.

Definition: Individuals are selected because they are easily accessible; the probability of selection is unknown.
Common Examples: Often involves volunteers or college undergraduates.
Drawbacks: Increases sampling error and limits the generalizability of results.

Purpose: Ensures that key characteristics of the population are represented adequately in the sample.
Control: Allows for specific percentage control of various subgroups within the population, thus reducing bias.

Purpose: Involves random selection of entire groups (e.g., schools, cities).
Use Case: More suitable for very large or geographically spread-out populations.

Definition: The distribution of all possible sample means that can be obtained from repeated samples of the same size.
Key Facts:
- Most sample means cluster around the population mean.
- Extreme sample means are infrequent.
- This distribution is crucial for determining whether observed results are likely due to chance or suggest a real population effect.

Core Question: Is this result likely due to sampling error?
- If the result is deemed unlikely due to sampling error, it suggests the presence of a real effect in the population.
Cautionary Note: A single study does not definitively prove a conclusion; further research and validation are necessary.

Key Questions:
1. Are the sample means different?
2. How much variability is present among the samples?
3. Could the observed differences be explained by sampling error?