Population vs Sampling: Key Concepts and Random Sampling Methods

Population

Population: the total group of interest in a study (e.g., all college students in the US).
Why not study every member? It’s usually impractical due to time, money, and resources.

Sampling frame: the list or accessible set of individuals from the population used to draw the sample.
Ideal: know everyone in the population, but often difficult in practice.

Random sampling: every member of the population has an equal chance of being selected. Considered the gold standard for quantitative studies.
Nonrandom sampling: some individuals have higher chances of selection; can introduce biases; common in qualitative research.
Random sampling is necessary but not sufficient for a representative sample.

Representative sample: the sample mirrors key characteristics of the population (e.g., gender, race, region).
Representativeness facilitates moving from descriptive statistics to inferential statistics.
Inferential statistics: use sample results to make conclusions about the population with a stated level of confidence.

Define the population clearly (e.g., all college students in a region).
Establish a sampling frame where possible (complete list of population members).
Choose a random selection method (e.g., simple random, systematic, stratified, cluster).
Collect data until the desired sample size is reached.
Acknowledge practical caveats (cost, access, frame imperfections).

Simple Random Sampling: each member has an equal, nonzero chance of being selected.
Systematic Random Sampling: select every k-th element from the population list.
- Interval k is given by k = \frac{N}{n}, where N is population size and n is desired sample size.
- Example: If N = 19 and n = 10, then k = \frac{19}{10} = 1.9\approx 2; select every 2nd person.
Stratified Sampling: divide population into subgroups (strata) and take samples from each in proportion to their size in the population.
Cluster Sampling: first select whole subgroups (clusters) at random, then sample within those clusters.

Population size: N = 19, Desired sample size: n = 10.
Systematic interval: k = \frac{N}{n} = \frac{19}{10} = 1.9 \rightarrow \text{round up to } 2; select every 2nd person.
This yields a systematic sample of 10 from the 19.

Suppose 10 said YES (go to Mars), 9 said NO (not going).
Proportions: YES ≈ 52.6%, NO ≈ 47.4%.
Stratified plan: draw 5 from YES and 5 from NO to reflect the population proportions.

Example in class: group by rows/columns (clusters) and sample from selected clusters.
Key idea: first decide which groups (clusters) will be in the sample, then sample within those groups.

A representative sample supports generalizing findings to the population.
Random sampling is the first step; representativeness strengthens the bridge from descriptive to inferential statistics.