Population vs Sampling: Key Concepts and Random Sampling Methods
Population
- Population: the total group of interest in a study (e.g., all college students in the US).
- Why not study every member? It’s usually impractical due to time, money, and resources.
Sample
- Sample: a subset of the population from which data are collected.
- Respondents/participants (preferred over 'subjects').
Sampling Frame
- Sampling frame: the list or accessible set of individuals from the population used to draw the sample.
- Ideal: know everyone in the population, but often difficult in practice.
Random vs Nonrandom Sampling
- Random sampling: every member of the population has an equal chance of being selected. Considered the gold standard for quantitative studies.
- Nonrandom sampling: some individuals have higher chances of selection; can introduce biases; common in qualitative research.
- Random sampling is necessary but not sufficient for a representative sample.
Representativeness and Inference
- Representative sample: the sample mirrors key characteristics of the population (e.g., gender, race, region).
- Representativeness facilitates moving from descriptive statistics to inferential statistics.
- Inferential statistics: use sample results to make conclusions about the population with a stated level of confidence.
Steps for Random Sampling
- Define the population clearly (e.g., all college students in a region).
- Establish a sampling frame where possible (complete list of population members).
- Choose a random selection method (e.g., simple random, systematic, stratified, cluster).
- Collect data until the desired sample size is reached.
- Acknowledge practical caveats (cost, access, frame imperfections).
Types of Random Sampling Procedures
- Simple Random Sampling: each member has an equal, nonzero chance of being selected.
- Systematic Random Sampling: select every k-th element from the population list.
- Interval k is given by k = \frac{N}{n}, where N is population size and n is desired sample size.
- Example: If N = 19 and n = 10, then k = \frac{19}{10} = 1.9\approx 2; select every 2nd person.
- Stratified Sampling: divide population into subgroups (strata) and take samples from each in proportion to their size in the population.
- Cluster Sampling: first select whole subgroups (clusters) at random, then sample within those clusters.
Worked Example: Class of 19, Target n = 10
- Population size: N = 19, Desired sample size: n = 10.
- Systematic interval: k = \frac{N}{n} = \frac{19}{10} = 1.9 \rightarrow \text{round up to } 2; select every 2nd person.
- This yields a systematic sample of 10 from the 19.
Stratified Sampling Example (Mars Question)
- Suppose 10 said YES (go to Mars), 9 said NO (not going).
- Proportions: YES ≈ 52.6%, NO ≈ 47.4%.
- Stratified plan: draw 5 from YES and 5 from NO to reflect the population proportions.
Cluster Sampling Concept
- Example in class: group by rows/columns (clusters) and sample from selected clusters.
- Key idea: first decide which groups (clusters) will be in the sample, then sample within those groups.
Why Representativeness Matters
- A representative sample supports generalizing findings to the population.
- Random sampling is the first step; representativeness strengthens the bridge from descriptive to inferential statistics.
Quick Recap
- Population vs Sample vs Sampling Frame
- Respondents/Participants vs Subjects
- Random vs Nonrandom; Gold standard vs practicality
- Types: Simple Random, Systematic, Stratified, Cluster
- Use formulas like k = \frac{N}{n} for systematic sampling decisions
- Representativeness enables valid inferences about the population