Notes on Sampling and Study Design
Population and population identification
The population to study depends on the study’s aim and what the researchers believe the study is about.
Examples from Seton Hall context:
Seton Hall community likely includes students, faculty and administration, staff, and the people who live around Seton Hall.
Alumni (not on campus) who fund projects or tuition.
Families whose money supports current and future students.
All of these groups are potential populations depending on the research question.
When publishing a study, researchers should identify the population upfront so readers can know to whom the results apply (e.g., population on campus vs. economic impact on the town).
Population vs. sample: the population is the entire set of units of interest; the sample is a subset used to make inferences about the population.
Study types and data collection concepts
Study type helps determine the sample size and feasibility:
Experiments: expensive, time-, staff-, and labor-intensive; often involve random assignment and control/treatment groups; high potential for causal inference but not always feasible.
Observational studies: less invasive and cheaper; rely on existing records or natural observations; harder to rule out confounding variables; can be large-scale (e.g., 94,000 people) but with more lurking variables.
Surveys and censuses: direct data collection from individuals; can be active (you ask questions) or passive (recorded data); often used to measure opinions, behaviors, and statuses.
Measurement collection challenges:
Posing the right question and asking it to the right person can bias results.
Interviewers’ identity and relationship to respondents can influence responses (e.g., adult children interviewing parents).
Confidentiality concerns can lead to underreporting of embarrassing information (e.g., reading level, income).
Descriptive vs. inferential statistics:
Descriptive: summarize measurements in the sample (e.g., mean, median, variance).
Inferential: extend findings from the sample to the population, including estimates of variability and margin of error.
Inference process steps:
Design, data collection, and modeling influence how much we can generalize beyond the sample.
Interpretation phase translates results into actionable conclusions and persuades stakeholders that methods were sound.
Sampling frame, frame maintenance, and representativeness
Sampling frame: an organized list that describes the population (e.g., roster of undergraduates, emails, etc.). The sample is drawn from this frame.
The frame must exist and be maintained; problems include:
Frame maintenance: people move, graduate, transfer, or join; updating is necessary but can be costly and time-consuming.
Under-coverage: some population members are not included in the frame (e.g., left-handed individuals if not captured, or hard-to-reach groups).
Nonprobability vs. probability sampling:
Probability sampling uses a frame to draw a sample where every unit has a known nonzero chance of selection.
Nonprobability sampling (e.g., voluntary, convenience) does not guarantee representativeness and can lead to bias.
Random sampling aims for representativeness, but random does not guarantee a perfect cross-section of the population; representativeness is evaluated after sampling.
Random sampling caveats:
Even with random methods, under-coverage or nonresponse can bias results.
The frame is critical; if important subgroups are missing, the sample may not reflect the population well.
Common random sampling techniques
Simple Random Sampling (SRS)
Take a complete list (sampling frame), number each unit, and draw n random numbers to select units.
Process steps:
Number the population 1..N (e.g., 6,000 undergraduates).
Generate n random numbers between 1 and N (often via electronic tools).
Select the units corresponding to those numbers.
Practical notes:
Duplicates must be avoided; if a number repeats, draw another number.
Requires an up-to-date, complete frame; otherwise, many units may be missed or counted incorrectly.
Advantages: straightforward and easy to explain; fairly unbiased if the frame is complete.
Disadvantages: still subject to frame errors and nonresponse; may underrepresent subgroups if they are rare in the frame.
Example analogy: hat method – 6,000 pieces of paper in a hat; draw 100 times to select 100 individuals (don’t actually say this in proposals; describe as SRS).
Systematic Random Sampling
Approach: divide the frame into equal-size slices and select at a fixed interval k (the skip)
Steps:
Determine n = sample size and N = population size; compute k = N/n.
Randomly select a starting point between 1 and k, then select every k-th unit thereafter.
For example, with N = 6,000 and n = 100, k = 60; start at position x in 1..60, then select x, x+60, x+120, …
Advantages: easier to implement; spreads sample across the frame; requires only one random start.
Disadvantages: frame ordering can bias results if the frame is ordered (e.g., by day of week, or by some attribute); if k is not an integer or if rounding occurs, coverage gaps can appear.
Practical notes:
If k is not an integer, round and handle edge cases; may lead to under-coverage if last portion is cut off.
Application examples: systematic sampling used in telephone number generation (e.g., RDD lists) and other large-scale polls.
Stratified Random Sampling
Concept: divide the population into strata (subgroups) that are internally homogeneous and externally heterogeneous.
Steps:
Identify strata (e.g., left-handed vs. right-handed vs. ambidextrous; age groups; education levels; political affiliations).
Within each stratum, perform random sampling (SRS or systematic) to obtain the required number from each group.
Proportional vs. disproportional (oversampling) approaches:
Proportional: sample sizes from each stratum are proportional to the stratum’s share in the population (e.g., 90% right-handed, 10% left-handed).
Disproportional (oversampling): over-sample certain strata to ensure enough data for analysis (e.g., oversampling seniors as likely voters).
Rationale and use:
Helps guarantee representation of important subgroups and enables cross-strata comparisons.
Potential issues:
Requires clear definitions of strata and accurate classification of individuals.
If some strata are absent or very small in the frame, interpretation can be tricky.
Example: likely voters oversampling elderly due to higher turnout; helps in election polling but must be corrected in analysis for representativeness.
Cluster Sampling
Concept: sample natural groups or clusters when a complete frame is unavailable or too costly to obtain.
Steps:
Identify clusters (e.g., dorms, schools, geographic areas, troops/teams).
Randomly select clusters, then sample everyone within selected clusters or sample within clusters (two-stage sampling).
In large surveys, a two-stage cluster design may be used: select clusters, then sample within clusters.
Advantages: reduces administrative burden; less costly than listing all individuals; scalable for nationwide polls.
Disadvantages: clusters may be more similar to each other than the overall population (increased homogeneity); potential undercoverage if some clusters are not selected or are problematic (e.g., dorms with restricted access).
Two-stage or multi-stage sampling is common in large surveys.
Note on clusters in practice: clusters should be diverse overall, not biased toward one demographic; if clustering limits diversity, results may be biased.
Multistage Sampling (combining methods)
Real-world surveys often combine stratification and clustering, then apply simple random or systematic sampling within final stages.
Example pattern: stratify by region or state, then cluster by city, then sample within cities using SRS or systematic sampling.
Random Digit Dialing (RDD) and Telephone Polling frames
RDD uses telephone numbers generated through a systematic process to reach potential respondents.
Phone polling often uses two frames: cell phones (personal) and landlines (households), each with different interviewing protocols.
Interview procedures differ by frame (e.g., request the youngest adult in a household for landlines).
Companies like Dynata compile lists for dialing based on various criteria; numbers are selected to represent targeted subgroups.
RDD can be proportional or oversample certain groups (e.g., civil rights questions may oversample a demographic of interest).
Post-survey adjustments (weighting) are common to align sample demographics with the population.
Stratification vs. clustering: how to tell them apart in practice
Stratified sampling uses all strata and ensures representation of each subgroup; clusters are groups selected for practicality and may not cover all subgroups directly.
Stratified sampling aims for homogeneity within strata and heterogeneity across strata; clusters aim for manageability and cost efficiency.
In practice, surveys often mix both approaches: stratify by some variables (like age, race, region) and cluster within strata to reduce costs, then sample within clusters.
Oversampling and “likely voters” concepts
Oversampling: intentionally sampling more from a particular subgroup to ensure sufficient data for analysis (e.g., seniors in a political poll).
Likely voters: a concept used to weight or sample respondents who are more likely to vote; involves identifying groups that are more prone to voting (often elderly or higher-income groups).
Important caveat: oversampling must be corrected with appropriate weighting to avoid bias in overall population estimates.
Practical considerations and polling logistics
Framing and sequencing of questions matter; poorly designed questions can distort results (e.g., Obamacare polling showing two distinct reasons for a “no” vote).
Random sampling does not guarantee representativeness; nonresponse and self-selection can bias results.
Frame maintenance and up-to-date lists are essential; otherwise, the frame can become outdated quickly.
Random sampling is powerful but requires careful implementation and ethical safeguards.
Nonprobability sampling and common biases
Nonprobability sampling methods include:
Volunteer sampling: participants opt in (e.g., call-in polls, website polls).
Convenience sampling: sample those who are easiest to reach (e.g., boardwalk surveys, online panels).
Biases in nonprobability sampling:
Self-selection bias: those who volunteer may differ systematically from the population in ways relevant to the study (e.g., more motivated respondents).
Convenience bias: sampling from easily accessible locations may exclude other groups (e.g., people not at the boardwalk when recreation questions are asked).
Undercoverage and coverage issues persist if the sample frame misses key population segments.
Random sampling caveats: even with random methods, if the frame omits subgroups or if response rates are uneven, results may be biased.
Variability, bias, and the margin of error
Variability (random error) vs. bias (systematic error):
Variability is the natural fluctuation in responses across different samples; larger samples help reduce variability.
Bias is a directional shift away from the true population value; increasing sample size does not fix bias and can even magnify it if the sampling process is flawed.
Margin of error (MOE): a measure of variability across repeated samples; often associated with a confidence level (e.g., 95%).
Common formula for a proportion p with sample size n (assuming simple random sampling):
ME = z{ rac{eta}{2}} \ sqrt{\frac{p(1-p)}{n}} where $z{ rac{eta}{2}}$ is the critical value from the standard normal distribution for the desired confidence level.Confidence interval for a proportion:
CI = p \pm MESample size for a desired MOE (for a proportion):
n \approx \frac{z^2 p(1-p)}{ME^2}Finite population correction (when sampling a sizable fraction of the population):
ME \approx z_{ rac{\alpha}{2}} \sqrt{\frac{p(1-p)}{n} \cdot \frac{N-n}{N-1}}
Representativeness and random sampling
A representative sample mirrors the overall population patterns (not necessarily exact values but similar distributions across key characteristics).
Probability samples from a frame are designed to be representative; nonprobability samples require post-hoc adjustments and caution in interpretation.
Real-world case studies and illustrative examples
Seton Hall context (course example): population choices affect study design and the interpretation of results related to the university community and its broader impact.
Observational study example: a claim like “nut consumption reduces cholesterol” based on observational data rather than a controlled experiment.
Nunn study reference: lifestyle similarities used to justify limiting confounding variables in observational settings.
Obamacare polling example: misinterpretation arises when the question’s framing yields multiple, opposing reasons for a single binary response; underscoring the importance of precise question construction.
The Lancet and Johns Hopkins Iraq death studies (02/2004 and 02/2006) as a prominent sampling case:
2004 study design: map of Iraq divided into 33 regions (strata); within each region, randomly locate a dot and identify the 30 closest households for interviews (cluster sampling).
Result: nearly 8,000 people interviewed; ~0.5% nonresponse rate; data collected by Iraqi interviewers who were medical professionals (to gain trust and safety in a war zone).
Limitations and criticisms: some clusters were too close to active war zones or outside region boundaries; data dropped if clusters didn’t fit region definitions; some criticisms about representativeness and data quality.
2006 follow-up: expanded to ~50 regions, ~13,000 people; adjustment for clusters that were excluded and refined sampling strategy.
Findings: estimated around 650,000 deaths associated with the invasion (much higher than official counts); highlights the challenges and importance of robust field sampling in conflict zones.
Takeaway: sampling in difficult environments can still produce valuable estimates, but requires careful design, local interviewing capability, and transparent handling of limitations.
Ethical, practical, and interpretive considerations
Informed consent and confidentiality: essential for ethical data collection; respondents must understand participation and data use.
Embedding data collection in real-world settings (e.g., war zones) demands robust safeguards for interviewers and participants; nonresponse and safety concerns can shape design.
Practical constraints shape study design: cost, time, labor, and available staff influence whether an experiment, observation, or survey is chosen.
The importance of transparent methods: clearly describe population, frame, sampling method, response rates, and weighting adjustments to enable evaluation of representativeness.
Interpretation and communication: results must be presented with caveats about limitations, possible biases, and the degree to which conclusions generalize beyond the sample.
Key terms and recap concepts
Population: the entire group of interest in a study.
Sampling frame: the list or source from which a sample is drawn.
Sample: a subset of the population chosen for study.
Stratified sampling: dividing population into homogeneous subgroups (strata) and sampling within each stratum.
Clustering: selecting natural groups (clusters) and sampling within clusters.
Systematic sampling: selecting every k-th unit after a random start.
Simple Random Sampling (SRS): each unit has an equal chance of selection.
Oversampling: sampling more from a subgroup to ensure adequate analysis.
Likely voters: a subgroup assumed to be more representative of the actual voters; used to improve polling accuracy but requires proper adjustment.
Undercoverage: portions of the population are missing from the sampling frame.
Nonresponse bias: when respondents differ from nonrespondents in ways that affect the study.
Bias vs. variability: bias is systematic error; variability is random error; larger samples reduce variability but not bias.
Margin of error (MOE): a measure of the precision of an estimate, tied to confidence level.
Random Digit Dialing (RDD): a sampling method used in telephone surveys to reach potential respondents.
Two-stage/multi-stage sampling: combining sampling methods across stages to manage costs and logistics.
Representativeness: a sample pattern that resembles the population along key dimensions; achieved through probability sampling and weighting when needed.
Proportional vs. disproportional sampling: sampling in proportion to population or oversampling particular strata for analytical purposes.
Frame maintenance: updating the sampling frame to reflect population changes over time, essential for accurate sampling.
Case study takeaways: well-designed sampling in challenging contexts can yield valuable inferences, but transparency about limitations and potential biases is crucial.