Survey Sampling: Comprehensive Notes (Video Transcript)
Course context and goals
- Instructor: Thomas Lovely (not Andrew Spall); will teach the next three weeks on sampling and sampling aspects of surveys.
- Connection to Andrew Spall: issues of how to get surveys that measure what you want and how you ask questions; next focus is how you choose people for surveys and how that affects analysis.
- Data analysis plans: later weeks will cover data analysis with Insight (survey analyses and simple experimental design analyses); alternative tools include R and Stata.
- Thomas background: medical statistics researcher and professor; specializes in sampling, fitting models to data from complex samples, and using sampling to measure expensive metrics on smaller groups; has written the survey software underneath Insight.
- Logistics: lecture uses a discussion system (Ed) instead of Piazza; questions encouraged; instructor will pause to allow questions.
Why surveys and what a survey looks like
- Structured data vs homogeneous data: surveys (and experiments) are structured; not all observations are the same.
- Purpose of surveys: learn about populations by sampling a smaller group cost-effectively.
- Key questions addressed by surveys: proportion in a population, trends over time, and associations between variables (e.g., personality vs views on human rights).
- Population concepts:
- Target population: the population we want to make estimates about.
- Survey population: the people we could actually survey.
- Sampling frame: a list of members of the survey population from which we sample.
- Sample: the people we actually select.
- Respondents: people in the sample who are also in the target population.
- Sample design: how we go about choosing the sample (and may include sampling people not in the target population).
- Practical points:
- Frame errors and nonresponse can bias results even if the sampling method is unbiased in principle.
- Some data are collected by organizations and made available for analysis (government surveys in the US, New Zealand data, etc.).
- Core tasks when analyzing survey data:
- Define the survey structure to the computer so analyses reflect the design.
- Use software that understands survey design (Insight, R with survey package, Stata).
- Design details to report: stratification (if any), clustering (if any), and sampling weights.
- Note on causality:
- Surveys are usually observational; correlations do not imply causation.
- Special cases exist (e.g., AB testing, or experiments embedded in surveys) where causal inferences are more reliable.
The NZ statistical geography and its relevance to sampling
- NZ geography is hierarchical and designed to support sampling and data reporting:
- Regions: 16 regions.
- Territorial authorities: 67 total (53 districts).
- Statistical geography units:
- SA1: about 100–200 people.
- SA2: a few thousand people.
- Mesh blocks: the smallest basic geographic units; designed to be stable over time (unlike dwellings, which change).
- Regions are subdivided into urban/rural strata for sampling; mesh blocks are nested within SA1/SA2 and within territorial authorities, regions, etc.
- Nested structure and reporting:
- Every point in NZ is in one mesh block, one SA1, one SA2, one TA, and one region.
- Data reporting uses SA1, SA2, and mesh blocks to balance confidentiality and geographic specificity.
- Additional geographical constructs:
- Electorates exist but are not perfectly nested with SA1/SA2; electoral apportionment can cut across these units.
- Practical note: when planning surveys, one typically aims to cover all regions to ensure geographic representativeness; omitting a region (e.g., Waikato) could bias national estimates.
Defining populations and sampling frames in practice
- Target population vs survey population: need precise definitions to avoid misalignment.
- Examples of well-defined targets: New Zealand adults 18+; or a more refined group like a sub-population of interest.
- Edge cases to consider: adults physically in NZ vs citizens; residents with certain visas; international students; tourists.
- Sampling frame quality matters:
- A good frame (e.g., electoral roll) makes sampling practical but may include ineligible individuals (non-citizens in NZ roll).
- Frame errors (missing people or including ineligible people) lead to biases if not properly adjusted.
- Sample, respondents and target population alignment:
- The sample may include people who are outside the intended target population; analysts must adjust and/or weigh to reflect the true population of interest.
- Edge cases and practical issues:
- In NZ, enrollment in the electoral roll is compulsory but enforcement varies; this affects coverage.
- The concept of who counts as a New Zealand adult can affect the generalizability of results.
- Nonresponse (people not answering calls, surveys, etc.) biases must be addressed via weighting or follow-ups.
Sampling frames and a concrete NZ example: Mission On (nutrition/physical activity intervention)
- Mission On structure (2008–2009) overview:
- Aim: evaluate a government initiative to improve nutrition and physical activity among youth.
- Regions and stratification: 16 regions × urban vs rural (32 strata).
- Sampling frame: mesh blocks within each stratum were sampled; mesh blocks with more dwellings had higher selection probability (PPS by dwellings).
- Within each selected mesh block: households sampled with a systematic design (start at SW corner; select every nth house, e.g., third on the left and seventh on the right).
- Within each selected household: one person selected (often the one with the most recent birthday).
- Design and terminology in this example:
- Population is subdivided into strata (regions × urban/rural).
- Within strata, mesh blocks are the clusters (PSUs, the first-stage sampling units).
- Not all mesh blocks are sampled; only a subset is selected; then within those blocks, a subset of dwellings is surveyed.
- Within households, only one person is surveyed, creating another stage of sampling (multi-stage design).
- Rationale and implications:
- The approach reduces travel costs and makes fieldwork feasible for in-person interviews.
- You must generalize from sampled mesh blocks to the rest of the country, which introduces design effects due to clustering.
- Stratification increases precision and ensures geographic representation; clusters reduce cost but introduce intracluster correlation.
- Sampling weights arise from unequal probabilities of selection at each stage and must be used in analysis for unbiased population estimates.
Key sampling design concepts: stratification, clustering, and multi-stage designs
- Stratified sampling (strata):
- The population is cut into non-overlapping strata (e.g., regions; urban/rural within regions).
- Sampling can be performed within each stratum; common motivations:
- Ensure representation of important subgroups.
- Improve precision by reducing within-stratum variance.
- Enable region- or subgroup-specific analyses.
- Effects of ignoring stratification: standard errors can be too large or estimates biased if strata represent different populations (e.g., rural vs urban).
- Clustering (clusters):
- Units are grouped into clusters (e.g., mesh blocks, schools, classes).
- Sampling occurs by selecting some clusters and then sampling within them.
- Rationale:
- Cost savings, reduced travel, feasible data collection when frames are incomplete or hard to enumerate.
- Trade-offs:
- Intraclass correlation means units within a cluster resemble each other; effective sample size is smaller than the raw sample size.
- Potential unequal selection probabilities across clusters; bias can occur if not accounted for.
- Combined designs (stratified and clustered):
- Common in practice (e.g., Mission On used stratification at the region level and multi-stage cluster sampling within regions).
- The first-stage sampling unit is the PSU (primary sampling unit); additional stages include within-PSU clusters.
- Primary sampling units (PSUs):
- The largest units actually sampled in a multi-stage design.
- In Mission On, mesh blocks were the PSUs.
- If you have no clustering, the PSU concept reduces to the individual unit.
- Multi-stage designs and terminology:
- Stage 1: choose PSUs (e.g., mesh blocks).
- Stage 2: sample clusters within PSUs (e.g., households within blocks).
- Stage 3+: sample individuals within clusters (e.g., one person per household).
- In survey literature, stage numbers can be described differently depending on context (multilevel modeling uses a different orientation for levels).
Probabilities, weights, and analysis with complex samples
- Inclusion probabilities and weights:
- For a unit i, the overall inclusion probability is the product of stage-level probabilities:
π<em>i=∏</em>t=1Tπi,t. - The survey weight is the inverse of this inclusion probability:
w<em>i=π</em>i1.
- Example structure (Mission On-like):
- Probability of selecting a mesh block j within a region is proportional to its number of dwellings: π<em>j∝D</em>j,or π<em>j=∑<em>k∈regionD</em>kD</em>j.
- Probability of selecting a household within a chosen mesh block is: π<em>hh∣j=H<em>jn</em>hh∣j, where $n{hh|j}$ is the number of households sampled in block j and $H_j$ is the total households in block j.
- Probability of selecting a person within a selected household: π<em>p∣hh=N</em>p∣hh1, often N<em>p∣hh=number of people in the household. For one person per household, this is 1/($N{p|hh}$).
- Overall inclusion probability for a person i in the survey: π<em>i=π</em>j×π<em>hh∣j×π</em>p∣hh.
- Practical impacts for analysis:
- If PSUs are ignored, standard errors are biased; if weights are ignored or mis-specified, point estimates can be biased.
- Modern software (Insight, R survey package, Stata svy) can handle these designs if you correctly specify strata, clusters, and weights.
- The design effect (DEFF) quantifies the efficiency loss due to clustering; effective sample size is reduced: neff=DEFFn.
Simple random sampling vs complex sampling
- Simple random sampling (SRS) without replacement:
- Each unit has equal probability of being selected; all possible samples are equally likely.
- Requires a complete list (sampling frame) of units; Random digit dialing is an alternative implicit frame for telephone surveys.
- Practical limits: becomes costly if sampling a large fraction; generally not used for large-scale surveys due to cost and travel when in-person.
- Limitations of SRS in practice:
- If you have extra information (e.g., want better representation of ethnic groups), SRS may be inefficient; oversampling may be used to ensure adequate representation within subgroups.
- Frames may be incomplete or outdated; nonresponse can still bias results.
Why stratification and clustering decisions matter in practice
- Stratified sampling advantages:
- Protects against bad samples (e.g., missing a region like Auckland); ensures representation across key subpopulations.
- Facilitates reporting summaries for each stratum (e.g., region-level estimates).
- Increases precision by reducing variance within strata; allows different sampling methods per stratum.
- Example: Scottish Household Survey uses dense-area strata for cities and whole towns in Highlands for sparsely populated areas.
- Consequences of ignoring stratification:
- Standard errors may be inflated or deflated inappropriately; estimates may be biased if strata differ meaningfully.
- Clustering advantages and trade-offs:
- Cost and practicality: easier to collect data in a few locations (PSUs) rather than many scattered sites.
- Does not require a complete frame at the individual level; can work with a frame at a higher level (e.g., villages, schools).
- Intra-cluster correlation reduces the information gained from each additional unit within the same cluster; effective sample size is smaller than the nominal sample size.
- Risk of unequal sampling probabilities across clusters; can bias estimates if clusters differ systematically in key variables.
- When to use both stratification and clustering:
- Very common in practice and often the most cost-effective design; strata help ensure representativeness, clusters reduce fieldwork costs.
A few additional examples of survey designs
- Cannabis referendum (hypothetical example): stratified by Maori descent (Maori vs non-M Maori) and age (>50 vs ≤50); sample size of 1000 individuals per cell in a 2x2 design (Maori/non-Maori × age groups).
- School vaccination attitudes: two-stage cluster sampling; select 100 schools; survey all students in three randomly chosen classes per school.
- School Health Survey: stratified by school type (co-educational, boys, girls); clusters are schools; within each school, sample three classes; survey all students in those classes.
- Malawi Food Production and Security survey: strata are districts; clusters are villages; PSUs are villages; within each village sample 30 households; within households survey all residents or selected members.
- Practical point: in high-framing conditions, clustering may be necessary even when stratification is desirable; in low-framing contexts, stratification alone may suffice.
Why we need to use survey-aware software and proper design reporting
- Modern software supports complex survey designs, but only if you provide accurate design information:
- Strata variables (if any);
- Clusters / PSUs (if any);
- Sampling weights (from inclusion probabilities).
- If you mis-specify the design (wrong strata, wrong clustering, wrong weights), you will get biased results.
- Without proper design, analysis may resemble simple random sampling and ignore design effects, leading to misleading confidence intervals and p-values.
Practicalities and ethical considerations in survey sampling
- Nonresponse and coverage bias are persistent concerns across survey methods; weighting and follow-ups are standard tools to mitigate these biases.
- Privacy and defensible definitions of population are important: precise definitions determine what you can generalize to; mismatches can mislead policy decisions.
- Data accessibility varies by country; large datasets (e.g., US government surveys) exist, but access often requires permissions and data use agreements.
- Ethical implications include ensuring representation of subpopulations, avoiding stigmatization by over- or under-sampling certain groups, and preserving respondent confidentiality in small-area reporting (confidentiality of SA1/SA2 data).
Recap: core concepts to remember for exams and practice
- Core definitions: target population, survey population, sampling frame, sample, respondent, and sample design.
- The NZ geographic hierarchy: regions → territorial authorities → SA1 → SA2 → mesh blocks; nested and hierarchical, with PSUs typically being the coarsest sampled units.
- Stratified sampling vs cluster sampling:
- Stratification: sample from every stratum; increases representativeness and precision.
- Clustering: sample a subset of clusters; reduces travel and logistical burden; introduces design effects and intra-cluster correlation.
- Primary sampling units (PSUs) and stages: the largest units sampled first; subsequent stages sample within PSUs; multi-stage designs are common in surveys.
- Inclusion probabilities and weights: overall probability is the product of stage probabilities, and weights are the inverse of that product; proper weighting is essential for unbiased population estimates.
- Analyzing survey data requires software aware of the design; mis-specification leads to biased results.
- Causality in surveys is limited; AB testing and embedded experiments can provide stronger causal evidence in some cases.
- Real-world trade-offs: cost vs precision; larger clusters reduce fieldwork but can reduce precision; oversampling, frame quality, and nonresponse must be managed.