NZ Adverse Events Study - Sampling Design, Weights, and InsightLite workflow
NZ quality of healthcare study: adverse events in public hospitals (1998)
Objective: examine adverse events (AEs) in New Zealand public hospitals — occurrence, impact, and preventability of adverse events.
Population and sampling frame:
Admissions to NZ public hospitals with more than 100 beds in 1998, across tertiary and secondary hospitals.
Population included 20 such hospitals; the six biggest are tertiary hospitals, handling more serious cases and larger capacities.
The study sampled from hospital admissions in tertiary and secondary public hospitals.
Sample details:
Sample size: approximately 6,500 admissions from about 700,000 admissions in the year.
Hospitals sampled:
All six tertiary hospitals.
Four secondary hospitals from those with >300 beds.
Three secondary hospitals from those with 100–300 beds.
For each sampled hospital, about 500 admissions were collected, yielding roughly 13 hospitals in total (6 tertiary + 7 secondary).
Stratification and clustering:
Stratified sampling by hospital type (three strata):
Stratum 1: tertiary hospitals (sample all hospitals in this stratum).
Stratum 2: secondary hospitals >300 beds (sample a fraction).
Stratum 3: secondary hospitals 100–300 beds (sample a fraction).
Clustering: hospitals act as clusters; admissions are sampled within hospitals.
Primary sampling units (PSUs): hospitals themselves.
Design described: a stratified one-level cluster sampling design with 13 clusters (PSUs).
Why stratify and oversample certain strata:
Oversampling tertiary hospitals ensures enough data on serious cases.
Including hospitals of various sizes ensures representation across hospital types and sizes.
If only half of secondary hospitals were sampled at random, smaller hospitals might be underrepresented.
Data structure and variables:
Key variables collected:
Sex, AE score, age, ethnicity group, admission type, MDC 31, stay (length of stay), deprivation decile, etc.
Stay: number of days in hospital.
AE score: categorical (no AE, evidence of AE, definite AE).
AE score handling:
AE score sometimes not a simple yes/no in original data; for analysis it was simplified to a binary variable: evidence of adverse event (converted to yes/no).
Definition: not explicitly a simple yes/no in raw data; hospitals collected AE data as part of patient care, not for this survey.
Weights and identifiers:
Weight variable: sampling weight (not patient weight) used to adjust for unequal probabilities of selection.
Stratum ID: denotes hospital type strata.
Cluster ID: hospital identifier within strata.
Additional variables: deprivation decile, MDC code, stay, etc.
Data handling in InsightLite (survey statistics software):
Accessing survey design:
InsightLite has a Survey Design dialog (dataset menu) where you specify:
Strata variables, cluster variables, and weighting variables.
InsightLite estimates population sizes (e.g., ~699,095) as a check against the specified design.
Design diagnostics:
It reports the design (e.g., stratified one-level cluster sampling with 13 clusters) and the number of PSUs per stratum (e.g., Stratum 1: 6 PSUs, Stratum 2: 4 PSUs, Stratum 3: 3 PSUs).
It uses the specified design to compute population estimates and standard errors that reflect the survey design.
Two-stage vs one-stage description:
The data are actually two-stage (sampling hospitals then admissions within hospitals), but often only the first stage is described for analysis (i.e., one-stage sampling design) to simplify inference. The term "with replacement" is used in jargon for this approximation, even though sampling is not with replacement.
Design interpretation:
After specifying the design, summaries (means, medians, etc.) for variables like stay take the survey design into account.
Population estimates are the point estimates; standard errors (and confidence intervals) are computed using the complex-sample design.
Replicate weights option:
InsightLite offers an option to specify design with replicate weights (related to bootstrap-like methods) for variance estimation.
Using design information in visuals:
Graphs (e.g., scatter plots) are weighted; a weighted scatter plot shows population-level distribution rather than a simple sample distribution.
Example of a descriptive analysis in this study:
Descriptive plots: histogram and box plot for stay, using survey design adjustments; box plot reveals skew with many short stays and a tail for long stays.
Population histograms and box plots are produced with weights to reflect population distribution, not the unweighted sample.
Design effects are printed to indicate efficiency loss due to complex sampling.
Key concept: sampling weights and unequal probabilities of selection
Why unequal probabilities arise:
Stratified sampling: different strata may have different sampling fractions (oversampling for some strata to improve estimates).
Cluster sampling: sampling only a subset of clusters (hospitals) introduces unequal probabilities across units within and across strata.
Oversampling smaller or minority subgroups (e.g., Maori or Pacific peoples) to obtain adequate subgroup information.
Consequences of ignoring unequal probabilities:
Biased population estimates and incorrect standard errors; the sample would be representative of the oversampled subgroups, not the whole population.
Remedies: use sampling weights that are the reciprocals of selection probabilities to reweight observations to population totals.
Formal definitions and formulas:
Simple random sampling probability and weight:
If you randomly select little n objects from big N, the selection probability is
The corresponding sampling weight is
For multi-stage sampling (stratum, PSU, unit): overall probability for unit i is
Weight is the reciprocal of this overall probability:
Weighted mean (population mean) estimator using weights:
ar{y}w = rac{igl(\sum{i=1}^n wi yiigr)}{igl(\sum{i=1}^n wiigr)}Population totals and other statistics can be similarly estimated with weights (e.g., total T as
ilde{T}w = igl(\sum{i=1}^n wi yiigr)Design effect (DEFF):
Indicates loss of information due to complex sampling relative to simple random sampling:
A DEFF > 1 means the design is less efficient than SRS; effective sample size is roughly
Worked example (illustrative): oversampling Aucklanders by housing value strata
Suppose two strata:
Stratum A (expensive houses): population size NA = 50{,}000; sample size nA = 500; pA = 0.01; wi = NA/nA = 100 for observations in Stratum A.
Stratum B (normal houses): population size NB = 950{,}000; sample size nB = 500; pB ≈ 0.000526; wi = 1/p_B ≈ 1{,}900 for observations in Stratum B.
Let the mean income in Stratum A be
ar{y}A = 180{,}000, and in Stratum B be ar{y}B = 40{,}000.Weighted overall mean using per-observation weights:
Weighted mean equals
ar{y}w = rac{igl( ext{sum of } wi yi ext{ in A}igr) + igl( ext{sum of } wi yi ext{ in B}igr)}{igl( ext{sum of } wi igr){ ext{A}} + igl( ext{sum of } wi igr)_{ ext{B}}igr}With the given numbers, this reweights to reflect population proportions, yielding a population-representative average (illustrative; exact numeric depends on totals and weights).
The key takeaway: if you naively average raw sample values without weights, you’ll bias toward oversampled strata; weights correct this bias.
Practical interpretation for this NZ study:
The NZ study illustrates how oversampling larger tertiary hospitals and certain secondary hospitals affects estimates if unaccounted for.
Weights allow estimation that generalizes to the population of hospital admissions in 1998, not just the sampled admissions.
Design effects reveal how much information is lost due to clustering and stratification, guiding interpretation of precision and required sample sizes for future work.
Why practitioners care about these concepts:
Real-world data sets often come from complex sampling designs; ignoring design leads to biased estimates and misleading inference.
Software like InsightLite (and rSurvey) provides built-in support for complex survey design to produce valid population estimates and standard errors.
Understanding and documenting the sampling design (strata, PSUs, weights) is crucial for reproducibility and correct interpretation of results.
Practical workflow recap (as described):
Import dataset and preview to verify structure.
In the software, specify the survey design by indicating:
Strata variable (e.g., stratum_id).
PSU/cluster variable (e.g., cluster_id).
Weight variable (e.g., weight or W).
Create the design; the software uses it to adjust all summaries, graphs, and models.
Use descriptive analyses (e.g., histograms, box plots) and summaries (means, medians, standard errors) with design-adjusted calculations.
If nested sampling is a concern (same PSU labels across strata), decide whether to enable a nested-sampling check; in NHANES-like datasets this may be true, and you can turn the check off if the PSU names are unique across strata.
Connections to broader principles:
Sampling theory: unequal probabilities require weights to produce unbiased population estimates.
Foundational stats: many estimators (means, totals, regression coefficients) can be written as weighted sums; weights extend to almost all statistical methods under complex survey design.
Real-world relevance: oversampling minority or hard-to-reach groups improves information for those groups while preserving population representativeness through weighting; ethically and practically important in health surveys.
Final takeaways:
Stratified and cluster sampling are common in health-data surveys and require weights for valid inferences.
Weights reflect selection probabilities; use them to obtain population-representative estimates and correct standard errors.
Tools like InsightLite help apply these concepts by integrating design information into all analyses, including descriptive stats, visuals, and models.
Always report the sampling design, weights, and design effects when presenting survey-based results to ensure transparent interpretation.
Summary of key concepts and practical formulas
Stratified vs. cluster sampling
Stratified sampling partitions population into strata and samples within strata; allows oversampling of important subgroups and ensures representation across strata.
Cluster sampling samples groups (clusters) (e.g., hospitals) and then samples units within clusters; reduces fieldwork cost but introduces design effects due to within-cluster similarity.
Weights and unequal probabilities of selection
Probability of selection for a unit i in a simple two-stage example:
Weight for unit i:
For a simple random subset:
Weighted estimators
Weighted mean:
ar{y}w = rac{igl( abla{i=1}^n wi yiigr)}{igl(
abla{i=1}^n wiigr)}
Design effect and effective sample size
Design effect:
Effective sample size:
Practical example recap ( Auckland housing value strata )
Two strata: expensive (NA=50{,}000; nA=500) and normal (NB=950{,}000; nB=500).
Weights: wA= NA/nA = 100; wB = NB/nB ≈ 1{,}900.
Weighted mean uses sums of wi yi divided by sums of w_i, yielding population-representative income estimate.
Notes on the NHANES/InsightLite workflow reference
In InsightLite, the design is specified via Dataset → Survey Design:
Stratum variable (e.g., stratumid) and cluster/PSU variable (e.g., clusterid).
Weight variable (e.g., weight or W).
Option to specify design with replicate weights or to read the design from the file or to fix up weights (Post-stratify).
When loading a dataset, InsightLite remembers the survey design for subsequent analyses until a new dataset is loaded, ensuring that all summaries, graphs, and models account for the complex design.
Nested sampling (PSUs repeated across strata) can occur in some surveys; InsightLite allows turning off the check that PSUs are unique across strata if the data are correctly representing nested structures; otherwise, keep the check on to catch data-entry issues.
Practical takeaway: always verify that the survey design is correctly specified and that the weights, strata, and clusters align with the data documentation to obtain valid population inferences.
NZ quality of healthcare study: adverse events in public hospitals (1998)
Objective: Examine the occurrence, impact, and preventability of adverse events (AEs) in NZ public hospitals.
Population and sampling frame: Admissions to NZ public hospitals with >100 beds in 1998.
Sample details: Approximately 6,500 admissions from 13 hospitals (all 6 tertiary, 7 secondary).
Stratification and clustering:
Stratified by hospital type (tertiary, secondary >300 beds, secondary 100-300 beds).
Hospitals acted as clusters (Primary Sampling Units or PSUs).
Design: A stratified one-level cluster sampling design with 13 clusters.
Why stratify and oversample tertiary hospitals: To ensure sufficient data on serious cases and representation across hospital sizes.
Data structure and variables:
Key variables: Sex, AE score (simplified to binary), age, ethnicity, admission type, length of stay (stay), deprivation decile.
Weights and identifiers: Sampling weight (reciprocal of selection probability), Stratum ID, Cluster ID.
Data handling in InsightLite (survey statistics software):
Software specifies design (strata, cluster, weights) to estimate population sizes and compute design-adjusted standard errors.
The actual two-stage sampling (hospitals then admissions) is often approximated as one-stage for simplified inference.
Visuals and summaries are weighted to reflect population distribution.
Key concept: sampling weights and unequal probabilities of selection:
Arise from stratified and cluster sampling or oversampling subgroups.
Consequences of ignoring: Biased population estimates and incorrect standard errors.
Remedies: Use sampling weights () to reweight observations.
Formal definitions and formulas:
Simple random sampling probability and weight:
,Multi-stage sampling overall probability:
Weighted mean estimator:
Design effect (DEFF):
Effective sample size:
Practical interpretation for this NZ study:
Oversampling tertiary hospitals requires weights for generalizable estimates to the 1998 hospital admission population.
Design effects quantify information loss due to complex sampling.
Why practitioners care:
Complex sampling designs necessitate design-aware analysis to avoid biased estimates and misleading inference.
Software like InsightLite provides tools for valid population estimates and standard errors.
Practical workflow recap:
Import data, specify strata, cluster, and weight variables in survey software.
Software then adjusts all subsequent analyses, summaries, and graphs.
Verify design specification and data alignment.
Connections to broader principles:
Sampling theory: Unequal probabilities require weights for unbiased estimates.
Foundational stats: Weights extend to most statistical methods under complex survey design.
Ethical/practical: Oversampling minority groups improves data quality for those groups while maintaining population representativeness via weighting.
Final takeaways:
Stratified and cluster sampling are common in health surveys and require weights for valid inferences.
Weights correct for unequal selection probabilities, yielding population-representative estimates and correct standard errors.
Tools integrate design information for comprehensive analysis.
Documenting design, weights, and design effects ensures transparency in reporting survey results.