Research & Sampling Design – Comprehensive Study Notes
Meaning and Scope of Research Design
- "Research Design" = overall blueprint for a study.
- Arranges the “what, where, when, how much & by what means” of data collection/analysis.
- Formal definition: “arrangement of conditions for collection & analysis of data in a manner that combines relevance to research purpose with economy in procedure.”
- Covers researcher actions from framing the hypothesis → operationalising variables → collecting & analysing data → writing final report.
Typical Design‐Decision Questions
- What is the study about & why is it being done?
- Where & when will it be carried out (time horizon, geographic setting)?
- What data are required & where can they be found?
- What period(s) will be covered?
- What sample design will be followed?
- Which data-collection techniques will be used?
- How will data be analysed? In what reporting style?
Four Sub-Designs Contained in the Master Design
- Sampling design – procedure for selecting study elements.
- Observational (data) design – conditions under which observations are made (who, when, where, with what instrument).
- Statistical design – how many observations, which analyses.
- Operational design – field and administrative procedures that implement the other three.
Need / Importance of Research Design
- Ensures smooth, efficient conduct → maximises information while minimising time, effort & cost.
- Provides firm foundation for reliability & validity of results.
- Serves as architectural “blue-print,” analogous to a house plan before construction.
- Specific benefits:
- Reduces inaccuracy & bias; improves efficiency & reliability.
- Minimises waste of resources; guides resource allocation.
- Clarifies requirements for hypothesis testing & data needs.
- Communicates overview to other experts; keeps project on course.
Features of a Good Design
- Flexible, appropriate, efficient, economical.
- Minimises bias & experimental error; maximises reliability.
- Generates maximal information; allows examination of multiple aspects of problem.
- Suitability is context-specific: one design rarely fits all problems.
Criteria normally considered:
- Means of obtaining information.
- Skill/availability of researcher & staff.
- Objectives & nature of problem.
- Time & money constraints.
Key Concepts in Research Design
- Variable – measurable concept; can be continuous (age, income) or discrete (number of children).
- Independent Variable (IV) – antecedent; manipulated/predictor.
- Dependent Variable (DV) – outcome/consequence affected by IV.
- Example: Age → Height (Age = IV, Height = DV).
- Extraneous Variable – IV not related to study purpose but influences DV → introduces experimental error.
- Control – design strategies to minimise extraneous influence.
- Confounded Relationship – DV influence by extraneous variables; IV–DV relation clouded.
- Research Hypothesis – predictive statement linking at least one IV to one DV; tested via scientific method (e.g., “e-Learning enhances teaching–learning experience”).
- Hypothesis-Testing Research
- Experimental: IVs actively manipulated.
- Non-experimental: IVs not manipulated (ex-post-facto, survey, etc.).
- Experimental & Control Groups – experimental receives novel treatment; control receives usual conditions.
- Treatments – specific conditions applied to groups.
- Experiment – process of testing statistical hypothesis.
- Absolute vs comparative experiments.
- Experimental Unit – basic plot/subject on which treatment is applied.
Traditional Categories of Research Design
| Objective | Appropriate Design | Typical Key Words |
|---|---|---|
| Gain background, define terms, clarify problems, set priorities | Exploratory | Qualitative, flexible, informal |
| Describe who, what, where, when, how | Descriptive / Diagnostic | Cross-sectional, longitudinal, survey |
| Determine causality, test “if-then”, evaluate effects | Causal | Experiment, manipulation, control |
Exploratory Research
- Unstructured, informal; undertaken when little is known.
- Uses: secondary data analysis, experience surveys, case analysis, focus groups, projective techniques.
Descriptive Research
- Answers who/what/where/when/how (not why).
- Cross-sectional studies → one-time measurement; sample surveys often online.
- Longitudinal studies → repeated measures; panels or repeated independent samples.
Causal Research (Experiments)
- Explains phenomena via conditional statements “If X then Y.”
- Relies on manipulation of IVs & control of extraneous variables.
- Two settings: Laboratory (high control, artificial) vs Field (natural, realistic).
- Symbols:
- = observation/measurement of DV
- = manipulation of IV
- = random assignment
- = experimental effect
- Simple experimental patterns:
- After-Only:
- One-Group Before–After:
- Before–After w/ Control:
- Experimental:
- Control: ,
Internal vs External Validity
- Internal – observed DV change really due to IV.
- External – results generalise to real-world.
R. A. Fisher’s Three Principles of Experimental Design
- Replication – repeat treatments to estimate error & increase accuracy.
- Randomisation – assign treatments randomly to protect against extraneous influences.
- Local Control (Blocking) – deliberately vary known nuisance factors so their variation can be measured & removed.
Informal Experimental Designs (no specific statistical layout)
- Before-After without Control.
- After-Only with Control.
- Before-After with Control.
Formal Experimental Designs
- Completely Randomised Design (CRD) – uses replication & randomisation; analysed by one-way ANOVA.
- Randomised Block Design (RBD) – adds blocking (local control); analysed by two-way ANOVA.
- Latin Square (LS) Design – controls two blocking factors (rows & columns); each treatment appears once per row & column.
- Factorial Designs – study two or more factors simultaneously.
- Simple factorial.
- Multifactor (e.g., ) – cells labelled Cell 1 … Cell 8.
Sampling Design: Definitions & Options
- Sample Design – definite plan for obtaining sample from a population; specifies selection technique.
- Census – complete enumeration of every element (e.g., Indian Census every 10 yrs).
- Advantages: intensive detail; high accuracy.
- Disadvantages: cost, time, impractical for large populations.
- Sample Survey – study subset; conclusions generalised to population.
- Advantages: economical, quicker, indispensable, checks on census.
The Sampling Design Process
- Define the population (geography, demographics, usage, awareness, etc.).
- Choose / construct the sampling frame (list of elements).
- Select sampling technique(s) – probability or non-probability.
- Determine sample size.
- Execute sampling & ensure adherence to plan.
Criteria for Selecting a Sampling Procedure
- Balance two costs:
- Data-collection cost.
- Cost of incorrect inference.
- Consider potential systematic bias & sampling error.
- Major sources of systematic bias:
- Inappropriate frame, defective measuring device, non-response, observation effects, natural reporting bias.
- Sampling error is random; falls as .
Characteristics of a Good Sample Design
- Produces truly representative sample.
- Small sampling error.
- Viable within budget & logistics.
- Controls systematic bias effectively.
- Allows results to generalise to population with reasonable confidence.
Classification of Sampling Techniques
Probability Sampling (each element known, non-zero chance)
- Simple Random Sampling (SRS)
- where = population size.
- Procedure: number elements 1…, generate random numbers.
- Systematic Sampling
- Skip interval ; pick random start then elements .
- Stratified Sampling
- Divide population into homogeneous strata; draw SRS within each.
- Sample per stratum may be proportionate or disproportionate.
- Cluster Sampling
- Divide into heterogeneous clusters; randomly select clusters, then either take all units (one-stage) or sample within (two-stage/multistage).
- Area Sampling – clusters are geographic areas (blocks, districts, etc.).
Stratified vs Cluster (Quick Contrast)
| Feature | Stratified | Cluster |
|---|---|---|
| Subdivision | Few strata, many elements each | Many clusters, few elements each |
| Within group | Homogeneous | Heterogeneous |
| Between groups | Heterogeneous | Homogeneous |
| Selection | Sample element | Sample cluster |
| Goal | Increase precision (↓error) | Increase efficiency (↓cost) |
Non-Probability Sampling (non-random selection)
- Convenience – easy access.
- Judgment (Purposive) – researcher selects what seems representative.
- Quota – sample mirrors population on selected control characteristics (gender, age, etc.). Example calculation for 3,000 respondents with sex/age/education proportions.
- Snowball – initial respondents refer others; useful for rare traits.
Choosing Probability vs Non-Probability
| Factor | Favors Non-Prob | Favors Prob |
|---|---|---|
| Research nature | Exploratory | Conclusive |
| Sampling vs nonsampling error | Non-sampling bigger | Sampling bigger |
| Population variability | Homogeneous | Heterogeneous |
| Statistical analysis | Unfavorable | Favorable |
| Operational ease/cost | Favorable | Unfavorable |
Strengths & Weaknesses Summary (Selected Techniques)
- Probability → results projectable, sampling error computable; but higher cost & time.
- Non-probability → cheaper, quicker; but unknown sampling error, limited generalisability.
Sampling & Non-Sampling Errors
- Sampling Error – due to studying sample not whole; .
- Non-Sampling Error – any other error (coverage, non-response, measurement, processing).
- Group A: Preparation errors (e.g., inadequate frame).
- Group B: Data-collection errors (interviewer bias, respondent misreporting).
- Group C: Processing errors (editing, coding, analysis mistakes).
Graphical relation: as sample size increases, sampling error ↓ but non-sampling error may ↑ after a point.
Sample Size Determination
Why Determine Sample Size?
- Ensure sufficient power/precision without wasting resources.
- Too small → study cannot detect true effects; too large → unnecessary cost, may lose accuracy via non-sampling errors.
Statistical Concepts
- Random Error ↔ precision (reliability); reduced by larger .
- Systematic Error (Bias) ↔ accuracy (validity); reduced by better design.
- Null ($H0$) vs Alternative ($H1$) Hypothesis
- Type I Error (\alpha) – reject true ; commonly .
- Type II Error (\beta) – fail to reject false .
- Power – probability of detecting true effect.
- Effect Size – magnitude of difference/association to be detected.
Basic Formula: Large Populations (>10,000)
- = z-value for desired confidence (1.96 for 95%).
- = expected proportion possessing attribute; if unknown use 0.5.
- .
- = acceptable margin of error (precision), e.g., 0.05.
Example: .
Finite Population Correction (<10,000)
- where = population size.
Example: .
Comparing Two Equal Groups (Proportions)
- If expecting and wanting to detect difference at 95% confidence:
per group.
Relation of Sample Size, Error & Power
- (generic principle).
- Larger → higher power, smaller confidence interval width.
Practical Tools for Sample Size & Power
- Ready-made tables (e.g., incidence rate table with relative precision).
- Nomograms (graphical; need control % & desired % change).
- Software: Epi-Info, nQuery, Power & Precision, Sample, STATA, SPSS.
Summary & Practical Implications
- A research or sampling design is the strategic plan that binds objectives, data needs, collection methods, analysis, cost & time.
- Good design and sampling choices minimise both bias & variance, ensuring results are valid, reliable, precise and economical.
- Understanding experimental layouts, probability vs non-probability sampling, and error structures is essential before fieldwork begins.
- Sample-size calculations anchor the study’s statistical validity; they hinge on effect size, desired confidence, allowable error, power, and population size.
- Ethical & practical stakes: over- or under-sizing wastes resources, compromises findings, or burdens participants unnecessarily.