Study Notes on Sampling, Observation, Surveys, Ethics, and Data Interpretation
Population, sampling, and generalizability
- Population vs. sample definitions:
- Population: the entire group of interest (e.g., all gabbling college students).
- Sample: a subset drawn from the population for study.
- Gold standard in social science research: random sample
- Definition: every member of the population has an equal chance of being selected.
- Intuition: lottery-like selection where everyone is in the hat and randomly drawn.
- Implication: data from a random sample are typically more generalizable to the population.
- Real-world challenge: truly random samples are often not achieved
- People screen calls, blocks unknown callers, etc.
- Telephone survey limitations: many people use cell phones; landlines underrepresent younger generations and non-owners.
- If sampling via random digit dialing (house phones) or phone books, you may oversample older populations and miss younger, cell-phone-only respondents.
- Example population and sampling risk
- Population: study habits of gabbling college students.
- If sample collection is biased (e.g., surveying only at the student union at noon, or only online students), the sample may not represent the entire student body.
- Consequence: biased results and limited generalizability.
- Haphazard vs. systematic sampling
- Haphazard (convenience) sample: survey people who happen to be nearby with no strategy.
- Limitation: often not representative of the population; useful for learning survey administration, not for making broader inferences.
- Systematic and stratified sampling: approaches students should know for more robust inference (referenced later).
Classic sampling failure: Literary Digest, 1936
- Survey scale: about 2,000,000 surveys conducted to predict the presidential election.
- Prediction: Alfred Landon would win; map showed Landon with Maine and Vermont only; FDR would win the rest.
- Why the failure?
- Sampling frame bias: respondents were drawn from magazine subscribers, telephone users, and vehicle registrations.
- Context: during the Great Depression, car ownership and telephone access were not uniformly distributed; rural and poorer populations were underrepresented.
- Nonresponse bias: a large portion of those contacted did not respond, and respondents differed in meaningful ways from nonrespondents.
- Takeaway:
- Even huge samples can be biased if the sampling frame and response patterns miss important subgroups.
- Nonresponse bias remains a major challenge in modern surveys as well.
Observational methods and sampling ethics in fieldwork
- Observational research approaches
- Unobtrusive observation: participants do not know they are being observed; minimizes reactivity but raises ethical/privacy concerns.
- Obtrusive observation: researchers are involved and interacting; higher potential for changing behavior or influencing the scene.
- Participant observation and ethnography
- Participating observer: researchers engage with the group to gain deeper insight into culture and practices.
- Complete participant: researcher is fully embedded and identified as part of the group from the start.
- Reactivity: people act differently when they know they are being watched; ethnography often requires long immersion to restore natural behavior.
- Covert vs. overt research
- Covert: participants are unaware they are being studied; raises substantial ethical concerns around privacy and informed consent.
- Overt: participants are informed and consent to be studied; more ethically straightforward.
- Privacy and public vs private spaces
- Public settings (parks, malls) generally have fewer privacy expectations; observation is more permissible but still ethically guided.
- Private settings (homes, private offices) require explicit consent to observe or interview.
- Informed consent, consent withdrawal, and coercion
- Participants should consent to participate and know what the study entails.
- Consent can be withdrawn at any time; participants are not obliged to continue.
- For surveys, forced answering (no skip options) is unethical; participants should be able to skip questions.
- Special consent considerations
- Protected populations (children, incarcerated individuals, pregnant women, etc.) require extra safeguards due to history of mistreatment.
- Incarcerated populations often have constrained autonomy; research must be tightly justified and tightly regulated.
Surveys and measurement issues
- Surveys basics
- A survey is a structured set of questions designed to elicit information.
- Operationalization: how a concept is defined and measured in the survey (wording, response options).
- Social desirability bias
- Respondents may tailor answers to be viewed favorably, especially in face-to-face interviews about sensitive topics (e.g., sexual behavior).
- Anonymity or computer-based surveys can reduce social desirability bias.
- Mode effects and question design
- Mode (in-person, phone, online, paper) influences how people respond.
- Sensitive questions may require anonymous or non-face-to-face formats to improve honesty.
- Data quality and validation checks
- Include attention checks and implausible-response questions to detect non-serious participation.
- If respondents report impossible values (e.g., impossible drug use), discard or flag their data; otherwise, reliability is compromised.
- Incentives and data quality
- Paid surveys can incentivize hurried or dishonest responses; data quality may suffer.
- Deception and reporting in survey design
- Researchers may, in some cases, alter the framing or labels of surveys to improve participation (e.g., renaming a study to increase willingness to participate).
- Ethical considerations require transparency and minimizing harm.
- The ethics of information presentation
- Statistics and graphs can shape perception without lying about numbers
- Best and Shirley argue for critical consumption of data: beware of how data are framed to advance a narrative.
- Examples of misleading representation (conceptual, not exhaustive)
- Unemployment graphs: same data, different scales produce different visual stories
- Graphs with small y-axes show more dramatic changes; large scales can obscure changes.
- Planned Parenthood graph (Congressional hearing): juxtaposing abortions with cancer screenings without proper scaling or context can mislead about trends.
- Infographics and political messaging: increasing graduation rates vs. a very gradual real increase can be exaggerated visually; a careful, honest depiction would show gradual growth.
- Tax-cut visuals (Fox News example): framing of policy consequences can provoke a specific political stance without presenting a complete picture.
- Correlation vs. causation
- Correlation does not imply causation; two things can move together due to a third variable or coincidence.
- Example style of misleading correlations often cited for humor:
- worldwide noncommercial space launches and sociology degrees awarded
- number of people electrocuted by power lines and marriage rate in Alabama
- Important reminder: apparent correlations should prompt questions about potential confounders and underlying mechanisms.
- “Mutant statistics” and the risk of misinterpretation
- Misleading statistics can be created intentionally or unintentionally by misreading or misreporting data.
- Classic example: a statistic like "the number of gun deaths doubled every year since 1950" would imply an impossible total; what was meant is that the figure had doubled since 1950 (a different interpretation).
- Always verify the exact phrasing and time frame of a statistic before accepting it.
Ethical and practical dilemmas in reporting and data collection
- Deceptive variable naming and sample framing
- The National Survey of Fertility Barriers was renamed to a more generic term (e.g., a family survey) to encourage participation, masking the study’s focus on fertility barriers.
- Deceptive labeling can affect who participates and how they respond.
- Dilemmas with reporting sensitive information
- When researchers encounter sensitive contexts (e.g., illegal activities, exploitation), they must balance participant welfare with the scientific value of disclosure.
- Examples include interviews with formerly incarcerated individuals, surrogacy arrangements, or illicit drug use.
- Mandated reporting and protection considerations
- Some researchers become mandated reporters; these legal obligations influence how information is handled.
- In sensitive cases (e.g., potential child welfare concerns), researchers must consider whether to report, given confidentiality and safety implications.
- Balancing emancipatory goals with potential harm
- In some cases, sharing a story or technique can help social change but may also expose participants to risk or stigma.
- Researchers may choose to publish with pseudonyms or anonymized accounts, or to tell the story in a way that preserves participants’ safety while highlighting systemic issues.
- Real-world ethical decision-making example
- A researcher studying pathways to parenting for same-sex couples faced a choice: disclose information about adoption loopholes and cross-state arrangements that could promote social change but also risk participants’ safety or privacy.
- After consultation with participants, the researcher chose to tell the full story to support social change, while recognizing ongoing legal debates and potential risks.
- Reporting changes in policy and law
- Laws and policies shift over time; researchers must stay aware of current contexts (e.g., surrogacy law, joint adoption rights) to interpret data accurately and responsibly.
Special topics: measurement, interpretation, and responsibility
- Emphasis on critical thinking when consuming statistics
- Always question: What exactly is being measured? What population? What time frame?
- Are there potential biases in sampling, response, or framing?
- Are there confounding variables that could explain the observed relationships?
- Practical takeaways for exam and research practice
- Distinguish clearly between survey research, observational studies, and experiments
- Prefer random sampling where possible to improve generalizability
- Be transparent about limitations: sampling bias, nonresponse, measurement error, and ethical constraints
- Use appropriate statistical reasoning: avoid assuming causation from correlation; recognize the role of confounders
- When presenting data, ensure scales and labels accurately reflect the magnitude of changes (avoid deliberate or accidental misrepresentation)
- Final ethical reminder
- Always honor participants’ autonomy: informed consent, voluntary participation, and the right to withdraw
- Protect vulnerable populations; ensure privacy and minimize potential harm
- Be honest about methods, limitations, and potential conflicts of interest
extExampleformula n=2,000,000 ext(LiteraryDigestsamplesize)
extExampleofamisleadingtime−framestatement: extThenumberofgundeathsdoubledsince1950 extvs.extThenumberofgundeathshasdoubledeveryyearsince1950.