Study Notes on Sampling, Observation, Surveys, Ethics, and Data Interpretation

Population, sampling, and generalizability

  • Population vs. sample definitions:
    • Population: the entire group of interest (e.g., all gabbling college students).
    • Sample: a subset drawn from the population for study.
  • Gold standard in social science research: random sample
    • Definition: every member of the population has an equal chance of being selected.
    • Intuition: lottery-like selection where everyone is in the hat and randomly drawn.
    • Implication: data from a random sample are typically more generalizable to the population.
  • Real-world challenge: truly random samples are often not achieved
    • People screen calls, blocks unknown callers, etc.
    • Telephone survey limitations: many people use cell phones; landlines underrepresent younger generations and non-owners.
    • If sampling via random digit dialing (house phones) or phone books, you may oversample older populations and miss younger, cell-phone-only respondents.
  • Example population and sampling risk
    • Population: study habits of gabbling college students.
    • If sample collection is biased (e.g., surveying only at the student union at noon, or only online students), the sample may not represent the entire student body.
    • Consequence: biased results and limited generalizability.
  • Haphazard vs. systematic sampling
    • Haphazard (convenience) sample: survey people who happen to be nearby with no strategy.
    • Limitation: often not representative of the population; useful for learning survey administration, not for making broader inferences.
    • Systematic and stratified sampling: approaches students should know for more robust inference (referenced later).

Classic sampling failure: Literary Digest, 1936

  • Survey scale: about 2,000,000 surveys conducted to predict the presidential election.
  • Prediction: Alfred Landon would win; map showed Landon with Maine and Vermont only; FDR would win the rest.
  • Why the failure?
    • Sampling frame bias: respondents were drawn from magazine subscribers, telephone users, and vehicle registrations.
    • Context: during the Great Depression, car ownership and telephone access were not uniformly distributed; rural and poorer populations were underrepresented.
    • Nonresponse bias: a large portion of those contacted did not respond, and respondents differed in meaningful ways from nonrespondents.
  • Takeaway:
    • Even huge samples can be biased if the sampling frame and response patterns miss important subgroups.
    • Nonresponse bias remains a major challenge in modern surveys as well.

Observational methods and sampling ethics in fieldwork

  • Observational research approaches
    • Unobtrusive observation: participants do not know they are being observed; minimizes reactivity but raises ethical/privacy concerns.
    • Obtrusive observation: researchers are involved and interacting; higher potential for changing behavior or influencing the scene.
  • Participant observation and ethnography
    • Participating observer: researchers engage with the group to gain deeper insight into culture and practices.
    • Complete participant: researcher is fully embedded and identified as part of the group from the start.
    • Reactivity: people act differently when they know they are being watched; ethnography often requires long immersion to restore natural behavior.
  • Covert vs. overt research
    • Covert: participants are unaware they are being studied; raises substantial ethical concerns around privacy and informed consent.
    • Overt: participants are informed and consent to be studied; more ethically straightforward.
  • Privacy and public vs private spaces
    • Public settings (parks, malls) generally have fewer privacy expectations; observation is more permissible but still ethically guided.
    • Private settings (homes, private offices) require explicit consent to observe or interview.
  • Informed consent, consent withdrawal, and coercion
    • Participants should consent to participate and know what the study entails.
    • Consent can be withdrawn at any time; participants are not obliged to continue.
    • For surveys, forced answering (no skip options) is unethical; participants should be able to skip questions.
  • Special consent considerations
    • Protected populations (children, incarcerated individuals, pregnant women, etc.) require extra safeguards due to history of mistreatment.
    • Incarcerated populations often have constrained autonomy; research must be tightly justified and tightly regulated.

Surveys and measurement issues

  • Surveys basics
    • A survey is a structured set of questions designed to elicit information.
    • Operationalization: how a concept is defined and measured in the survey (wording, response options).
  • Social desirability bias
    • Respondents may tailor answers to be viewed favorably, especially in face-to-face interviews about sensitive topics (e.g., sexual behavior).
    • Anonymity or computer-based surveys can reduce social desirability bias.
  • Mode effects and question design
    • Mode (in-person, phone, online, paper) influences how people respond.
    • Sensitive questions may require anonymous or non-face-to-face formats to improve honesty.
  • Data quality and validation checks
    • Include attention checks and implausible-response questions to detect non-serious participation.
    • If respondents report impossible values (e.g., impossible drug use), discard or flag their data; otherwise, reliability is compromised.
  • Incentives and data quality
    • Paid surveys can incentivize hurried or dishonest responses; data quality may suffer.
  • Deception and reporting in survey design
    • Researchers may, in some cases, alter the framing or labels of surveys to improve participation (e.g., renaming a study to increase willingness to participate).
    • Ethical considerations require transparency and minimizing harm.

Biases, misinformation, and the ethics of information presentation

  • The ethics of information presentation
    • Statistics and graphs can shape perception without lying about numbers
    • Best and Shirley argue for critical consumption of data: beware of how data are framed to advance a narrative.
  • Examples of misleading representation (conceptual, not exhaustive)
    • Unemployment graphs: same data, different scales produce different visual stories
    • Graphs with small y-axes show more dramatic changes; large scales can obscure changes.
    • Planned Parenthood graph (Congressional hearing): juxtaposing abortions with cancer screenings without proper scaling or context can mislead about trends.
    • Infographics and political messaging: increasing graduation rates vs. a very gradual real increase can be exaggerated visually; a careful, honest depiction would show gradual growth.
    • Tax-cut visuals (Fox News example): framing of policy consequences can provoke a specific political stance without presenting a complete picture.
  • Correlation vs. causation
    • Correlation does not imply causation; two things can move together due to a third variable or coincidence.
    • Example style of misleading correlations often cited for humor:
    • worldwide noncommercial space launches and sociology degrees awarded
    • number of people electrocuted by power lines and marriage rate in Alabama
    • Important reminder: apparent correlations should prompt questions about potential confounders and underlying mechanisms.
  • “Mutant statistics” and the risk of misinterpretation
    • Misleading statistics can be created intentionally or unintentionally by misreading or misreporting data.
    • Classic example: a statistic like "the number of gun deaths doubled every year since 1950" would imply an impossible total; what was meant is that the figure had doubled since 1950 (a different interpretation).
    • Always verify the exact phrasing and time frame of a statistic before accepting it.

Ethical and practical dilemmas in reporting and data collection

  • Deceptive variable naming and sample framing
    • The National Survey of Fertility Barriers was renamed to a more generic term (e.g., a family survey) to encourage participation, masking the study’s focus on fertility barriers.
    • Deceptive labeling can affect who participates and how they respond.
  • Dilemmas with reporting sensitive information
    • When researchers encounter sensitive contexts (e.g., illegal activities, exploitation), they must balance participant welfare with the scientific value of disclosure.
    • Examples include interviews with formerly incarcerated individuals, surrogacy arrangements, or illicit drug use.
  • Mandated reporting and protection considerations
    • Some researchers become mandated reporters; these legal obligations influence how information is handled.
    • In sensitive cases (e.g., potential child welfare concerns), researchers must consider whether to report, given confidentiality and safety implications.
  • Balancing emancipatory goals with potential harm
    • In some cases, sharing a story or technique can help social change but may also expose participants to risk or stigma.
    • Researchers may choose to publish with pseudonyms or anonymized accounts, or to tell the story in a way that preserves participants’ safety while highlighting systemic issues.
  • Real-world ethical decision-making example
    • A researcher studying pathways to parenting for same-sex couples faced a choice: disclose information about adoption loopholes and cross-state arrangements that could promote social change but also risk participants’ safety or privacy.
    • After consultation with participants, the researcher chose to tell the full story to support social change, while recognizing ongoing legal debates and potential risks.
  • Reporting changes in policy and law
    • Laws and policies shift over time; researchers must stay aware of current contexts (e.g., surrogacy law, joint adoption rights) to interpret data accurately and responsibly.

Special topics: measurement, interpretation, and responsibility

  • Emphasis on critical thinking when consuming statistics
    • Always question: What exactly is being measured? What population? What time frame?
    • Are there potential biases in sampling, response, or framing?
    • Are there confounding variables that could explain the observed relationships?
  • Practical takeaways for exam and research practice
    • Distinguish clearly between survey research, observational studies, and experiments
    • Prefer random sampling where possible to improve generalizability
    • Be transparent about limitations: sampling bias, nonresponse, measurement error, and ethical constraints
    • Use appropriate statistical reasoning: avoid assuming causation from correlation; recognize the role of confounders
    • When presenting data, ensure scales and labels accurately reflect the magnitude of changes (avoid deliberate or accidental misrepresentation)
  • Final ethical reminder
    • Always honor participants’ autonomy: informed consent, voluntary participation, and the right to withdraw
    • Protect vulnerable populations; ensure privacy and minimize potential harm
    • Be honest about methods, limitations, and potential conflicts of interest

extExampleformula n=2,000,000 ext(LiteraryDigestsamplesize)ext{Example formula} \ n = 2{,}000{,}000 \ ext{(Literary Digest sample size)}

extExampleofamisleadingtimeframestatement: extThenumberofgundeathsdoubledsince1950 extvs.extThenumberofgundeathshasdoubledeveryyearsince1950.ext{Example of a misleading time-frame statement:} \ ext{The number of gun deaths doubled since } 1950 \ ext{vs. } ext{The number of gun deaths has doubled every year since } 1950.