Unit 3 Notes: Samples and Surveys

4.1 Samples and Surveys

Observational Study vs. Experiment

  • Observational Study: Observes individuals and measures variables of interest without attempting to influence the responses.
  • Experiment: Deliberately imposes some treatment on individuals to measure their responses.

Population and Sample

  • Population: The entire group of individuals that we want information about.
  • Sample: A subset of individuals in the population from which we collect data.
  • From a sample, we obtain a statistic (either xˉ\bar{x} or p^\hat{p}).
Problem 1: Identifying Population and Sample
  • Scenario 1:
    • The student government surveys 100 students to get opinions about a change to the bell schedule.
      • Population: All students at the high school.
      • Sample: 100 students surveyed.
  • Scenario 2:
    • A quality control manager selects 10 cans from the production line every hour to check soda volume.
      • Population: All cans produced by the bottling company.
      • Sample: 10 cans from the production line every hour.

Census

  • A census collects data from every individual in the population.
  • A census is difficult, costly, and time-consuming.

How to Sample Badly

Convenience Sampling
  • Selects individuals from the population who are the easiest to reach.
  • Often produces data that is unrepresentative of the population.
Voluntary Response Sampling
  • Allows people to self-select to be in a sample by responding to a general invitation.

  • Convenience and voluntary response sampling are "bad" sampling techniques because they can lead to bias.

Sample Surveys: What Can Go Wrong?

Bias
  • Bias is a systematic error in data.
  • The design of a statistical study shows bias if it is very likely to overestimate or underestimate the value you want to know (mean or proportion).
Types of Bias
  • Undercoverage: Occurs when some members of the population are less likely to be chosen or are left out of the sample.
  • Nonresponse: Occurs when an individual is selected for a sample but can't be contacted or refuses to participate.
  • Response Bias: Occurs when there is a systematic pattern of false or inaccurate responses.
  • Question-Wording Bias: Occurs when the wording of a question systematically influences the responses.

Check Your Understanding

  1. Orange Crate Scenario:
    • A farmer brings crates of oranges weekly, and an inspector looks at 10 oranges from the top of each crate.
    • Sampling Method: Convenience sample.
    • Bias: The oranges at the top may not be representative, leading to undercoverage. This could lead to an underestimation of "bad" oranges (undercoverage).
  2. Nightline Poll:
    • Viewers were invited to call one number for "Yes" and another for "No" regarding the UN headquarters in the US.
    • Sampling Method: Voluntary response.
    • Bias: Callers are invited to respond, typically those with strong opinions, leading to an overestimate of those who respond "No."
  3. Los Angeles Times Online Poll:
    • The poll asked if the City Council was right to pass a boycott of Arizona, with 96% of respondents saying "No."
    • Representativeness: Not representative.
    • Explanation: Because this is a voluntary response sample and the question was worded in a way to support the boycott (was the city council right) both lead to an overestimate of Los Angeles residents who said NO.