Sampling and Surveys

Population and Samples

Population: whole collection of people that information is sought out from

Census: collection of data from every subject matter in population

Sample: specific bit of population where data is collected from

Whether certain data is seen as population or sample depends on who is viewing it.

The definition of population depends on what you’re trying to study.

Parameter

  • A number that describes certain characteristics of the population

  • This is a fixed number, but in reality we don’t know it’s value

Statistic

  • Known after sample is taken

  • This will vary from sample to sample

  • Statistic is used to estimate the parameter

Inference

The process of drawing conclusions based on a sample.

Bias

  • When one answer is systematically favored over another

  • It should be stated if the sample result is too high or low in comparison to the population

Voluntary Bias

  • When people respond to a general invitation to answers a question

  • Subjects self select

  • Causes fluctuation (bias) between people who are passionate and people who don’t care enough to respond

Convenience Sampling

  • When subjects are chosen based on convenience

  • Doesn’t fairly represent the whole population

Systematic Sample

Step 1: Randomly select starting point

Step 2: Select every kth item after

  • Quick and easy

  • Not every sample has an equal chance of being chosen

  • Can produce bias

Simple Random Sample

When everyone has an equal chance of being chosen.

Variability

  • How spread out statistics are among samples

  • Bigger samples tend to be less spread out than smaller ones

Studies should have low bias and low variability

Random sampling + Larger samples = fair results

Explaining Bias

Step 1: How will sampled individuals differ from the rest of the population

Step 2: How this results in overestimate or underestimate

Observational Studies

Retrospective: examines existing data or asks about past behaviors

Prospective: follows individuals to gain further data

Sampling Frame

List of individuals that sample is drawn from

  • Undercoverage bias and variable sampling errors can occur

Non Sampling Errors

  • Voluntary response bias

  • Undercoverage bias

  • Non response bias

  • Non random sampling methods

These can also be in a census. Increasing sample size won’t reduce error.

Non Response Bias

When some individuals can’t be contacted or when people lie, don’t respond, or partially respond.

  • Not undercoverage bias

  • Not voluntary response bias (selected by researchers as opposed to self selected)

Response Bias

When there are problems with the data gathering instrument

  • People can lie

  • People can answer something they don’t know

  • The question is confusing

Question Wording Bias

The wording of questions can influence answers.

Stratified Random Sampling

Step 1: Divide into distinct groups (stratas)- homogeneous grouping

Step 2: Use a method of random number selection and pick from each stratum to form the complete sample

  • Since people are divided into groups already all groups have an equal chance of being represented

  • Small differences within strata and large differences among statums

  • Helps to reduce variance in the data

Cluster Sampling

  • Classify groups of people that are next to each other

  • Use a method to randomly select one or more clusters

  • This helps to save time and money

  • The clusters should be different within but similar between

robot