Sampling and Surveys
Population and Samples
Population: whole collection of people that information is sought out from
Census: collection of data from every subject matter in population
Sample: specific bit of population where data is collected from
Whether certain data is seen as population or sample depends on who is viewing it.
The definition of population depends on what you’re trying to study.
Parameter
- A number that describes certain characteristics of the population
- This is a fixed number, but in reality we don’t know it’s value
Statistic
- Known after sample is taken
- This will vary from sample to sample
- Statistic is used to estimate the parameter
Inference
The process of drawing conclusions based on a sample.
Bias
- When one answer is systematically favored over another
- It should be stated if the sample result is too high or low in comparison to the population
Voluntary Bias
- When people respond to a general invitation to answers a question
- Subjects self select
- Causes fluctuation (bias) between people who are passionate and people who don’t care enough to respond
Convenience Sampling
- When subjects are chosen based on convenience
- Doesn’t fairly represent the whole population
Systematic Sample
Step 1: Randomly select starting point
Step 2: Select every kth item after
- Quick and easy
- Not every sample has an equal chance of being chosen
- Can produce bias
Simple Random Sample
When everyone has an equal chance of being chosen.
Variability
- How spread out statistics are among samples
- Bigger samples tend to be less spread out than smaller ones
Studies should have low bias and low variability
Random sampling + Larger samples = fair results
Explaining Bias
Step 1: How will sampled individuals differ from the rest of the population
Step 2: How this results in overestimate or underestimate
Observational Studies
Retrospective: examines existing data or asks about past behaviors
Prospective: follows individuals to gain further data
Sampling Frame
List of individuals that sample is drawn from
- Undercoverage bias and variable sampling errors can occur
Non Sampling Errors
- Voluntary response bias
- Undercoverage bias
- Non response bias
- Non random sampling methods
These can also be in a census. Increasing sample size won’t reduce error.
Non Response Bias
When some individuals can’t be contacted or when people lie, don’t respond, or partially respond.
- Not undercoverage bias
- Not voluntary response bias (selected by researchers as opposed to self selected)
Response Bias
When there are problems with the data gathering instrument
- People can lie
- People can answer something they don’t know
- The question is confusing
Question Wording Bias
The wording of questions can influence answers.
Stratified Random Sampling
Step 1: Divide into distinct groups (stratas)- homogeneous grouping
Step 2: Use a method of random number selection and pick from each stratum to form the complete sample
- Since people are divided into groups already all groups have an equal chance of being represented
- Small differences within strata and large differences among statums
- Helps to reduce variance in the data
Cluster Sampling
- Classify groups of people that are next to each other
- Use a method to randomly select one or more clusters
- This helps to save time and money
- The clusters should be different within but similar between