Chapter 1 Part 2 Data collection
Chapter 1: Data Collection
Key Concepts in Statistics
Mean of Sampling Distribution
Refers to the average of sample means drawn from a population, which is consistent irrespective of sample size.
Central Limit Theorem: As sample size increases, the distribution of sample means approaches normality, regardless of the population's distribution.
Shape of Sampling Distribution
The shape becomes nearly normal when sample size is adequate (typically n ≥ 30).
Page 2: Agenda
Data Collection Methods
Observational vs. Experimental Studies
Random Sampling
Page 3: Observational Studies vs. Designed Experiments
Learning Objectives
Distinguish between observational studies and experiments.
Explain various types of observational studies.
Page 4: Observational Studies and Experiments
Observational Study
Researchers observe behaviors without influence.
Designed Experiment
Researchers manipulate variables and assign groups.
Page 5: Example - Cellular Phones and Brain Tumors
Context: Study of mobile phone use and brain tumors with 791,710 women over 7 years.
Key Finding: No significant difference in tumor incidence between phone users and non-users (Source: Benson et al., 2013).
Page 6: National Toxicology Program Study
Investigated radio-frequency radiation (RFR) and brain tumors using rats in controlled environments:
Three groups: control (no RFR), GSM-modulated RFR, CDMA-modulated RFR.
Findings: Low tumor incidence in exposed rats; results not statistically significant.
Page 8: Research Variables
Response Variable: Brain cancer occurrence.
Explanatory Variable: Level of cell phone usage.
Aim is to see how the explanatory variable impacts the response variable.
Page 9: Observational Study Definition
No influence on response or explanatory variables; behavior is simply observed.
Page 11: Flu Shots Example
Longitudinal study of 36,000 seniors regarding flu shot effectiveness.
Findings: Flu shots associated with reduced hospitalization and mortality from pneumonia/influenza (Source: Nichol et al., 2007).
Page 13: Confounding in Studies
Definition: Effects of multiple explanatory variables are not isolated leading to relations that may not be directly due to the studied variables.
Lurking Variables: Not considered but affect the response variable.
Page 15: Causation vs. Association
Observational studies reveal association, not causation.
Page 18: Types of Observational Studies
Cross-sectional Studies: Information collected at one point in time.
Case-control Studies: Retrospective study comparing individuals with certain characteristics to those without.
Cohort Studies: Prospective, following a group over time to collect data on characteristics.
Page 20: Census
Defined as a list of all individuals in a population and their characteristics.
Page 21: Web Scraping
Process of data extraction from websites; involves ethical considerations and leveraging available public data.
Page 24: Simple Random Sampling Definition
Definition: Randomly selecting individuals from a population ensures every individual has an equal chance of being included in the sample.
Page 25: Sample Size Consideration
Size of sample (n) must be less than that of the population (N).
Page 27: Simple Random Sampling Example
Scenario: Selecting three friends from six for a concert.
Total combinations calculated to highlight sampling likelihood.
Page 31: Sampling Techniques
Without Replacement: Selected individuals can't participate again.
With Replacement: Selected individuals can be chosen again in future samples.
Page 41: Cluster Sampling
Approach involves selecting entire clusters—groups of individuals—and surveying all members within them.
Page 47: Types of Sampling
Comparison of Stratified, Systematic, and Cluster Sampling techniques and their methodologies.
Page 48: Bias in Sampling
Sources of Bias
Sampling Bias: Bias in selection technique favoring specific population aspects.
Nonresponse Bias: Differences in opinions between respondents and non-respondents.
Response Bias: Inaccurate reflections of true feelings due to various influences.
Page 60: Addressing Response Bias
Suggested considerations include:
Interviewer Error: Skilled interviewers lead to accurate responses.
Misrepresented Answers: Responses may not always be truthful.
Wording and Order of Questions: Skewed results can result from biased phrasing or leading questions.