Chapter 1 Part 2 Data collection

Chapter 1: Data Collection

Key Concepts in Statistics

  • Mean of Sampling Distribution

    • Refers to the average of sample means drawn from a population, which is consistent irrespective of sample size.

    • Central Limit Theorem: As sample size increases, the distribution of sample means approaches normality, regardless of the population's distribution.

  • Shape of Sampling Distribution

    • The shape becomes nearly normal when sample size is adequate (typically n ≥ 30).

Page 2: Agenda

  • Data Collection Methods

  • Observational vs. Experimental Studies

  • Random Sampling

Page 3: Observational Studies vs. Designed Experiments

  • Learning Objectives

    • Distinguish between observational studies and experiments.

    • Explain various types of observational studies.

Page 4: Observational Studies and Experiments

  • Observational Study

    • Researchers observe behaviors without influence.

  • Designed Experiment

    • Researchers manipulate variables and assign groups.

Page 5: Example - Cellular Phones and Brain Tumors

  • Context: Study of mobile phone use and brain tumors with 791,710 women over 7 years.

  • Key Finding: No significant difference in tumor incidence between phone users and non-users (Source: Benson et al., 2013).

Page 6: National Toxicology Program Study

  • Investigated radio-frequency radiation (RFR) and brain tumors using rats in controlled environments:

    • Three groups: control (no RFR), GSM-modulated RFR, CDMA-modulated RFR.

    • Findings: Low tumor incidence in exposed rats; results not statistically significant.

Page 8: Research Variables

  • Response Variable: Brain cancer occurrence.

  • Explanatory Variable: Level of cell phone usage.

  • Aim is to see how the explanatory variable impacts the response variable.

Page 9: Observational Study Definition

  • No influence on response or explanatory variables; behavior is simply observed.

Page 11: Flu Shots Example

  • Longitudinal study of 36,000 seniors regarding flu shot effectiveness.

    • Findings: Flu shots associated with reduced hospitalization and mortality from pneumonia/influenza (Source: Nichol et al., 2007).

Page 13: Confounding in Studies

  • Definition: Effects of multiple explanatory variables are not isolated leading to relations that may not be directly due to the studied variables.

  • Lurking Variables: Not considered but affect the response variable.

Page 15: Causation vs. Association

  • Observational studies reveal association, not causation.

Page 18: Types of Observational Studies

  • Cross-sectional Studies: Information collected at one point in time.

  • Case-control Studies: Retrospective study comparing individuals with certain characteristics to those without.

  • Cohort Studies: Prospective, following a group over time to collect data on characteristics.

Page 20: Census

  • Defined as a list of all individuals in a population and their characteristics.

Page 21: Web Scraping

  • Process of data extraction from websites; involves ethical considerations and leveraging available public data.

Page 24: Simple Random Sampling Definition

  • Definition: Randomly selecting individuals from a population ensures every individual has an equal chance of being included in the sample.

Page 25: Sample Size Consideration

  • Size of sample (n) must be less than that of the population (N).

Page 27: Simple Random Sampling Example

  • Scenario: Selecting three friends from six for a concert.

    • Total combinations calculated to highlight sampling likelihood.

Page 31: Sampling Techniques

  • Without Replacement: Selected individuals can't participate again.

  • With Replacement: Selected individuals can be chosen again in future samples.

Page 41: Cluster Sampling

  • Approach involves selecting entire clusters—groups of individuals—and surveying all members within them.

Page 47: Types of Sampling

  • Comparison of Stratified, Systematic, and Cluster Sampling techniques and their methodologies.

Page 48: Bias in Sampling

  • Sources of Bias

    • Sampling Bias: Bias in selection technique favoring specific population aspects.

    • Nonresponse Bias: Differences in opinions between respondents and non-respondents.

    • Response Bias: Inaccurate reflections of true feelings due to various influences.

Page 60: Addressing Response Bias

  • Suggested considerations include:

    • Interviewer Error: Skilled interviewers lead to accurate responses.

    • Misrepresented Answers: Responses may not always be truthful.

    • Wording and Order of Questions: Skewed results can result from biased phrasing or leading questions.

robot