GPH 380E Foundations of Biostatistics and Epidemiology - Lecture 2
Lecture Goals
Understand the epidemiologic transition and shifting disease patterns over time.
Identify factors contributing to the re-emergence of infectious diseases (e.g., antibiotic resistance, globalization, climate change, vaccine hesitancy).
Compare population-based vs. high-risk approaches to disease prevention and when each is most effective.
Recognize the role of epidemiology in public health and clinical decision-making.
Learn from historical examples of epidemiology-driven prevention (e.g., John Snow, Semmelweis, Jenner).
Epidemiology
Definition: Epidemiology is the study of how diseases or health characteristics are distributed in populations and the factors that influence or determine this distribution.
Premise: Disease, illness, ill health, and health status are not randomly distributed in human populations.
Each person has characteristics that predispose to or protect against various diseases.
Domains involved:
Biological
Environmental
Behavioral
Objectives of Epidemiology
Identify Etiology – the cause of a disease and its relevant risk factors; how a disease is transmitted from person to person.
Determine the extent of the disease – its burden in the community.
Study the natural history and prognosis – how virulent (lethal) is the disease? The goal is to quantify its impact.
Evaluate existing and newly developing preventative and therapeutic measures and modes of health care delivery.
Provide the foundation for developing public policy relating to environmental problems, genetic issues, and other social and behavioral considerations regarding disease prevention and health promotion.
Changing Patterns
A major role of epidemiology is to provide clues to time-based changes in health problems in the community.
1900 death patterns (leading causes): Pneumonia and influenza, Tuberculosis, Diarrhea and enteritis, Heart disease, Stroke, Kidney disease, Accidents, Cancer.
2020 death patterns (leading causes): Heart disease, Cancer, COVID-19, Unintentional injuries, Stroke, Chronic lower respiratory disease, Alzheimer’s disease, Diabetes, Senility, Influenza and pneumonia, Diphtheria, Kidney disease.
Death rates are presented as rates per 100,000 population.
Changing Patterns (Continued)
In many developing regions, the pattern of disease in 1900 resembles the U.S. in 1900—infectious diseases are leading causes of death.
As countries industrialize, mortality patterns shift toward those seen in developed countries.
Even in industrialized nations, infectious diseases are re-emerging as major public health problems (examples: HIV, COVID-19, Measles, Tuberculosis).
Re-Emergence of Infectious Disease
Antibiotic Resistance
Overuse and misuse of antibiotics in humans and animals select for resistant strains (e.g., multidrug-resistant tuberculosis, MRSA).
Limited development of new antibiotics complicates treatment.
Globalization and Increased Travel
Rapid international travel spreads infections quickly (e.g., COVID-19, Ebola).
Urbanization and high population density facilitate transmission.
Climate Change and Environmental Factors
Rising temperatures and changing ecosystems expand habitats of vectors (mosquitoes) for malaria, dengue, Zika.
Natural disasters and environmental degradation increase exposure to pathogens.
Zoonotic Spillover and Wildlife Contact
Deforestation and habitat destruction increase contact with wildlife (SARS, MERS, COVID-19).
Global wildlife/food animal trade contributes to pathogen transmission.
Vaccine Hesitancy and Declining Immunization Rates
Misinformation, distrust in public health, fears about vaccines contribute to outbreaks (e.g., measles).
Breakdown of Public Health Infrastructure
War, political instability, underfunded health systems weaken surveillance and response.
Inadequate sanitation and clean water in developing regions contribute to diarrheal diseases and cholera.
Emerging and Evolving Pathogens
New or mutated pathogens arise due to natural genetic changes (e.g., new influenza strains, COVID-19 variants).
TB Outbreak in Kansas; Measles Outbreak in Texas; Legionnaires’ Disease Cluster in Central Harlem
These pages highlight real-world outbreaks used to illustrate epidemiologic investigation and response.
Details vary; examples include outbreaks culminating in confirmed cases, hospitalizations, and, in Harlem, counts of cases, deaths, and hospitalizations along with time-to-symptom onset after exposure.
Legionnaires’ cluster (Central Harlem, ZIP codes 10027, 10030, 10035, 10037, 10039):
Risk to most people in ZIP codes is low.
If symptomatic, seek care; typical symptoms include cough, fever, chills, muscle aches, shortness of breath.
As of a date, counts included: 113 confirmed cases, 6 deaths, 7 hospitalized; symptoms usually develop 2 to 10 days after exposure (up to 14 days in some cases).
Public health contact numbers and resources are provided.
Epidemiology and Prevention
Major use: Identify subgroups at high risk for disease to direct preventive efforts (e.g., screening programs for early disease detection).
Identifying risk factors to modify: Some factors are modifiable, others non-modifiable.
Types of characteristics:
Modifiable
Non-modifiable
Three Types of Prevention
| Type | Definition | Examples |
| Primary | Preventing the initial development of a disease | Immunization, reducing exposure to a risk factor |
| Secondary | Early detection of existing disease to reduce severity and complications | Screening for cancer |
| Tertiary | Reducing the impact of the disease | Rehabilitation for stroke |
Two Approaches to Prevention
Population-based approach: Preventive measures applied broadly to the entire population.
Example: Dietary advice for coronary disease prevention or anti-smoking campaigns via mass media and health education.
High-risk approach: Preventive measures targeted to high-risk groups.
Example: Screening cholesterol in children for high-risk families.
Question for reflection: In which health issues might one approach be more effective than the other?
Epidemiology and Clinical Practice
Epidemiology informs clinical decision-making; medicine relies on population data.
Example: Interpreting a heart murmur requires correlating clinical findings with pathology/autopsy data from large patient groups to determine diagnosis (e.g., mitral regurgitation).
Epidemiologic Approach and Disease Causation
Multistep reasoning:
Determine if a statistical association exists between exposure or a characteristic and the disease.
If an association exists, assess whether it is causal; not all associations imply causation.
The goal is to derive appropriate inferences about possible causal relationships from observed associations.
Analytic Epidemiology (Example Data)
Gonorrhea rates by state, United States and Territories, 2011 and 2020 (illustrative): differences across states and over time may reflect exposure patterns, reporting, and demographic factors.
Dental Caries and Fluoride (Illustrative Relationship)
Observational relationship between fluoride content in public water supply and dental caries experience in children.
Data presented from communities with varying natural fluoride levels show differences in caries experience (e.g., Kingston vs. Newburgh); caries measured as decayed, missing, and filled teeth (DMF) per 100 children, stratified by age groups (6-9, 10-12, 13-14, 15-16).
From Observation to Preventive Actions
Semmelweis – Handwashing to prevent childbed fever (early 19th century).
Jenner – Vaccination against smallpox (late 18th century).
Snow – Epidemiological investigation of cholera (mid 19th century).
These preventive actions emerged from observations even when complete mechanistic details were not known.
Triad of Disease
Human susceptibility is determined by a mix of factors including genetic, behavioral, nutritional, and immunologic characteristics.
Components:
AGENT
HOST
VECTOR
ENVIRONMENT
Modes of Transmission
Direct transmission: Person-to-person contact (physical touch, respiratory droplets, sexual contact).
Indirect transmission:
Common vehicle
Single exposure (one-time contamination event, e.g., Salmonella in a contaminated batch of food)
Multiple exposures (several contamination events across time)
Continuous exposure (ongoing contamination, e.g., contaminated water supply)
Vector transmission: Insect vectors such as mosquitoes or ticks.
The Iceberg Concept
Diseases exist on a spectrum from subclinical (hidden) to clinical (visible).
Iceberg: Only a small portion of cases are clinically apparent; many infections remain subclinical.
Subclinical cases matter because individuals can transmit pathogens without symptoms (e.g., polio in pre-vaccine era; SARS-CoV-2 asymptomatic spread).
Chronic noncommunicable diseases can also be subclinical for long periods before diagnosis.
Implications: Disease Surveillance and Control
Relying only on clinically diagnosed cases underestimates disease burden.
Public health strategies must account for subclinical cases via testing, surveillance, and preventive interventions (e.g., asymptomatic testing, screening for chronic diseases).
Distribution of Clinical Severity
Severity depends on virulence and site of infection.
Localized infections (e.g., influenza in respiratory tract) may be mild to moderate but highly transmissible.
Systemic infections (e.g., rabies) often lead to severe or fatal disease due to widespread organ damage.
Conceptual classes (illustrative):
CLASS A: INAPPARENT INFECTION FREQUENT (e.g., tubercle bacillus)
CLASS B: CLINICAL DISEASE FREQUENT; FEW DEATHS (e.g., measles virus)
CLASS C: INFECTIONS USUALLY FATAL (e.g., rabies virus)
Clinical vs. Nonclinical
Clinical disease: defined by signs and symptoms.
Subcategories:
Preclinical disease: not yet clinically apparent.
Subclinical disease: not clinically apparent and not destined to become clinical.
Persistent (chronic) disease: infection persists for years.
Latent disease: infection with no active multiplication (e.g., shingles).
Carrier Status
An individual can harbor a pathogen without clinical illness and without serologic evidence (no antibodies).
Can still transmit infection, often at lower rates than symptomatic individuals.
Carrier status can be temporary or chronic (months or years).
Typhoid Mary
Mary Mallon (1869–1938) carried Salmonella typhi and caused multiple typhoid outbreaks while working as a cook.
She was quarantined on North Brother Island for years; authorities documented she shed typhoid bacilli and potentially spread infection without being ill herself.
This case illustrates asymptomatic carriage and the public health implications of carrier states.
Endemic, Epidemic, Pandemic
Endemic: A disease habitually present in a geographic area at a stable level (e.g., malaria in sub-Saharan Africa).
Epidemic: A sudden increase in cases in a community/region beyond normal expectations (e.g., mpox outbreak in the U.S., 2021-2022).
Pandemic: A worldwide epidemic spanning multiple countries and populations (e.g., COVID-19 in 2020).
Predicting Future Pandemics: The Role of AI
An open-access article discusses the potential role of AI (e.g., ChatGPT) in predicting pandemics and informing prevention strategies.
Key takeaways:
AI is not a substitute for human expertise but an adjunct to support early prediction, prevention, and management of future pandemics.
Disease Outbreak
Outbreak definitions by exposure:
Single Exposure: Contaminated food or water consumed once (e.g., catered lunch).
Multiple Exposures: Repeated consumption of contaminated food (e.g., leftovers served multiple times).
Continuous Exposure: Persistent contamination (e.g., sewage leakage into a water supply).
Common outbreak characteristics:
Explosive nature: Sudden, rapid increase in cases.
Cases limited to those exposed to the source.
Rare secondary transmission: Foodborne outbreaks typically do not spread person-to-person.
Norovirus Outbreak (Foodborne Illness)
Norovirus: Leading cause of foodborne illness in the U.S.
Global impact: Direct healthcare costs around $
$4.2 billion annually; societal costs around $60.3 billion annually.Cruise ship outbreaks: Rates have declined due to CDC Vessel Sanitation Program (VSP).
Prevention strategies: Hygiene protocols, surveillance, and regulatory enforcement.
Immunity and Susceptibility
Disease risk in a population depends on the balance between susceptible and immune individuals.
Immunity sources:
Previous infection (antibodies provide protection)
Vaccination (stimulates immunity without illness)
Genetic factors can confer natural resistance.
If everyone is immune, no epidemic can occur.
Herd Immunity
Definition: Resistance of a population to disease due to high levels of immunity among individuals.
Mechanism: Fewer susceptible individuals reduce transmission and protect even unvaccinated individuals (e.g., immunocompromised, small children).
Critical immunity threshold varies by disease.
Example thresholds and durations (illustrative):
Measles: ~94% immunity; duration lifelong
Polio: 80-86%; duration 5-7 years
Smallpox: 80-85%; duration 5-7 years
Rubella: 83-94%; duration ~15-20 years
COVID-19: 83-94%; duration Unknown
Others: ranges shown; some durations are unknown or vary
Incubation Period
Definition: Time between exposure to an infectious agent and onset of symptoms.
Influencing factors:
Time to replicate to a critical mass
Infection site (superficial vs. deep in the body)
Infectious dose (higher doses may shorten incubation)
Contagious Before Symptoms?
Common Cold: Incubation 1-3 days; contagious ~1 day before; total contagious period up to 10 days
Flu: Incubation 1-4 days; contagious before symptoms in some cases; total contagious period ~5-7 days
COVID-19 (Original): Incubation ~6.5 days; contagious ~1 day before symptoms; total contagious period ~5-7 days to 10+ days
COVID-19 (Omicron): Incubation 3-4 days; contagious ~1-2 days before; total contagious period ~5-7 days
Attack Rate
Definition: Measures the proportion of people at risk who develop a disease after exposure.
Formula:
\text{Attack Rate} = \frac{\text{Number of people at risk who develop illness}}{\text{Total number of people at risk}}Use: Compare risk across groups with different exposures; can be specific to an exposure (food-specific attack rate):
\text{Food-Specific Attack Rate} = \frac{\text{Number of people who ate the food and became ill}}{\text{Total number of people who ate the food}}
Secondary Attack Rate (SAR)
Definitions:
Primary Case: First person to become ill after exposure (e.g., contaminated food).
Secondary Case: Individual infected from a primary case (person-to-person spread).
Formula:
\text{SAR} = \frac{\text{New cases among contacts of primary case}}{\text{Total number of susceptible contacts}}Applications:
Infectious diseases (e.g., household spread of flu or COVID-19).
Noninfectious diseases (e.g., secondary tobacco exposure leading to lung cancer in non-smokers).
Exploring Occurrence of Disease
Key questions in disease investigation:
Who? Which populations are affected? (age, sex, behavior, genetics)
When? Does the disease follow a seasonal or annual pattern?
Where? Are cases clustered geographically?
These questions help identify risk patterns, transmission routes, and prevention strategies.
Who is Affected?
Host characteristics influence disease risk:
Age: e.g., COVID-19 deaths highest in 75+; Pertussis more severe in infants; peaks in teens for some infections.
Sex: e.g., Gonorrhea rates higher in males; underreporting in females due to asymptomatic infections.
Behavioral Risk Factors: Smoking, diet, sexual behaviors, hygiene.
Where Does Disease Occur? (Place)
Disease is not randomly distributed in space.
Geographic clustering examples:
Lyme disease is highly concentrated in the Northeast, North-Central, and Pacific Coast regions of the U.S.
West Nile Virus originated in NYC (1999) and spread across the U.S.
Vector-borne diseases depend on environment:
Lyme follows deer tick vectors; West Nile spreads via a mosquito–bird cycle.
Mapping cases aids outbreak tracking and control.
Cross-Tabulation in Epidemiology
Definition: A method to determine which of multiple possible exposures is linked to an outbreak.
Often used in foodborne illness investigations.
Example: Foodborne Streptococcal outbreak (Dade County Jail, 1974) – 325 of 690 inmates developed group A beta-hemolytic streptococcal pharyngitis; questionnaire linked beverages and egg salad sandwiches to illness; food-specific attack rates calculated.
Cross-Tabulation: Findings from Dade County Jail Outbreak
Items consumed vs. illness:
Beverage: 67.8% sick for those who drank vs. 44.0% who did not; P-value < .010
Egg Salad Sandwich: 77.9% sick for those who ate vs. 37.0% who did not; P-value < .001
Higher attack rates among those who consumed beverages and egg salad sandwiches; however, the table does not prove one item caused the outbreak.
Cross-Tabulation: Analyzing Items Together
Example category: drank beverage vs. ate egg salad
Findings: Eating egg salad associated with illness (80% vs. 25% attack rate) even when not drinking the beverage; drinking the beverage did not significantly alter risk (75.6% vs. 26.4% across categories shown).
Conclusion: Egg salad was the likely source of the outbreak.
Outbreak Investigation: Why Cross-Tabulation Is Useful
Helps identify primary source in multi-exposure outbreaks.
Allows comparison of attack rates among exposure groups.
Used in both infectious and noninfectious disease research.
Broader Applications of Epidemiology
Infectious vs. Noninfectious Diseases
Infectious causes of chronic diseases: Hepatitis B → Primary liver cancer; HPV → Cervical cancer; Helicobacter pylori → Gastric cancer
Formulas and Key Definitions Summary
Attack Rate:
\text{Attack Rate} = \frac{\text{Number of people at risk who develop illness}}{\text{Total number of people at risk}}Food-Specific Attack Rate:
\text{Food-Specific Attack Rate} = \frac{\text{Number of people who ate the food and became ill}}{\text{Total number of people who ate the food}}Secondary Attack Rate:
\text{SAR} = \frac{\text{New cases among contacts of primary case}}{\text{Total number of susceptible contacts}}
Note on Tables and Figures
Several slides referenced figures/images (death rates by year, outbreaks by location, incubation period charts, herd immunity tables). Descriptions above reproduce the core concepts and numerical relationships indicated in the transcript.