Principles of Epidemiology in Public Health Practice - Complete Course Notes
Definition and Principles of Epidemiology
Etymology: The term is derived from the Greek words epi (on or upon), demos (people), and logos (the study of), translating to "the study of what befalls a population."
Formal Definition: Epidemiology is the study of the distribution and determinants of health-related states or events in specified populations, and the application of this study to the control of health problems.
Key Components of the Definition:
Study: Epidemiology is a scientific, data-driven discipline relying on systematic and unbiased approaches to collection, analysis, and interpretation of data. It employs causal reasoning based on hypotheses from biology, behavioral science, physics, and ergonomics.
Distribution: Refers to frequency (number of events and their relationship to population size via rates) and pattern (occurrence by time, place, and person).
Determinants: Factors (causes, risk factors) that bring about a change in a health condition. It assumes illness is not random but occurs when the right accumulation of factors exists in an individual.
Health-related states or events: Originally focused on communicable disease epidemics; now includes chronic diseases, injuries, birth defects, occupational health, environmental health, and behaviors (e.g., exercise, seat belt use).
Specified populations: Clinicians treat individuals (the patient); epidemiologists treat the community (the collective health of people).
Application: Epidemiology is both science and art, using scientific methods to "diagnose" the community's health and propose practical interventions.
Historical Evolution of Epidemiology
Hippocrates (Circa 400 B.C.): In the essay "On Airs, Waters, and Places," he suggested environmental and host behavioral factors influence disease, moving away from supernatural explanations.
John Graunt (1662): A London haberdasher who first quantified patterns of birth, death, and disease, noting disparities by sex, infant mortality, and urban/rural differences.
William Farr (1800s): The father of modern vital statistics and surveillance. He systematically collected and evaluated Britain's mortality statistics, reporting findings to authorities and the public.
John Snow (Mid-1800s): The father of field epidemiology. His 1854 investigation of the Golden Square cholera outbreak in London used a "spot map" to link deaths to the Broad Street pump. His second investigation compared mortality between the Lambeth Company (intake upstream from sewage) and the Southwark and Vauxhall Company (intake downstream).
Southwark & Vauxhall Mortality:5.0 per 1,000 population.
Lambeth Mortality:0.9 per 1,000 population.
Post-World War II Developments:
Application to chronic disease (e.g., Doll and Hill smoking/lung cancer studies; Framingham Heart Study).
Smallpox eradication (1960s-1970s).
Inclusion of injuries, violence, and molecular/genetic epidemiology (1980s-1990s).
Focus on bioterrorism and biologic warfare (Post-September 11, 2001).
Core Epidemiologic Functions
Public Health Surveillance: Ongoing systematic collection, analysis, interpretation, and dissemination of health data ("information for action").
Field Investigation: Responding to reports of cases or clusters to identify causes and prevent further spread ("shoe leather epidemiology").
Analytic Studies: Using comparison groups to evaluate hypotheses. Components include design (calculating sample sizes), conduct (ethical protocols), analysis (testing for significance), and interpretation.
Evaluation: Assessing the relevance, effectiveness, efficiency, and impact of health services.
Effectiveness: Ability of a program to produce results in the field.
Efficacy: Ability to produce results under ideal conditions.
Efficiency: Producing results with minimum expenditure of time and resources.
Linkages: Working in multidisciplinary teams; field epidemiology is a "team sport."
Policy Development: Providing recommendations based on findings to direct appropriate public health interventions.
The Epidemiologic Approach and Case Definitions
Primary Tasks: Count (cases), Divide (by denominators to find rates), and Compare (rates over time or between groups).
Case Definition: A set of standard criteria for classifying whether a person has a specific disease or condition.
Clinical Criteria: Confirmatory lab tests, combinations of symptoms (subjective), and signs (objective).
Time/Place/Person: Standardized limits reflecting the scope of an outbreak (e.g., "Resident of Winston-Salem with onset between October and January").
Sensitivity vs. Specificity:
Sensitive (Loose): Used for rare/severe diseases to capture every possible case (e.g., rubella defined as "any generalized rash illness").
Specific (Strict): Used in analytic studies to ensure participants truly have the disease (e.g., requiring positive lab culture for Salmonella).
Nominal-scale: Categories without numerical ranking (e.g., county of residence, male/female). Categories are qualitative.
Ordinal-scale: Categories that can be ranked but not evenly spaced (e.g., Stage I-IV cancer).
Interval-scale: Measured in equally spaced units without a true zero (e.g., date of birth).
Ratio-scale: Interval variable with a true zero point (e.g., height, duration of illness, induration in millimeters).
Frequency Distribution: Displays the values a variable can take and the number of persons with each value.
Measures of Central Location
Arithmetic Mean: The average of all values, often called the "center of gravity."
Formula:Mean=n∑xi
Centering Property:∑(xi−mean)=0
Median: The middle value of a set of data in rank order (50th percentile).
Position Formula:2n+1
Mode: The value that occurs most often. Distributions can be bimodal (two peaks) or have no mode.
Midrange: The midpoint of a set of observations.
Standard Formula:2Minimum+Maximum
Age Formula:2Minimum+Maximum+1
Geometric Mean: The mean of data measured on a logarithmic scale. Used for serial dilutions or assays.
Method A:Antilog[n∑log(xi)]
Measures of Spread
Range: The difference between the maximum and minimum values. Epidemiologists often report it as "from [min] to [max]."
Interquartile Range (IQR): The central portion of the distribution (25th to 75th percentile).
Q1 position=4n+1
Q3 position=43(n+1)
Standard Deviation (SD): Measures how widely observations are distributed around the arithmetic mean.
Variance (s2):n−1∑(xi−xˉ)2
SD:s2
Standard Error of the Mean (SEM): Refers to variability in means of repeated samples.
Formula:nSD
Confidence Interval (CI): A range of values consistent with data, indicating the precision of an estimate.
95%CI for Mean:Mean±(1.96×SEM)
Morbidity Frequency Measures
Incidence Proportion (Attack Rate/Risk): Proportion of an initially disease-free population that develops disease over a specific period.
Formula:Population at start of periodNew cases identified during period×10n
Secondary Attack Rate: Measure of transmission within a closed group (e.g., household).
Formula:Total number of contactsCases among contacts of primary cases×100%
Incidence Rate (Person-time Rate): Incorporates time directly into the denominator.
Formula:∑Time each person was observedNew cases during specified period×10n
Prevalence: Proportion of persons in a population who have a disease at a specific time (Point Prevalence) or over a period (Period Prevalence). Includes both new and pre-existing cases.
Formula:Population during same periodAll new and pre-existing cases×10n
Mortality Frequency Measures
Crude Death Rate:Mid-interval populationTotal deaths during period×10n
Cause-Specific Mortality:Mid-interval populationDeaths from specific cause×100,000
Age-Specific Mortality:Population in same age groupDeaths in specific age group×10n
Infant Mortality Rate:Number of live births reportedDeaths among children < 1 year×1,000
Neonatal Mortality: Deaths from birth to day 27.
Postneonatal Mortality: Deaths from day 28 up to 1 year.
Maternal Mortality Rate:Number of live birthsPregnancy-related deaths×100,000
Case-Fatality Rate:Number of incident casesCause-specific deaths among incident cases×100%
Proportionate Mortality:Total deaths from all causesDeaths from a particular cause×100%
Years of Potential Life Lost (YPLL): A measure of premature mortality.
Individual YPLL:Endpoint (e.g., 65)−Age at Death
Measures of Association and Impact
Risk Ratio (Relative Risk - RR): Compares risk in the exposed group to risk in the unexposed group.
Formula:Risk (Unexposed)Risk (Exposed)
Odds Ratio (OR): Used in case-control studies where the population size is unknown.
Formula (Cross-product):bcad
Attributable Proportion: Percent of disease among the exposed that is due to the exposure.