"0.36%" (rounded)
TIME, POPULATIONS & DISEASE OCCURRENCE
This set of notes covers key concepts for measuring disease occurrence in populations, with emphasis on definitions, study design terminology (population types), time-at-risk concepts, and common epidemiologic measures (prevalence, incidence, rates, and related calculations).
WHAT IS A POPULATION?
A population is a specific group defined by three dimensions:
Person (e.g., age, sex, race/ethnicity, job)
Place (e.g., country, state, county, city, workplace)
Time (e.g., calendar year, life-course stage such as adolescence or emerging adulthood)
These three dimensions together define the group to study or make inferences about.
WHAT IS A TARGET POPULATION?
Target population: Individuals about whom inferences are to be made.
The goal of epidemiologic research is to identify disease causes in the target population.
Example: The causes of AIDS in men who have sex with men (MSM) in San Francisco and New York in the early 1980s.
Target population is the group for whom you want to draw conclusions.
WHAT IF WE CAN’T STUDY THE ENTIRE TARGET POPULATION?
When enumeration of the entire target population isn’t feasible, identify a group of individuals expected to have the same exposure-disease association as the target population and that can be enumerated.
Source population: the group from which the study population is drawn.
Example: MSM enrolled in health clinics, frequenting other venues, or participating in outreach programs in San Francisco in 1984.
Diagrammatic idea: Target Population ← Source Population ← Study Population (when full enumeration isn’t possible)
TARGET & SAMPLE POPULATIONS
If every individual in the target population can be enumerated and the study is feasible: you may use the full Target Population or select a sample from the Source Population.
Conceptual relation: Target Population = Source Population (if you study everyone) OR Target Population is inferred from a Sample drawn from the Source Population.
STUDY POPULATION
If it is NOT feasible to enumerate and study every individual in the target or source population: you study a Study Population drawn from the Source Population.
Key terms:
Source Population: group from which the study population is drawn
Target Population: group to whom inferences will be made
Study Population: group actually studied in your research
A central concern is representativeness: how accurately the study population represents the target population.
EXAMPLE: STUDY OF TECHNOLOGY-FACILITATED VIOLENCE
Dr. Lambert’s CDC-funded study on digital dating violence among LGBTQIA+ youth aged 13–17 in eight Deep South states.
Recruited 400 youth via social media who completed an online survey.
A purposefully sampled subset of 35 youth participated in an in-depth qualitative interview.
Topics: experiences seeking healthcare, sexual health, mental health, experiences of violence (online and offline; IPV and non-IPV), identity development, impact of COVID-19.
Analysis finding: 320 of the 400 youth reported prior experiences of digital dating abuse.
Questions to classify populations:
What is the study population?
What is the sample population?
What is the target population?
Answers:
Target Population: LGBTQIA+ youth aged 13–17 in eight Deep South states.
Source Population: Youth reachable via social media recruitment (the population from which the sample is drawn).
Study Population: The 400 youth who completed the online survey.
Sample Population: The 35 youth who participated in the qualitative interviews.
Outcome of interest (digital dating abuse) reported by 320/400 participants.
COHORTS
Cohorts are populations of individuals moving through time together.
Components:
Target Population
Source Population
Study Population
Time dimension (t) is central to cohort definitions.
COHORT FOLLOW-UP
Visual: a sequence showing the study population over time with follow-up points.
Key idea: follow-up captures whether individuals experience outcomes over time.
MEASURING THE AMOUNT OF TIME A PERSON IS AT RISK
At risk: an individual who can experience the endpoint of interest.
Individuals may be at risk for different lengths of time.
You need a method to measure, for each individual, the time during which they are at risk.
MEASURING PERSON-TIME AT RISK
Example setup: Individual 1 with Entry1, Exit1, and Total time at risk T1.
Total person-time at risk is the sum across individuals:
Person-time can be summed across all individuals in the cohort to yield the denominator for rate calculations.
DIAGRAMS OF INDIVIDUALS IN POPULATIONS MOVING THROUGH TIME
Key ideas: entry into follow-up, development of the endpoint, censoring events, start and end of follow-up.
Censoring occurs when follow-up ends before the endpoint is observed for some individuals (e.g., administrative end, loss to follow-up, death from other causes).
COHORT TYPE: CLOSED VS OPEN
Closed cohort:
A group of at-risk individuals followed over time with no additions
No exits except for the endpoint of interest
Open (dynamic) cohort:
Individuals enter at different times
Individuals may exit for reasons other than the endpoint of interest
EXAMPLES OF COHORT TYPES
Closed cohort example: Enter together; exit only for the endpoint of interest (e.g., a birth cohort followed until all have died).
Closed cohort with administrative censoring: Enter together; exit for endpoint or end of follow-up (administrative censoring).
Open cohort example: Participants entering at different times and exiting for various reasons (e.g., CLUE II cohort followed for colorectal cancer diagnosis with censored observations).
MORE ABOUT COHORTS
Closed cohorts (entering together, exiting only for the endpoint) are uncommon in practice but are useful for teaching the concept of disease frequency measures.
COHORT EXAMPLES
Classic cohorts that entered at the same time and exited for endpoint(s):
Framingham Heart Study (1948)
British Doctors Study (1951)
CLUE I, CLUE II (1974, 1989)
Multicenter AIDS Cohort Study (1984, 1987, 2001)
Cohorts entering at different times with varying exit reasons: US cancer incidence, Medicare (age 65+)
WHAT KIND OF COHORT?
Examples to illustrate different entry patterns:
All patients admitted to the NICU during a single week and followed until death or NICU discharge
All UGA students admitted in Fall 2023 and followed until graduation
All 2022-2023 UGA students
PERSON-TIME AT RISK: EQUIVALENCE
Example shown: Individual-by-individual entries with T1, T2, …, T7 summing to total time at risk (e.g., 26 person-years in the example).
Purpose: demonstrate how person-time equivalence captures varying follow-up durations.
PERSON-TIME AT RISK: THE DATA
Data structure typically includes: entry time, exit time, status at end of follow-up, and computed person-time at risk for each individual.
This forms the basis for calculating incidence rates using person-time as the denominator.
WHY CALCULATE PERSON-TIME AT RISK?
Important denominator for disease occurrence rates.
Rates are a measure of disease occurrence and are estimated as:
Units example: per person-year, per 100,000 person-years, etc.
MEASURING DISEASE OCCURRENCE
Types of measures: Count, Ratio, Proportion, Rate
Warning: Many epidemiology terms are used incorrectly; clarity about what numerator and denominator represent is essential.
TYPES OF MEASURES
Count: Number of events or cases (e.g., number of food-poisoning cases in two cities).
Ratio: One quantity divided by another quantity (e.g., men to women, oranges to apples).
Proportion: A type of ratio where the numerator is part of the denominator (e.g., infected individuals among a group).
Rate: A ratio in which time is part of the denominator (e.g., events per person-time).
Takeaway: Proportions and counts are not the same as rates; rates incorporate time.
PREVALENCE & INCIDENCE
Prevalence measures the existence of current disease at a point in time or over a period.
Incidence measures the occurrence of new disease events over a period.
Intuition: prevalence reflects burden at a moment; incidence reflects the risk of developing disease over time.
Common relationships: Cures and deaths affect prevalence over time.
PREVALENCE
Definition: Prevalence = Number of EXISTING cases / Size of the population
Notes:
Prevalence is a proportion, not a rate (though people often misuse the term “prevalence rate”).
Population of interest (POI) is typically the total population or the population at risk.
POINT VS PERIOD PREVALENCE
Point prevalence: the number of existing cases at a given date divided by the population at that date.
Period prevalence: the number of cases in a population during a given period divided by the population size during that period.
Denominator considerations:
If the population changes, defining the denominator for period prevalence is challenging.
Solutions: use mid-point population or average population over the period.
CALCULATING POINT & PERIOD PREVALENCE (EXAMPLES)
Example setup (simplified):
Jan 1, 2024: 1,000 prevalent cases; population = 1,000,000
Dec 31, 2024: 1,000 prevalent cases cured; 1,000 incident cases cured; 1,000 incident cases with disease; population = 1,200,000 (start) / 1,100,000 (mid-year)
Point prevalence on Jan 1, 2024:
ext{Point Prev}_{Jan1,2024} = rac{2000}{1000000} = 0.002 = 0.2 ext{%}
Point prevalence on Dec 31, 2024 (using Dec31 existing cases):
If there are 2,000 existing cases and population is 1,200,000: ext{Point Prev}_{Dec31,2024} = rac{2000}{1200000} = 0.0017 = 0.17 ext{%}
Period prevalence on Dec 31, 2024 (with 4,000 cases with disease over the period and population 1,100,000):
$$ ext{Period Prev}_{Dec31,2024} = rac{4000}{1100000} \