Statistical Reasoning Lecture 8

Descriptive Study Designs: Case Series, Cross-Sectional Studies, and Ecological Studies

Pablo Martinez Amezcua, MD, PhD, MHS, Johns Hopkins University


Descriptive Epidemiology
  • Focus on the distribution of disease and its determinants.

  • Analysis of disease patterns based on characteristics:

    • Who is affected?

    • Where does it occur?

    • How does it change over time?

  • Key questions that descriptive epidemiology answers:

    • What is being measured?

    • How is the outcome measured?

  • Uses of descriptive epidemiology:

    • Understanding public health trends

    • Identifying health priorities

    • Informing policy and interventions

  • Source: Lesko, C. R., Fox, M. P., & Edwards, J. K. (2022). A framework for descriptive epidemiology. American Journal of Epidemiology, 191(12), 2063–2070. https://doi.org/10.1093/aje/kwac115

Study Designs Taxonomy
  • Study Type Characteristics:

    • Experimental (RCT): randomized controlled trial.

    • Observational: Includes the study types of cohort, case-control, cross-sectional, and ecological.

Study Designs: Observational Designs
  • Experimental (RCT):

    • Investigates associations, prevention, and treatment of diseases.

    • The investigator observes as nature follows its course.

  • Observational Studies Include:

    • Cohort: Follows groups based on exposure.

    • Case–control: Compares individuals with a disease to those without.

    • Cross-sectional: Assesses exposure and outcome simultaneously.

    • Ecological: Analyzes data at the population level.

Descriptive Study Designs
  • Types of Descriptive Study Designs:

    • Case reports and case series

    • Cross-sectional surveys

    • Ecological designs

Case Reports and Case Series
  • Case Reports:

    • Detailed account featuring a single individual (Example: 51-year-old with Starr–Edwards caged-ball valve).

    • Source: Raymundo-Martínez et al., 2020

  • Case Series:

    • A compilation of cases with similar diagnoses (Example: Study on acute kidney injury due to non-steroidal anti-inflammatory drugs).

    • Source: Dixit et al., 2010

Ecological Studies
  • Characteristics:

    • Utilizes population-level data for analysis rather than individual-level data.

    • Investigates associations between exposure and disease at a group level (e.g., incidence of disease based on an environmental factor).

What Are Ecological Studies?
  • Examines disease rates relative to a factor described for a population.

  • Units of analysis include:

    • Aggregated data on populations (e.g., individuals over 65 years)

    • Environmental measures (e.g., levels of air pollution)

    • Geographic locations (e.g., counties)

    • Global measures without direct individual analogs (e.g., population density)

Key Features That Differentiate Ecological Studies
  • Unit of analysis: The population instead of individuals.

  • Exposure status is population-level property.

  • First step in determining association but suffers from ecological fallacy—the risk that observed group data does not apply to individuals.

Ecological Fallacy
  • Defines that an association on a group level may not depict the same association on an individual level (e.g., egg consumption at the county level does not guarantee the same effect in individuals).

Why Conduct an Ecological Study?
  • When hypotheses are novel or when individual-level data is inaccessible.

  • Ethical concerns may preclude individual studies.

  • Interest in ecological variables lacking individual-level equivalents.

  • Constraints in time and financial resources limit feasibility of individual-level studies.

Cross-Sectional Studies
  • Characteristics:

    • Assesses both exposure and outcomes at one point in time.

    • Enables measurement of exposure prevalence in relation to disease prevalence.

Design of a Cross-Sectional Study
  • Facilitates identification of existing cases to estimate prevalence as a morbidity measure.

National Health and Nutrition Examination Survey (NHANES)
  • A comprehensive survey that assesses health-related risk behaviors and chronic conditions among American adults.

  • Established in 1984, now encompassing all 50 states and U.S. territories, with an annual compilation of over 400,000 interviews, marking it the world's largest health survey system.

Behavioral Risk Factor Surveillance System (BRFSS)
  • A telephone survey gathering health-related data across the US.

  • Provides insights into health-related risk behaviors and chronic health conditions, informing public health decisions.

Comparisons in a Cross-Sectional Study
  • Differential prevalence between exposed versus nonexposed populations can be assessed using the formulas:

    • For Disease Prevalence: racaa+brac{a}{a+b} vs. raccc+drac{c}{c+d}

    • For Exposure Prevalence: racaa+crac{a}{a+c} vs. racbb+drac{b}{b+d}

Summary: Key Features of Cross-Sectional Studies
  • Examines exposure and disease prevalence simultaneously.

  • May overrepresent cases of longer duration due to existing disease representation.

  • Cannot infer temporal relationships between exposure and outcomes.

  • Valuable for generating hypotheses for future studies.

Study Designs: Summary
  • Experimental (RCT): Randomized controlled trial.

  • Observational: Studies associations; includes cohort, case–control, cross-sectional, and ecological studies.

Lessons Learned
  • Identified features, strengths, and limitations of descriptive study designs.

  • Explained ecological study design, including strengths and limitations and how it intersects with other observational study types.

  • Described cross-sectional study design, emphasizing its strengths, limitations, and how it informs other observational studies based on its data.

Equations in Cross-Sectional Studies

In cross-sectional studies, data is often organized into a 2×22 \times 2 table to compare exposure and disease status. Let's define the variables:

Disease (Yes)

Disease (No)

Total

Exposed (Yes)

a

b

a+b

Exposed (No)

c

d

c+d

Total

a+c

b+d

a+b+c+d

Where:

  • a: Number of individuals who are exposed and have the disease.

  • b: Number of individuals who are exposed and do not have the disease.

  • c: Number of individuals who are not exposed and have the disease.

  • d: Number of individuals who are not exposed and do not have the disease.

1. Disease Prevalence

These formulas calculate the proportion of individuals with the disease within specific exposure groups (exposed vs. non-exposed).

  • Prevalence of disease among the exposed: racaa+brac{a}{a+b}

    • Explanation: This represents the proportion of exposed individuals who have the disease.

    • When to use: Use this to determine how common the disease is among those who have been exposed to a particular factor at a specific point in time.

  • Prevalence of disease among the non-exposed: raccc+drac{c}{c+d}

    • Explanation: This represents the proportion of non-exposed individuals who have the disease.

    • When to use: Use this to determine how common the disease is among those who have not been exposed to the particular factor at the same point in time. Comparing this to the prevalence among the exposed helps assess a potential association between exposure and disease.

2. Exposure Prevalence

These formulas calculate the proportion of individuals who are exposed within specific disease status groups (diseased vs. non-diseased).

  • Prevalence of exposure among the diseased: racaa+crac{a}{a+c}

    • Explanation: This represents the proportion of individuals with the disease who are also exposed.

    • When to use: Use this to understand how common the exposure is among those who already have the disease. This can be useful for hypothesis generation or further investigation into the characteristics of diseased individuals.

  • Prevalence of exposure among the non-diseased: racbb+drac{b}{b+d}

    • Explanation: This represents the proportion of individuals without the disease who are exposed.

    • When to use: Use this to understand how common the exposure is among the healthy population. Comparing this to the prevalence of exposure among the diseased population can provide insights into potential risk factors, though causality cannot be inferred from a cross-sectional study alone.