Lecture on Case Series and Case-Control Studies

Case Series and Case-Control Studies

Objectives

Understand basic features of case series studies
- Design and Purpose: Case series studies involve observing and describing a group of individuals with a common outcome, such as a specific disease or procedure. Their primary purpose is to generate hypotheses about potential causes or characteristics of a disease, describe the clinical course, and report on new conditions or unusual presentations, often serving as an early alert system in epidemiology.
- How Data Are Typically Analyzed: Data are primarily analyzed using descriptive statistics, including frequencies (counts), percentages, means, medians, and ranges. The focus is on characterizing the prevalence of symptoms, signs, demographic features, and outcomes within the described group.
- Key Sources of Bias: The main limitation is the absence of a comparison group, making it impossible to establish causality or quantify associations. This leads to issues with generalizability, as the observed characteristics may not be representative of the broader population with the disease. Selection bias can occur if the cases are not representative of all individuals with the condition.
Understand basic features of case-control studies
- Design and Purpose: Case-control studies are retrospective observational studies that compare individuals with a disease (cases) to individuals without the disease (controls) to identify risk factors or exposures that differ between the two groups. They are particularly useful for studying rare diseases or when a long latency period exists between exposure and outcome.
- How Data Are Typically Analyzed: The primary measure of association is the Odds Ratio (OR), calculated from a 2x2 contingency table. The OR quantifies the odds of exposure among cases compared to controls, estimating the relative risk when the disease is rare. Conditional logistic regression is used for matched studies.
- Key Sources of Bias: These studies are highly susceptible to recall bias (cases may remember exposures differently than controls), selection bias (improper selection of cases or controls that distorts the exposure-disease relationship), and observer bias (knowledge of disease status influencing data collection).

Case Series

Overview

Case Report: A detailed report that describes the diagnosis, treatment, and follow-up of a single patient, often highlighting unusual or novel aspects of a disease or treatment. These are valuable for early identification of new diseases or adverse drug effects.
Case Series: A report that describes a group of individuals who share a common characteristic, such as having the same disease or disorder, undergoing the same procedure, or experiencing a similar adverse event. It aggregates information from individual case reports to provide a broader picture.

Characteristics of Case Series

Objective: To describe the clinical, demographic, and exposure characteristics of a group of individuals sharing a specific outcome. This helps in understanding the natural history of a disease, identifying potential risk factors, and generating hypotheses for future analytical studies.
Primary Study Question: What are the key characteristics (e.g., demographics, clinical presentation, exposures, outcomes) of the cases included in this study, and what patterns emerge? This question focuses on comprehensive description without attempting to establish causal links.
Population: All individuals included must rigorously meet the same outcome definition (e.g., a specific disease, a particular surgical procedure, or a defined adverse event). Consistency in case definition is crucial for internal validity.
When to Use This Approach: This method is particularly useful in the early stages of investigating a new disease outbreak, identifying unusual clusters of symptoms, or when a strong source of cases (e.g., a hospital registry) is readily available, and obtaining a comparison group is either impractical or not necessary for the descriptive aim.
First Steps:
1. Specify what new and important information the analysis will provide: Clearly articulate the unique contribution or hypothesis that the case series aims to explore.
2. Identify a source of cases: This could be a clinical database, hospital records, public health surveillance systems, or direct patient recruitment.
3. Assign a case definition: Establish clear, objective, and consistent inclusion and exclusion criteria to ensure homogeneity among reported cases.
4. Select the characteristics of the study population that will be described: This includes demographic data, clinical signs and symptoms, diagnostic test results, treatments received, and outcomes.
What to Watch Out For: The primary limitation is the lack of generalizability. Findings from a case series may not be applicable to the wider population, especially if the cases were selected from a specialized clinic or represent extreme presentations of a disease. It cannot establish cause-and-effect relationships.
Key Statistical Measure: Primarily descriptive statistics such as counts, percentages, means, medians, standard deviations, and ranges are employed to summarize the characteristics of the study group.

Case Definition

Case Definition: A pre-established, clearly defined list of inclusion and exclusion criteria used to systematically classify individuals as having the disease or condition of interest. This ensures uniformity and replicability.
- Sign: An objective, measurable indication of disease that can be clinically observed by an examiner (e.g., an elevated body temperature, a characteristic rash, abnormal blood pressure readings).
- Symptom: A subjective indication of illness experienced and reported by an individual but not directly observable or measurable by others (e.g., pain, fatigue, nausea).
- Lab Test Result: Objective laboratory findings or imaging results that confirm or suggest a diagnosis (e.g., a positive culture for a pathogen, specific genetic markers, abnormal values from a mammogram or a rapid diagnostic test for malaria).
- Diagnosis/Procedure Codes:
  - ICD Codes: International Classification of Diseases (ICD), formally known as the International Statistical Classification of Diseases and Related Health Problems. These are globally recognized codes for diseases and health problems, maintained by the World Health Organization (WHO), crucial for mortality and morbidity statistics.
  - CPT Codes: Current Procedural Terminology codes published by the American Medical Association. These are medical codes used to describe medical, surgical, and diagnostic services and procedures performed by physicians and other healthcare providers.

Sampling Characteristics (PPTs)

PPTs: Characteristics of Person (e.g., age, sex, occupation, socioeconomic status), Place (e.g., geographic location, urban/rural residence, healthcare facility), and Time (e.g., year, season, duration of illness) that guide the selection and recruitment of cases for inclusion in the study (defining the source population). These characteristics help to define the context and generalizability of the findings.
Poor Sampling: If the sampling strategy is poorly defined or executed, resulting cases will not be representative of all individuals with the condition, leading to selection bias and limiting the external validity or generalizability of the study's conclusions.

Sample Case Definitions (Figure 8-2)

Example 1: Whooping Cough (ICD-10 code A37)

Disease/Procedure: Any person with a confirmed case of whooping cough, defined as:
1. An acute cough of any duration with isolation of Bordetella pertussis from a clinical specimen.
2. A cough lasting 2 or more weeks with paroxysms of coughing, inspiratory "whoop," and/or posttussive vomiting in an individual known to have had contact with a laboratory-confirmed case of pertussis.
Person: Residents of River City whose diagnoses were reported to the River City Health Department.
Time: Patients who first sought clinical care for a cough between January 1 and March 31, 2020.

Example 2: Liver Transplantation

Population: Adult patients (ages 18 years and older) excluding those not receiving their first liver transplant and those receiving multiple organ transplants.
Place: Patients who had transplant surgery at the Oakville Regional University Medical Center.
Time: Recipients of liver transplants between January 1, 2010, and December 31, 2019, who were followed for a minimum of 2 years post-transplant.

Data Collection

A case series may be constructed from primary data acquired by directly interviewing cases about their experiences using structured questionnaires and/or in-depth qualitative techniques (e.g., focus groups, semi-structured interviews). This allows for rich, detailed insights.
When using existing medical records or other secondary data sources, a well-structured questionnaire or data abstraction form guiding the systematic extraction of specific, predefined data points from these files is highly recommended. This standardizes data collection, minimizes error, and ensures consistency across cases.

Analysis of Case Series

Case Fatality Rate (CFR): The actual proportion of people diagnosed with a particular disease who die as a direct result of that specific condition within a specified time period. It is a measure of the severity of a disease.
- Formula: $CFR = \frac{\text{Number of deaths from disease X}}{\text{Number of confirmed cases of disease X}} \times 100\%$
Mortality Rate: The percentage of population members dying from any condition (all-cause mortality rate) or a particular condition (cause-specific mortality rate) during a specified time period, typically expressed per 1,000 or 100,000 individuals. It reflects the risk of death in a population.
- Formula: $\text{Mortality Rate} = \frac{\text{Number of deaths from all causes (or specific cause)}}{\text{Total population at mid-period}} \times \text{k}$ (where k is 1,000 or 100,000)
Proportionate Mortality Rate (PMR): The proportion of all deaths in a particular population during a specified period that are attributable to a specific cause. It indicates the relative importance of a specific cause of death, but not the risk of dying from that cause.
- Formula: $PMR = \frac{\text{Number of deaths from cause X}}{\text{Total number of deaths from all causes}} \times 100\%$
Analysis Methods: Primarily involves simple counts and percentages to describe the frequency of various characteristics among the cases. For example, "X% of cases reported symptom Y" or "the average age of cases was Z years."
- Exception: More advanced statistical comparisons might be employed when comparing characteristics between subpopulations of cases (e.g., males vs. females within the series) or when assessing before-and-after measures within the same group of cases (e.g., symptom severity before and after a new treatment intervention).

Figure 8-3: Mortality Rates, CFR, and PMR

Relationships represented in a 2D diagram clarifying the distinctions between mortality rate, case fatality rate among the total population, and PMR based on causes.

Overview: Case-Control Studies

Definition of Case-Control Study

A type of observational epidemiological study that retrospectively compares the exposure histories of a group of individuals who already have a specific disease or outcome (cases) with a group of individuals who do not have the disease or outcome (controls). The goal is to identify factors (exposures) that are more prevalent in the cases and therefore potentially associated with the disease.
- Case: A study participant who meets the pre-defined criteria for having the specified disease or health condition under investigation. Cases should be identified unambiguously, usually from a defined source population.
- Control: A study participant who does not have the disease being examined. Controls are selected to be representative of the source population from which the cases arose, ideally sharing similar demographic and other characteristics with the cases, but differing only in the absence of the disease.
Selection Key Point: Unlike cohort studies where participants are selected based on exposure status, in case-control studies, selection is fundamentally based on the disease status (presence or absence of the outcome). This design is efficient for rare diseases.

Characteristics of Case-Control Studies

Objective: To identify and quantify the strength of association between potential risk factors (exposures) and a specific disease by comparing their prevalence in 'cases' and 'controls'. This helps in understanding etiology and informing public health interventions.
Primary Study Question: Do individuals with the disease (cases) have a significantly different history of exposure to various factors compared to individuals without the disease (controls)? This question guides the retrospective assessment of exposures.
Population: It is crucial that cases and controls originate from the same underlying population at risk and are similar in all respects except for their disease status. This minimizes confounding and ensures that any observed differences in exposure can be more reliably attributed to the disease.
Usage: This study design is highly efficient and preferred when the disease of interest is relatively uncommon (rare), when the disease has a long latency period, or when a strong source of cases (e.g., a disease registry, hospital records) is available, making it impractical or unethical to follow a large cohort prospectively. It is also suitable for exploring multiple potential risk factors for a single disease.

First Steps in Case-Control Study

Identify a source of cases: This involves pinpointing where individuals with the disease can be accurately identified (e.g., hospitals, clinics, disease registries).
Assign a case definition: Develop clear, specific, and objective diagnostic criteria for identifying cases, ensuring consistency across all selected individuals.
Decide on an appropriate control population: Select controls who are representative of the same source population as the cases and who would have been identified as cases had they developed the disease. This is a critical step to avoid selection bias.
Determine whether cases and controls will be matched: Matching involves selecting controls to share specific characteristics (e.g., age, sex, geographic area) with cases to control for potential confounding variables, thereby making the groups more comparable.

Bias in Case-Control Studies

Key Types

Selection Bias: Occurs if the selection of cases or controls, or their participation in the study, is somehow related to the exposure status, thereby distorting the true relationship between exposure and disease. For instance, if hospital-based controls are chosen, and their hospitalization is related to the exposure of interest, this can introduce bias.
Recall Bias: A systemic error that can occur when individuals with a disease (cases) may more accurately or vividly remember their past exposures (e.g., dietary habits, chemical exposures, medication use) than individuals without the disease (controls), simply because they are ill. This differential recall can lead to an artificially inflated or deflated association.
Misclassification Bias: Arises when there are errors in measuring or classifying either disease status (e.g., incorrectly classifying a healthy person as a case) or exposure status (e.g., misclassifying an exposed person as unexposed). This misclassification is "differential" if the error rate differs between cases and controls (e.g., exposure is more likely to be misclassified in cases than controls), which can bias the Odds Ratio either towards or away from the null. Non-differential misclassification (random errors) typically biases the OR towards the null.

Odds Ratios

Definition

Odds Ratio (OR): A fundamental measure of association used in case-control studies, quantifying the strength of the relationship between an exposure and a disease outcome. It represents the ratio of the odds that a case was exposed to the odds that a control was exposed. For rare diseases, the OR provides a good approximation of the relative risk.
- For a case-control study, OR is specifically calculated as the ratio of the odds of exposure among cases to the odds of exposure among controls. It tells us how many times greater the odds of exposure are for cases compared to controls.

Calculation Using a 2x2 Contingency Table

	Disease (Cases)	No Disease (Controls)	Total
Exposed	$a$	$b$	$a+b$
Unexposed	$c$	$d$	$c+d$
Total	$a+c$	$b+d$	$N$

Formula: $OR = \frac{(a \times d)}{(b \times c)}$
- Where $a$ = number of exposed cases, $b$ = number of exposed controls, $c$ = number of unexposed cases, and $d$ = number of unexposed controls.
Odds of exposure among cases: $\frac{a}{c}$ (the ratio of exposed cases to unexposed cases)
Odds of exposure among controls: $\frac{b}{d}$ (the ratio of exposed controls to unexposed controls)

Interpretation

$OR = 1$ : Indicates no association between the exposure and the disease. The odds of being exposed are the same for cases and controls.
OR > 1: Suggests an increased odds of exposure among cases compared to controls. This implies that the exposure may be a risk factor for the disease. For example, an OR of 2 means cases have twice the odds of being exposed compared to controls.
OR < 1: Indicates a decreased odds of exposure among cases compared to controls. This suggests that the exposure might have a protective effect against the disease. For instance, an OR of 0.5 means cases have half the odds of being exposed compared to controls.

Conclusion

Case Report vs. Case-Control Study

Case Report: A descriptive study focusing on a single individual or a very small group, primarily used to identify unusual or novel features of a disease or treatment, generate hypotheses, and serve as an early warning system. It involves deep, detailed description.
Case-Control Study: An analytical observational study that compares exposure histories between individuals with a disease (cases) and those without (controls) to quantify the association between risk factors and the disease. It aims to establish statistical relationships.
Key Differences:
- Case reports require no comparison group, merely describing observed cases; case-control studies critically rely on a carefully selected comparison group (controls) to identify differences in exposure.
- Case reports only require descriptive statistics (counts, percentages) to summarize findings; case-control studies use inferential statistics, primarily odds ratios, to quantify the strength of association between exposure and disease, and often employ regression models for adjustment.