Statistical Reasoning Lecture 9

Hypothesis Testing for Population Comparisons: An Overview
- The material in this video is subject to copyright and provided under fair use for educational purposes for registered students in this course only.

Upon completion of this lecture section, you will begin to understand:
- A conceptual framework for the process of statistical hypothesis tests.
- How confidence intervals and hypothesis testing are related.

In fields such as public health, medicine, and science, researchers are often interested in comparing outcomes between two or more populations using data collected from samples.
Importance of:
- Estimating the magnitude of differences in outcomes between groups.
- Recognizing uncertainty in estimates.
Approaches to recognize uncertainty:
- Confidence Intervals (CIs)
- Hypothesis Testing

Types of two-group comparisons using continuous outcomes:
- Paired: Measurements taken from the same subjects under different conditions.
- Unpaired: Measurements taken from different subjects.
Types of two-group comparisons using binary outcomes, incidence rates, and time-to-event:
- Unpaired: Two independent groups.
These methods can be extended to more than two population comparisons in unpaired studies.

Differences of two quantities from normal distributions have a normal distribution.
Principles of CLT apply for understanding and quantifying sampling variability:
- Mean differences between two independent populations.
- Differences in proportions between two independent populations.
- Natural logarithm of ratios (as differences).

Confidence intervals for differences (means, risks, ln(ratios)) are created using CLT results and properties of the normal curve.
The concept of confidence intervals emphasizes, "Letting data take you to the truth."

Hypothesis testing begins with defining competing possibilities for the unknown truth about the population comparison measure.
- Possible truths:
  - Truth 1: No difference between populations (H0)
  - Truth 2: Difference between populations (HA)
Competing hypotheses:
- Null Hypothesis (H0): No difference between populations.
- Alternative Hypothesis (HA): A difference exists.
These can be expressed differently for types of data outcomes (continuous, binary, time-to-event).
For comparing means:
- If the means are the same: $ext{if } \boldsymbol{\boldsymbol{ u}1 = u2} ext{ then } \boldsymbol{ u1 - u2 = 0}$
- If the means differ: $ext{if } \boldsymbol{\boldsymbol{ u}1 eq u2} ext{ then } \boldsymbol{ u1 - u2 eq 0}$

The theoretical sampling distribution assesses how studied data can help choose between hypotheses while accounting for uncertainty.
Confidence intervals help in bridging sample results to population truths.
- The sample estimated difference is expected to be close to the unknown truth
Hypothesis testing assumes H0 is true, predicting that the sample estimated difference aligns with the null hypothesis.

Definition: The p-value is the probability of obtaining a study result as extreme or more extreme than the sample result under H0.
- It indicates how unusual the observed data is if H0 is true.
- $ext{p-value} = P( ext{observed extreme data} | H_0 ext{ is true})$
Decision Basis:
- Determine if p-value indicates the study results are likely or unlikely assuming H0 is true.
- Common significance level: 0.05 (α level), corresponding to a 95% confidence interval.
Decision making based on p-value:
- If p < 0.05 , reject H0 in favor of HA (statistically significant).
- If p ext{ } ext{ }
 ot< 0.05 , fail to reject H0 (we do not conclude a difference).
Relationship to Confidence Intervals:
- If p < 0.05, the 95% CI will not include the null value.
- If p ext{ } ext{ }
 ot< 0.05, the 95% CI will include the null value.

Confidence intervals and hypothesis testing complement each other in addressing uncertainty in sample comparisons regarding unknown population comparisons.
Sample results typically align closely with the true metrics of interest in most random sample-based studies.

Understand how to estimate and interpret a p-value for hypothesis testing of mean differences using the paired t-test methodology.

Data was collected from 65 male sexual contacts with AIDS, assessed for palpable lymph nodes by two physicians.
- Physician Differences:
  - Doctor 1: Mean = 7.91, Standard deviation = 4.35
  - Doctor 2: Mean = 5.16, Standard deviation = 3.93
  - Mean difference = $-2.75$ .
- Based on a source from Rosner (2005).
95% Confidence Interval for the difference:
- Results indicate the found mean difference in lymph nodes between Doctor 2 and Doctor 1 lies between $[-3.45, -2.05]$ .
Hypothesis Testing Approach: Setting Competing Hypotheses
- H0: $\boldsymbol{ u{Doctor 1} = u{Doctor 2}}$
- H0: $\boldsymbol{ u{Doctor 1} - u{Doctor 2} = 0}$
- H0: $\boldsymbol{ u_{diff} = 0}$
- HA: $\boldsymbol{ u{Doctor 1} eq u{Doctor 2}}$
- HA: $\boldsymbol{ u{Doctor 1} - u{Doctor 2} eq 0}$
- HA: $\boldsymbol{ u_{diff} eq 0}$
Assuming the null hypothesis, calculate the observed mean difference: $\boldsymbol{t = rac{x{diff} - 0}{SE(x{diff})}}$
Result: Use statistical software to determine how extreme the observed mean is given H0.

Distance of observed mean difference from H0 (assumed mean difference of 0):
- $t = rac{-2.75}{2.83 / ext{sqrt}(65)} ightarrow t ext{ statistic value}$
The probability of observing such a difference (p-value) given H0 being true can be obtained via R programming.
- Example command: $2*pnorm(-\boldsymbol{t})$
Interpreting p-Value:
- A small p-value (p < 0.0001) leads to rejecting H0, indicating a statistically significant difference in lymph node findings.
This decision corresponds with the 95% confidence interval ruling out the potential for no significant difference in counts of lymph nodes by the doctors.

The approach to transform observed data into p-values and link them to statistical conclusions is similarly applied in further sections on paired and unpaired comparisons.
The process maintains the dual aspects of hypothesis testing, ensuring that results obtained lead closer to ascertaining statistical truth in population differences when using sample data.

State null (H0) and alternative (HA) hypotheses.
Measure the difference between study estimates and H0 in standard errors.
Convert distance into p-values.
Determine whether to “reject” or “fail to reject” the null based on the p-value and pre-set significance criteria.

Understanding the interconnections between hypothesis testing decisions and applications in real-world assessments provides a holistic view of how statistical reasoning applies to hypothesis testing and estimation of population parameters in health-related research and beyond.

It is imperative to remember the societal and ethical implications of statistical conclusions, ensuring responsible interpretations that correlate with real cases in applied fields such as medicine and public health.