Statistical interference

Statistical Inference

What is Statistical Inference?

Statistical inference is the process of drawing conclusions or making predictions about a population based on data from a sample.   - Involves:
    - Analyzing sample data.
    - Making generalizations, estimates, or decisions about a larger group (the population) that the sample represents.

Inferring Population Parameters

Population vs. Sample

Population:
- The entire group or set of individuals, items, or data points that you are interested in studying.
Sample:
- A subset of the population that is actually observed or measured. - Practicality:
- It is often impractical to gather data from an entire population.
Statistical Inference:
- Allows researchers to make reliable conclusions about the population based on the sample.

Example: Inferring Population Parameters

Goal:
- Estimate the average refractive error (in diopters) of all adults in a city.
Population:
- All adults in the city.
Sample:
- A randomly selected sample of fifty adults, each measured for refractive error.
Sample Data:
  - Sample size (n) = 50.
  - Sample mean ($\bar{x}$) = -0.75 diopters.
  - Standard deviation (s) = 1.2 diopters.

Point and Interval Estimates

Point Estimate

Definition:
- The point estimate for the average refractive error in the population is -0.75 diopters.

Interval Estimate

Confidence Level:
- We are 95% confident that the true average refractive error of the population is ($\bar{x} \pm 0.341$) diopters. - Interpretation:
- The point estimate provides a specific value, while the interval estimate offers a range of values for the population parameter (mean refractive error).

Standard Error of the Mean (SEM)

Purpose

Question Addressed:
- How well does a sample mean represent the population mean?
SEM:
- Provides insight into this question.

Characteristics of Distribution of Sample Means

When repeatedly drawing samples from a population and calculating their means, these means form a distribution.
Mean of this distribution:
- Equal to the population mean ($\mu$).
Standard deviation of sample means:
- Denoted as $\sigma_{\bar{x}}$ and calculated as $\frac{\sigma}{\sqrt{n}}$ where $\sigma$ is the standard deviation of the population, and $n$ is the sample size.
Implications of SEM:
- Large SEM indicates sample means are widely dispersed around the population mean; small SEM indicates they are tightly clustered around it.

Factors Affecting SEM

Population Standard Deviation ($\sigma$):
- The larger the $\sigma$, the larger the SEM.
Sample Size ($n$):
- The larger the $n$, the smaller the SEM.
Implications of SEM:
- A smaller SEM indicates a more accurate estimate of the population mean. - A larger SEM suggests greater uncertainty about the population mean.

Standard Deviation vs. Standard Error of the Mean

Definitions

Standard Deviation (SD):
- Measures the spread of individual data points in a sample or population.
Standard Error of the Mean (SEM):
- Measures how accurately a sample mean estimates the population mean.

Key Differences

SEM is typically smaller than SD since it pertains to sample means, not individual data points; predicting the mean of several data points is easier than predicting individual points.

Confidence Intervals

Definitions

Confidence Level:
- The probability that the confidence interval will contain the true population parameter. For instance, a 95% confidence level means that if you take 100 samples, approximately 95 of them will contain the true population parameter.
Confidence Interval:
- A range of values within which the true population parameter is expected to fall. For example, a CI of (10, 20) means that the population parameter is believed to be between 10 and 20 with a certain level of confidence.

Critical Value

Definition:
- A value from a statistical distribution (like the z-distribution for large samples or the t-distribution for small samples) corresponding to the desired confidence level.
Example:
- For a 95% confidence interval, the critical value for the z-distribution is approximately 1.96.

Interpretation

A 95% confidence interval's interpretation does not imply the true parameter lies within a specific interval with 95% probability; rather, the true parameter is fixed, and the interval varies based on the sample.

Example of Calculation

Data for Population:
- Sample size (n) = 50, Sample mean ($\bar{x}$) = -0.75 diopters, Standard deviation (s) = 1.2 diopters.
Interval Estimate Calculation:
- Critical value ($t_{c}$) for 95% confidence from t-distribution with df=49 = 2.009. - SEM = $\frac{s}{\sqrt{n}} = \frac{1.2}{\sqrt{50}} = 0.1697$ . - CI: ($\bar{x} \pm t_{c} \cdot SEM$) = $-0.75 \pm 2.009 \cdot 0.1697 = (-1.091, -0.409)$ .

Hypothesis Testing

Basic Concept

In hypothesis testing, you start with an assumption known as the null hypothesis.
The goal is to test this assumption using sample data to determine whether the statistical tests provide sufficient evidence to reject the null hypothesis.

Definitions

Null Hypothesis (H₀):
- An assumption that there is no effect or difference (e.g., the mean of the population equals a specific value).
Alternative Hypothesis (H₁):
- An assumption that there is an effect or a difference.

Testing Procedure

Involves calculating a test statistic (like a t-test or chi-square test) and comparing it to a critical value or calculating a p-value to decide whether to reject or fail to accept the null hypothesis.

Hypothesis Testing: Example

Situation Setup

A study compares two cataract surgery techniques (Technique A and Technique B).
Data Collected:
- Technique A: Mean improvement = 0.25 logMAR, SD = 0.10, Sample size = 50. - Technique B: Mean improvement = 0.30 logMAR, SD = 0.08, Sample size = 50.
Research Question:
- Is there a difference in BCVA improvement between the two techniques?

Hypotheses Defined

Null Hypothesis (H0):
- There is no difference in the mean improvement seen with the two techniques.
Alternative Hypothesis (H1):
- There is a difference in mean improvements seen with the two techniques.

Test Selection

The test selected for the analysis is the Independent two-sample t-test.

Hypothesis Testing: Independent (Two Sample) t-test

Necessary Equations

T-statistic formula:
- $t = \frac{\bar{x}_1 - \bar{x}_2}{SE}$
SE definition:
- $SE = \sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}$ .

Explanation of t-statistic

Numerator:
- The difference of the two means; the larger this difference, the more likely it is that the two groups are different.
Denominator (SE):
- Represents variability; a larger SE suggests lesser likelihood of the groups being different.
Variance Addition Note:
- Variances can be added but standard deviations cannot; combine by squaring, adding, and taking the square root.

Calculating the t-statistic

$t = \frac{0.25 - 0.30}{\sqrt{(0.10^2/50) + (0.08^2/50)}}$ = $\frac{-0.05}{\sqrt{0.00038}} = -2.76$
Degrees of Freedom:
- For the two-sample t-test: df = $n_1 + n_2 - 2 = 50 + 50 - 2 = 98$.
Significance Level:
- Select significance level (α) = 0.05.
Look up critical t-value:
- For df = 98, α = 0.05 yields critical t-value ≈ ±1.984; since $t_{stat}$ (-2.76) is in the critical region, we reject $H_0$.

Probability Density for Hypothesis Testing

Critical Values:
- Left Tail (2.5%): -1.98, Right Tail (2.5%): 1.98, based on t-Distribution for df = 98.

Conclusion of t-test Example

Result Interpretation:
- There is a statistically significant difference in BCVA improvement between the two techniques.

Significance Level (α)

Definition:
- The probability of rejecting the null hypothesis when it is actually true (Type I error).
Common Levels:
- 0.05, 0.01, and 0.10; example used was α = 0.05 - indicating a 5% risk of a Type I error.

Critical Region (Rejection Region)

Definition:
- The set of values for the test statistic that leads to rejection of the null hypothesis, determined by α.
Example Context:
- For α = 0.05, critical regions are in the extreme 2.5% of both tails of the distribution.
Determination:
- The critical regions beyond ±1.984; the t-statistic (-2.76) is in the critical region leading to rejection of $H_0$.

p-value Concept

Definition:
- The p-value is the probability of obtaining test results at least as extreme as the observed results, assuming the null hypothesis is true.
Interpretation for Decision Making:
- If p-value ≤ α, reject $H_0$; otherwise, do not reject.
Contextual Example:
- For a t-statistic of -2.76, p-value = 0.007; as this is less than α (0.05), $H_0$ is rejected.

Reporting p-value

Importance of Reporting:
- Always report the p-value along with t-statistic, degrees of freedom, and other details; e.g., t_{98} = -2.76, p = 0.007.
Strength of Evidence Highlight:
- A small p-value (< 0.05) suggests the observed data is unlikely under $H_0$, leading to its rejection.

Types of Error

Type I Error (False Positive)

Definition:
- Occurs when $H_0$ is rejected even though it is true.
Control:
- Controlled by the significance level (α). - Example if α = 0.05 implies a 5% chance of making a Type I error.

Type II Error (False Negative)

Definition:
- Occurs when $H_0$ is not rejected even though it is false.
Notation:
- Probability of Type II error is denoted as β.

Understanding Trade-offs

Trade-off Context:
- Reducing α lowers the risk of Type I error but increases the risk of Type II error.

Hypothesis Testing: Paired t-test Example

Situation Description

A study investigates a new glaucoma medication.
Measured Data:
- Before Medication: Mean IOP ($\mu_{before}$) = 25 mmHg. - After Medication: Mean IOP ($\mu_{after}$) = 20 mmHg. - Standard deviation of differences (sd) = 3 mmHg.
- Sample size = 20.
Research Question:
- Does the medication reduce IOP?

Hypotheses for the Tester

Null Hypothesis (H0):
- No difference in the mean IOPs before and after medication.
Alternative Hypothesis (H1):
- A difference exists in the mean IOPs.

Test Chosen

Analysis Approach:
- A Paired t-test is utilized.

Paired t-test Equations and Calculations

Necessary Formulas

T-statistic Formula:
- $t = \frac{\bar{d}}{SE_d}$
Measured $d$:
- Mean difference of IOP values before and after medication.

Calculating the T-statistic

Steps:
- $t = \frac{5}{3/\sqrt{20}} = \frac{5}{0.67} \approx 7.46$ .
Degrees of Freedom:
- For the paired t-test: df = n - 1 = 20 - 1 = 19.
Significance Level:
- We select α = 0.05.

Critical t-value and p-value Lookup

Critical Value:
- From tables, for df = 19, α = 0.05, t-critical (2-sided) is ±2.093.
Critical Region:
- Results in the region beyond ±2.093.
p-value Extraction:
- Based on df = 19 and t = 7.46, p = 4.66E-07 (2-sided).
Decision:
- Since p < α (0.05), reject $H_0$.

Conclusion of the t-test Example

Final Interpretation:
- The medication significantly reduces IOP.

Common Misinterpretations of p-value

Clarifications Needed

P-value vs. Probability of Hypothesis:
- A p-value measures observed data probability under $H_0$, not the probability that $H_0$ is true.
P-value vs. Evidence Strength:
- A p-value offers evidence against $H_0$ but not a measure of evidence for $H_1$.
P-value vs. Certainty:
- A small p-value does not guarantee truth; it indicates likelihood of data under $H_0$.

Statistical vs. Clinical Importance

Example Overview

Study Title:
- “A Randomized Clinical Trial of Progressive Addition Lenses versus Single Vision Lenses on the Progression of Myopia in Children”.

Data Overview

Measurement of myopia progression by change in equivalent mean sphere over 3 years: - PAL group: mean of -1.28D, SE = 0.06 D. - SVL group: mean of -1.48D, SE = 0.06 D. - Difference PAL-SVL: mean = 0.2D, SE = 0.08D (significant at 5%).

Importance of Distinguishing

Statistical Significance:
- Compares mean difference (effect size) to random effects size (standard error).
Clinical Importance:
- A significant difference must also have an effect size large enough to be meaningful.

Achieving Agreement Between Statistical and Clinical Importance

Suggested Approach

Outcome Measure Selection:
- Choose clinically significant outcome measures. For instance, in myopia progression, define success as a difference of less than -1D over 3 years.
Statistical Analysis Framework:
- Analyze for how many successes occurred in treatment vs. control, ensuring significance in statistical outcomes implies clinical relevance.

Example of Hypothesis Testing with chi-squared test

Invented Data Representation

Group Analysis:
- Use Chi-squared test to determine if deviation is significant.
Data Structure: - Less than -1D progression:
- PAL lens: 10, SVL lens: 2. - More than -1D progression: - PAL lens: 220, SVL lens: 228.