Statistical interference

Statistical Inference

What is Statistical Inference?

  • Statistical inference is the process of drawing conclusions or making predictions about a population based on data from a sample.   - Involves:
        - Analyzing sample data.
        - Making generalizations, estimates, or decisions about a larger group (the population) that the sample represents.

Inferring Population Parameters

Population vs. Sample
  • Population:
      - The entire group or set of individuals, items, or data points that you are interested in studying.

  • Sample:
      - A subset of the population that is actually observed or measured.   - Practicality:
        - It is often impractical to gather data from an entire population.

  • Statistical Inference:
      - Allows researchers to make reliable conclusions about the population based on the sample.

Example: Inferring Population Parameters

  • Goal:
      - Estimate the average refractive error (in diopters) of all adults in a city.

  • Population:
      - All adults in the city.

  • Sample:
      - A randomly selected sample of fifty adults, each measured for refractive error.

  • Sample Data:
      - Sample size (n) = 50.
      - Sample mean ($\bar{x}$) = -0.75 diopters.
      - Standard deviation (s) = 1.2 diopters.

Point and Interval Estimates

Point Estimate
  • Definition:
      - The point estimate for the average refractive error in the population is -0.75 diopters.

Interval Estimate
  • Confidence Level:
      - We are 95% confident that the true average refractive error of the population is ($\bar{x} \pm 0.341$) diopters.   - Interpretation:
        - The point estimate provides a specific value, while the interval estimate offers a range of values for the population parameter (mean refractive error).

Standard Error of the Mean (SEM)

Purpose
  • Question Addressed:
      - How well does a sample mean represent the population mean?

  • SEM:
      - Provides insight into this question.

Characteristics of Distribution of Sample Means
  • When repeatedly drawing samples from a population and calculating their means, these means form a distribution.

  • Mean of this distribution:
      - Equal to the population mean ($\mu$).

  • Standard deviation of sample means:
      - Denoted as $\sigma_{\bar{x}}$ and calculated as σn\frac{\sigma}{\sqrt{n}} where $\sigma$ is the standard deviation of the population, and $n$ is the sample size.

  • Implications of SEM:
      - Large SEM indicates sample means are widely dispersed around the population mean; small SEM indicates they are tightly clustered around it.

Factors Affecting SEM

  • Population Standard Deviation ($\sigma$):
      - The larger the $\sigma$, the larger the SEM.

  • Sample Size ($n$):
      - The larger the $n$, the smaller the SEM.

  • Implications of SEM:
      - A smaller SEM indicates a more accurate estimate of the population mean.   - A larger SEM suggests greater uncertainty about the population mean.

Standard Deviation vs. Standard Error of the Mean

Definitions
  • Standard Deviation (SD):
      - Measures the spread of individual data points in a sample or population.

  • Standard Error of the Mean (SEM):
      - Measures how accurately a sample mean estimates the population mean.

Key Differences
  • SEM is typically smaller than SD since it pertains to sample means, not individual data points; predicting the mean of several data points is easier than predicting individual points.

Confidence Intervals

Definitions
  • Confidence Level:
      - The probability that the confidence interval will contain the true population parameter. For instance, a 95% confidence level means that if you take 100 samples, approximately 95 of them will contain the true population parameter.

  • Confidence Interval:
      - A range of values within which the true population parameter is expected to fall. For example, a CI of (10, 20) means that the population parameter is believed to be between 10 and 20 with a certain level of confidence.

Critical Value
  • Definition:
      - A value from a statistical distribution (like the z-distribution for large samples or the t-distribution for small samples) corresponding to the desired confidence level.

  • Example:
      - For a 95% confidence interval, the critical value for the z-distribution is approximately 1.96.

Interpretation
  • A 95% confidence interval's interpretation does not imply the true parameter lies within a specific interval with 95% probability; rather, the true parameter is fixed, and the interval varies based on the sample.

Example of Calculation
  • Data for Population:
      - Sample size (n) = 50, Sample mean ($\bar{x}$) = -0.75 diopters, Standard deviation (s) = 1.2 diopters.

  • Interval Estimate Calculation:
      - Critical value ($t_{c}$) for 95% confidence from t-distribution with df=49 = 2.009.   - SEM = sn=1.250=0.1697\frac{s}{\sqrt{n}} = \frac{1.2}{\sqrt{50}} = 0.1697.   - CI: ($\bar{x} \pm t_{c} \cdot SEM$) = 0.75±2.0090.1697=(1.091,0.409)-0.75 \pm 2.009 \cdot 0.1697 = (-1.091, -0.409).

Hypothesis Testing

Basic Concept
  • In hypothesis testing, you start with an assumption known as the null hypothesis.

  • The goal is to test this assumption using sample data to determine whether the statistical tests provide sufficient evidence to reject the null hypothesis.

Definitions
  • Null Hypothesis (H₀):
      - An assumption that there is no effect or difference (e.g., the mean of the population equals a specific value).

  • Alternative Hypothesis (H₁):
      - An assumption that there is an effect or a difference.

Testing Procedure
  • Involves calculating a test statistic (like a t-test or chi-square test) and comparing it to a critical value or calculating a p-value to decide whether to reject or fail to accept the null hypothesis.

Hypothesis Testing: Example

Situation Setup
  • A study compares two cataract surgery techniques (Technique A and Technique B).

  • Data Collected:
      - Technique A: Mean improvement = 0.25 logMAR, SD = 0.10, Sample size = 50.   - Technique B: Mean improvement = 0.30 logMAR, SD = 0.08, Sample size = 50.

  • Research Question:
      - Is there a difference in BCVA improvement between the two techniques?

Hypotheses Defined
  • Null Hypothesis (H0):
      - There is no difference in the mean improvement seen with the two techniques.

  • Alternative Hypothesis (H1):
      - There is a difference in mean improvements seen with the two techniques.

Test Selection
  • The test selected for the analysis is the Independent two-sample t-test.

Hypothesis Testing: Independent (Two Sample) t-test

Necessary Equations
  • T-statistic formula:
      - t=xˉ1xˉ2SEt = \frac{\bar{x}_1 - \bar{x}_2}{SE}

  • SE definition:
      - SE=s12n1+s22n2SE = \sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}.

Explanation of t-statistic

  • Numerator:
      - The difference of the two means; the larger this difference, the more likely it is that the two groups are different.

  • Denominator (SE):
      - Represents variability; a larger SE suggests lesser likelihood of the groups being different.

  • Variance Addition Note:
      - Variances can be added but standard deviations cannot; combine by squaring, adding, and taking the square root.

Calculating the t-statistic

  • t=0.250.30(0.102/50)+(0.082/50)t = \frac{0.25 - 0.30}{\sqrt{(0.10^2/50) + (0.08^2/50)}} = 0.050.00038=2.76\frac{-0.05}{\sqrt{0.00038}} = -2.76

  • Degrees of Freedom:
      - For the two-sample t-test: df = $n_1 + n_2 - 2 = 50 + 50 - 2 = 98$.

  • Significance Level:
      - Select significance level (α) = 0.05.

  • Look up critical t-value:
      - For df = 98, α = 0.05 yields critical t-value ≈ ±1.984; since $t_{stat}$ (-2.76) is in the critical region, we reject $H_0$.

Probability Density for Hypothesis Testing

  • Critical Values:
      - Left Tail (2.5%): -1.98, Right Tail (2.5%): 1.98, based on t-Distribution for df = 98.

Conclusion of t-test Example
  • Result Interpretation:
      - There is a statistically significant difference in BCVA improvement between the two techniques.

Significance Level (α)

  • Definition:
      - The probability of rejecting the null hypothesis when it is actually true (Type I error).

  • Common Levels:
      - 0.05, 0.01, and 0.10; example used was α = 0.05 - indicating a 5% risk of a Type I error.

Critical Region (Rejection Region)

  • Definition:
      - The set of values for the test statistic that leads to rejection of the null hypothesis, determined by α.

  • Example Context:
      - For α = 0.05, critical regions are in the extreme 2.5% of both tails of the distribution.

  • Determination:
      - The critical regions beyond ±1.984; the t-statistic (-2.76) is in the critical region leading to rejection of $H_0$.

p-value Concept

  • Definition:
      - The p-value is the probability of obtaining test results at least as extreme as the observed results, assuming the null hypothesis is true.

  • Interpretation for Decision Making:
      - If p-value ≤ α, reject $H_0$; otherwise, do not reject.

  • Contextual Example:
      - For a t-statistic of -2.76, p-value = 0.007; as this is less than α (0.05), $H_0$ is rejected.

Reporting p-value

  • Importance of Reporting:
      - Always report the p-value along with t-statistic, degrees of freedom, and other details; e.g., t_{98} = -2.76, p = 0.007.

  • Strength of Evidence Highlight:
      - A small p-value (< 0.05) suggests the observed data is unlikely under $H_0$, leading to its rejection.

Types of Error

Type I Error (False Positive)
  • Definition:
      - Occurs when $H_0$ is rejected even though it is true.

  • Control:
      - Controlled by the significance level (α).   - Example if α = 0.05 implies a 5% chance of making a Type I error.

Type II Error (False Negative)
  • Definition:
      - Occurs when $H_0$ is not rejected even though it is false.

  • Notation:
      - Probability of Type II error is denoted as β.

Understanding Trade-offs
  • Trade-off Context:
      - Reducing α lowers the risk of Type I error but increases the risk of Type II error.

Hypothesis Testing: Paired t-test Example

Situation Description
  • A study investigates a new glaucoma medication.

  • Measured Data:
      - Before Medication: Mean IOP ($\mu_{before}$) = 25 mmHg.   - After Medication: Mean IOP ($\mu_{after}$) = 20 mmHg.   - Standard deviation of differences (sd) = 3 mmHg.
      - Sample size = 20.

  • Research Question:
      - Does the medication reduce IOP?

Hypotheses for the Tester
  • Null Hypothesis (H0):
      - No difference in the mean IOPs before and after medication.

  • Alternative Hypothesis (H1):
      - A difference exists in the mean IOPs.

Test Chosen
  • Analysis Approach:
      - A Paired t-test is utilized.

Paired t-test Equations and Calculations

Necessary Formulas
  • T-statistic Formula:
      - t=dˉSEdt = \frac{\bar{d}}{SE_d}

  • Measured $d$:
      - Mean difference of IOP values before and after medication.

Calculating the T-statistic
  • Steps:
      - t=53/20=50.677.46t = \frac{5}{3/\sqrt{20}} = \frac{5}{0.67} \approx 7.46.

  • Degrees of Freedom:
      - For the paired t-test: df = n - 1 = 20 - 1 = 19.

  • Significance Level:
      - We select α = 0.05.

Critical t-value and p-value Lookup
  • Critical Value:
      - From tables, for df = 19, α = 0.05, t-critical (2-sided) is ±2.093.

  • Critical Region:
      - Results in the region beyond ±2.093.

  • p-value Extraction:
      - Based on df = 19 and t = 7.46, p = 4.66E-07 (2-sided).

  • Decision:
      - Since p < α (0.05), reject $H_0$.

Conclusion of the t-test Example
  • Final Interpretation:
      - The medication significantly reduces IOP.

Common Misinterpretations of p-value

Clarifications Needed
  • P-value vs. Probability of Hypothesis:
      - A p-value measures observed data probability under $H_0$, not the probability that $H_0$ is true.

  • P-value vs. Evidence Strength:
      - A p-value offers evidence against $H_0$ but not a measure of evidence for $H_1$.

  • P-value vs. Certainty:
      - A small p-value does not guarantee truth; it indicates likelihood of data under $H_0$.

Statistical vs. Clinical Importance

Example Overview
  • Study Title:
      - “A Randomized Clinical Trial of Progressive Addition Lenses versus Single Vision Lenses on the Progression of Myopia in Children”.

Data Overview
  • Measurement of myopia progression by change in equivalent mean sphere over 3 years:   - PAL group: mean of -1.28D, SE = 0.06 D.   - SVL group: mean of -1.48D, SE = 0.06 D.   - Difference PAL-SVL: mean = 0.2D, SE = 0.08D (significant at 5%).

Importance of Distinguishing
  • Statistical Significance:
      - Compares mean difference (effect size) to random effects size (standard error).

  • Clinical Importance:
      - A significant difference must also have an effect size large enough to be meaningful.

Achieving Agreement Between Statistical and Clinical Importance

Suggested Approach
  • Outcome Measure Selection:
      - Choose clinically significant outcome measures. For instance, in myopia progression, define success as a difference of less than -1D over 3 years.

  • Statistical Analysis Framework:
      - Analyze for how many successes occurred in treatment vs. control, ensuring significance in statistical outcomes implies clinical relevance.

Example of Hypothesis Testing with chi-squared test

Invented Data Representation
  • Group Analysis:
      - Use Chi-squared test to determine if deviation is significant.

  • Data Structure:   - Less than -1D progression:
        - PAL lens: 10, SVL lens: 2.   - More than -1D progression:     - PAL lens: 220, SVL lens: 228.