Bio-stats 8.2

3.2 Analysis of Paired Samples (Dependent Samples)

This section discusses the hypothesis tests and confidence intervals for paired samples, emphasizing the assumptions, computations, and interpretations crucial for conducting these analyses.

Theorem 8.1: Hypothesis Test of Paired Samples

Assumptions
  1. Normality: The data should be from an approximately normal distribution.

  2. Random Sampling: The data must represent a random sample of differences from the population of all possible differences.

  3. Notation:

    • Let $d$ denote the differences between paired samples.

    • Let $S_d$ represent the sample mean of those differences.

    • Let $s_d$ be the sample standard deviation of the differences.

  4. Significance Level: A significance level, denoted as $ ext{a}$, should typically be set at $5\%$.

  5. Standard Deviation: If $\sigma$ is unknown, denote it as the standard deviation for the normal distribution of differences.

  6. Sample Size Influence: Normality is less critical if the sample size is large (i.e., $n \geq 30$).

Null Hypothesis

The null hypothesis ($H_0$) takes the form:

  • There is no mean difference between populations.

  • Mathematically, $H0: \mud = d0$, where $d0 = 0$.

Steps for Hypothesis Testing
A. Compute the Test Statistic

The test statistic (TS) is computed as follows:
extTS=(dˉd<em>0)(s</em>d/n)ext{TS} = \frac{(\bar{d} - d<em>0)}{(s</em>d / \sqrt{n})}

  • Where:

    • $\bar{d}$ is the mean of the $d$ column.

    • $s_d$ is the sample standard deviation of the differences.

    • $n$ is the sample size.

B. Compute the p-Value

The p-Value is calculated based on the alternative hypothesis $H_1$, which specifies the direction of the test.

  • Left-Sided Test:

    • Null: $H0: \mud = d_0$

    • Alternative: $H1: \mud < d_0$

    • p-Value: P(T < TS) = tCDF(-E99, TS, df)

  • Right-Sided Test:

    • Alternative: $H1: \mud > d_0$

    • p-Value: P(T > TS) = tCDF(TS, E99, df)

  • Degrees of Freedom:

    • Calculated as $df = n - 1$.

  • Two-Sided Test:

    • Alternative: $H1: \mud \neq d_0$

    • p-Value is twice the probability of the appropriate one-sided hypothesis:

    • If $TS < 0$:
      p-Value = 2 imes P(T < TS) = 2 imes tCDF(-E99, TS, df)

    • If $TS > 0$:
      p-Value = 2 imes P(T > TS) = 2 imes tCDF(TS, E99, df)

C. Make a Decision
  • Decision Rule:

    • If $p-Value < \alpha$, then reject $H_0$.

    • If $p-Value > \alpha$, then fail to reject $H_0$.

    • It is important to note that one should never accept $H_0$, as a hypothesis test cannot definitively prove that the null hypothesis is true.

D. Interpretation
  • If $H_0$ is rejected:

    • There is sufficient evidence to support the research hypothesis $H_1$.

  • If $H_0$ is not rejected:

    • There is not sufficient evidence to support the research hypothesis $H_1$.

Theorem 8.2: Confidence Interval for Paired Samples

Assumptions

Similar to Theorem 8.1, the assumptions for paired sample confidence intervals are:

  1. Data must come from an approximately normal distribution.

  2. Data is a random sample of differences from the population of all possible differences.

  3. The standard deviation of differences, $\sigma_d$, is unknown.

  4. Normality is less critical if the sample size is large (i.e., $n \geq 30$).

Steps for Computing Confidence Intervals
A. Compute the Point Estimate
  • The point estimate of the mean difference is given by: dˉ=din\bar{d} = \frac{\sum d_i}{n}

    • Where $d_i$ represents individual differences and $n$ is the total number of samples.

B. Compute the Margin of Error
  • The critical value, $t{a/2}$, is determined as: t</em>a/2=invT(2,df)t</em>{a/2} = | \text{invT}(2, df) |

    • Ensure that the critical value is positive, and degrees of freedom are calculated as $df = n - 1$.

  • The margin of error is calculated as:
    extMOE=t<em>a/2×s</em>dnext{MOE} = t<em>{a/2} \times \frac{s</em>d}{\sqrt{n}}

C. Confidence Interval
  • The confidence interval can then be expressed as:
    dˉ±extM.O.E=(dˉt<em>a/2,dˉ+t</em>a/2)\bar{d} \pm ext{M.O.E} = (\bar{d} - t<em>{a/2}, \bar{d} + t</em>{a/2})

Interpretation
  • The interpretation states that we are $100(1 - \alpha)\%$ confident that the true mean difference $\mu_d$ lies within the specified interval.

Practical Implications
  • Use of TI procedures for calculations, and for hypothesis testing, utilize statistics such as:

    • Paired T-Test for mean differences.

    • T-Interval for confidence interval construction by accessing:

    • Use STATTESTS2: T-TEST or 8: T-INTERVAL.

Example Application of Paired T-Test

Assessment of Mean Scores

  • The researchers aim to evaluate whether the educational video significantly affects mean scores observed before and after viewing.

Data Comparison

Example Data
  • Scores Before Viewing:

    • $[61, 60, 52, 74, 64, 75, 42, 63, 53, 56]$

  • Scores After Viewing:

    • $[67, 62, 54, 83, 60, 89, 44, 67, 62, 57]$

  • Differences (After - Before):

    • $d = A - B$:

    • $[6, 2, 2, 52, 74, 64, 75, 42, 63, 53, 56] - [58, 60, 52, 74, 64, 75, 42, 63, 53, 56]$.

    • Example differences: $[6, 2, -4, 14, 2, 4, 9, 1]$

State the Hypotheses
  • Null Hypothesis ($H_0$):

    • $\mu_d = 0$ (no mean difference).

  • Alternative Hypothesis ($H_1$):

    • $\mu_d > 0$ (after scores are higher).

Direction of the Test
  • Positive right-tailed test for detecting an increase in mean scores.

Test Statistic Calculation
  • Test Statistic: extTS=dˉd<em>0s</em>d/next{TS} = \frac{\bar{d} - d<em>0}{s</em>d / \sqrt{n}}

    • Resulting in a calculated value of $2.77$ with corresponding p-Value of $0.0108$ (demonstrates statistical significance).

    • Decision:

    • Reject $H_0$ if $p-Value < 0.05$.

Interpretation
  • At 5% significance level, there is sufficient evidence that the mean score after viewing the educational video is significantly higher than the mean score before viewing.

Using R for Computation

Confidence Interval Computation

  • Using R, compute confidence intervals by specifying the alternatives.

# R code implementation
before = c(61, 60, 52, 74, 64, 75, 42, 63, 53, 56)
after = c(67, 62, 54, 83, 60, 89, 44, 67, 62, 57)
t.test(after, before, paired = TRUE, alternative = 'two.sided', conf.level = 0.95)
Output
  • Resulting confidence interval $(0.8329478, 8.1670522)$ typically denoting the range of mean differences.

Paired T-Test Implementation in R

  • For confirming the analysis result:

# Execute paired t-test
before = c(61, 60, 52, 74, 64, 75, 42, 63, 53, 56)
after = c(67, 62, 54, 83, 60, 89, 44, 67, 62, 57)
t.test(after, before, paired = TRUE, mu = 0, conf.level = 0.95, alternative = 'greater')
Summary of R Outputs
  • Test statistic found to be $t = 2.776$, with $df = 9$, and a p-value of $0.01077$, confirming the hypothesis test findings.

  • Confidence interval for the true mean difference is noted as $(1.528447, \infty)$.

Comparison with TI Calculations

  • Verify that results from R align with TI outputs for consistency in calculations and interpretations.

Final Considerations
  • It is critical that all assumptions for paired tests are satisfied, and proper interpretation of results should be conducted to deduce accurate conclusions.