Bio-stats 8.2
3.2 Analysis of Paired Samples (Dependent Samples)
This section discusses the hypothesis tests and confidence intervals for paired samples, emphasizing the assumptions, computations, and interpretations crucial for conducting these analyses.
Theorem 8.1: Hypothesis Test of Paired Samples
Assumptions
Normality: The data should be from an approximately normal distribution.
Random Sampling: The data must represent a random sample of differences from the population of all possible differences.
Notation:
Let $d$ denote the differences between paired samples.
Let $S_d$ represent the sample mean of those differences.
Let $s_d$ be the sample standard deviation of the differences.
Significance Level: A significance level, denoted as $ ext{a}$, should typically be set at $5\%$.
Standard Deviation: If $\sigma$ is unknown, denote it as the standard deviation for the normal distribution of differences.
Sample Size Influence: Normality is less critical if the sample size is large (i.e., $n \geq 30$).
Null Hypothesis
The null hypothesis ($H_0$) takes the form:
There is no mean difference between populations.
Mathematically, $H0: \mud = d0$, where $d0 = 0$.
Steps for Hypothesis Testing
A. Compute the Test Statistic
The test statistic (TS) is computed as follows:
Where:
$\bar{d}$ is the mean of the $d$ column.
$s_d$ is the sample standard deviation of the differences.
$n$ is the sample size.
B. Compute the p-Value
The p-Value is calculated based on the alternative hypothesis $H_1$, which specifies the direction of the test.
Left-Sided Test:
Null: $H0: \mud = d_0$
Alternative: $H1: \mud < d_0$
p-Value: P(T < TS) = tCDF(-E99, TS, df)
Right-Sided Test:
Alternative: $H1: \mud > d_0$
p-Value: P(T > TS) = tCDF(TS, E99, df)
Degrees of Freedom:
Calculated as $df = n - 1$.
Two-Sided Test:
Alternative: $H1: \mud \neq d_0$
p-Value is twice the probability of the appropriate one-sided hypothesis:
If $TS < 0$:
p-Value = 2 imes P(T < TS) = 2 imes tCDF(-E99, TS, df)If $TS > 0$:
p-Value = 2 imes P(T > TS) = 2 imes tCDF(TS, E99, df)
C. Make a Decision
Decision Rule:
If $p-Value < \alpha$, then reject $H_0$.
If $p-Value > \alpha$, then fail to reject $H_0$.
It is important to note that one should never accept $H_0$, as a hypothesis test cannot definitively prove that the null hypothesis is true.
D. Interpretation
If $H_0$ is rejected:
There is sufficient evidence to support the research hypothesis $H_1$.
If $H_0$ is not rejected:
There is not sufficient evidence to support the research hypothesis $H_1$.
Theorem 8.2: Confidence Interval for Paired Samples
Assumptions
Similar to Theorem 8.1, the assumptions for paired sample confidence intervals are:
Data must come from an approximately normal distribution.
Data is a random sample of differences from the population of all possible differences.
The standard deviation of differences, $\sigma_d$, is unknown.
Normality is less critical if the sample size is large (i.e., $n \geq 30$).
Steps for Computing Confidence Intervals
A. Compute the Point Estimate
The point estimate of the mean difference is given by:
Where $d_i$ represents individual differences and $n$ is the total number of samples.
B. Compute the Margin of Error
The critical value, $t{a/2}$, is determined as:
Ensure that the critical value is positive, and degrees of freedom are calculated as $df = n - 1$.
The margin of error is calculated as:
C. Confidence Interval
The confidence interval can then be expressed as:
Interpretation
The interpretation states that we are $100(1 - \alpha)\%$ confident that the true mean difference $\mu_d$ lies within the specified interval.
Practical Implications
Use of TI procedures for calculations, and for hypothesis testing, utilize statistics such as:
Paired T-Test for mean differences.
T-Interval for confidence interval construction by accessing:
Use
STAT→TESTS→2: T-TESTor8: T-INTERVAL.
Example Application of Paired T-Test
Assessment of Mean Scores
The researchers aim to evaluate whether the educational video significantly affects mean scores observed before and after viewing.
Data Comparison
Example Data
Scores Before Viewing:
$[61, 60, 52, 74, 64, 75, 42, 63, 53, 56]$
Scores After Viewing:
$[67, 62, 54, 83, 60, 89, 44, 67, 62, 57]$
Differences (After - Before):
$d = A - B$:
$[6, 2, 2, 52, 74, 64, 75, 42, 63, 53, 56] - [58, 60, 52, 74, 64, 75, 42, 63, 53, 56]$.
Example differences: $[6, 2, -4, 14, 2, 4, 9, 1]$
State the Hypotheses
Null Hypothesis ($H_0$):
$\mu_d = 0$ (no mean difference).
Alternative Hypothesis ($H_1$):
$\mu_d > 0$ (after scores are higher).
Direction of the Test
Positive right-tailed test for detecting an increase in mean scores.
Test Statistic Calculation
Test Statistic:
Resulting in a calculated value of $2.77$ with corresponding p-Value of $0.0108$ (demonstrates statistical significance).
Decision:
Reject $H_0$ if $p-Value < 0.05$.
Interpretation
At 5% significance level, there is sufficient evidence that the mean score after viewing the educational video is significantly higher than the mean score before viewing.
Using R for Computation
Confidence Interval Computation
Using R, compute confidence intervals by specifying the alternatives.
# R code implementation
before = c(61, 60, 52, 74, 64, 75, 42, 63, 53, 56)
after = c(67, 62, 54, 83, 60, 89, 44, 67, 62, 57)
t.test(after, before, paired = TRUE, alternative = 'two.sided', conf.level = 0.95)
Output
Resulting confidence interval $(0.8329478, 8.1670522)$ typically denoting the range of mean differences.
Paired T-Test Implementation in R
For confirming the analysis result:
# Execute paired t-test
before = c(61, 60, 52, 74, 64, 75, 42, 63, 53, 56)
after = c(67, 62, 54, 83, 60, 89, 44, 67, 62, 57)
t.test(after, before, paired = TRUE, mu = 0, conf.level = 0.95, alternative = 'greater')
Summary of R Outputs
Test statistic found to be $t = 2.776$, with $df = 9$, and a p-value of $0.01077$, confirming the hypothesis test findings.
Confidence interval for the true mean difference is noted as $(1.528447, \infty)$.
Comparison with TI Calculations
Verify that results from R align with TI outputs for consistency in calculations and interpretations.
Final Considerations
It is critical that all assumptions for paired tests are satisfied, and proper interpretation of results should be conducted to deduce accurate conclusions.