E

Chapter 7 -- Complete

Chapter 7: Comparing Two Means (Dependent Samples)

Introduction

Overview of comparing two means using statistical methods in life and health sciences, emphasizing its relevance for experimental design and analysis. It also highlights the distinct approaches to independent and dependent samples in statistical inference, essential for accurate data interpretation.

Two-Sample Problems: Quantitative

Independence vs Dependent Samples:

  • Importance of assessing the dependence of samples: Understanding whether samples are independent or dependent is crucial when comparing means, as it influences the choice of statistical tests and the interpretation of results.

  • Previous chapters (Chapter 6) focused primarily on independent samples, while Chapter 7 delves deeper into the intricacies and methodologies of dependent samples, marking a significant conceptual shift in statistical analysis.

Section 7.1: Paired Designs

Need for Matching:

  • Randomization is a common practice in experimental designs; however, it can sometimes lead to uneven distributions of latent variables, resulting in confounding effects that affect the validity of conclusions.

  • Using matched pairs can counteract these confounding variables effectively; examples include classic before-and-after scenarios (e.g., measuring outcomes in experiments outlined in Chapter 4).

Definition of Matched Pairs Design:

  • A Matched Pairs Design compares two treatments by using pairs of similar experimental units—these units can either be individuals matched based on key characteristics or paired observations for the same unit before and after treatment.

  • Treatments within these pairs are then assigned randomly to eliminate bias and increase the reliability of the findings.

Notation for Matched Pairs

Calculating Responses:
  • Differences within each pair are calculated using the formula (d_i = (x_{1,i} - x_{2,i})), where x_{1,i} represents the response of the first treatment in pair i, and x_{2,i} denotes the response of the second treatment.

  • One-sample procedures are subsequently applied to these calculated differences to assess statistically significant distinctions.

Mean Calculations:
  • The sample mean of differences (39;d) is calculated relatively simply; however, variance calculations necessitate traditional methods, as the variance of the differences (s_d^2) requires distinct computation to obtain accurate results.

Section 7.3: Theoretical Approach for Paired Samples

Theoretical Framework:

  • This section introduces a structured process for analyzing paired differences that parallels the one-sample t-procedures, providing a clear pathway for statistical testing.

  • Key steps involved include computing the paired differences, determining the sample mean of these differences (39;d), and calculating the standard deviation (s_d).

  • Adjustments to confidence intervals are made in line with one-sample methods to ensure consistency in statistical reasoning.

Validity Condition for Normal Approximation

Criteria for Validity:
  • A large sample size of paired differences ((n_d ≥ 30)) is a requirement for achieving normal approximation through the central limit theorem, essential for the validity of parametric analyses.

  • It is also necessary that paired differences exhibit approximately normal (or symmetric) distributions; should neither condition be met, a robust condition can be applied, which can suffice with only 20 observations provided that the distributions are not skewed.

Example: Aspirin Effect on Blood Clotting Time

Research Context:
  • A study was conducted to investigate the effect of aspirin on prothrombin time across 12 subjects, providing insights into potential clinical benefits or risks associated with aspirin use.

  • The primary aim was to test if aspirin significantly increases clotting time at a significance level (b = 0.05).

Test Setup:
  • A normal probability plot was utilized indicating a roughly normal population, which supports the appropriateness of the t-tests used. The hypothesis test outline comprises assumptions regarding sample randomness and normality, accompanied by clearly defined null (H_0: bc_d = 0) and alternative hypotheses (H_a: bc_d ≠ 0).

  • The test yielded a test statistic (t = 0.7400, df = 11) and a corresponding p-value of (0.2374), leading to a conclusion of failing to reject the null hypothesis, thus indicating insufficient evidence to assert that aspirin increases clotting time significantly.

Confidence Interval

Calculation Procedure:
  • A 95% confidence interval was calculated, demonstrating a range (-0.2141, 0.4301) that includes a value of 0, indicating that the true mean difference in thrombin-prothrombin time due to aspirin intake could be either positive or negative.

  • Interpretation of this confidence interval reinforces that there is a 95% confidence that the true average increase (or decrease) in thrombin-prothrombin time from aspirin lies within the specified bounds, highlighting the complexity of treatment effects.

Section 7.2: Simulation Approach for Paired Samples

Introduction to Simulation:

  • This section presents an overview of employing simulations to address scenarios where validity assumptions for parametric tests fail, reverting instead to nonparametric methods when standard approaches are insufficient to provide reliable results.

The Randomization Effect with Matched Pairs

Randomization Process:
  • The emphasis here is on maintaining linked individuals across different treatments to accurately assess group differences while mitigating biases through random assignment of treatment groups.

Example: Heart Rate After Exercise

Exercise Comparison:
  • A study compared the heart rates of participants post-exercise, particularly looking at the effects of different exercise types (jumping jacks vs. bicycle kicks) using paired samples taken from 22 individuals, crucial for understanding the physiological impacts of exercise type on heart rate.

Using the Applet

  • A simulation tool is utilized for conducting 10,000 simulations, focusing on the observed differences between treatments to enhance statistical inference reliability.

Hypothesis Test Summary

Structured Outline for Hypothesis Test:
  • Assumptions: The test relies on establishing a dependent simple random sample of differences.

  • Hypotheses Defined: H_0: bc_d = 0 (no difference), H_a: bc_d ≠ 0 (difference exists).

  • Test Statistic Output: A mean difference (39;x_d = 11.95) was calculated for the jumping jacks group (Group 1).

  • The resulting p-value (p = 0.0124) provides strong evidence against the null hypothesis, leading to the conclusion that jumping jacks elevate heart rate more significantly than bicycle kicks.

2SD Confidence Interval

Confidence Interval Estimate:
  • The 95% confidence interval was estimated using an approximate standard error, yielding a range of (2.156, 21.744), suggesting that it is highly likely that the impact of jumping jacks on heart rate is positive and significant.

Final Notes: Importance of Pairing Samples

Advantages of Paired Designs:

  • This section analyzes the reasons why pairing samples can yield insights not captured through independent sample testing. A case example is provided to illustrate how neglecting paired samples can obscure underlying effects, such as the influence of biofeedback training on blood pressure.

Without Pairing: Analysis Issues

Consequences of Treating Dependent Samples Independently:
  • Independent sample tests misrepresent significant differences when dealing with dependent data due to internal variations, potentially leading to incorrect conclusions.

Pairing Matters!

Emphasis on Correct Methodology:
  • This reinforces the argument that dependent sample tests (like the matched-pairs t-test) are statistically more valid for specific data contexts than independent tests, emphasizing the importance of utilizing the right methodologies for data analysis.

Final Notes: Considerations for Proportions and Medians

Exploring Alternatives:

  • The chapter also explores various statistical methodologies for assessing median differences and nonparametric tests, such as the Wilcoxon Signed-Rank Test, which serve as viable alternatives to mean difference tests when the underlying assumptions of normal distribution are not met.

  • Additionally, McNemar's Test is introduced for evaluating dependent proportions, particularly pertinent for categorical data analysis.