Two Sample Hypothesis Testing

Set null and alternative hypotheses.
Select significance level $( \alpha )$ and calculate the test statistic.
Compute the p-value.
Make a decision based on the p-value:
- Reject the null hypothesis if the p-value is less than the significance level.
- Do not reject the null hypothesis if the p-value is greater than the significance level.

Alternative hypothesis is the new claim or question.
Null hypothesis includes the null value $( \mu_0 )$ .
- \HA: \mu > \mu0 \H0: \mu <= \mu0
- \HA: \mu < \mu0 \H0: \mu >= \mu0
- \HA: \mu \neq \mu0 $\H0: \mu = \mu0$
If population SD known, test statistic is $Z = \frac{\overline{X} - \mu_0}{\frac{\sigma}{\sqrt{n}}}$
- Excel uses NORM.DIST for P-value
If population SD unknown, test statistic is $t = \frac{\overline{X} - \mu_0}{\frac{s}{\sqrt{n}}}$
- Excel uses T.DIST, T.DIST.RT, T.DIST.2T

Most test statistic formulas assume independent populations and samples.
Independent samples: Knowing values of one sample provides no information about the other.
Mathematically, for independent events A and B, $P(A | B) = P(A)$ .
No mathematical test exists to confirm independence; decision based on problem knowledge.

Observations in two samples can be placed in one-to-one correspondence.
Samples can be from the same participant (within-participant) or a between-participant study with a direct link.
Sample sizes must be equal.
A straightforward way to link one observation in sample 1 with exactly one in sample 2.

Take the difference in each pair, resulting in a single variable D for n pairs.
If the CLT applies, perform a one-sample t hypothesis test.
Test statistic: $t = \frac{\overline{D} - d_0}{\frac{s}{\sqrt{n}}}$
- $(n - 1)$ degrees of freedom.
- $d_0$ is the hypothesized mean difference.

Observed value: $\overline{x}1 - \overline{x}2$
Null value: $d_0$
Standard Error: $\sqrt{\frac{\sigma1^2}{n1} + \frac{\sigma2^2}{n2}}$
Test statistic: $Z = \frac{\overline{x}1 - \overline{x}2 - d0}{\sqrt{\frac{\sigma1^2}{n1} + \frac{\sigma2^2}{n_2}}}$

Substitute sample standard deviations when population standard deviations are unknown:
$t = \frac{\overline{x}1 - \overline{x}2 - d0}{\sqrt{\frac{s1^2}{n1} + \frac{s2^2}{n_2}}}$

Uses an approximate t distribution.
DF is rounded down to the next lower number.
Test statistic: $t = \frac{\overline{x}1 - \overline{x}2 - d0}{\sqrt{\frac{s1^2}{n1} + \frac{s2^2}{n_2}}}$

Dependent samples: Matched pairs t test.
- Test statistic $t = \frac{\overline{D} - \mu_D}{\frac{s}{\sqrt{n}}}$
  - $(n – 1)$ degrees of freedom.
Independent samples, population SDs known: Two sample Z test.
- $Z = \frac{x1 - x2 - d0}{\sqrt{\frac{\sigma1^2}{n1} + \frac{\sigma2^2}{n_2}}}$
Independent samples, population SDs unknown: Welch’s two sample t test with unequal variances.
- $t = \frac{x1 - x2 - d0}{\sqrt{\frac{s1^2}{n1} + \frac{s2^2}{n_2}}}$
  - Satterthwaite’s degrees of freedom.

P-value: Probability of getting a sample statistic or a more extreme one, assuming the null hypothesis is true.
If P-value <= $\alpha$ :
- Null hypothesis is true and data was unusual.
- Null hypothesis is false and should be rejected.
Decision to reject could be wrong.

Statistical significance is a consequence of the data.
If the sample mean is unusual under the null hypothesis conditions, and thus rejected, based on a significance level defined in advance, we call the data result statistically significant.
Statistical significance has nothing to do with practical, real-world usefulness or importance.

The P-value is NOT the probability that the null hypothesis is true given the data!
The P-value is NOT the probability that the alternative hypothesis is false!
The P-value is NOT the probability of Type I error!
A smaller P-value does NOT indicate a larger population effect!
A larger P-value does NOT prove the null hypothesis is true!