Two-Sample Tests of Hypothesis Study Notes Handout

Introduction to Two-Sample Hypothesis Testing

Conceptual Overview: In prior studies of hypothesis testing, researchers typically compared a single sample of data against a known or assumed population standard (a single numerical value). Two-sample hypothesis testing expands this concept by comparing two separate populations.
General Objective: The goal is to determine if a measurable and significant difference exists between the average values ( $\mu_1$ and $\mu_2$ ) of two groups.
Methodology: Selection of two independent random samples, one from each population, to identify if differences are statistically significant or merely due to random chance.
Real-World Applications: * Comparing real estate sales prices between male and female agents. * Comparing average call counts between morning and afternoon customer service shifts. * Determining if different supermarket checkout procedures result in different average times.

Independent Samples: Known Population Standard Deviations

Core Condition: This test is applied when samples are selected independently from two populations, both populations are assumed to follow a normal distribution, and the population standard deviations ( $\sigma_1$ and $\sigma_2$ ) are known.
The Test Statistic ( $z$ ): The $z$ -distribution is used. The formula for the test statistic is:

$z = \frac{\bar{x}_1 - \bar{x}_2}{\sqrt{\frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2}}}$

Example 1: Supermarket Checkout Methods

Scenario: FoodTown supermarket compares "Standard" cashier-assisted checkout ( $S$ ) with "Fast Lane" self-checkout ( $F$ ). The goal is to see if the standard method takes longer.
Step 1: Hypotheses: * Null Hypothesis ( $H_0$ ): $\mu_S \leq \mu_F$ (Standard is not slower than Fast Lane). * Alternate Hypothesis ( $H_1$ ): $\mu_S > \mu_F$ (Standard is slower than Fast Lane).
Step 2: Significance Level: $\alpha = 0.01$ .
Step 3: Test Statistic: $z$ distribution is used because distributions are assumed normal and $\sigma$ is known.
Step 4: Decision Rule: Upper-tailed test. The critical $z$ value is $2.326$ . Reject $H_0$ if $z > 2.326$ .
Step 5 & 6: Result: The computed $z$ value is $3.123$ . Because $3.123 > 2.326$ , the null hypothesis is rejected. The difference of $0.20$ minutes is statistically significant.
P-value Reasoning: The p-value is the probability of obtaining a $z$ -value larger than $3.123$ when the null is true.

Example 2: Sales Associates Gender Comparison

Scenario: Tom Sevits (Appliance Patch) wants to know if men sell less than women on average.
Data: * Men: $n_1 = 40, \bar{x}_1 = \$1,400, \sigma_1 = \$200$ * Women: $n_2 = 50, \bar{x}_2 = \$1,500, \sigma_2 = \$250$
Hypotheses: * $H_0: \mu_1 \geq \mu_2$ * $H_1: \mu_1 < \mu_2$ (Left-tailed test)
Parameters: $\alpha = 0.05$ . Critical $z = -1.645$ .
Decision: Computed $z = -2.11$ . Since $-2.11 < -1.645$ , reject $H_0$ . Women sell significantly more on average.

Example 3: E-commerce Website Sales

Scenario: Testing if Website A generates less daily sales than Website B.
Data: * Website A: $n_1 = 45, \bar{x}_1 = \$3,200, \sigma_1 = \$600$ * Website B: $n_2 = 50, \bar{x}_2 = \$3,450, \sigma_2 = \$700$
Parameters: $\alpha = 0.05$ . Critical $z = -1.645$ .
Decision: Computed $z = -1.87$ . Since $-1.87 < -1.645$ , reject $H_0$ . Website B has significantly higher sales.

Example 4: Regional Sales Unit Comparison

Scenario: Comparing average sales (units) between two regions.
Data: * Region 1: $n_1 = 50, \bar{x}_1 = 200, \sigma_1^2 = 25$ * Region 2: $n_2 = 60, \bar{x}_2 = 220, \sigma_2^2 = 30$
Hypotheses: $H_0: \mu_1 = \mu_2, H_1: \mu_1 \neq \mu_2$ (Two-tailed test).
Parameters: $\alpha = 0.05$ . Critical $z = \pm 1.96$ .
Decision: Computed $z = 20$ (absolute value). Since $|20| > 1.96$ , reject $H_0$ . There is a significant difference.

Independent Samples: Unknown Population Standard Deviations (Equal Variances)

Core Condition: Population standard deviations are unknown, but it is assumed that $\sigma_1 = \sigma_2$ . In this scenario, we use the $t$ distribution and a "pooled" estimate of the variance.
Pooled Sample Variance ( $s_p^2$ ): A weighted mean of the two sample standard deviations, where weights are the degrees of freedom provided by each sample.

Requirements and Assumptions

Sampled populations are approximately normally distributed.
Sampled populations are independent.
Standard deviations of the two populations are equal ( $\sigma_1 = \sigma_2$ ).

Pooled Test Formulas

Degrees of Freedom: $df = n_1 + n_2 - 2$
Pooled Variance Equation:

$s_p^2 = \frac{(n_1 - 1)s_1^2 + (n_2 - 1)s_2^2}{n_1 + n_2 - 2}$

Test Statistic ( $t$ ):

$t = \frac{\bar{x}_1 - \bar{x}_2}{\sqrt{s_p^2 (\frac{1}{n_1} + \frac{1}{n_2})}}$

Example 5: Owens Lawn Care Mounting Procedures

Scenario: Comparing Welles method (W) vs. Atkins method (A) for engine mounting speed.
Data: * Method W: $n_W = 5, \bar{x}_W = 4, s_W = 2.9$ * Method A: $n_A = 6, \bar{x}_A = 5, s_A = 2.1$
Hypotheses: $H_0: \mu_W = \mu_A, H_1: \mu_W \neq \mu_A$ (Two-tailed).
Parameters: $\alpha = 0.10, df = 5 + 6 - 2 = 9$ . Critical $t = \pm 1.833$ .
Decision: The computed value of $t$ falls between $-1.833$ and $1.833$ . Fail to reject $H_0$ . No significant difference in assembly times detected.

Independent Samples: Unknown Population Standard Deviations (Unequal Variances)

Core Condition: Standard deviations are unknown and it is not reasonable to assume they are equal ( $\sigma_1 \neq \sigma_2$ ).
Methodology Changes: Use separate sample standard deviations ( $s_1, s_2$ ) and adjust degrees of freedom downward using a complex approximation formula.
Effect of Adjustment: Reducing degrees of freedom makes it more difficult to reject the null hypothesis (requires a larger test statistic).

Test Statistic Formula

$t = \frac{\bar{x}_1 - \bar{x}_2}{\sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}}$

Degrees of Freedom Formula (Satterthwaite Approximation)

$df = \frac{[(\frac{s_1^2}{n_1}) + (\frac{s_2^2}{n_2})]^2}{\frac{(\frac{s_1^2}{n_1})^2}{n_1 - 1} + \frac{(\frac{s_2^2}{n_2})^2}{n_2 - 1}}$

Note: If the result for $df$ is a fraction, it is always rounded down to the nearest integer.

Example 6: Paper Towel Absorbency

Data: Store Brand ( $n_1=9, s_1=3.321$ ), Name Brand ( $n_2=12, s_2=1.621$ ).
Significance: $\alpha = 0.10$ . Assumption: Unequal variances.
Degrees of Freedom Calculation: Rounded down to $10$ . Critical $t = \pm 1.812$ .

Example 7: Help Line Resolution Time

Scenario: testing if software issues ( $1$ ) take longer than hardware issues ( $2$ ).
Data: * Software: $n_1 = 18, \bar{x}_1 = 18, s_1 = 4.2$ * Hardware: $n_2 = 12, \bar{x}_2 = 15.5, s_2 = 3.9$
Hypotheses: $H_0: \mu_1 \leq \mu_2, H_a: \mu_1 > \mu_2$
Calculations: $\alpha = 0.05, df = 24.9 \rightarrow 24$ . Critical $t = 1.711$ .
Decision: Computed $t = 1.67$ . Since $1.67 < 1.711$ , fail to reject $H_0$ . Data does not prove software issues take longer.

Hypothesis Testing for Dependent Samples (Paired Observations)

Core Condition: Samples are not independent; they are related or matched. This is often termed a "paired sample."
Mechanism: Instead of comparing two group means directly, we analyze the distribution of the differences ( $d$ ) between each pair of observations. This reduces the problem to a one-sample test of the differences.
Notation: $\mu_d$ represents the population mean of the distribution of differences.
Assumption: The distribution of population differences is approximately normal.

Test Statistic for Paired Samples

$t = \frac{\bar{d}}{s_d / \sqrt{n}}$

Variables: * $n$ : The number of paired observations. * $df$ : $n - 1$ . * $\bar{d}$ : The mean of the differences between paired observations. * $s_d$ : The standard deviation of the differences.

Example 8: Real Estate Appraisal Consistency

Scenario: Nickel Savings and Loan uses two firms (Schadek and Bowyer) to appraise the same $10$ homes to check for consistency.
Data: $n = 10, \bar{d} = 4.6, s_d = 4.402$ .
Hypotheses: $H_0: \mu_d = 0, H_1: \mu_d \neq 0$ (Two-tailed).
Parameters: $\alpha = 0.05, df = 9$ . Critical $t = \pm 2.262$ .
Result: Computed $t = \frac{4.6}{4.402 / \sqrt{10}} \approx 3.30$ . Since $3.30 > 2.262$ , the null hypothesis is rejected. There is a significant difference between the two appraisal firms.

Practice Scenarios: Independent vs. Dependent Samples

Before and After Software: Measuring productivity scores for the same employees before and after a change. (Dependent / Paired).
Two Separate Schools: Comparing test scores of two unique groups of students from School A and School B. (Independent).
Diet Plan Weight Loss: Measuring weight of the same participants before and after a 3-month diet. (Dependent / Paired).
Two Fertilizers: Measuring yields from two separate groups of plants. (Independent).
Therapy Anxiety Levels: Measuring anxiety for the same group of patients before and after therapy. (Dependent / Paired).
New Technology Productivity: Comparing two different groups of employees after technology implementation. (Independent).

Multiple Choice Concept Check

Null Hypothesis in Two-Sample Mean Test: Generally $H_0: \mu_1 = \mu_2$ .
Use of t-distribution: Used when the population variance (or standard deviation) is unknown.
Variance Assumption Importance: Knowing if variances are equal determines the choice of the test statistic (pooled vs. separate) and the calculation of degrees of freedom.

Formula Sheet Summary Reference

Independent known $\sigma$ : $z = \frac{\bar{x}_1 - \bar{x}_2}{\sqrt{\frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2}}}$
Independent unknown $\sigma$ (Equal): $t = \frac{\bar{x}_1 - \bar{x}_2}{\sqrt{s_p^2 (\frac{1}{n_1} + \frac{1}{n_2})}}$ ; $df = n_1 + n_2 - 2$ .
Independent unknown $\sigma$ (Unequal): $t = \frac{\bar{x}_1 - \bar{x}_2}{\sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}}$ ; $df$ uses Satterthwaite approximation.
Dependent (Paired): $t = \frac{\bar{d}}{s_d / \sqrt{n}}$ ; $df = n - 1$ .