Module 8: One-Sample Hypothesis Testing (Part 1)

Exam and Course Logistics

Office Hours Opportunities:
- Two Webex office hours this week.
- After next Monday's class, two in-person office hours (likely Monday and Tuesday at $8\text{PM}$ in the classroom) will be offered before Wednesday.
Mock Exam: A mock exam will be held next Monday during class.
- An answer key and in-person availability will follow the mock exam.
Instructor's Availability/Efficiency:
- The instructor will be available for questions, emphasizing efficient meetings (asking specific questions).
- Instructor is moving further away but remains committed to supporting students.
Group Study: Highly recommended for efficiency; the instructor is more willing to meet with groups of $3-4$ students (Webex or in-person) than individuals.
Cheat Sheet Template: A template document to aid in preparing cheat sheets will be made available on Wednesday.
Attendance Tracking: Students should check their AT, AC, and AP records. AT tracks absences (good numbers: $0, 1, 2$ ). AC and AP track missed active learning items, costing $0.5\%$ from the $10\%$ activity grade per miss.

Choosing Between Z-test and T-test for Means

Key Condition: The decision to use a Z-test or T-test depends solely on whether the population standard deviation ( $\sigma$ ) is known.
Z-test Usage: Use a Z-test only if the question explicitly states that the population standard deviation ( $\sigma$ ) is known (e.g., "the population standard deviation is equal to $\sigma = \text{such and such}$ ").
T-test Usage: If the population standard deviation ( $\sigma$ ) is not mentioned or unknown, assume a T-test is required.
Clarification on Sample Size ( $n ge 30$ ):
- The rule that for $n ge 30$ one can use a Z-test (due to T and Z distributions overlapping for large sample sizes) is not the primary condition for choosing between Z and T in this class.
- The only decisive factor is knowledge of the population standard deviation.
Applicability: This rule applies currently to confidence intervals for means and will be extended/modified for proportions (covered in Lecture $9$ ).

Lecture 8: One-Sample Hypothesis Testing (Part 1)

This is the concluding lecture for the statistical portion of the course.
After Lecture $10$ , the course will transition to more algebraic topics.

Understanding Hypothesis Testing

Definition: A hypothesis is a statement about population parameters (e.g., population mean, $\mu$ ).
Example (Cafeteria Rating):
- Parameter: The average rating ( $\mu$ ) for the campus cafeteria.
- Population: All individuals who have used the cafeteria service (not just students, but also staff, visitors, etc.).
- Practicality: It's impossible to collect data from the entire population, so sampling is necessary.

Setting Up Hypotheses

Two complementary statements are used:
- Null Hypothesis ( $H_0$ ): Always includes the equality sign.
 - E.g., $H0: \mu = \text{some value}$ (hypothesized population mean, $\mu0$ ).
- Alternative Hypothesis ( $H_1$ ): States the opposite of the null hypothesis.
 - E.g., $H_1: \mu \ne \text{some value}$ .
Complementary Nature: Together, $H0$ and $H1$ cover all possibilities ( $100\%$ ).

Types of Hypothesis Tests

Two-Sided (Two-tailed) Test:
- Used when we want to test if a parameter is different from a specific value.
- Form: $H0: \mu = \mu0$ vs. $H1: \mu \ne \mu0$ .
- Focus of today's lecture.
One-Sided (One-tailed) Test:
- Used when we are interested if a parameter is less than or greater than a specific value.
- Form: $H0: \mu \le \mu0$ vs. H1: \mu > \mu0 (or $H0: \mu ge \mu0$ vs. H1: \mu < \mu0).
- Will be discussed on Wednesday.

Three Approaches to One-Sample Hypothesis Testing

When testing a hypothesis, there are three equivalent approaches to draw conclusions:

Approach 1: Critical Value Approach

This method compares a calculated test statistic to critical values that define rejection regions.

State the Null ( $H0$ ) and Alternative ( $H1$ ) Hypotheses:
- Example: For cafeteria ratings,
 - $H_0: \mu = 55$ (The average rating is $55$ )
 - $H_1: \mu \ne 55$ (The average rating is not $55$ )
- This is a two-tailed test because of the " $\ne$ " in $H1$ . The hypothesized value is $\mu0 = 55$ .
Specify the Desired Level of Significance ( $\alpha$ ) and Sample Size ( $n$ ):
- Significance Level ( $\alpha$ ): The probability of rejecting a true null hypothesis. By default, $\alpha = 0.05$ ( $5\%$ ), but it can be changed. Recall $\alpha = 1 - \text{Confidence Level}$ .
- Sample Size ( $n$ ): Determines the degrees of freedom. For the example, a convenience sample of $n = 6$ students was used.
Determine the Appropriate Test Statistic (Z or T):
- Based on whether the population standard deviation ( $\sigma$ ) is known.
- Example: Since $\sigma$ is unknown for the cafeteria rating, a T-test is used.
Determine the Critical Value(s):
- Degrees of Freedom ( $df$ ): For a one-sample T-test, $df = n - 1$ . For $n = 6$ , $df = 5$ .
- Using Minitab to find critical values (Slide 4):
  - Navigate to Graph > Probability Distribution Plot > View Probability > OK.
  - Select T Distribution and enter Degrees of freedom: 5.
  - Go to Shaded Area, select Probability, choose Both Tails.
  - Enter Probability = 0.05 ( $\alpha$ ).
  - The output shows the critical values. For $df = 5$ and $\alpha = 0.05$ (both tails), the critical values are $\pm 2.571$ . These values define the rejection regions (the tails) and the non-rejection region (the middle area).
Calculate the Test Statistic (T-stat):
- Formula: $t{ ext{stat}} = \frac{\bar{x} - \mu0}{s / \sqrt{n}}$ (where $\bar{x}$ is sample mean, $\mu_0$ is hypothesized population mean, $s$ is sample standard deviation, $n$ is sample size).
- Example Calculations:
 - First, get sample descriptive statistics using Minitab (Stat > Basic Statistics > Display Descriptive Statistics).
 - Sample mean ( $\bar{x}$ ) of $6$ collected ratings: $46$ .
 - Sample standard deviation ( $s$ ) of $6$ collected ratings: $30.8$ .
 - $n = 6$ .
 - $t_{ ext{stat}} = \frac{46 - 55}{30.8 / \sqrt{6}} = \frac{-9}{30.8 / 2.449} \approx \frac{-9}{12.57} \approx -0.716$ .
Make a Decision and Draw a Conclusion:
- Compare the calculated $t_{ ext{stat}}$ to the critical values ( $\pm 2.571$ ).
- If $t{ ext{stat}}$ falls into a rejection region (i.e., less than $-2.571$ or greater than $+2.571$ ), reject $H0$ .
- If $t{ ext{stat}}$ falls into the non-rejection region (between the critical values), do not reject $H0$ .
- Example: The calculated $t_{ ext{stat}} = -0.716$ falls between $-2.571$ and $+2.571$ . It is in the non-rejection region.
- Conclusion: Do not reject $H_0$ at the $5\%$ significance level. (This implies the sample mean of $46$ is not statistically different from the hypothesized mean of $55$ ).

Approach 2: P-Value Approach

This method compares the probability of observing the test statistic (or more extreme) to the significance level.

Steps 1-3: Same as the Critical Value Approach.
Calculate the P-value:
- Use the calculated $t_{ ext{stat}}$ (e.g., $-0.716$ ).
- Using Minitab to find P-value:
  - Navigate to Graph > Probability Distribution Plot > View Probability > OK.
  - Select T Distribution and enter Degrees of freedom: 5.
  - Go to Shaded Area, select X value, choose Left Tail.
  - Enter X value = -0.716.
  - The output gives the probability for the left tail. For $-0.716$ , the left tail probability is approximately $0.253$ .
  - For a two-tailed test, double this probability: P-value = $2 \times 0.253 = 0.506$ .
Make a Decision and Draw a Conclusion:
- Rule:
  - If P-value $\le \alpha$ : Reject $H_0$ .
  - If P-value > \alpha: Do Not Reject $H_0$ .
- Example: P-value ( $0.506$ ) is greater than $\alpha$ ( $0.05$ ).
- Conclusion: Do not reject $H_0$ at the $5\%$ significance level.
- Consistency: This conclusion is consistent with the critical value approach.

Interpretation of P-value (brief): The P-value is the probability of observing a test statistic as extreme as, or more extreme than, the one calculated, assuming the null hypothesis is true. (This definition will be revisited for deeper understanding later).

Approach 3: Confidence Interval Approach

This method determines if the hypothesized population parameter falls within the calculated confidence interval.

Steps 1-3: Same as the Critical Value Approach.
Construct the Confidence Interval (CI):
- Formula (for T-distribution CI for means): $\bar{x} \pm t{ ext{critical}} \times \frac{s}{\sqrt{n}}$ (where $t{ ext{critical}}$ corresponds to the desired confidence level and $df$ ).
- Example Calculations:
 - $\bar{x} = 46$
 - $t_{ ext{critical}} = 2.571$ (for $95\%$ confidence and $df = 5$ , from Step 4 of Critical Value Approach).
 - $s = 30.8$
 - $n = 6$
 - $CI = 46 \pm 2.571 \times \frac{30.8}{\sqrt{6}}$
 - $CI = 46 \pm 2.571 \times 12.57$
 - $CI = 46 \pm 32.32$
 - Confidence Interval: $[13.68, 78.32]$ (lower bound approx. $13.7$ , upper bound approx. $78.3$ ).
Make a Decision and Draw a Conclusion:
- Rule:
 - If the hypothesized population mean ( $\mu0$ ) falls within the confidence interval: Do Not Reject $H0$ .
 - If the hypothesized population mean ( $\mu0$ ) falls outside the confidence interval: Reject $H0$ .
- Example: The hypothesized population mean ( $\mu_0 = 55$ ) is within the calculated $95\%$ confidence interval $[13.68, 78.32]$ .
- Conclusion: Do not reject $H_0$ at the $5\%$ significance level.
- Consistency: This conclusion is consistent with both the critical value and P-value approaches.

Using Minitab for All-in-One Hypothesis Testing

Minitab can perform all these calculations simultaneously, providing the T-stat, P-value, and confidence interval directly.
Steps:
1. Go to Stat > Basic Statistics > 1-Sample t...
2. Select Summarized data (or One or more samples, each in a column if you have raw data).
3. Enter the Sample size (n), Sample mean (X bar), and Standard deviation (S).
4. Check Perform hypothesis test and enter the Hypothesized mean (Mu) ( $\mu_0$ ).
5. Click Options:
  - Set the Confidence level (e.g., $95\%$ for $\alpha=5\%$ ).
  - Set the Alternative hypothesis to Mean not equal to hypothesized mean for a two-sided test.
6. Click OK.
Output: Minitab will display the T-value ( $t_{ ext{stat}}$ ), P-value, and the confidence interval.
Missing Information: Minitab's 1-Sample t function does not directly provide the critical values. These must be obtained separately using Graph > Probability Distribution Plot.
Educational Value: Understanding the manual steps behind each approach is crucial before relying solely on software outputs.

Practice Example: Exam Average (One-Sample T-test)

Scenario: Instructor believes the class average for the first exam is $85$ .
Hypotheses:
- $H_0: \mu = 85$
- $H_1: \mu \ne 85$
Sample Data: A sample of $n = 10$ students yields a sample mean $\bar{x} = 83.4$ and a sample standard deviation $s = 5.7$ .
Significance Level: $\alpha = 7\%$ (or $93\%$ confidence level).
Test Type: T-test (since population standard deviation is unknown).
Minitab Application (Stat > Basic Statistics > 1-Sample t... with summarized data):
- Sample size: 10
- Sample mean: 83.4
- Standard deviation: 5.7
- Hypothesized mean: 85
- Options: Confidence level: 93.0, Alternative hypothesis: Mean not equal to hypothesized mean.
Minitab Results:
- T-Value ( $t_{ ext{stat}}$ ): $-0.89$
- P-Value: $0.395$
- Confidence Interval ( $93\%$ CI): $[79.80, 86.99]$ (rounded)
Conclusions from Each Approach:
1. P-value Approach: P-value ( $0.395$ ) > \alpha ( $0.07$ ).
  - Decision: Do not reject $H_0$ at the $7\%$ level.
2. Confidence Interval Approach: The hypothesized mean ( $\mu_0 = 85$ ) falls within the $93\%$ CI $[79.80, 86.99]$ .
  - Decision: Do not reject $H_0$ at the $7\%$ level.
3. Critical Value Approach:
  - Degrees of Freedom: $df = n - 1 = 10 - 1 = 9$ .
  - Finding Critical Values (Minitab: Graph > Probability Distribution Plot): For $df=9$ and $\alpha = 0.07$ (both tails), the critical values are $\approx \pm 2.073$ .
  - Comparison: The calculated $t_{ ext{stat}} = -0.89$ falls between $-2.073$ and $+2.073$ (non-rejection region).
  - Decision: Do not reject $H_0$ at the $7\%$ level.
Overall Conclusion: All three approaches consistently lead to the conclusion to not reject the null hypothesis that the class average is $85$ .