Hypothesis Testing: One Sample Test (Part 1)
Hypothesis Testing: One Sample Test (Part 1)
Introduction to Hypothesis Testing
A hypothesis is defined as a claim about a population parameter. This parameter could be, for example, the population mean ().
Null Hypothesis (): This is the statement being tested. It typically represents a situation of no effect or no difference, and always includes an equality sign (, , or ). For example, .
Alternative Hypothesis (): This is the statement that one wants to prove. It contradicts the null hypothesis and never includes an equality sign. It indicates a difference or an effect.
Two-tailed test: The alternative hypothesis states that the population parameter is not equal to the hypothesized value (). This means the rejection region is split into two tails of the distribution.
One-tailed test: The alternative hypothesis states that the population parameter is less than the hypothesized value (H1: \mu < \mu0) or greater than the hypothesized value (H1: \mu > \mu0). The rejection region is entirely in one tail of the distribution.
To make a decision on whether to reject or not reject the null hypothesis, three main approaches can be used, and they should all lead to the same conclusion:
Critical Value Approach
P-Value Approach
Confidence Interval Approach
Critical Value Approach
This approach involves comparing the calculated test statistic with pre-determined critical values based on the significance level.
Steps for the Critical Value Approach:
State the appropriate null and alternative hypotheses:
(where is the hypothesized population mean)
(This represents a two-tail test, where we are interested if the mean is different from ).
Specify the desired level of significance () and the sample size ():
The significance level () is the probability of rejecting the null hypothesis when it is actually true (Type I error). A common choice is .
The sample size () is determined by the data collected.
Determine the appropriate technique:
If the population standard deviation () is unknown (which is very common), a t-test is typically used.
If is known, a z-test would be used (though less frequent in practice).
Determine the critical values:
For a t-test, the critical t () values are found using the significance level () and the degrees of freedom ().
Degrees of freedom (): For a one-sample t-test, (where is the sample size).
For a two-tailed test with , one would look for in each tail. The critical values would be .
How to find t-critical values (e.g., using Minitab):
Navigate to Graph > Probability Distribution Plot > View Probability.
Select the t-distribution and specify the degrees of freedom ().
Choose Define Shaded Area By > Probability.
Select Both Tails and enter the significance level (e.g., for ).
Minitab will display the critical values () marking the boundaries of the rejection regions.
Collect the data and compute the test statistic ():
The formula for the t-test statistic for one sample is: where:
is the sample mean
is the hypothesized population mean from
is the sample standard deviation
is the sample size
P-Value Approach
This approach compares the calculated p-value directly with the significance level.
Interpreting the P-Value Approach:
P-value: The probability of obtaining a test statistic equal to or more extreme than the observed sample value, assuming the null hypothesis () is true.
Decision Rule:
If p-value , then reject .
If p-value , then do not reject .
Finding p-value (e.g., using Minitab):
Go to Graph > Probability Distribution Plot > View Probability.
Choose the t-distribution and specify the degrees of freedom ().
Select Define Shaded Area By > X Value.
Shade both tails beyond the absolute value of the calculated test statistic ().
The total shaded area represents the p-value.
Confidence Interval Approach
This approach uses the confidence interval calculated from the sample data to make a decision about the null hypothesis.
Interpreting the Confidence Interval Approach:
Decision Rule:
If the confidence interval does not contain the hypothesized population mean () from the null hypothesis (), then reject at the level of significance.
If the confidence interval does contain , then do not reject at the level of significance.
Confidence Interval Formula: For a t-distribution, the confidence interval for the mean is typically calculated as: .
Conclusion on Approaches
All three approaches—Critical Value, P-Value, and Confidence Interval—are mathematically equivalent and should always lead to the same conclusion regarding the rejection or non-rejection of the null hypothesis.
Two-Tail t Test Using Minitab
For performing a one-sample t-test in Minitab:
Minitab Navigation for One-Sample t-Test:
Go to Stat > Basic Statistics > 1-Sample t…
In the dialog box, select the appropriate option for your data (e.g., 'One or more samples, each in a column').
Check 'Perform hypothesis test'.
Enter the 'Hypothesized mean' () corresponding to your null hypothesis.
Click Options…:
Set the 'Confidence level' (e.g., for ).
Select the 'Alternative hypothesis' (e.g., 'Mean hypothesized mean' for a two-tailed test).
Applying Each Approach with Minitab/Manual Calculation:
Test Statistic: The t-statistic formula is . Use t in almost all cases; use z only if the population is explicitly given.
1. Critical Value Approach:
Find at with (as described in L08, Slide 4/Page 4 of the transcript).
Reject if |t{STAT}| > t_{critical}.
2. P-Value Approach:
Using Minitab's 'Probability Distribution Plot', choose the t-distribution with .
Shade both tails beyond . The total shaded area will be the p-value.
Reject if p-value .
3. Confidence Interval Approach:
The confidence interval is calculated as .
Reject if (the hypothesized mean) is outside this confidence interval.
Introduction to Hypothesis Testing
Imagine you are a detective examining a claim about a population (e.g., "the average height of students is 170 cm").
A hypothesis is this claim about a population parameter, like the population mean ().
The Null Hypothesis () is your default assumption or the "status quo." It's the statement you assume to be true until proven otherwise. It always includes an equality sign (, $, or ). Think of it as "innocent until proven guilty." For example, cm.
The Alternative Hypothesis () is what you want to prove. It contradicts the null hypothesis and never includes an equality sign. It suggests a difference or an effect exists.
Two-tailed test: You're looking for any difference (e.g., cm). Your evidence could be in both extremes (tails) of the distribution, meaning the average height could be significantly lower or significantly higher than 170 cm.
One-tailed test: You're specifically looking for a difference in one direction (e.g., H1: \mu < 170 cm or H1: \mu > 170 cm). Your evidence would only be considered strong if it falls into one specific extreme (tail).
To decide whether your evidence is strong enough to reject , you can use three interconnected "lenses" or "filters." They are all different ways to look at the same evidence, and they will always lead to the same conclusion:
Critical Value Approach: Setting clear boundaries.
P-Value Approach: Assessing how surprising your data is.
Confidence Interval Approach: Defining a range of plausible values.
Critical Value Approach
This approach is like drawing "rejection lines" on your statistical map before you even look at your specific data's location. If your calculated data point (test statistic) falls outside these lines, you reject .
Steps for the Critical Value Approach (How to Filter and Identify):
State your hypotheses ( and ):
(e.g., "the average is 170 cm")
(e.g., "the average is NOT 170 cm" - this tells you it's a two-tail test, so you'll look for rejection in both directions).
Specify your level of significance () and sample size ():
(e.g., or ): This is your "tolerance for error" or how rare an event must be to be considered significant. It's the probability of mistakenly rejecting when it's actually true (Type I error).
: How many observations you have.
Determine the appropriate technique:
Rule: If the population standard deviation () is unknown (which is almost always the case in real life, as we rarely know everything about a population), use a t-test.
If were known, you'd use a z-test. (Focus on t-test for most scenarios).
Determine the critical values (): These are the "rejection lines." They depend on and your degrees of freedom ().
Degrees of freedom (): For a one-sample t-test, . Think of this as the number of independent pieces of information available to estimate population variability.
For a two-tailed test with , you're splitting the rejection regions into two tails, so you look for in each tail. You'll find a positive and a negative .
Visualization: Imagine a bell curve. These critical values () mark the points beyond which we consider observations too extreme to have occurred by chance if were true.
Collect data and compute the test statistic ():
The formula for your observed data's position on the t-distribution is: where:
is your sample average.
is the average you hypothesized in .
is your sample standard deviation.
is the square root of your sample size.
Intuition: This formula tells you how many standard errors your sample mean () is away from the hypothesized population mean ().
Decision Rule for Critical Value Approach (How to Filter and Identify):
Reject if your absolute test statistic () is greater than the critical value (). In other words, if your data falls beyond the rejection line in either tail, it's too extreme to support . |t{STAT}| > tc.
Do not reject if . Your data is not extreme enough; it falls within the "acceptance region" where is plausible.
P-Value Approach
This approach directly quantifies how surprising your observed data is, assuming is true. It's a continuous measure of evidence against .
Interpreting the P-Value Approach (How to Filter and Identify):
P-value: This is the probability of observing sample data as extreme as, or more extreme than, what you actually got, if the null hypothesis () were true. A small p-value means your data would be very unlikely if were true, suggesting is probably false.
Visualization: Imagine your test statistic () on the bell curve. The p-value is the total area in the tails beyond your test statistic (or its absolute value for a two-tailed test). It's the cumulative probability of observing something at least that extreme.
Decision Rule (How to Filter and Identify):
If p-value , then reject . This means your observed data is so surprising (unlikely) under that you decide must be wrong.
If p-value , then do not reject . Your observed data is not surprising enough to convincingly reject .
Confidence Interval Approach
This approach builds a "net" or a range of plausible values for the true population mean, based on your sample data. You then check if the value hypothesized in falls within this net.
Interpreting the Confidence Interval Approach (How to Filter and Identify):
Confidence Interval (CI): This is a range calculated from your sample data that is likely to contain the true (unknown) population mean () with a certain level of confidence (e.g., ).
Formula: For a t-distribution, it's typically: .
Intuition: "Based on my sample, I am confident that the true average is somewhere between X and Y."
Decision Rule (How to Filter and Identify):
If the confidence interval does not contain the hypothesized population mean () from (i.e., falls outside your net), then reject .
If the confidence interval does contain (i.e., is caught in your net), then do not reject .
Conclusion on Approaches: The Unified Perspective
Always leading to the same conclusion: It's crucial to understand that all three approaches—Critical Value, P-Value, and Confidence Interval—are mathematically equivalent. They're just different ways of visualizing and interpreting the same statistical evidence. If you reject using one, you will reject it using the others. Think of them as three windows looking out at the same landscape.
Two-Tail t Test Using Minitab: Your Tool for Application
Minitab simplifies the calculations, allowing you to focus on the interpretation. Here's how to apply these concepts using Minitab for a one-sample t-test:
Minitab Navigation for One-Sample t-Test:
Go to Stat > Basic Statistics > 1-Sample t…
Select 'One or more samples, each in a column' (if your data is raw) and specify your data column.
Check 'Perform hypothesis test'.
Enter the 'Hypothesized mean' () – this is the value from your (e.g., for the height example).
Click Options…:
Set the 'Confidence level' (e.g., for ).
Select the 'Alternative hypothesis' (e.g., 'Mean hypothesized mean' for a two-tailed test, 'Mean < hypothesized mean' for a left-tailed test, etc.). This ensures Minitab calculates the correct p-value and critical values for your specific .
Applying Each Approach with Minitab/Manual Calculation:
Test Statistic (): Minitab will calculate this for you. Remember the rule: use in almost all cases unless the population standard deviation () is explicitly given.
1. Critical Value Approach (How to identify):
From Minitab (if available graphically) or manually: Find at (for a two-tail test) with Using Minitab's 'Probability Distribution Plot', you can visualize where these boundary lines fall for your chosen .
Decision: Reject if |t{STAT}| > t_{critical} (if your calculated statistic is more extreme than the boundary lines).
2. P-Value Approach (How to identify):
Minitab will directly provide the p-value in its output.
Decision: Reject if p-value (if your data is surprisingly unlikely to have occurred under ).
3. Confidence Interval Approach (How to identify):
Minitab will provide the confidence interval in its output.
Decision: Reject if (your hypothesized mean) is outside this confidence interval (if your hypothesized value is not a plausible value for the true population mean based on your data).