Study Notes on Hypothesis Testing
Hypothesis Testing in Statistics
Understanding Hypothesis Testing
Definition: Hypothesis testing is a statistical method used to make inferences or draw conclusions about a population based on sample data.
Key Terms:
Sample Data: A subset of a population that is collected for analysis.
Population Parameter: A value that represents a characteristic of an entire population (e.g., population mean, population standard deviation).
Sample Mean ( ar{x} ): The average of the sample data being analyzed.
Standard Deviation ( ext{σ} ): Measures the dispersion or variation of a set of values in a population.
Common Errors in Estimation
Marginal Error: The largest error that can occur when estimating a population from a sample.
Example: If the sample mean is significantly different from the expected population parameter, it may indicate a margin of error.
Point Estimate: Refers to using a single sample to estimate a population parameter.
It represents the best guess but does not provide a range of confidence about the estimate.
Steps in Hypothesis Testing
State the Null Hypothesis ( H_0 ): This is a presumption that there is no effect or no difference, and it serves as a starting point for statistical testing.
State the Alternative Hypothesis ( H_1 ): This hypothesis represents what we aim to prove, indicating that there is an effect or a difference.
Determine the Sample Statistics: Collect sample data and calculate statistics such as sample mean ( ar{x} ) and sample size ( n ).
Perform the Statistical Test: Using the data, apply a statistical test (like a z-test or t-test) based on the scenario and population standard deviation (if known).
Make a Decision: Compare the p-value obtained from the statistical test to a significance level (usually ext{α} = 0.05 ). Either reject H_0 or fail to reject H_0 .
Conclusion: Report the findings in the context of the original research question.
Simplifying Concepts
Alternative Way to Phrase: Instead of "is my sample data," one might say, "Does the sample provide strong evidence for the alternative hypothesis?" This emphasizes looking for evidence rather than simply comparing values.
Confidence Intervals for Means Using T-Distribution
Constructed a 95% confidence interval for average student sleep hours using a sample of 50 students
Sample mean: 7.98 hours, standard deviation: 1.19 hours
Used t-distribution with 49 degrees of freedom, t-star value: 2.010
Final interval: 6.8 to 7.5 hours of sleep per night
Formula approach: must write out the formula and show how t-star was found on exams
Calculator method: TInterval function in Stats menu allows using either raw data or summary statistics
Interpreting Confidence Intervals
Correct interpretation: 95% confident that the population mean falls within the interval
Common misconceptions addressed:
The interval does NOT capture 95% of individual data values
The interval does NOT predict where 95% of future sample means will fall
If repeating the procedure 1000 times, about 950 intervals would contain the true population mean
Used baseball player salary example to illustrate these concepts
Margin of Error and Sample Size Calculation
Margin of error for the sleep study interval: 0.34 hours
Calculated as: (upper bound - lower bound) / 2
To find required sample size for a given margin of error, use formula: n = (t-star × s / margin of error)²
Must round up to next whole number
Example: to achieve 90% confidence with 0.25 margin of error, need sample size of 92 students
Use standard deviation from previous sample or range/6 estimate
Introduction to Hypothesis Testing (Not on Exam)
Section 5.4 explicitly stated as not covered on the exam
General framework: null hypothesis (H₀: μ = μ₀) vs alternative hypothesis (H₁: μ ≠ μ₀, μ > μ₀, or μ < μ₀)
Test statistic: t = (x̄ - μ₀) / (s / √n)
P-value found using tcdf function with n-1 degrees of freedom
Example problem: testing if listening to classical music reduces maze completion time from population mean of 40 seconds
Sample of 100 students, mean time 39.1 seconds, standard deviation 4 seconds
Exam Reminders
Must show formulas and calculations when constructing confidence intervals
Can use calculator functions but must specify which function and what values were used
For homework, may need to enter answers to 4 decimal places - use Vars menu to access full precision
Formulas will be provided on exam cheat sheet