Hypothesis testing By Dr. Ama
Hypothesis Testing
Definition: An act in statistics whereby an analyst tests an assumption regarding a population parameter. (Source: Investopedia)
Instructor: Dr. Ama Jayawardana, Senior Lecturer, General Sir John Kotelawala Defense University.
Fundamentals of Hypothesis Testing
Hypothesis testing is a technique for interpreting and drawing inferences about a population based on sample data.
It helps determine which sample data best support mutually exclusive population claims.
Hypotheses
Null Hypothesis (H0): The assumption that the event will not occur. It has no impact unless rejected.
Symbol: H0 (pronounced H-naught)
Alternate Hypothesis (H1 or Ha): The logical opposite of the null hypothesis.
t Tests
Definition: A t test is a statistical test used to compare the means of two groups.
Purpose: Used in hypothesis testing to determine if a treatment has an effect on the population or if two groups differ significantly.
t Test Example
Scenario: Testing if the mean petal length of iris flowers differs by species.
Procedure: Measure 25 petals from two different species and apply a t test with null and alternative hypotheses.
H0: The true difference between group means is zero.
Ha: The true difference is not zero.
When to Use a t Test
Conditions:
Data must be independent and approximately normally distributed.
Groups must exhibit homogeneity of variance (similar variance).
t Test Formula
t = (x1 - x2) / (s2 / n1 + s2 / n2)
Where:
t = test statistic
x1, x2 = means of the groups
s2 = pooled standard error
n1, n2 = number of samples in each group.
Sample Data for t Test
Group | Sample Size (n) | Average (X̄) | Std Dev (s) |
|---|---|---|---|
Women | 10 | 22.29 | 5.32 |
Men | 13 | 14.95 | 6.84 |
Calculating Differences
Difference in averages: (22.29 - 14.95) = 7.34.
Pooled Standard Deviation
Calculate pooled variance: For example, use:
816.55 / 21 = √38.88 = 6.24.
Test Statistic Calculation
Combine calculated values:
Final formula used:
t = 7.34 / (√(6.24 × (1/10 + 1/13))) = 2.80.
Finding Degrees of Freedom and Alpha Level
Degrees of Freedom (df): n1 + n2 - 2 = 21.
Significance level (α = 0.05). The critical t value for df=21 is 2.080.
Conclusion of t Test
Compare t value to critical t value: If t > 2.080, reject H0.
Conclusion: Significant difference in body fat between men and women.
Chi-square Test
Definition: A statistical procedure to determine the difference between observed and expected data, or to examine the relationship between categorical variables.
Types of Chi-Square Tests
Independence: Examines if two categorical variables are related.
Goodness-of-Fit: Determines if a variable is likely from a certain distribution.
Steps for Chi-Square Test of Independence
Null Hypothesis: There is no relationship between variables.
Calculate expected values based on the sample distribution and total counts.
Calculate the Chi-square statistic.
Example Data
Gender and education levels across a survey of 395 individuals.
Calculation of Test Statistic
Use the formula: χ² = Σ(O−E)²/E.
Compare χ² value to critical value to determine result significance.
One-Way ANOVA
Definition: A statistical method used to compare means across multiple groups.
Key Points:
Dependent variable must be continuous.
Independent variable must be categorical.
Assumes normality and homogeneity of variance.
Process Steps for One-Way ANOVA
Calculate group means and overall mean.
Calculate SSR (Regression Sum of Squares) and SSE (Error Sum of Squares).
Fill in ANOVA table using totals obtained.
Compare F statistic to critical F value to draw conclusions.
Example Interpretation
F test statistic results and comparing to critical F values guide null hypothesis acceptance or rejection.
Practical Application
Chi-square and t tests are steps to assess relationships and effects in real-world scenarios, like analyzing apple weights or political preferences.
Conclusion
Research methods in statistics like t tests and Chi-square tests are vital tools in drawing inferences from data.