Significance Testing and Hypothesis Testing

Testing the null hypothesis.
Example: Comparing leaf sizes from the Southwest (SW) and Northwest (NW).
Null Hypothesis (H₀): Leaf size SW < Leaf size NW.
Data samples:
- SW: 2.01, 1.35, 4.5, 4.32, 5.34, 1.2
- NW: 1.32, 2.63, 3.45, 7.51, 6.35, 6.45
Location and Spread Estimators (Sample):
- Mean: $\bar{x} = \frac{1}{n} \sum x_i$
- Variance: $s^2$
LsD (Leaf size Difference) = $\bar{x}<em>{SW} - \bar{x}</em>{NW}$

Definition: A numerical summary that reduces the data to one value and which values we know (distribution) under the null hypothesis.
LsD = $\bar{x}<em>{SW} - \bar{x}</em>{NW} = -1.3$
Question: How certain are we LsD is indeed negative and we did not get this value just out of chance?
We can find the P(SD=-1.3)!
Remember, we just need to know the area under the SD distribution curve up to that value.

Estimate all possible values of the statistic. Data samples for SW and NW are repeated with Lsd1, Lsd2 … Lsdn.
Use a statistic with known pdf (statistical distributions).
- t: uses the Student t distribution
- F: uses the F distribution
- Z: uses the Standard normal distribution
- $\chi^2$ : uses the chi-square distribution
- Mention of Bayesian vs. Frequentist approaches.

Null (H₀): There is NO association between class and survival.
Alternative (Hₐ): There IS an association between class and survival.
Expectation if the null hypothesis is true: Same proportion of people would have died in each class!

Used to compare two population means.
- Paired data: same individuals studied at two different times or under two conditions (PAIRED T-TEST).
- Independent: data collected from two separate groups (INDEPENDENT SAMPLES T-TEST).

Normality:
- Plot histograms: one plot of the paired differences for any paired data; two (one for each group) for independent samples.
- Should be roughly symmetric.
Equal Population variances:
- Compare sample standard deviations: one should be no more than twice the other.
- Do a formal test for differences – F-test, Levene’s test, Fligner-Killeen test, etc.
The t-test is very robust to violations of the assumptions of Normality and equal variances, particularly for moderate (i.e., >30) and larger sample sizes.
Resource: www.statstutor.ac.uk

Charts can be used to informally assess whether data is normally distributed or skewed.
The mean and median are very different for skewed data.

Histograms showing frequency distributions.
Examples include distributions of weight loss for placebo and treatment groups, and a new drug.
Question: Do these histograms look approximately normally distributed?

There are alternative tests which do not have these assumptions.
Independent t-test: Use Mann-Whitney test if histograms of data by group are not normal.
Paired t-test: Use Wilcoxon signed-rank test if the histogram of paired differences is not normal.

To test against the null hypothesis, we first calculate a statistic.
A statistic is a numerical summary that reduces our data under the null hypothesis to one value.
In frequentist statistics, we use statistics whose possible values under the null hypothesis are known.
The researcher is the one who establishes the level of confidence for the test against H₀, by deciding a value of alpha (either 0.05 or 0.01).
The decision on failing to accept H₀ is usually done by comparing the p-value against the set threshold alpha (α).