Lecture 6 - Randomisation Tests & P-Values

0.0(0)

Studied by 0 people

0.0(0)

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/4

There's no tags or description

Looks like no tags are added yet.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

5 Terms

New cards

Null Hypothesis

A hypothesis that states that there is no difference, no effect, just nothing is going on. Typically, we’d test our models and data against the null hypothesis based on an alternative hypothesis, one that says that there is something going on.

New cards

Randomisation Tests

Assess the significance of an observed result against a just base-minimum randomised result, typically a model that represents the null hypothesis. To conduct one, the steps are:

Deciding on a metric to measure the effect in question (means, mediums, differences, etc)
Based on our observed data, we’d calculate the test statistic (the value we’re investigating)
To manually do a randomisation test, we’d randomly shuffle around the data labels and calculate the new test statistic based on the reshuffled data
1. We’d have to do several times to get a good null hypothesis model
2. The null hypothesis would show us true randomness
Lastly, we’d state the strength of our observed test statistic within the null hypothesis model (how probable is it?)

New cards

P-Value

The probability of the test statistic in the null hypothesis model. It tells us whether the test statistic is more extreme or how much it fits with the null hypothesis model.

The more incompatible it is with the model, the more evidence against the null hypothesis model.

They do not tell us the probability of whether the studied hypotheses are true

So, real world decisions shouldn’t be based on just p-values alone because:

They don’t tell us the magnitude of effect/difference
It doesn’t tell us the importance of a result

Propper inference requires full reporting & transparency

With the p-value, we can state whether we have evidence for or against the null hypothesis.

New cards

Type II Error

False positive, declaring that there is a difference (rejecting null) when there is actually no difference (null is true)

It is determined by the level of significance that we choose (a). The higher the signifiance, the more likely for us to make a Type I error.

New cards

Type II Error

False negative, delcaring that there is no difference (accepting null) when there is actually a difference (null is false).

The power of the test is the probability that we do correctly reject the null hypothesis when the alternative hypothesis is it true,

Let β = P(don't reject null when null is false). So 1 - β would give us P(reject null when null is false) which is also our power
The power of the test is also determined by sample size, effect size (size of difference), significance value, and experimental error variance

Type II and Type I errors are inversely related, reducing the chances of a Type I error increases the chance of Type II and vice versa