Looks like no one added any tags here yet for you.
Hypothesis Testing - Informally
If we want to investigate a theory, we can get data and analyze to see whether it supports or refutes our theory
If we find undeniable evidence that supports a theory of ours, we can say that the theory is likely true. However, much of the time, we do not have undeniable evidence
In biostatistics, we can only talk about how much evidence we have, and we need to make a decision based on that evidence
If we find a lot of evidence, we can say that the theory is likely true
Absence of evidence is not evidence of absence
Hypothesis Testing
In hypothesis testing, a specific statement or hypothesis is generated about a population parameter
Two competing hypotheses are generated about the unknown population parameter
One reflects no difference, no change, no association, no effect, etc…
This is called the null hypothesis: H0
Default null is 0
Other null values are possible, e.g., when looking at the bioavailability of a new generic drug as compared to the known bioavailability of a branded drug
The other reflects the investigator’s belief: research or alternative hypothesis: H1 or HA
The null alternative hypotheses are set before we collect the data
Then, sample data are analyzed, sample statistics are used to assess the likelihood that are hypothesis is true and determined to support or refute the research hypothesis
Note that we CANNOT know with certainty whether the null hypothesis is true or not
All we can say is whether there is enough evidence in favor of rejecting the null hypothesis or of failing to reject the null hypothesis
We get at that using a p-value
This reflects how likely is it to observe the sample data or something more extreme if the null hypothesis was true
This is a conditional probability
How surprised are you to see these data if the null is true
If the p-value is large, you are not surprised, so you fail to reject the null
If the p-value is small, you are very surprised, so you reject the null
We need to determine a threshold or cutoff point (called the critical value) to decide when to favor rejecting the null hypothesis or failing to reject
In hypothesis testing, we select critical value from a sampling distribution
This is done by first determining what is called the level of significance, denoted α
Probability of making the following error: null is true but out conclusion is to reject it
α reflects the probability that we reject the null hypothesis (in havor of the alternative) when it is actually true:
α = P(Reject H0 | H0 is true)
The usual value for α is 0.05, or 5%
But this can be 0.01, 0.1, 0.0167, etc
If we select α = 0.05, we are allowing a 5% probability of incorrectly rejecting the null hypothesis in favor of the alternative when the null is true
Hypothesis Testing Techniques
One sample
Continuous
Two independent samples
Continuous
Two dependent, matched samples
Continuous
More than two independent samples
Continuous
One sample
Dichotomous
Two independent samples
Dichotomous
More than two independent samples
Dichotomous
One sample
Categorical or ordinal (more than 2 response options)
Stwo or more independent samples
Categorical or ordinal
p-values
We are determining how likely the data we observed would be an extreme case given that the null hypothesis is true
A p-value is the estimated probability of observing a statistic value as extreme or more extreme than the one we actually observed given that the null is true (aka under the null)
A small p-value indicated that the statistic we have observed would be unlikely when the null hypothesis is true
That leads us to doubt the null
In such a case, we reject the null hypothesis
A large p-value just tells us that we have insufficient evidence to doubt the null hypothesis
In particular, it does not prove the null to be true
In such a case, we fail to reject the null hypothesis
Setting up hypotheses
Hypothesis are set up before any data are collected
The research of alternative hypothesis can take one of three forms
Parameter has increased: H1: µ > µ0, where µ0 is the comparator or null value and an increase is hypothesized—this type of test is called an upper-tailed test
Parameter has decreased: H1: µ < µ0, where a decrease is hypothesized—this is called a lower-tailed test
Parameter has changed: H1: µ ≠ µ0, where a difference is hypothesized—this is called a two-tailed test
We are interested in deviations in either direction away from the hypothesized parameter value
The p-value from a two-sided test is always double the p-value from a one-sided test
The exact form of research hypothesis depends on the investigator’s belief (possibly increased, decreased, or is different from the null value) about the parameter of interest