1/24
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Hypothesis tests for a population mean (o known)
Hypothesis, test statistic, P-value, statistical significance
Two sided tests and confidence intervals
A hypothesis test has different goal than confidence intervals
Parameter of interest:
& evidence
u, the true population mean
Our goal is to determine where there is strong enough evidence to support this claim
Our evidence comes from sample data
Our evidence is x-bar, the sample mean
Claims
On population mean, u
We can never prove that a claim is correct
We can’t know for sure that u has any particular value without actually calculating it
But we try to reach our conclusions with a reasonably high probability of being correct
“True mean”
Population mean
“If the True mean daily vitamin C intake is 75 mg, then what is the probability of observing a sample mean at least as low as 73mg?”
We assume that the population mean really is 75 (u = 75),
Then we know X ~ N → X-bar ~N(u, o/sqrtn
P(X-bar <= 73) = P(Z <= (73 - u⁰)/(o/sqrtn)
so if the True mean daily vitamin C intake of all female Canadians was 75mg, the probability of observing a sample mean at least as low as 73mg would be 30.85% (“observation is likely to occur again”)
So our evidence that u < 75 is not very strong!
Therefore the evidence is insufficient to support the doctors claim
IMPORTANT! We would need a sample mean x-bar even lower than 73mg to be convinced that u < 75,
30.85% chance that 73 is the mean therefore u could be … what?? Huh??? probability needs to be lower.
We are NOT concluding that u =75, and that the doctors claim that u < 75 is wrong
We just don’t have strong enough evidence for us to be convinced that u < 75
u⁰
assumed value of u
… “Is it possible that the True mean speed at the intersection really is 60km/h, and we observed a sample mean as high as 66km/hr purely by chance?”
We need to ask: if the True mean speed of motorists at the intersection is 60km/hr, then what is the probability of observing a sample mean at least as high as 66km/hr?
→ IMPORTANT NOTE: we look for P(X-bar > 66) (because a number in continuous variables have zero chance of happening.)
If The probability is low, then we have strong evidence in support of the parent councils claim
If it is unlikely we observed a sample mean this high purely by chance, then it’s reasonable to conclude that the True mean u really is higher than 60
→ if possible (1%), then True mean is new mean?
If the probability is high, then we have weak evidence in support of the parent councils claim
If it is likely that we observes a sample mean this high purely by chance, then this evidence is not strong enough to conclude that the True mean u is higher than 60.
If we assume that u = 60, then by central limit theorem, we know
X-bar ~dot N(60, 15/sqrt50)
…0.0023
So if the True mean speed of vehicles at this intersection was 60 km/hr, then the probability of observing a sample mean speed at least as high as 66 would only be 0.23%
So observing a sample mean as high as we did is extremely unlikely
In other words: if u = 60, then observing a sample mean at least as high as 66 is extremely surprising
So our evidence that u > 60 is very strong!
So there is sufficient evidence to support the parent councils claim. It is reasonable to conclude that u > 60, and so a red light camera is installed.
There is always a possibility that we will be wrong in our conclusion
Maybe the True mean really is 60, and we just happened to have an exceptional sample of unusually fast vehicles!!! The probability of this happening is low (0.23%), but not impossible!
However we are able to conclude in favour of the councils claim (that u > 60) with a high level of certainty
The foundation and main idea lf hypothesis testing can be summarized as follows:
“If our initial assumption were true, then how likely would it be to observe an estimate this extreme?” , and
“An outcome that would rarely occur if an assumption were true, is good evidence that the assumption is not true”
We are now ready to formalize the process of hypothesis testing, with new vocabular
Alternative hypothesis
The statement making the claim we are trying to support is called the alternative hypothesis, denoted Ha.
In this unit, the alternative hypothesis is always an inequality in terms of u. this could look like:
Ha: u > u0, or
Ha: u < u0, or
Ha: u /= u0
a hypothesis test is assessing the evidence in favour of the alternative hypothesis… what
e.g. Ha: u < 75
H0 → null hypothesis. the statement being tested in a hypothesis test
this is our initial assumption
in this unit, the null hypothesis is always an equality in terms of u.
H0 : u = u0
H0 is always a statement of “no difference“ or “no effect“
the hypothesis test is assessing the strength of the evidence against the null hypothesis
whart
e.g H0: u = 75
if we assume that the null hypothesis (H0) is true, then the probability of observing a sample mean x-bar at least as high/low/extreme as the one observed is called the P-value of the test
high if Ha : u > u0
low if Ha : u < u0
extreme if Ha : u /= u0
a low P-value means we have strong evidence against the null hypothesis/ in favour of the alternative hypothesis
.3085 is not a low enough P-value to reject H0 (u = 75) and to conclude that u < 75 (i.e conclude in favour of Ha)
in the speeding vehicle example, the P-value was 0.0023. we concluded this was low enough to reject our initial assumption that u = 60 (reject H0) and conclude that u > 60 (i.e. conclude in favour of Ha)
Q: how low does our P-value need to be to reject the null hypothesis in favour of the alternative hypothesis
A: it depends! before we perform a hypothesis test, we choose the level of significance of our test, denoted a(fish a)
if the P-value is less than or equal to “fish”, then we will reject H0 in favour of Ha
if the P-value is greater than fish, then we fail to reject H0
thus fish is the maximum P-value for which the null hypothesis will be rejected
low fish →we need stronger evidence to reject null hypothesis
level of significance (in irl…)
if something is “high stakes“, then you require strong evidence: choose a low value of fish →e.g. meds & vaccines
if something is “low stakes“, then you don’t need evidence to be that strong: higher value fish is okay
… @ course, significance level is given…
common values of fish are 0.10(10%), 0.05(5%), and 0.01(1%)
we rarely would use a value higher than 0.10 → (for the same reason that we almost never use a confidence level less than 90%)
hypothesis test steps, “P-value method“
state lvl of significance (fish) >°))))彡
statment of hypothesis, H0 and Ha
statement of the decision rule (also known as the rejection rule)
the decision rule (rejection rule) is the precise statement of what must happen in order for rus to reject the null hypothesis
→ for this method, decision rule is always “reject H0 if P-value <= >°))))彡“
calculation of the test statistic
the test statistic provides a measure of the compatability between the null hypothesis and our data
in this unit, our test statistic will always be Z:
→z = (x-bar -u0)/(o/sqrtn)… z-score…
calculate the P-value
conclusion
hypothesis testing (P-value mehod) example (test of significance ig)
perform a hypothesis test with a 5% level of significance for the car intersection example
solution:
state the level of significance
let >°))))彡 = 0.05
interpretation: “we will be willing to conclude in favour of the council (i.e. that u > 60) only if the P-value is less than or equal to 0.05“
statement of hypotheses
H0 : u = 60 vs Ha : u > 60
in words:
H0: “The true mean speed at the intersection is equal to the posted limit, and no red light camera is needed.”
Ha: “The true mean speed at the intersection is greater than the posted limit, and a red light camera is needed.”
statement of the decision rule
reject H0 if the P-value <= (fish, significance level) = 0.05
calculation of test statistic
z score using (x-bar - u0)/(o/sqrtn)
e.g. z = 2.83
“so he sample mean that we observed is 2.83 std. deviations above our assumed population mean“
calculate P-value
P-value = P(Z >= 2.83)
= 1-P(Z < 2.83)
= 1 - 0.9977
= 0.0023
interpretation: ”if the true mean speed of vehicles at this intersection was 60, the probability of observing a sample mean at least as high as 66 would be 0.0023.”
conclusion
since the P-value = 0.0023 < significance level = 0.05, we reject the null hypothesis in favour of the alternative. At a 5% level of significance, we have sufficient statistical evidence to conclude that the true mean speed of motorists at the intersection is greater than 60km/hr
we call this example a right-sided test, since our alternative hypothesis is of the form Ha: u > u0
statistical significance
results that lead to the rejection of a null hypothesis are said to be statistically significant. yar
statistical significance is an effect so large that it would rarely occur by chance alone
for this reason, statistical hypothesis tests are also referred to as tests of significance. yar
hypothesis test:
perform a hypothesis test for the vitamin C example. use fish = 0.01
state the level of significance
let fish = 0.01
interpretation: “we will be willing to conclude in favour of the doctors claim only if the P-value is less than or equal to 0.01” i.e. that u < 75
statement of hypotheses
H0: u = 75 vs Ha : u < 75
H0: the true mean daily vitamin C intake efor all Canadian females is equal to the recommended amount of 75mg
Ha: the true mean daily vitamin C intake for all Canadian females is less than the recommended amount of 75mg.
statement of the decision rule
reject H0 if P-value <= fish = 0.01
“*notice: the decision rule is always the same! the only change is the value of 2”
calculation of test statistic
z = (x-bar - u0)/(o/sqrtn)
z = -0.50
so the sample mean we observed is 0.50 std. deviations below our assumed population mean
calculate the P-value
P-value = P(Z <= -0.50) = 0.3085
“if the true average daily vitamin C intake for Canadian females was 75mg, the probability of observing a sample mean at least as low as 73mg would be 0.3085”
conclusion
since the P-vlaue = 0.3085 > significance level = 0.01, we fail to reject the null hypothesis. at the 1% level of significance, we have insufficient statistical evidence that the true mean vitamin C intake for Canadian females is less than 75mg
IMPORTANT! note: the conclusion has a very specific format you must follow
left-sided test, as the alternative hypothesis is the form of Ha: u < u0
we do not conclude that u = 75
notice we never conclude H0 is true. All we can say is that we do not have enough evidence to reject the null hypothesis
“NEVER SAY “ACCEPT H0“!!!” whart
P-values for left and right-sided tests
the only significant difference between left- and right-sided hypothesis tests are the alternative hypothesis, and how we calculate the P-value
a sample rule for remembering the difference between calculating the P-value for left- and right-sided tests:
the direction of the inequality in the P-value is the same as the direction of the inequality in the alternative hypothesis
Ha: u < u0 >= P-value = P(Z < test stat.)
Ha: u > u0 >= P-value = P(Z > test stat.)
IMPORTANT!: template for conclusion (P-value method)
since the P-value = (Ha) (</>) sig. level, we (reject/fail to reject) H0
at the (#) level of significance, we have (sufficient/insufficient) evidence that (Ha) is true
two-sided tests - whart
the last type of hypothesis test well look at is a two-sided test. this is a hypothesis test where the alternative hypothesis is of the form:
Ha : u /= u0
in this case, the P-value is the probability of observing a value of the sample mean at least as extreme (in either direction), (given that the null hypothesis is true)
“at least as extreme“means “at least as far away from our hypothesized population mean, u0 “
we find the P-value by doubling the probability of the Left or right of our test statistic z (whichever probability is lower, i.e. whichever side is the “tail“)
(double left or right to find P-value)
z = test statistic
e.g. bottling company, underfilled/ overfilled
two sided hypothesis test: bottling comapany
stement of level of significance
let level of significance = 0.05
statement of hypotheses
H0 : u = 500 vs Ha : u /= 500
H0: the true mean volume of root beer in all bottles in the shipment is 500 ml
Ha: the true mean volume of root beer in all bottles in the shipment differes from 500ml
statement of the decision rule
reject H0 if P-value <= sig level = 0.05
calculation of test statistic
z = (x-bar - u0)/(o/sqrtn)
z = 1.81
calculate the P-value
2P(Z >= 1.81) = 1(1 - P(Z < 1.81))
= 2(1 - 0.9649)
= 2(0.0351)
= 0.0702
if the true mean fill volume was 500ml, the probability of observing a sample mean at least as extreme (as far away from u0 = 500) as 502ml would be 0.0702
conclusion
since the P-value = 0.702 > sig. level = 0.05, we goal to reject H0. at the 5% level of significance, we have insufficient evidence that the true mean of all bottles in the shipment differs from 500ml.
template for P-value Interpretation
“if the (null hypothesis) were true, the probability of observing a sample mean at least as high/low/extreme as (sample mean observed) would be (P-value)“
high = for right-sided test
low = for left-sided test
extreme = for 2-sided test
P-value calculations
to summarize: the P-value for testing H0: u = u0 vs:
Ha: u > u0 is P(Z > z)
Ha: u < u0 is P(Z < z)
Ha: u /= u0 is 2P(Z > |z|)
disregard |z| if you don’t know. just remember the P-value is double the tail area
these P-values are exact if the population is normal, and approximate for large sample sizes in other case (by the CLT)
verifying results
notice that in the root beer example, the company doesn't want to see the null hypothesis be rejected! →maybe take another sample? or make changes to the fish?
…. no! unethical. already chose the sig. level. already got the population mean and sample mean, std. dev, normal distribution. easy.