Unit 9 - Testing Claims

0.0(0)

Studied by 0 people

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/25

There's no tags or description

Looks like no tags are added yet.

Last updated 10:50 PM on 5/19/26

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No analytics yet

Send a link to your students to track their progress

26 Terms

New cards

significance test

formal procedure for comparing observed data with a claim (hypothesis) abt the population

^assessing a claim abt a parameter w/ data

New cards

null hypothesis

claim we weigh evidence against in a statistical test (H₀, the claim itself)

H₀: parameter = value

New cards

alternative hypothesis

claim abt the population that we are trying to find evidence for (H_a)

H_a: parameter </>/≠ value

^define parameter

New cards

alternative hypotheses:

one-sided vs. two-sided

states larger or smaller than the null hypothesis value (use > or <) _{^{(one-sided b/c on curve either below/above value)}}

vs.

states the parameter is different from the null hypothesis value (in general; could be either larger or smaller) (use ≠) _{^{(“may differ”) (two-sided b/c on curve shows above and below the value. ex: get z=1, do above 1 and below -1)}}

New cards

!!! (flip to see ‘perfect’ significance test steps)
_{^{hypotheses made}}_^before_{^{collect data}}

H₀ and H_a are essentially opposites (only 1 can be true)

hypotheses always refer to a population (use population parameter like p, μ)

an outcome that would rarely happen if the null hypothesis were true is good evidence that the null hypothesis is not true

_{^{need multiple samples of evidence to be convincing (simulate many many times)}}

don’t say ‘this proves []’ or ‘the null hypothesis is correct’

careful w/ what n is
if simulate N samples of size n, n is the sample size to use! also, if give σ from this data, use this value!
if know σ, use z and σ for mean! (1-mean z test!) → (x̄-μ)/(σ/√n)

New cards

P-value

the probability (computed assuming H₀is true) that the statistic would take a value as extreme or more extreme than the one actually observed
^!probability of getting a particular sample statistic if null is true

small P-value = good evidence against the null b/c observed result is unlikely to happen when H₀ is true

small P-value → not likely to happen, reject H₀ → convincing evidence for H_a

large P-value → likely to happen, fail to reject H₀ → not convincing evidence for H_a(doesn’t mean you accept the null, just that there’s no convincing evidence against it)

New cards

significance level α

fixed value that we regard as decisive that is used to compare against the P-value
^default use 0.05
^want lower significance level if extreme consequence e.g. death (so less likely Type I error and harder to reject H₀ in case it’s true)

New cards

statistically significant

P-value < α

^supports H_a

P-value < α → supports H_a, reject H₀ → convincing evidence for H_a

P-value ≥ α → fail to reject H₀ → not convincing evidence for H_a

New cards

Interpret P-value

>/<: Assuming the (H₀ in context is parameter=[] or lower/greater (opposite of H_a!)), the probability of getting a sample (statistic) of [sample statistic value] or greater/smaller (match H_a) by chance is (P-value).

≠: Assuming the (H₀ in context is parameter=[]), the probability of getting a sample (statistic) that is at least as different from (H₀ parameter value) as the proportion/mean in the sample is (P-value).

*assuming H₀is true _{^{^can say sample statistic/proportion/mean}}

New cards

Conclusion for significance test

Since P-value = [], which is </> α = [], we reject/fail to reject H₀. We do/don’t have convincing evidence to support (H_a in context).
_{^{(only need to say H0, don’t have to define)}}

New cards

Type I vs. Type II error

reject null when it is actually true (probability of Type I error = α as a percent) (believe H_a over H₀) (false positive, say H_a is true but it is false)
^find convincing evidence of H_a incr/decr when it really hasn’t

vs.

fail to reject the null when it is false (believe H₀ over H_a) (false negative, deny H_a but it is true)
^don’t find convincing evidence for H_a incr/decr when it really has
^probability is #s of type II error (# of false negatives) over the # of times when H0 is false

BOTH: say the error and the consequence(s)

New cards

Relate α and probability of Type I error

incr α, incr probability of Type I error, more likely to reject H₀

decr α, less probability of Type I error, less likely (aka harder) to reject H₀ _{^{(b/c making test stricter and need stronger evidence to reject null & show statistical significance)}}

^probability is # of false positives over # of times H0 is true

New cards

Conditions for significance test abt a proportion

randomness (ex: SRS) (say ‘random sample’ and quote the Q)
10% condition (n≤0.1N)
LCC (np≥10 & n(1-p)≥10) (use p (H₀ value) NOT p-hat! _{^{would use p-hat for confidence interval}})
- don’t meet LCC? - look at distribution of sample data, see relatively normal/no strong skew or outliers (say “since [] shows relative normality/no strong skew, I can proceed w/ z test using caution”)

New cards

test statistic _{^(proportion)}

measures how far a sample statistic diverges from what we would expect if H₀ is true, in standardized units (a z-score)

test statistic = (statistic - parameter)/SD of statistic

z = (p̂-p)/√[(p(1-p)/n]

^parameter is μ_p̂ (which is p), SD of statistic is σ_p̂from sampling distribution of proportion

<p>measures how far a sample statistic diverges from what we would expect if H<sub>0</sub> is true, in standardized units (<u>a z-score</u>)</p><p>test statistic = (statistic - parameter)/SD of statistic</p><p><span style="color: rgb(255, 255, 255);"><mark data-color="yellow" style="background-color: yellow; color: inherit;">z = (p̂-p)/√[(p(1-p)/n]</mark></span></p><p>^parameter is μ<sub>p̂</sub> (which is p), SD of statistic is σ<sub>p̂ </sub>from sampling distribution of proportion</p>

New cards

<p>Significance Test Steps <br><sub><sup>*see how to draw curve, p0 is null, says p but is same for μ</sup></sub></p>

Significance Test Steps
_{^{*see how to draw curve, p0 is null, says p but is same for μ}}

State

write hypotheses, define parameter (p, μ) & significance level α

Plan

check conditions (random sample - quote what the question says, 10% - say ‘reasonable to assume (n≤0.1N)’ if N is words/description)
name the proper inference procedure/test (“all conditions met to proceed with 1-proportion z-test”; “all conditions met to proceed with 1-mean t-test”)

Draw curve for significance test (shade area that matches H_a)
- proportion: get z-score, can use standard normal curve & label (N(0,1). Draw mean of 0, SDs of 1, and tick w/ z-score)
- mean: get t, draw curve with 0 in middle as mean, label curve as t with degrees of freedom as subscript (ex: t₃₅). Put mean 0, SDs of 1, & tick is t
show work to get test statistic (write out formula for z or t & w/ #s plugged in) and P-value (write probability statement that matches the curve/H_a → ex: probability statement for p-value:
for Ha of <: P(z>2)
for Ha of >: P(z<2)
for Ha of ≠: P(z<-|2| or z>|2|) = 2P(z>|2|)
^do same for t; can be any # you get for t/z
- proportion: use normalcdf to get P-value. If H_a says less than, then test statistic is upper. if H_a says greater than, then test statistic is lower #)
  - ex: H_a: p > 0.37, get test statistic of 1.3, do normalcdf (1.3, 100, 0, 1)
- _{^{OR use [stat] Tests ‘1-PropZTest’ (plug in null hypothesis value for p0, x for # of things, n for sample size (x/n is the proportion), prop is what Ha is) to}}_{^{get p (p-value)}}_{^{, p̂ (x/n),}}_{^{z (test statistic)}}_{^{, n (again) + says Ha at the top of the screen)}}
- mean: use tcdf to get P-value (only use values of t!!!)

Conclude

answer question by comparing P-value and α

New cards

<p>!!! ≠ is vague, don’t know if > or < → use confidence interval for two-sided hypothesis (≠) (confidence intervals can be used for one-sided or two-sided alternative hypotheses)<br>^z* use invNorm w/ area as C% as a decimal<br>^t* use invT w/ area as only one tail → (1-C% as decimal)/2<br>^α determines confidence level C% → C% = 100(1-α)%<br>^CI has null value -> fail to reject bc H<sub>0</sub> is plausible<br>^CI doesn't have null value -> statistically significant, can reject null</p><p>careful for H<sub>a</sub> - read question if says less than(<)/greater than(>)/different (≠)</p><p>for mean, use t-distribution w/ n-1 degrees of freedom (NOT standard normal curve like for proportion)</p><p>conservative with degrees of freedom, <u>round down</u></p><p>don’t use absolutes e.g. prove</p>

!!! ≠ is vague, don’t know if > or < → use confidence interval for two-sided hypothesis (≠) (confidence intervals can be used for one-sided or two-sided alternative hypotheses)
^z* use invNorm w/ area as C% as a decimal
^t* use invT w/ area as only one tail → (1-C% as decimal)/2
^α determines confidence level C% → C% = 100(1-α)%
^CI has null value -> fail to reject bc H₀ is plausible
^CI doesn't have null value -> statistically significant, can reject null

careful for H_a - read question if says less than(<)/greater than(>)/different (≠)

for mean, use t-distribution w/ n-1 degrees of freedom (NOT standard normal curve like for proportion)

conservative with degrees of freedom, round down

don’t use absolutes e.g. prove

New cards

power of a test

against a specific alternative is the probability that the test will reject H₀ at α given the H_a is true
^probability of correctly rejecting H₀ when H₀ is false/probability of finding convincing evidence for H_a when H_a is true

interpret power: probability of correctly rejecting H₀ when it is false is (power), and can conclude that the true p/μ (in context) is (H_a)
+given that (H_a w/ context), if apply test on repeated samples of the same size, about (power %) of the samples we would expect to correctly reject the null in favor of the alternative

power = 1-β (β is probability of making a Type II error (fail to reject H₀ when H₀ is false))

*want higher power so more likely to correctly reject the null

New cards

What affects power?

Sample size (incr n, incr power)
α (incr α, incr power b/c more likely to reject null)
value of the alternative parameter AKA difference btwn hypothesized and true mean (incr this, incr power)

New cards

How to avoid type I and type II error

decr α (probability of type I error)

vs.

incr α
_{^{^b/c α is likelihood to reject null. for type I want to decr α so less likely to reject null in case it’s true. for type II want to incr α so more likely to reject null in case it’s false.}}

New cards

How to get p-value for H_a: [] ≠ #

do 2 times the p-value you get from _{^1-PropZTest/}normalcdf/tcdf _{^{(2 times the probability of just one side)}}(b/c don’t know if > or <, so want both sides (as much difference since use z or t))

<p>do 2 times the p-value you get from <sub><sup>1-PropZTest/</sup></sub>normalcdf/tcdf <sub><sup>(2 times the probability of just one side) </sup></sub>(b/c don’t know if > or <, so want both sides (<em>as much</em> difference since use z or t))</p><p></p>

New cards

Conditions for significance test abt mean

random sample (quote the Q)
10% (n≤0.1N)
(CLT:) pop. has normal distrib. OR n≥30 OR distrib. of sample data has no strong skew/outliers

New cards

test statistic _^(mean)

t = (statistic-parameter)/SD of statistic
^use s_x if have sample SD and no population SD

<p>t = (statistic-parameter)/SD of statistic <br>^use s<sub>x</sub> if have sample SD and no population SD </p>

New cards

paired data

study designs that involve making 2 observations on the same individual (count n by # of individuals) OR one observation on each of 2 similar individuals (count n by # of pairs, like a block!)

^for comparative studies, for mean difference
^use a matched pairs design!

New cards

paired t procedures

paired data from measuring the same quantitative variable twice _{^{(usually test a product/program/etc, compare results, for mean difference) (after check conditions)}}

New cards

!!! smaller significance level α -> need stronger evidence to reject the null hypothesis
higher power gives better chance to detect a difference when it really exists.
at any significance level/desired power, need larger sample to detect a small diff btwn null and alternative parameter values than detecting a large diff

statistically significant -> good evidence of a difference (can be very small) _{^{(for mean difference)}}
^large sample -> even small deviation from null can be significant

statistical significance has meaning (only) if you decide what diff ur seeking, design a study to search for it, and use a significance test to weigh the evidence u get

most important condition for sound conclusions from statistical inference is that the data comes from a well-designed random sample OR randomized experiment

statistical test more likely to find a significant increase in mean [] if very large sample and use 5% sig level _{^{(5% b/c less strict) (large sample incr statistical power, making tests highly sensitive to even trivial differences)}}

use t for test abt pop mean b/c z requires u know pop SD of σ

New cards

What to know about mean difference

null usually H₀: μ = 0. look at what u are subtracting to see what the H_a is _{^{(ex: x-y, want y to be bigger, so Ha: μ < 0)}}
when define parameter, write what you are subtracting
interpret P-value: assuming the mean diff is 0 _{^{(the avgs are the same for both things you are doing subtraction with)}}, the probability of getting a mean difference of [] or smaller/bigger (match H_a) is [P-value]