Unit 9 - Testing Claims

0.0(0)
Studied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/25

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 10:50 PM on 5/19/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

26 Terms

1
New cards

significance test

formal procedure for comparing observed data with a claim (hypothesis) abt the population

^assessing a claim abt a parameter w/ data

2
New cards

null hypothesis

claim we weigh evidence against in a statistical test (H0, the claim itself)

H0: parameter = value

3
New cards

alternative hypothesis

claim abt the population that we are trying to find evidence for (Ha)

Ha: parameter </>/≠ value

^define parameter

4
New cards

alternative hypotheses:

one-sided vs. two-sided

states larger or smaller than the null hypothesis value (use > or <) (one-sided b/c on curve either below/above value)

vs.

states the parameter is different from the null hypothesis value (in general; could be either larger or smaller) (use ≠) (“may differ”) (two-sided b/c on curve shows above and below the value. ex: get z=1, do above 1 and below -1)

5
New cards

!!! (flip to see ‘perfect’ significance test steps)
hypotheses made before collect data

H0 and Ha are essentially opposites (only 1 can be true)

hypotheses always refer to a population (use population parameter like p, μ)

an outcome that would rarely happen if the null hypothesis were true is good evidence that the null hypothesis is not true

need multiple samples of evidence to be convincing (simulate many many times)

don’t say ‘this proves []’ or ‘the null hypothesis is correct’

careful w/ what n is
if simulate N samples of size n, n is the sample size to use! also, if give σ from this data, use this value!
if know σ, use z and σ for mean! (1-mean z test!) → (x̄-μ)/(σ/√n)

knowt flashcard image
6
New cards

P-value

the probability (computed assuming H0 is true) that the statistic would take a value as extreme or more extreme than the one actually observed
^!probability of getting a particular sample statistic if null is true

small P-value = good evidence against the null b/c observed result is unlikely to happen when H0 is true

small P-value → not likely to happen, reject H0 → convincing evidence for Ha

large P-value → likely to happen, fail to reject H0 → not convincing evidence for Ha (doesn’t mean you accept the null, just that there’s no convincing evidence against it)

7
New cards

significance level α

fixed value that we regard as decisive that is used to compare against the P-value
^default use 0.05
^want lower significance level if extreme consequence e.g. death (so less likely Type I error and harder to reject H0 in case it’s true)

8
New cards

statistically significant

P-value < α

^supports Ha

.

P-value < α → supports Ha, reject H0 → convincing evidence for Ha

P-value ≥ α → fail to reject H0 → not convincing evidence for Ha

9
New cards

Interpret P-value

>/<: Assuming the (H0 in context is parameter=[] or lower/greater (opposite of Ha!)), the probability of getting a sample (statistic) of [sample statistic value] or greater/smaller (match Ha) by chance is (P-value).

≠: Assuming the (H0 in context is parameter=[]), the probability of getting a sample (statistic) that is at least as different from (H0 parameter value) as the proportion/mean in the sample is (P-value).

*assuming H0 is true ^can say sample statistic/proportion/mean

10
New cards

Conclusion for significance test

Since P-value = [], which is </> α = [], we reject/fail to reject H0. We do/don’t have convincing evidence to support (Ha in context).
(only need to say H0, don’t have to define)

11
New cards

Type I vs. Type II error

reject null when it is actually true (probability of Type I error = α as a percent) (believe Ha over H0) (false positive, say Ha is true but it is false)
^find convincing evidence of Ha incr/decr when it really hasn’t

vs.

fail to reject the null when it is false (believe H0 over Ha) (false negative, deny Ha but it is true)
^don’t find convincing evidence for Ha incr/decr when it really has
^probability is #s of type II error (# of false negatives) over the # of times when H0 is false

.

BOTH: say the error and the consequence(s)

12
New cards

Relate α and probability of Type I error

incr α, incr probability of Type I error, more likely to reject H0

decr α, less probability of Type I error, less likely (aka harder) to reject H0 (b/c making test stricter and need stronger evidence to reject null & show statistical significance)

^probability is # of false positives over # of times H0 is true

13
New cards

Conditions for significance test abt a proportion

  • randomness (ex: SRS) (say ‘random sample’ and quote the Q)

  • 10% condition (n≤0.1N)

  • LCC (np≥10 & n(1-p)≥10) (use p (H0 value) NOT p-hat! would use p-hat for confidence interval)

    • don’t meet LCC? - look at distribution of sample data, see relatively normal/no strong skew or outliers (say “since [] shows relative normality/no strong skew, I can proceed w/ z test using caution”)

14
New cards

test statistic (proportion)

measures how far a sample statistic diverges from what we would expect if H0 is true, in standardized units (a z-score)

test statistic = (statistic - parameter)/SD of statistic

z = (p̂-p)/√[(p(1-p)/n]

^parameter is μ (which is p), SD of statistic is σfrom sampling distribution of proportion

<p>measures how far a sample statistic diverges from what we would expect if H<sub>0</sub> is true, in standardized units (<u>a z-score</u>)</p><p>test statistic = (statistic - parameter)/SD of statistic</p><p><span style="color: rgb(255, 255, 255);"><mark data-color="yellow" style="background-color: yellow; color: inherit;">z = (p̂-p)/√[(p(1-p)/n]</mark></span></p><p>^parameter is μ<sub>p̂</sub> (which is p), SD of statistic is σ<sub>p̂ </sub>from sampling distribution of proportion</p>
15
New cards
<p>Significance Test Steps <br><sub><sup>*see how to draw curve, p0 is null, says p but is same for μ</sup></sub></p>

Significance Test Steps
*see how to draw curve, p0 is null, says p but is same for μ

State

  • write hypotheses, define parameter (p, μ) & significance level α

Plan

  • check conditions (random sample - quote what the question says, 10% - say ‘reasonable to assume (n≤0.1N)’ if N is words/description)

  • name the proper inference procedure/test (“all conditions met to proceed with 1-proportion z-test”; “all conditions met to proceed with 1-mean t-test”)

Do

  • Draw curve for significance test (shade area that matches Ha)

    • proportion: get z-score, can use standard normal curve & label (N(0,1). Draw mean of 0, SDs of 1, and tick w/ z-score)

    • mean: get t, draw curve with 0 in middle as mean, label curve as t with degrees of freedom as subscript (ex: t35). Put mean 0, SDs of 1, & tick is t

  • show work to get test statistic (write out formula for z or t & w/ #s plugged in) and P-value (write probability statement that matches the curve/Ha → ex: probability statement for p-value:

    for Ha of <: P(z>2)

    for Ha of >: P(z<2)

    for Ha of ≠: P(z<-|2| or z>|2|) = 2P(z>|2|)

    ^do same for t; can be any # you get for t/z

    • proportion: use normalcdf to get P-value. If Ha says less than, then test statistic is upper. if Ha says greater than, then test statistic is lower #)

      • ex: Ha: p > 0.37, get test statistic of 1.3, do normalcdf (1.3, 100, 0, 1)

    • OR use [stat] Tests ‘1-PropZTest’ (plug in null hypothesis value for p0, x for # of things, n for sample size (x/n is the proportion), prop is what Ha is) to get p (p-value), p̂ (x/n), z (test statistic), n (again) + says Ha at the top of the screen)

    • mean: use tcdf to get P-value (only use values of t!!!)

Conclude

  • answer question by comparing P-value and α

16
New cards
<p>!!! ≠ is vague, don’t know if &gt; or &lt; → use confidence interval for two-sided hypothesis (≠) (confidence intervals can be used for one-sided or two-sided alternative hypotheses)<br>^z* use invNorm w/ area as C% as a decimal<br>^t* use invT w/ area as only one tail → (1-C% as decimal)/2<br>^α determines confidence level C% → C% = 100(1-α)%<br>^CI has null value -&gt; fail to reject bc H<sub>0</sub> is plausible<br>^CI doesn't have null value -&gt; statistically significant, can reject null</p><p>careful for H<sub>a</sub> - read question if says less than(&lt;)/greater than(&gt;)/different (≠)</p><p>for mean, use t-distribution w/ n-1 degrees of freedom (NOT standard normal curve like for proportion)</p><p>conservative with degrees of freedom, <u>round down</u></p><p>don’t use absolutes e.g. prove</p>

!!! ≠ is vague, don’t know if > or < → use confidence interval for two-sided hypothesis (≠) (confidence intervals can be used for one-sided or two-sided alternative hypotheses)
^z* use invNorm w/ area as C% as a decimal
^t* use invT w/ area as only one tail → (1-C% as decimal)/2
^α determines confidence level C% → C% = 100(1-α)%
^CI has null value -> fail to reject bc H0 is plausible
^CI doesn't have null value -> statistically significant, can reject null

careful for Ha - read question if says less than(<)/greater than(>)/different (≠)

for mean, use t-distribution w/ n-1 degrees of freedom (NOT standard normal curve like for proportion)

conservative with degrees of freedom, round down

don’t use absolutes e.g. prove

17
New cards

power of a test

against a specific alternative is the probability that the test will reject H0 at α given the Ha is true
^probability of correctly rejecting H0 when H0 is false/probability of finding convincing evidence for Ha when Ha is true

interpret power: probability of correctly rejecting H0 when it is false is (power), and can conclude that the true p/μ (in context) is (Ha)
+given that (Ha w/ context), if apply test on repeated samples of the same size, about (power %) of the samples we would expect to correctly reject the null in favor of the alternative

power = 1-β (β is probability of making a Type II error (fail to reject H0 when H0 is false))

*want higher power so more likely to correctly reject the null

18
New cards

What affects power?

  • Sample size (incr n, incr power)

  • α (incr α, incr power b/c more likely to reject null)

  • value of the alternative parameter AKA difference btwn hypothesized and true mean (incr this, incr power)

19
New cards

How to avoid type I and type II error

decr α (probability of type I error)

vs.

incr α
^b/c α is likelihood to reject null. for type I want to decr α so less likely to reject null in case it’s true. for type II want to incr α so more likely to reject null in case it’s false.

20
New cards

How to get p-value for Ha: [] ≠ #

do 2 times the p-value you get from 1-PropZTest/normalcdf/tcdf (2 times the probability of just one side) (b/c don’t know if > or <, so want both sides (as much difference since use z or t))

<p>do 2 times the p-value you get from <sub><sup>1-PropZTest/</sup></sub>normalcdf/tcdf <sub><sup>(2 times the probability of just one side) </sup></sub>(b/c don’t know if &gt; or &lt;, so want both sides (<em>as much</em> difference since use z or t))</p><p></p>
21
New cards

Conditions for significance test abt mean

  • random sample (quote the Q)

  • 10% (n≤0.1N)

  • (CLT:) pop. has normal distrib. OR n≥30 OR distrib. of sample data has no strong skew/outliers

22
New cards

test statistic (mean)

t = (statistic-parameter)/SD of statistic
^use sx if have sample SD and no population SD

<p>t = (statistic-parameter)/SD of statistic <br>^use s<sub>x</sub> if have sample SD and no population SD </p>
23
New cards

paired data

study designs that involve making 2 observations on the same individual (count n by # of individuals) OR one observation on each of 2 similar individuals (count n by # of pairs, like a block!)

^for comparative studies, for mean difference
^use a matched pairs design!

24
New cards

paired t procedures

paired data from measuring the same quantitative variable twice (usually test a product/program/etc, compare results, for mean difference) (after check conditions)

25
New cards

!!! smaller significance level α -> need stronger evidence to reject the null hypothesis
higher power gives better chance to detect a difference when it really exists.
at any significance level/desired power, need larger sample to detect a small diff btwn null and alternative parameter values than detecting a large diff

statistically significant -> good evidence of a difference (can be very small) (for mean difference)
^large sample -> even small deviation from null can be significant

statistical significance has meaning (only) if you decide what diff ur seeking, design a study to search for it, and use a significance test to weigh the evidence u get

most important condition for sound conclusions from statistical inference is that the data comes from a well-designed random sample OR randomized experiment

statistical test more likely to find a significant increase in mean [] if very large sample and use 5% sig level (5% b/c less strict) (large sample incr statistical power, making tests highly sensitive to even trivial differences)

use t for test abt pop mean b/c z requires u know pop SD of σ

26
New cards

What to know about mean difference

  • null usually H0: μ = 0. look at what u are subtracting to see what the Ha is (ex: x-y, want y to be bigger, so Ha: μ < 0)

  • when define parameter, write what you are subtracting

  • interpret P-value: assuming the mean diff is 0 (the avgs are the same for both things you are doing subtraction with), the probability of getting a mean difference of [] or smaller/bigger (match Ha) is [P-value]