Inference for categorical data

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/22

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

23 Terms

1
New cards

Confidence interval levels

90%

1.645

95%

1.960

99%

2.576

99.5%

2.807

2
New cards

What does the confidence interval do

Demonstrates the amount of times the CI would capture the population parameter with other samples

→ When stating: “we are C% confident that the true proportion lies in this interval”

→ CI does NOT indicate probability; it’s equation is p̂ ± ME

3
New cards

Margin of error

The extents on either side of the mean determined by critical value × Standard Error

4
New cards

One proportion z-interval

Used when we make a claim about a single population

5
New cards

Assumptions

Independence: randomization and n </= 10% of the population

Sample size: np̂ > 10, n(1 - p̂) > 10

6
New cards

Steps to construct a confidence interval

  1. Identify parameter (what is p)

  2. Identify the procedure (test type)

  3. Check conditions

  4. Calculate CI

  5. Interpret the interval in context

7
New cards

If margin of error is decreasing

either Z-score or SE is too small

→ to fix this, we either decrease the CI or increase n

→ If we don’t know, √p̂(1 - p̂) is solved using min n value needed to achieve ME and CI

8
New cards

If the population is approx normal

The sampling distribution of the mean is also approx normal regardless of sample size (CLT)

9
New cards

Two types of statistical inference

  1. Confidence Interval (used when our goal is to estimate a population parameter)

  2. Test of significance (used when our goal is to assess the evidence regarding a claim about the population)

10
New cards

What is a test of significance

→ A formal procedure for comparing observed data with a claim (hypothesis), and then finding the truth

→ hypothesis is a statement about the parameter (either μ or p̂)

→ result is expressed in probability which measures how well the data and claim agree (p-value)

11
New cards

Steps to conducting a test of statistical significance

  1. State hypotheses (H₀ and Ha; they are mutually exclusive)

  2. Identify procedure and verify conditions (same conditions)

  3. Calculate the test statistic and find p-value

  4. Interpret p-value and conclude

12
New cards

What is a test statistic

A numerical value calculated from sample data; it shows how closely your observed data matches the distribution under null hyp - we also use it to calculate p-value

13
New cards

What is the p-value

The probability of observing sample data that is as extreme or more extreme as the obtained stat

→ the degree of confidence which we can reject H₀

14
New cards

Interpreting p-value

Always reference that we calculate p-value with the assumption that H₀ is true

→ answer the question “what does p suggest?” - include it in your conclusion

15
New cards

Type I and Type II error

Type I: we reject H₀ even though it’s true

Type II: we fail to reject H₀ even tho it’s false

16
New cards

P(error)

P(Type I error) = α (sig lvl)

P(Type II error) = β = 1 - power (β is hard to assess because we don’t know what the value of the parameter really is)

17
New cards

Rejecting vs failing to reject

If p =/< α, we reject H₀

If p > α, we fail to reject H₀

18
New cards

Significance levels

α = 0.01, 0.05 or 0.10

→ use 0.05 unless specified otherwise

19
New cards

What different p-values mean

p > 0.10: weak/no evidence against H₀

0.05 < p =/< 0.10: moderate evidence against H₀

0.01 < p =/< 0.05: strong evidence against H₀

p =\< 0.01: very strong evidence against H₀

20
New cards

Power

The probability that a test will correctly reject a false H₀

21
New cards

Factors that affect power

  • sample size increases (more data = more likely to make the correct choice in both scenarios) - also decreases SD

  • Sig lvl increases (higher α = higher probability of p < α or rejecting null H₀)

  • SE decreases (less variability, more chance of finding convincing evidence)

  • The true parameter is farther away from H₀ (it’s easier to find evidence with a large difference)

22
New cards

Constructing and interpreting CI for diff of proportions

  1. State hypothesis and identify sig lvl (clearly identify parameters)

  2. Identify procedure and check conditions (use p̂c (combined sucesses/combined observations) → all 4 must be over 10

  3. Calc test stat

  4. Interpret p-value and conclude

23
New cards

If the CI contains 0 for diff of proportions

There is strong evidence that there is not a significant difference in proportion