stats ch. 7 and 8 (population vs. sample, confidence interval, hypothesis testing)

0.0(0)
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/26

flashcard set

Earn XP

Description and Tags

Sadistic Torture Across Two Semesters

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

27 Terms

1
New cards

What is a population?

Group of individuals we wish to study (eg. all students at WVC).

2
New cards

What is a parameter?

Either a population proportion or population mean. Represented by p (proportion) or μ (mean). eg. proportion of all WVC students who work part-time.

3
New cards

What is a census?

Example of surveying entire population. However, this is usually unrealistic because the population is too big.

4
New cards

What is a sample?

Collection of individuals taken from population of interest.

5
New cards

What is a statistic?

Number of “yeses” / sample size. You have to have calculated something from the sample, like median or sample proportion. Represented by p-hat.

6
New cards

Which value can never be found, statistic or parameter?

Parameter. You can find the value of a statistic by collecting data, but you can only make inferences about the parameter (generalizations about the whole population).

7
New cards

What is sampling bias?

You get a sample that doesn’t accurately represent the whole population. eg. you survey a very opinionated population

8
New cards

What is voluntary-response bias?

Sampling bias where people only respond if they feel strongly about the results (miku fans will respond to surveys about how much they like miku. non-miku fans won’t fill it out at all)

9
New cards

What is nonresponse bias?

People asked to do the survey refuse to fill it out. (might ask for uncomfortable info.)

10
New cards

How do you get a random number from the calculator?

MATH → PRB → 5 → randInt (1, population size, amount of results you want)

11
New cards

What is the difference between accuracy and precision?

Accurate → how close you are to the target value → measured by how unbiased you are → fixed by getting a random sample

Precise → how close the values are to one another → measured by size of standard error (the smaller the better) → fixed by getting a larger sample

12
New cards

How do you calculate the population proportion based on the sampling distribution?

Mean of sampling distribution (p-hat) is equal to population proportion.

13
New cards

Key to all letters used in stats?

p = population proportion, yeses / total population. Usually a decimal like 0.6. → alternatively, in 1-PropZTest (hypothesis test), it represents p-value.

p̂ = sample proportion, yeses / sample size. Also called p-hat. Usually a decimal like 0.6

x = amount of “yeses”

n = total sample size

Po = assumed population proportion in the null hypothesis. This is what you guess before you do the experiment.

H0 = null hypothesis

HA = alternative hypothesis

μ = mean

x-bar = sample mean

σ = population standard deviation

s = sample standard deviation

α = significance level

14
New cards

Standard deviation vs. standard error?

Standard deviation = variation of your sample around the mean

Standard error = you take the mean of a bunch of samples. Then you calculate the standard deviation of those means.

15
New cards

Criteria for CLT (Central Limit Theorem)?

CLT tells you if distribution is Normal, if it is, only then can you run tests on it

  1. Random sample

  2. Large sample (at least 10 yeses and 10 nos)

  3. Large population (at least 10x sample size)

16
New cards

What is a confidence interval?

Interval containing the most plausible values for a parameter. Written like (point estimate) ± (margin of error).

17
New cards

What is margin of error?

z-score * standard error

Multiply the following SE by z-score to get margin of error:

99% confidence level = 2.58 standard errors

95% confidence interval = 1.96 (shortcut method: 2) standard errors

90% confidence interval = 1.645 standard errors

80% = 1.28 standard errors

18
New cards

What is a confidence level?

Probability that the confidence intervals created with this process contain the true parameter. Basically: if I create a bunch of confidence intervals, what % of them capture the true value?

Does NOT apply to a single confidence interval. That one either captures the true value (100%) or doesn’t (0%). Confidence level, on the other hand, is usually 0.95.

19
New cards

What should you keep in mind for sample size?

Always round up to the nearest whole number

20
New cards

Which proportions can be used to draw conclusions?

Population proportion. Never sample proportion.

21
New cards

How do you interpret a confidence interval for two populations?

(+,+) → Population 1 is significantly larger

(-,-) → Population 2 is significantly larger

(-,+) → No significant difference between populations (contains 0)

22
New cards

What is a significance level?

How okay you are with making a mistake. Usually 0.05, given by alpha (α).

23
New cards

What is a test statistic?

How many standard errors the observed proportion is above/below the null hypothesis. The higher it is, the more evidence you have against the null. Represented by z (like z-score).

Only use if the data passes CLT.

24
New cards

What is a p-value?

How likely the data is to be the same as expected. Represented by p.

Small p-value → large z-test statistic → data isn’t likely to be the same as expected → reject the null

Large p-value → small z-test statistic → data is pretty likely to be the same as expected → don’t reject the null

25
New cards

What is the relationship between p-value and significance level?

p < significance level → enough evidence to reject the null

p > significance level → not enough evidence. don’t reject the null

26
New cards

What are the 4 steps for hypothesis testing?

  1. Write the null and alternative hypotheses

  2. Choose a significance level and check CLT

  3. Run 1-PropZTest and find the z (test statistic) and p (p-value).

  4. Interpret that you either reject or fail to reject the null hypothesis

27
New cards

What are the “tailed” tests?

Right tailed test: Result is bigger than expected (p > Po). The right part of the normal curve is shaded, representing the p-value.

Two-tailed test: Result is not equal to what is expected (p ≠ Po). The p-value is doubled, and is shaded on the end of both sides of the normal curve.

Left-tailed test: Result is smaller than expected (p < Po). The left part of the normal curve is shaded, representing the p-value.