Stats Midterm 2

0.0(0)

Studied by 0 people

0.0(0)

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/44

There's no tags or description

Looks like no tags are added yet.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

45 Terms

New cards

What is the multiplication rule

multiply the chances of separate events happening to see the chance of them happening together

New cards

How to know if events are independent

if P(A | B) = P(A) and vice versa

New cards

How to find the chance that two independent events will happen together

multiply their unconditional probabilities

New cards

When are draws independent/dependent?

independent = drawing with replacement

dependent = drawing without replacement

New cards

when are events mutually exclusive

when they cannot happen at the same time

New cards

What is the paradox of Chevalier de Méré

if the chance of an event is hard to find, find the probability of the opposite event and subtract it from 100%

New cards

Frequency interpretation

the long-run relative frequency of an event happening if you repeated the experiment many, many times under the same conditions

New cards

Subjective interpretation

Probability measures a person’s degree of belief or confidence in an event, given the information they have

ex: There is a 70% chance it will rain tomorrow - there are not many trials of tomorrow, it’s based on their confidence in the event

New cards

what is the binomial coefficient (equation and words)

number of draws! / successes! failures!

How many ways can we choose k objects out of n when the order of them doesn’t matter

New cards

what is the binomial formula (equation and words)

\frac{n!}{k!\left(n-k\right)!}p^{k}\left(1-p\right)^{n-k} The chance that an event will occur exactly k times out of n

New cards

4 criteria for binomial formula

fixed # of trials
two possible outcomes per trial
trials are independent
probability of success remains constant

New cards

what is a hypergeometric problem and what is the formula

r red balls, g green balls, n is the number of trials, k is the number of successes, just like a binomial but without replacement

\frac{\frac{g}{k}\frac{r}{n-k}}{\frac{g+r}{n}}

New cards

expected value of x =

the sum of (xi * the probability x=xi)

New cards

What is SE and what is its difference from SD

SE tells us how much we can expect x to differ from E(X). SE is the longrun SD of x

New cards

What is the covariance if x and y are independent

New cards

what is the law of averages

after many tosses, the number of heads and tails should be the same

New cards

how to calculate chance error

actual value - expected value

New cards

What to ask when making a box model

what numbers go in the box

how many of each kind

how many draws are there

New cards

expected value for sum of draws made with replacement =

number of draws x average of the box

New cards

SE for a sum of draws

√number of draws x (SD of box)

New cards

expected value of a single draw =

average of the box

New cards

how to calculate SD when a list only has two numbers

(big - small) x √fraction w big x fraction w small

New cards

expected value of the sum without replacement

the same as with replacement as long as the number of draws is less than the total number of tickets

New cards

Glivenko Cantelli Theorem

with more and more draws, an empirical histogram gets closer to the probability histogram

New cards

central limit theorem

with more and more draws, the probability histogram of the sum goes to the normal curve (the number of draws varies depending on how lopsided the box is)

New cards

How to use the normal approximation to get the chance of getting between 45 and 55 heads, for example?

find z scores for 44.5 and 55.5 and find the area between them

New cards

difference between statistics and parameters when sampling

statistics are known, parameters are to be discovered…parameters are estimated by statistics

New cards

what is non response bias

when many sampled people don’t respond to survey and therefore are not represented in the data

New cards

what is quota sampling

fixed number of subjects to interview per category (like age, race, gender, etc). Within the assigned quotas, anybody can be chosen (not random)

New cards

what is multistage cluster sampling

an extension of cluster sampling in that, first, clusters are randomly selected and, second, sample units within the selected clusters are randomly selected

higher SE than an SRS

New cards

what is response bias

questions are leading

New cards

% of a group in a sample =

% of that group in the population + chance error

New cards

what does the size of the chance error depend on

the size of the sample, not the size of the population

New cards

what is the chance error equal to in an SRS

the standard error

New cards

what is a correction factor

used to calculate SE when drawing w/o replacement - multiply SE w/ replacement by correction factor

New cards

what is a convenience sample

generic term for non-random sample

New cards

what is a stratified sample

population divided into groups by age, race, or other. an SRS is taken for each group.

Lower SE than SRS because it ensures similar properties for each separate sample

lower sample variability than SRS

New cards

a confidence interval is a range of what values

sample average ± z * SE

New cards

what is the confidence interval procedure

set up a box model
calculate sample average
calculate approximate SE
look up in a z table what value of z corresponds to the % confidence (% confidence = area under the curve)
calculate confidence interval

New cards

how to interpret a confidence interval

95% of the time you make a confidence interval with this data, it will include the box average

New cards

what is the bootstrap method

substitute unknown population fractions for known sample fractions in the box

New cards

what happens to the SE when you multiply the number of draws by a factor

divides SE by √factor

New cards

what are outliers and what do they do to SD

extreme measurements that could be miracles for their number of SDs away from the average. They inflate the overall SD