AP Statistics Ultimate Review

5.0(2)

Studied by 9 people

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/90

There's no tags or description

Looks like no tags are added yet.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

91 Terms

New cards

Observational Studies claim….

Correlation! (not causation)

New cards

how to design an experiment

S- start with the subjects

R- randomly assign them (big bag shake well)

T- treatments (state them)

M- measure

E- each subjects response to them

C- compare their

A/P- average or proportion

New cards

Experiments

Can claim causation if they have control, randomization, and repetition

New cards

Types of Experiments

Completely Randomized- each participant is assigned to a treatment group randomly.

Randomized block- participants are divided into blocks based on a trait, and randomization occurs within each block.

Matched Pairs- participants are paired based on similar characteristics, and each receives a different treatment. (think twins)

New cards

What are Factors and Levels

Factors- independent variables (treatments) in an experiment that are manipulated. (Ex. Exercise)

Levels- the different values or conditions of a factor. (ex. High, medium, low)

New cards

Confounding vs. Lurking Variables

Confounding- affect both the treatment and the response, making it difficult to determine the true effect of the treatment.

Lurking- variables that are not included in the study, but may affect the outcome.

New cards

Measures of Spread and Outliers

IQR: middle 50% of the data found by Q3-Q1

Upper Fence/ Outliers: Q3+1.5(IQR)

Lower Fence/ Outliers: Q1-1.5(IQR)

New cards

Normal Model Information

Outliers: Mean ± 2(standard deviation)

New cards

Describing linear association (r-value)

There is a (strength), (positive or negative), linear relationship between x and y (in context)

New cards

interpret the coefficient of determination (r²)

___% of the variation in y (in context) can be explained by the changes in x (in context)

New cards

interpret the slope

for every 1 increase in the x (in context) the predicted y (in context) increases or decreases this much

New cards

interpret the y intercept

when the x is zero the predicted y is this (list x and y in context)

New cards

extrapolation

making a prediction outside of the domain of the provided data. this is dangerous!

New cards

leverage point on LSRL

point far away from the mean of x

New cards

influential point on LSRL

point with high leverage that is not in line with the rest of the data removing it has a significant impact on the slope the LSRL

New cards

Standard deviation of the residuals

average amount the actual y varies from the predicted y in LSRL

New cards

how to compute residual

Actual y- Predicted y

New cards

what does a pattern on a residual plot mean

A linear model is not appropriate

New cards

Understand how to read a computer printout

New cards

To find LSRL equation without a table

slope= r(std. deviation of y/ standard deviation of x)

then set up an equation using x bar and y bar to solve for y intercept.

New cards

Geometric Distribution

go until a success

New cards

Binomial distribution

success out of a set number of trials.

New cards

How to check for independence

P(A)= P(A|B)

New cards

what is given probability/ P(A I B)

P(A and B) / P(B)

New cards

What is the difference between U and Upside down U

U= “or”/addition

Upside Down U= “and”/ multiply

New cards

What is the sum of all probabilities

ONE

New cards

what are disjoint/ mutually exclusive events

Events that cannot happen at the same time.

If events are disjoint then P(A or B)= P(A)+P(B)

New cards

what are independent events

Events where the occurrence of one does not affect the probability of the other. For independent events, P(A and B) = P(A) × P(B).

New cards

If events are not disjoint (“or” formula)

P(A(or B) = P(A) + P(B) - P(A and B)

New cards

If events are not independent (“and” formula)

P(A and B)= P(A) x P(B|A)

New cards

when to use geometric pdf

when looking for the probability of an EXACT place of the first success

New cards

when to use geometric pdf (up to place of success)

to find probability of 1st success happening on or before a certain place

New cards

when to use 1- geometric cdf (lower part you don’t want)

to find probability of 1st success happening on or after a certain place

New cards

when to use binomial pdf

to find probability of an exact number of successes

New cards

when to use binomial cdf (up to and including the # of successes you want)

to find the probability of less than or at most a certain # of successes

New cards

when to use 1- binomial cdf (lower part you do not want)

to find the probability of more than or at least a certain # of successes

New cards

How to find an expected value

multiply each x by their frequency and add all together.

x₁(p₁)+x₂(p₂)+…= M_x

New cards

What is center effected by

addition, subtraction, multiplication, and division

New cards

What are measures of spread (standard deviation) affected by

Multiplication and division

New cards

How do you combine means together

add or subtract them like normal

New cards

how do you combine (add or subtract) standard deviations

std.dev_x+y= sqrt. of ((std. dev x)² + (std.dev y)²)

New cards

describing the shape of a sampling distribution (for proportions)

Shape: unimodal, symmetric

Outliers: N/A

Center: M_{p hat}= P (aka population proportion)

Spread: standard deviation of the population( sqrt. of ((p(1-p)/n)0

New cards

describe the distribution (for means)

Shape: unimodal, symmetric

Outliers: N/A

Center: M_{x bar}= M (aka population mean)

Spread: standard deviation of the population(standard deviation/ sqrt. of n)

New cards

Assumptions and conditions (proportions)

Random- stated or assumed

Independent- n ≤ 10% of population

Large Enough- np ≥ 10

n(1-p) ≥ 10

New cards

Assumptions and conditions (means)

Random- stated or assumed

Independent- n ≤ 10% of population

Large Enough- n >30

New cards

what is PANIC used for and what does it stand for

runs a confidence interval

P- State the parameter(s) (M= or P=)

A- Assumptions and Conditions

N- Name of the interval

I- find/calculate the interval

C- write conclusion

New cards

what to do if n isn’t >30 (means)

make a frequency dot plot and determine if it is

S- Somewhat Symmetric

U- Unimodal

N- No outliers

if it is and other conditions are met then approx. normal model applies

New cards

calculating a one proportion z-interval

statistic ± z* (standard deviation)

New cards

to find critical value (z-distribution)

invnorm(onetail)

ex. z* for 95% confidence interval= invnorm(.025)

New cards

one proportion z-interval in calculator

x: number of successes

n: total number of trials

c-level: desired confidence level

<p><strong>x:</strong> number of successes</p><p><strong>n: </strong>total number of trials</p><p><strong>c-level:</strong> desired confidence level </p>

New cards

two proportion z-interval formula

New cards

two proportion z interval calculator

x1: number of successes

n1: total number of trials

x2: number of successes

n2: total number of trials

c-level: desired confidence level

<p><strong>x1: </strong> number of successes</p><p><strong>n1:</strong> total number of trials</p><p><strong>x2:</strong> number of successes</p><p><strong>n2:</strong> total number of trials</p><p><strong>c-level:</strong> desired confidence level</p>

New cards

Confidence interval 1 proportion conclusion

i am % confident that the true proportion of ____ is between ___% and ___%

New cards

Confidence interval 2 proportion conclusion

i am % confident that the true proportion of ____ is between ___% and ___% lower/higher for __-.

AND

Since zero is not in the interval there is evidence of a significant diff. between ___ and ____

Since zero is in the interval there is not evidence of a significant diff. between ___ and ___.

New cards

Effects of higher confidence level

larger critical value and wider interval

New cards

what to do if no p-value is given

“dont cry use .5”

New cards

What is PHANTOMS and what does it stand for

model for hypothesis test

P- parameter(s) (P= or M=)

H- Hypothesis (H_o= 0 or no difference, H_a= new claim)
A- Assumptions (get RIL)

N- name of test

T- find test statistic

O- obtain p-value

M- make decision (reject or fail to reject H_o)
S- state conclusion

New cards

one proportion z-test formula

New cards

two proportion z test calculator

x1: number of successes

n1: total number of trials

x2: number of successes

n2: total number of trials

p1: ≠p2, <p2, >p2

New cards

when to reject H_o

when p-value is < alpha (.05)

New cards

when to fail to reject H_o

when p-value is > alpha (.05)

New cards

when to use t- distribution

when we do not have the standard deviation of the population.

New cards

how to calculate degrees of freedom (t-distribution)

df=n-1

New cards

one mean t-interval formula

New cards

computing critical value (t distribution)

invT(one tail)

New cards

one mean t-interval calculator

STATS-

x̅: sample average

S_x: standard error

n: sample size

c-level: desired confidence level

IF GIVEN TABLE-

enter data into L1and L2 and use “Data” instead

<p>STATS-</p><p><span><strong>x̅:</strong> sample average</span></p><p><span><strong>S<sub>x</sub>: </strong>standard error </span></p><p><span><strong>n: </strong>sample size</span></p><p><span><strong>c-level: </strong>desired confidence level </span></p><p><span>IF GIVEN TABLE-</span></p><p><span>enter data into L1and L2 and use “Data” instead </span></p>

New cards

hypothesis test conclusion

Since p value is ___ I reject/fail to reject H_o.

There is/is not significant evidence proving H_a (in context)

New cards

two mean t-interval formula

New cards

two mean t-interval conclusion

i am % confident that the true average of ____ is between ___ and ___ lower/higher for __-.

AND

Since zero is not in the interval there is evidence of a significant diff. between ___ and ____

Since zero is in the interval there is not evidence of a significant diff. between ___ and ___.

New cards

one mean t-test formula

New cards

one mean t test calculator

H_o: null hypothesis

x̅: sample mean

S_x: standard error

n: sample size

M: ≠ H_{o, <}H_{o, >}H_o

New cards

2 sample t test calculator

New cards

when to use matched pairs

1 sample of subjects and 2 pieces of related data. you are interested in the DIFFERENCE in between the data. ( in your calculator you will use a 1 mean t-test/interval and enter the data of the difference)

New cards

Matched pairs interval conclusion

I am ___% confident that the average difference in ___ is between ___ and ___ lower/higher for ___

New cards

Matched pairs test conclusion

Since p-value is ___ I reject/ fail to reject H_o.

There is/ is not evidence that the average difference in ___ is significantly different.

New cards

Assumptions for two samples of data

Get RIL for each data sample AND include that the samples are independent of each other (in context)

New cards

Type 1 error

Rejected the null hypothesis when it was actually true

New cards

Type 2 error

Not rejecting the null hypothesis when it is actually false.

New cards

What is Power

the probability of rejecting the null hypothesis when it is false

New cards

What is the relationship between Power and Beta

Power+Beta=1

New cards

What is Beta

Probability of a Type 2 error

New cards

What is alpha

probability of a Type 1 error

New cards

Effects of increasing alpha

probability of a Type 1 error increases

probability of a type 2 error decreases

Power increases

New cards

Effects of increasing sample size

probability of type 1 error stays the same

probability of type 2 error decreases

power increases

New cards

when to run chi square Goodness of fit test

one sample
one categorical variable

H_o= the claimed distribution is correct

New cards

when to run chi square test of homogeneity

2 samples (from different populations)
one categorical variable (checking for sameness)

H_o= each population is having the same rate for every category of the variable

New cards

Expected values for chi square test formula

((row total)(column total))/(total toal)

New cards

Chi square degrees of freedom formula

df= (row-1)(column-1)

New cards

when to run chi square test of independence

one sample
two categorical variables (sorted by categories)

H_o= the two variables are independent of each other/ no association/ no relationship

New cards

Empirical rule