AP Stats Ultimate Flashcards

0.0(0)

Studied by 16 people

Knowt Play

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/109

Earn XP

Description and Tags

Math Medic Flashcards

Statistics

AP Statistics

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

110 Terms

New cards

one-sample t-interval for μ

x̄ ± t* (s/√n), df = n-1

New cards

one-sample t-test for μ

t = x̄ - μ / (s / √n), df = n - 1

New cards

one-sample z-interval for p

p̂ ± z* √(p̂(1-p̂) / n)

New cards

one-sample z-test for p

z = (p̂ - p) / √(p(1-p) / n)

New cards

two-sample t-interval for μ1- μ2

(x̄1 - x̄2) ± t* √( (s1²/n1) + (s2²/n2) )

New cards

two-sample t-test for μ1- μ2

t = (x̄1 - x̄2) / √( (s1²/n1) + (s2²/n2) )

New cards

two-sample z-interval for p1 - p2

(p̂1 - p̂2) ± z* √( (p̂1(1-p̂1) / n1) + (p̂2(1-p̂2) / n2) )

New cards

two-sample z-test for p1 - p2

z = (p̂1 - p̂2) / √( (p̂c(1-p̂c) / n1) + (p̂c(1-p̂c) / n2) ), where p̂c = (X1 + X2) / (n1 + n2)

New cards

χ² test for homogeneity/independence

χ² = ∑ (observed - expected)² / expected, df = number of groups - 1

New cards

χ² goodness of fit test

χ² = ∑ (observed - expected)² / expected, df = (rows - 1)(columns - 1)

New cards

t-interval for slope

b = t* SEb, df = n - 2

New cards

t-test for slope

t = (b - β) / SEb

New cards

point estimate

if a confidence interval is (A, B), the point estimate is the average of A and B, or the exact center of the confidence interval

New cards

margin of error

critical value * standard error of statistic, or B - (the point estimate) for an interval (A, B)

New cards

power

the probability a test will correctly reject the null hypothesis, given the alternative hypothesis is true

New cards

type 1 error

when the null hypothesis is true and rejected (false positive)

New cards

type 2 error

when the alternative hypothesis is true and the null hypothesis is not rejected (false negative)

New cards

interpret the confidence interval

We are C% confident that the confidence interval from [A] to [B] captures the population parameter (in context)

New cards

interpret the confidence level

In repeated random sampling with the same sample size, approximately C% of confidence intervals created will capture the population parameter.

New cards

interpret the p-value

a p-value is the probability of obtaining a test statistic as extreme or more extreme than the observed test statistic when the null hypothesis is assumed to be true

New cards

unbiased estimator

when estimating a population parameter, a statistic is unbiased if the center of the sampling distribution for the statistic is equal to the population parameter

New cards

conditions for a one-sample t-test and t-interval for μ

random: data comes from a random sample
10%: when sampling without replacement, n < 10% of the population size
normal: population distribution is normal, large sample (n > 30), or a dotplot of the sample data shows no strong skewness or outliers

New cards

conditions for a one-sample z-test and z-interval for p

random: data comes from a random sample

10%: when sampling without replacement, n < 10% of the population size

large counts: np > 10 and n(1-p) > 10 for a test, np̂ > 10 and n(1-p̂) > 10 for an interval

New cards

conditions for a two-sample t-test and t-interval for μ1 - μ2

random: data come from independent random samples or 2 groups in a randomized experiment

10%: when sampling without replacement, n < 10% of the population size for both samples

normal: for both populations, either the population distribution is normal, large sample (n > 30), or a dotplot of the sample data shows no strong skewness or outliers

New cards

conditions for a two-sample z-test and z-interval for p1 - p2

random: data come from independent random samples or 2 groups in a randomized experiment.

10%: when sampling without replacement, n < 10% of the population size for both samples

large counts: n1p̂c > 20, n1(1-p̂c) > 10, n2p̂c > 10, n2(1-p̂c) > 10 for a test, n1p̂1 > 10, n1(1-p̂1) > 10, n2p̂2 > 10, n2(1-p̂2) > 10 for an interval.

New cards

conditions for a χ² test

random: data from a random sample, separate random samples, or groups in a randomized experiment

10%: when sampling without replacement, n < 10% of the population size for all samples
large counts: all expected counts must be at least 5

New cards

conditions for a t-test or t-interval for slope

linear: true relationship between the variables is linear

independent observations, 10% condition when sampling without replacement

normal: responses vary normally around the regression line for all x-values

equal variance around the regression line for all x-values

random: data from a random sample or randomized experiment

LINER

New cards

why do we check conditions?

random: so we can generalize to the population from which the sample was seleced

10%: so sampling without replacement is okay and we can use the stated formula for standard deviation

normal/large sample: so the sampling distribution is approximately normal

New cards

parameter

a number that describes the population (μ, p, σ)

New cards

statistic

a number that describes the sample (x̄, p̂, s)

New cards

population distribution

distribution of responses for every individual of the population

New cards

sample distribution

the distribution of responses for a single sample

New cards

sampling distribution

the distribution of values for the statistic for all possible samples of a given size from a given population

New cards

calculator function for one-sample t-interval for μ

T-Interval

New cards

calculator function for one-sample t-test for μ

T-Test

New cards

calculator function for a one-sample z-interval for p

1-PropZInt

New cards

calculator function for a one-sample z-test for p

1-PropZTest

New cards

what factors affect the width of a confidence interval?

decreases as n increases, increases as the confidence level increases

New cards

how do i make a decision based on a p-value?

if the p-value ≤ α, reject the null hypothesis

if the p-value > α, fail to reject the null hypothesis

New cards

what does it mean to reject the null hypothesis?

there is convincing statistical evidence to support the alternative hypothesis

New cards

what does it mean to fail to reject the null hypothesis?

there is not convincing statistical evidence to support the alternative hypothesis

New cards

what is the probability that a specific confidence interval captures the population parameter?

0 or 1, a confidence interval calculated from sample data either does or does not capture the population parameter

New cards

how to calculate expected counts in a χ² test for homogeneity/independence

(row total)(column total)/table total

New cards

how to choose the right inference procedure

does the scenario describe mean(s), proportion(s), counts, or slope?

does the scenario describe one sample, two samples, or paired data?

does the scenario describe a test or a confidence interval?

New cards

describing a distribution

shape (skew, binomial, symmetric, etc.)

center (mean or median if the distribution is skewed)

spread (variability, standard deviation for mean and interquartile range for median)

outliers (or potential outliers if you are estimating)

New cards

outlier rule

any value that falls more than 1.5IQR above Q3 or below Q1

Lower outliers < Q1 - 1.5(IQR)

Upper outliers > Q3 + 1.5(IQR)

New cards

how can we use a graph to compare the mean and the median?

skewed left, mean < median

roughly symmetric, mean = median

skewed right, mean > median

New cards

interpreting the standard deviation

gives the typical distance that the values are away from the mean, if there are more values away from the mean there’s a larger standard deviation

New cards

how do we describe the relationship between two variables (like in a scatterplot)

direction - positive or negative

unusual values - outliers, influential observations

form - linear or curved

strength - weak or strong

DUFS

New cards

how to find the mean, standard deviation, and 5 number summary using a calculator

stat → edit → enter data in L1 → stat → calc → 1-Var Stats → leave FreqList blank and calculate

New cards

how to calculate a least squares regression line using a calculator

stat → edit → enter x-values in L1 and y-values in L2 → stat → calc → LinReg(a+bx) → leave FreqList blank and calculate

New cards

What is the IQR (interquartile range)?

the difference between the third and first quartiles. Q1 and Q3 form the boundaries for the middle 50% of values in an ordered data set.

New cards

interpret the y-intercept of the least squares regression line

the predicted value of y in context when x in context is 0 is (a)

New cards

interpret the slope of the least squares regression line

for every increase of 1 unit of x in context, the predicted output of y in context increases/decreases by (b)

New cards

properties of correlation r

unitless, always between -1 and 1, greatly affected by regression outliers,

if the direction is negative, so is r. if the direction is positive, so is r.

the closer r is to -1 or 1, the stronger the relationship. the closer that r is to 0, the weaker the relationship.

gives the strength and direction of the linear relationship between 2 quantitative variables, does not apply for non-linear relationships

New cards

interpret the coefficient of determination r²

gives the percent of the variation of y in context tat is explained by the least squares regression line using x = x in context

New cards

regression outlier

a point that does not follow the general trend shown in the rest of the data and has a large residual

New cards

influential point

any point that, if removed, changes the relationship substantially (creates big changes to slope and/or y-intercept)

New cards

high-leverage point

has a substantially larger or smaller x-value than the other observations have

New cards

discrete variables

can take on a countable number of values, whether they are finite or infinite

New cards

continuous variables

can take on infinitely many values, but those values cannot be counted

New cards

categorical variable

takes on values that are category names or group labels

New cards

quantitative variable

one that takes on numerical values for a measured or counted quantity

New cards

control group

a collection of experimental units that are either not given a treatment of interest or given a treatment with an inactive substance (placebo) to provide a baseline to which the treatment groups can be compared, so it can be determined if the treatments have an effect

New cards

single-blind experiment

subjects do not know which treatment they are receiving, but members of the research team do, or vice versa

New cards

double-blind experiment

neither the subjects nor the members of the research team who interact with them know which treatment a subject is receiving

New cards

explanatory variable

a variable whose levels are manipulated intentionally

New cards

response variable

an outcome that is measured after the treatments have been administered

New cards

non-random (poor) sampling methods

convenience sampling and voluntary response sampling because they do not use chance to select the individuals

New cards

experimental units

animals or objects in an experiment

New cards

subjects

humans in an experiment

New cards

nonresponse bias

selected people do not respond

New cards

undercoverage

systematically excluding people from being able to be selected

New cards

response bias

providing inaccurate responses (on purpose or by accident)

New cards

wording issues

confusing wording or question is slanted towards a particular response

New cards

can increasing the sample size correct a biased sampling method?

no, you’ll just get a bigger flawed sample.

New cards

bias

the systematic tendency to overestimate or underestimate the true population parameter

New cards

observational study

no treatment imposed

New cards

experiment

imposed treatment on experimental units or subjects

New cards

can the results be generalized to a larger population?

the results can only be generalized to the population from which the sample/subjects were randomly selected. if the sample/subjects were not randomly selected then the results can only be generalized to “people like the ones in the study”

New cards

cause and effect

if the researchers randomly assigned the subjects to treatment groups, you can make conclusions about cause and effect.

if the researchers did not randomly assign subjects to treatment groups, you cannot say that the explanatory variable caused the change in the response variable.

New cards

stratified random sample

a simple random sample selected from the division of a population into separate groups (strata) based on shared attributes or characteristics (homogeneous grouping) within each stratum

New cards

simple random sample (SRS)

a sample in which every group of a given size has an equal chance of being chosen

New cards

systematic random sample

sample members from a population are selected according to a random starting point and a fixed, periodic interval

New cards

confounding variable

related to the explanatory variable and influences the response variable and makes it challenging to determine cause and effect

New cards

completely randomized design

treatments are assigned to experimental units completely at random. random assignment tends to create roughly equivalent groups, so that differences in responses can be attributed to the treatments

New cards

a well designed experiment should include

comparisons of at least 2 treatment groups, one of which could be a control group

random assignment of treatments to experimental units

replication enough experimental units in each treatment group to be able to detect a difference

control of potential confounding variables, where appropriate

New cards

matched pairs design

a special case of randomized block design, using a blocking variable, subjects are arranged in pairs matched on one or more relevant factors, every pair receives both treatments by randomly assigning one treatment to one member of the pair and subsequently assigning the remaining treatment to the second member of the pair, or alternatively, each subject may get both treatments.

New cards

randomized block design

treatments are assigned completely at random within each block. for each block, individuals are similar to each other with respect to at least one blocking variable in order to reduce variability of results within each treatment group and to eliminate the possibility of the blocking variable as a confounding variable

New cards

mean and standard deviation of a binomial distribution

μx = np

σ_X = √np(1-p)

n is the number of trials, p is the probability of success

New cards

mean and standard deviation of a geometric distribution

μx = 1/p

σ_X = √1-pc / p

New cards

formula for the binomial probability P(X = x)

P(X = x) = nCx p^x (1-p)^n-x

n is the number of trials, p is the probability of success, x is the number of successes

New cards

formula for the geometric probability P(X = x)

P(X = x) = (1-p)^x-1p

New cards

conditions for a binomial random variable

binary: two outcomes for each trial (success or failure)

independent: each trial is independent of the next

number of trials is a fixed number n

same probability of success for each trial p

New cards

conditions for a geometric random variable

binary: two outcomes for each trial (success or failure)

independent: each trial is independent of the next

trials until a success (not a fixed number)

same probability of success for each trial p

New cards

probability of “at least 1”

P(At least 1) = 1 - P(none)

New cards

law of large numbers

simulated (empirical) probabilities tend to get closer to the true probability as the number of trials increases

New cards

how can i tell if two events are independent?

P(A | B) = P(A)

P(B | A) = P(B)

P(A and B) = P(A) * P(B)

New cards

calculating conditional probability

P(A | B) = P(A and B) / P(B)

100

New cards

mean and standard deviation of the sum of 2 independent random variables

μT = μX + μY

σ_T = √σ_x² + σ_Y²