1/56
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
pth percentile
value with p% of the observations less than it
5 number summary
minimum, Q1, median, Q3, maximum
outlier
any data value >= Q3 + 1.5IQR, or
interquartile range (IQR)
Q3 - Q1
standardized value / z-score
given an observation x from a distribution with mean m and standard deviation s,
z = (z - m)/s
explanatory variable
independent variable; influences changes in the response variable
response variable
dependent variable; depends on the explanatory variable
parameter
a numerical characteristic of a whole population
statistic
a numerical measurement of a sample (rather than the population)
correlation coefficient (r)
measures the direction and strength of the linear association between two quantitative variables (-1 <= r <= 1)
regression line
straight line that describes how a response variable changes as the explanatory variable changes
square of the correlation (r^2)
the fraction of the variation in the response variable that is explained by the least-squares regression
residual
difference between an observed value of the response variable and the predicted value by the regression line:
residual = observed y - predicted y
lurking variable
a variable that is not among the explanatory or response variables but may influence the relationships among the variables
variance
the square of the standard deviation
Simpson's Paradox
the reversal of an apparent association when some variables are (dis)aggregated/combined
confounding
variables whose effects on a response variable cannot be distinguished from each other
anecdotal data
represent individual cases and are not representative of any larger group of cases
available data
data that are easily accessible/available
sample survey
a study that collects data from a sample representing the larger population
census
a study that collects data from all cases in the population of interest
observational study
a study that observes individuals without influencing the responses
experiment
a study that imposes treatments on experimental units to observe their responses
experimental units / test subjects
the individuals on which the experiment is done
treatment
a change in the explanatory variable
outcome
measured variables that are used to compare to the treatments
elements of experimental design
control, randomization, repetition
block design
forming blocks of experimental units that are similar in some way (ex: gender), then randomizing within each block
matched pair design
an special case of block design, where the blocks consist of two experimental units that share as many attributes as possible (ex: twins)
double-blind study
a study in which neither the experimenters nor the subjects know which treatment any subject has received
simple random sample (SRS)
sampling method where all groups of size n from the population have an equal probability of being chosen
stratified random sample
sampling method that separates the population into different strata based on some attribute (ex: socioeconomic level), chooses an SRS from each strata, and combines these samples according to the makeup of the population
multistage random sample
sampling method that selects successively smaller groups within the population in stages (ex: states -> cities -> districts -> schools -> students)
sample proportion (p̂)
proportion of 'successes' in a sample; a statistic
population proportion (p)
proportion of 'successes' in a population; a parameter that we will never know
bias
center of the sampling distribution is not equal to the true value of the parameter
variability
the spread of the sampling distribution
institutional review board
a board that reviews all planned studies in advance to protect subjects from possible harm
informed consent
giving potential participants enough information about a study to enable them to choose whether they wish to participate
confidentiality
only the researchers can identify responses of individual subjects
anonymity
subjects are anonymous, so their names are not known even to the director of the study
random
individual outcomes of a phenomenon are uncertain, but these outcomes occur a predictable percentage of the time in large numbers of independent trials
probability
the proportion of times the event occurs in many repeated trials of a random phenomenon
independent
the outcome of one trial does not influence the outcome of any other trial
sample space
the set of all possible outcomes of the random phenomenon
event
a set of outcomes
complement of an event A
exactly the outcomes that are not in A
binomial experiment
an experiment with a fixed number of trials for which each trial results in either a success or failure
Central Limit Theorem
a simple random sample of size n from a population with mean μ and standard deviation σ is drawn. when n is large, the sampling distribution of the sample mean x̄ is approximately N(μ, σ/sqrt(n))
standard error
the statistic s / sqrt(n), where s is the sample standard deviation and n is the size of the population
level C confidence interval
an interval centered around the sample mean that is computed from a sample. any level C confidence interval will have probability C of containing the population proportion
Type I error
error occurring when we reject the null hypothesis when in fact it is true
Type II error
error occurring when we fail to reject the null hypothesis when it is in fact false
power of a statistical test
1 - probability of a Type II error (given that the null hypothesis is false)