1/76
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No study sessions yet.
statistics
the science of collecting, analyzing, and drawing conclusions from data
statistic
a specific calculation from a data set
data
information with context
sample
group we gather data from
population
overall group of interest
variable
attribute that can take different values for different individuals
categorical variable
assigns labels that place each individual into a particular group
quantitative variable
numerical values that count or measure some characteristic of each individual
frequency table
lists category names and counts for each category
relative freaquency table
lists category names and percent for each category
bar graph
shows each category as a bar, heights of bars show category frequencies or relative frequencies
pie chart
shows each category as a slice of the pie, areas of slices are proportional to the category frequencies
two-way frequency table
table of counts that summarizes data on the relationship between two categorical variables for some groups of individuals
marginal relative frequency
gives percent or proportion of individuals that have a specific value for one categorical variable
conditional relative frequency
table of counts that summarizes data on the relationship between two categorical variables for some groups of individuals
side-by-side bar graph
displays distribution of a categorical variable for each value of another categorical variable, bars are grouped together based on the condition and placed side by side
segmented bar graph
displays distribution of a categorical variable as segments of a rectangle, area of each segment is proportional to percent of individuals in the corresponding category
mosaic plot
modified segmented bar graph in which the width of each rectangle is proportional to the number of individuals in the corresponding category, vertical axis is percent, don’t need spaces
mean
average value
median
middle number of a data set
mode
value that occurs the most
range
difference between the max and min
dotplot
shows each data value as a dot above its location on a number line
stemplots
shows each data value separated into two pars: a stem, which consists of all but the final digit, and a leaf, the final digit
histogram
shows each interval of values as a bar, heights of the bars show the frequencies or relative frequencies of values in each interval
boxplots
visual representation of the five-number summary
Q1
median of numbers below median
Q3
median of numbers above the median
IQR
Q3 - Q1
skewed to the right
right side of graph is much longer than left side
skewed to the left
left side of graph is much longer than right side
scatterplot
shows the relationship between two quantitative variable measured on the same individuals
correlation
measures the direction and strength of a linear association between two quantitative variables
regression line
line that describes how a response variable changes as an explanatory variable changes
least-squares regression line
line that makes the sum of the squared residuals as small as possible
residual plot
a scatterplot that displays the residuals on the vertical axis and the explanatory variable on the horizontal axis
coefficient of determination (r²)
measures the percent of variability in the response variable
high leverage point
data point whose x-value is far from the mean of x
influential point
any point that, if removed, substantially changes the slope, y-intercept, or correlation of the regression model
population
entire group of individuals we want information about
convenience sampling
selects individuals from the population who are easy to reach
voluntary response sampling
individuals choose to be in the sample by responding to a general invitation
bias
the design of a statistical study shows bias if it is very likely to underestimate or overestimate the value you want to know
random sampling
involves using a chance process to determine which members of a population are included in the sample
simple random sample
each combination of people has an equal chance of being chosen
stratified random sampling
divide population into groups of individuals based on common traits and choose a simple random sample from each group
cluster sampling
divide population into smaller groups and use a simple random sample to randomly choose entire clusters for the sample
systematic random sample
divide population into equal size groups, randomly decide a starting point in the first group, then take every nth term from there
undercoverage
when some members of the population are less likely to e chosen or cannot be chosen in a sample
nonresponse bias
when an individual chosen for the sample can’t be contacted or refuses to participate
response bias
anything in the survey design that influences the responses
sampling variability
different random samples of the same size from the same population produce different estimates
parameter
number that describes some characteristic of a populaton
observational study
researchers don’t assign choices or treatments, they just observe them
retrospective
examines existing data
prospective
tracks individuals into the future
experiment
deliberately imposes some treatment on individuals to measure their responses
experimental unit
the object to which a treatment is randomly assigned
factor
an explanatory variable that is manipulated and may cause a change in the response variable
levels
the amounts of each experimental factor administered
treatments
levels of the explanatory variable applied to the individuals in an experiment
confounding variable
a variable that is related to the expanatory variable and influences the response variable and may create a false perception of association between the two
statistically significant
when the observed results of a study are too unusual to be explained by chance alone
random process
generates outcomes determined purely by chance
trial
each occasion upon which we observe a random phenomenon
outcome
result ofa trial
event
collection of outcomes
sample space
a list of all possible outcomes
theoretical probability
the likelihood of an event occurring based on expected outcomes
law of large numbers
as we repeat a random process over and over, the proportion of times that an event occurs settles down to one number
conditional probability
probability an event will occur given another event has already occurred
random variable
takes on values based on the outcomes of a random event
discrete random variable
a variable that can only take a countable number of values
continuous randome variable
a variable that can take on any numeric value within an interval
probability distribution
gives all possible values of a random variable and their probabilities
binomial probability
the probability of getting a certain number of successes within a SET number of trials
geometric probability
the probability of getting the FIRST success on the xth trial