1/46
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
population
entire group of people, places, things of interest
parameters
numeric values from the population
sample
small group from population
statistics
numeric values from the sample
measurement bias
values larger or smaller than the true value that are typically caused by incorrectly calibrated instruments
sampling bias
the sample is not representative of the population
voluntary sampling
people with strong feelings one way or the other are the ones tending to respond
convenience sampling
only a "convenient" group is surveyed; this may leave out many subgroups of the population
survivorship bias
focusing on those who made it past some process and not considering those who did not
simple random sampling
a sampling method where all possible participants are equally likely to be chosen
sampling frame
all members of the population are identified and given a numerical label
strata
subgroups of interest
stratified sampling
divide the population into subgroups (strata) and take an SRS from each strata
cluster sampling
all members from a few randomly chosen clusters are selected and interviewed
systematic sampling
every kth subject is sampled
quantitative variables
numerical measurements that record “quantity” (measures or counts)
categorical variables
label outcomes into one of several exclusive groups or categories (qualitative)
frequency
how often each value is observed
data distribution
the way in which data values are spread or arranged across a range of possible values
mode
category with the highest frequency
bar chart
a graphical representation of data using rectangular bars to show the frequency of categories (categorical variables)
histograms
are similar to bar charts but display the frequency distribution of numerical data using adjacent bars of the same width (quantitative variables)
dot plots
similar to histograms but show the original data stacked as dots
shape
the overall layout of the data
center
the centrality or location of the data; where the histogram would be “balanced”
spread
the variation of the data from the minimum to maximum value; describes if data is concentrated in one area or is spread evenly throughout
outliers
any data points that are extremely large or small compared to the majority of the data
diversity
the range and distribution of various traits within a population (categorical)
mean
the average of a data set; the “balancing point”
median
the middle value/observation/data point of an ordered set of data (smallest to largest value)
resistant
a numerical summary that is influenced little by extreme observations (outliers)
measures of spread
variation/variability in the data
range
the largest observation minus the smallest observation
standard deviation
a measurement of the typical/average distance of observations from the mean
interquartile range (IQR)
measures the range of the middle 50 percent of data
percentiles
the pth percentile is the measurement to which p percent of all measurements fall below it and (100-p) percent lie above it
quartiles
special percentiles that divide the data into quarters
how to find IQR
IQR = Q3 - Q1
how to find the step
1.5 (IQR)
how to find the upper fence
Q3 + step
how to find the lower fence
Q1 - step
side by side bar graphs
bar graphs that compare two categorical variables
side by side box plots
separate box plots are constructed on the same axis for each level of a categorical variable
side by side dot plots
separate dot plots constructed for each level of a categorical variable
scatterplots
easy way to study the relationship between two quantitative variables measured on the same subject or at the same point (plotted (x,y))
explanatory variable
one variable that exerts influence on the other (x axis)
response variable
what is impacted or what responds to the explanatory variable