Looks like no one added any tags here yet for you.
data
facts / statistics collected together for reference / analysis
dataset
collection of data that can be analyzed as one unit
statistic
summarizes / analyzes data (ex: average)
variable
the characteristic being studied
numerical data
quantitative, consists of numbers
categorical data
consists of names, labels, or other nonnumerical values
discrete data
takes on only specific values, no decimal places
continuous data
can take on any value, can have decimal places
nominal data
categories with no specific order
ordinal data
categories with a specific order (5 star scale, sizes, etc)
population
collection of all possible members/outcomes for a group
parameter
calculation based on the whole population
sample
a subset of the population
statistic
calculation based on the sample
inference
using a sample statistic to estimate a population parameter
simple random sample (SRS)
choses x individuals from the whole population completely randomly
stratified random sample
separates the population into homogeneous groups (strata) and then choses a SRS from each group
convenience sample
choses individuals that are easiest to reach
sampling error
natural variation between sample estimation
bias
when we systematically over/under estimate a parameter because of the way the sample was collected
volunteer bias
when sample subjects are self selected
a bar graph is for __________ data
categorical
a histogram is for __________ data
numerical
bin width
width of an interval that is used in making a histogram.
left-skewed distribution
most of the data is on the right, mean > median
approximately normal
unimodal, symmetric, mean = median
right skewed distribution
most of the data is on the left, mean < median
multimodal distribution
Two or more peaks
uniform distribution
evenly spread out, no peaks
symbol for mean
x-bar (x̄)
standard deviation
average distance of points from the mean
five number summary consists of:
minimum, first quartile, median, third quartile, maximum
IQR (interquartile range)
Q3-Q1
what kind of plot shoes the five number summary
boxplot
measures of center and spread for approximately normal data
mean and standard deviation
measures of center and spread for skewed data
median and IQR
characteristics of a normal distribution
- perfectly symmetrical, unimodal, bell-shaped
- exactly half of the values are on the left, exactly half on the right
- mean = median = mode
68% of the data is within
1 standard deviation of the mean
95% of the data is within
2 standard deviations of the mean
99.7% of the data is within
3 standard deviations of the mean
standardized score is also known as a
Z-score
what is a z-score
number of standard deviations away from the mean
z-score formula
(x-mean)/standard deviation
sampling distribution
a probability distribution consisting of all possible values of a sample statistic
the mean of the sampling distribution is ___________ the mean of the population
equal to
variability of the sampling distribution__________ as sample size increases
decreases
standard deviation of all possible samples is the ___________
standard error
standard error formula
standard deviation/ square root of n
as sample size increases, the value of each sample mean is ____________ to the actual population mean
closer
normal approximation for the sampling distribution is used if _______ & __________
- the population is normally distributed
- sample size is greater than 30
acronym for describing a dataset
C - center
U - unusual features (outliars)
S - shape
S - spread