1/65
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
variable
a characteristic that changes from one individual to another
categorical variable
variable that takes on values that are category names or group labels
quanitative variable
variables that takes on numerical values for a measured or counted quantity
individuals
people, animals, or things described by a set of data
frequency table
A table used to show the number of times a variable occurs.
relative frequency table
a table used to show the proportions of variables
requirements for making categorical data bar charts
1.) label axis
2.) scale axis
3.) draw bars accurately
requirements for making categorical data pie charts
1.) include legend
discrete variable
variable that can take on a countable number of values (whole numbers)
continues variable
variable that can take on infinite values
advantages to a dot plot
shows every individual value in a data set; easy to see shape of the distribution
disadvantages to dot plot
difficult to make for large data sets
advantages to a stem and leaf plot
shows every individual values in a data set; easy to see shape of distribution
disadvantages to a stem and leaf plot
difficult to make for large data sets
how to describe distribution shape
symmetric, skewed left, skewed right, uniform, normal
symmetric distribution
right and left sides of histogram are approximately mirror images
skewed left distribution
data clusters to right with tail to left
skewed right distribution
data clusters to left with tail to right
uniform distribution
data is level
bimodal distribution
data creates two clusters
unimodal distribution
data creates one cluster
how to describe a distribution
shape, center, variability (spread), unusual features
what can be used to describe center of distibution
mean, median, quartile 1 and 3
mean
sum of all the data divided by the number of values
median
middle values of an ordered data set
quartile 1
the median of the lower half of the data
quartile 3
the median of the upper half of the data
what can be used to describe the variability of a distribution
range, interquartile range, standard deviation
range
the difference between the highest and lowest values in a distribution
interquartile range
The difference between the upper and lower quartiles.
standard deviation
typical distance that each value is away from the mean
standard deviation formula
sqrt { [(1)/(data size - 1)] [value - mean]^2 }
variance
standard deviation squared
variance formula
[(1)/(data size - 1)] [value - mean]^2
what letter represents data size
n
what letter represents the standard deviation
s
what symbol represents the standard deviaton
lowercase sigma
what symbol represents the mean
mu
what letter represents the mean
x bar
outlier
A value that "lies outside" most of the other values in a set of data.
IQR method to find outliers
an outlier is a value more than 1.5 x IQR below the first quartile or a value more than 1.5 x IQR above the third quartile
standard deviation method to find outliers
and outlier is a value located 2 or more standard deviations above or below the mean.
nonresistant summary statistics
summary statistics highly influenced by outliers (mean, standard deviation, and range)
resistant summary statistics
summary statistics not greatly influenced by outliers (median and IQR)
measures of center for a skewed distribution
median
measures of center for a symmetric distribution
mean
measures of variability for skewed distribution
IQR
measures of variability for symmetric distribution
standard deviation
5 number summary
minimum, Q1, median, Q3, maximum
advantages of a box plot
shows the five number summary and outliers; splits data into quartiles
disadvantages of a box plot
does not show every individual value; does not show shape of distribution
measure of center indications for skewed right distribution
mean is greater than median
measure of center indications for skewed left distribution
mean is less than median
measure of center indications for symmetric distribution
mean and median are approximately equal
percentile
the percent of data values less than or equal to a given value
standardized score (z - score) formula
(data value - mean)/(standard deviation)
what distribution shapes can percentiles and z score be applied to
any distribution shape
normal distribution
distribution that is mound shape ( or bell curve ) and symmetric
empirical rule
The rules gives the approximate % of observations within 1 standard deviation (68%), 2 standard deviations (95%) and 3 standard deviations (99.7%) of the mean
percent of data within one standard deviation of a normal distribution
68%
percent of data within two standard deviation of a normal distribution
95%
percent of data within three standard deviation of a normal distribution
99.7%
how to find percent of values to left of any x value (assumed normal distribution)
normalcdf (LB: -10^99; UB: X; Mu: mean; sigma: standard deviation)
how to find percent of values to right of any x value (assumed normal distribution)
normalcdf (LB: X ; UB: 10^99; Mu: mean; sigma: standard deviation)
how to find percent of values in between any x and y value (assumed normal distribution)
normalcdf (LB: X ; UB: Y ; Mu: mean; sigma: standard deviation)
how to find x value if given percent (assumed normal distribution)
inversenorm(area: percent of data to left of X; Mu: mean; sigma: standard deviation)