symmetric
a distribution is if the two halves on either side of the center look approximately like mirror images of one another
variance
a measure of how far a data set is spread out - mathematically defined as the average of the squared differences from the mean
outliers
extreme values that don’t appear to belong with the rest of the data - may be unusual values that deserve further investigations or may be mistakes
box-and-whisker plot/boxplot
a display of the 5-number summary from a set of data, including the minimum value, Q1, the median, Q3, and the maximum value of the data set
s²
the notation for variance for a sample of data
ȳ
the symbol for the mean of all the values of the variable y - can be used over any variable to indicate the mean of all the values of that variable
cente
a place in the distribution of a variable that you’d point to if you wanted to attempt to summarise the entire distribution with a single number - measures of include the mean and median
uniform
a distribution is if it is roughly flat
quantitative data condition
a condition that states the data in a set are values of a quantitative variable whose units are known
tails
the parts of a distribution that typically trail off on either side
quartiles
values that divide a dataset into quarters - the lower quartile (Q1) is the value with a quarter of the data below or equal to it; the upper quartile (Q3) has three quarters of the data below or equal to it; the median is sometimes referred to as Q2
skewed
a distribution is if it is not symmetric and has one tail stretching out further than the other - distributions are done so left when the longer tail is to the left and done so right when the longer tail is to the right
shape
the way that the distribution of data on a graphical display is patterned
percentile
the nth percentile is the number that falls above n% of the data
s
the notation for standard deviation for a sample of data
spread
a numerical summary of how tightly the values cluster around the center - measures of include the range, interquartile range (Q3-Q1), and standard deviation (square root of the variance)
cumulative distribution plot
a plot that displays the fraction of a data set that lies at or below any given data value - also called an ogive plot
Σ
sigma, indicating ‘the sum of’
dotplot
a graphical display of data created using dots and a number line - typically used for relatively small data sets
mode
a hump or local high point in the shape of the distribution of a variable - distributions can be unimodal, bimodal, or multimodal, and the apparent location of it can change as the scale of a histogram is changed
histogram
a graph that uses adjacent bars to show the distribution of a quantitative variable, showing the frequency or relative frequency of data
gap
a region of a distribution where there are no values
distribution
a list or graphical display of the data, showing all possible values or intervals of the data and how they occur
stem-and-leaf display
a data display where each value is split into a leaf and stem