1/27
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
Distribution
slices up all the possible values of the variable into equal-width bit and gives the number of values (or counts) falling into each bin
Histogram (relative frequency histogram)
uses adjacent bars to show the distribution of a quantitative variable. Each bar represents the frequency (or relative frequency) of values falling in each bin.
Gap
a region of the distribution where there are no values
Stem-and-Leaf Display
shows quantitative data values in a way that sketches the distribution of the data, best described in detail by example.
Dotplot
graphs a dot for each case against a single axis
Shape
to describe this, look for: single vs. multiple modes, symmetry vs. skewness, and outliers and/or gaps
Center
The place in the distribution of a variable that you'd point to if you wanted to attempt the impossible by summarizing the entire distribution with a single number. Measures of this include the mean and the median.
Spread
A numerical summary of how tightly the values are clustered around the center. Measures of this include the IQR and standard deviation.
Mode
A hump or local high point in the shape of the distribution of a variable. The apparent location of this can change as the scale of a histogram is changed.
Unimodal
A distribution having one mode
Bimodal
A distribution with two modes
Multimodal
A distribution with more than two modes
Uniform
A distribution that is roughly flat.
Symmetric
A distribution where two halves on either side of the center look approximately like mirror images of each other
Tails
The parts of a distribution that typically trail off on either side
Skewed
A distribution that is not symmetrical and one tail stretches out further than the other
Outliers
Extreme values that don't appear to belong with the rest of the data. These may be unusual values that deserve further investigation or they may just be mistakes. There's no obvious way to tell. Don't delete these automatically: you have to think about them. These can affect many statistical analyses so you should always be alert for them.
Median
The middle of volume with half of the data above and half below it. If n is even, it is the average of the two middle values. It is usually paired with the IQR.
Range
The difference between the lowest and highest values in a data set
Quartile
a value that divides the data set into four equal parts
Interquartile Range
The difference between the first and third quartiles. It is usually reported along with the median.
Percentile
the number that falls above i% of the data
5-number Summary
Reports the minimum value, quartile 1, the median, quartile 3, and the maximum value
Boxplot
Displays the five number summary as a central box, whiskers that extended to nonoutlying data values, and any outliers shown
Mean
The value found by summing all the data values and dividing by the count. It is usually paired with the standard deviation
Resistant
A calculated summary where outliers have only a small effect on it
Standard Deviation
The square root of the variance, usually reported along with the mean
Variance
The sum of squared deviations from the mean divided by the count minus one