Quantitative Data part 1
Distribution
- What value a variable takes and how often it takes that value
Dot Plots
- Show count associated with a value
- Stacks of dots over a numerical value
- Good for small data sets
Description for FRQ:
Largest category, symmetry, skewness, clusters, gaps, outliers
Stem and Leaf plot
- Stems are larger digit places, leaves are smaller digit places
- Good for small amounts of data
Description for FRQ:
Center, spread, shape, unusual features
Histograms
- Summarize large sets of data and grouped into intervals
- Vertical axis indicates frequencies or relative frequencies
- Horizontal axis indicates data values or ranges of data values
- Number of data variables in any interval is the frequency of the interval
- Unless there are no observations, there’s no gap between bars
Description for FRQ:
Skewed left/right, bell/mound shape, rectangular, uni/bi modal
Mode
- “most”
- A value or set of values that occurs most frequently
- The mode is where a frequency distribution reaches a maximum
- Bimodal- 2 modes
- Multimodal- multiple modes
- Not always near center of distribution
Median
- (Q2) middle value when observations are in numerical order
- If there are 2 middle numbers, then add the two numbers and average them
- If the distribution if symmetrical, then mean and median are equal
- The median is resistant because outliers don’t or barely affect value
- This means that when there are outliers, the median is a better measure of the center
Mean
Population parameter: μ (mew)
Sample statistic: x̄ (x_bar)
Add all the data points and divide by number of data points
x̄ = (∑xi)/n = (x1+x2+x3+ ……..+xn)/n
n= number of data items
∑= sum of whatever follows
Balance point of distribution
The mean is non resistant because outliers affect it
Symmetry
- The right and left halves mirror each other
- The mean and median are equal
Skewness
- If the right tail is longer than the left, then it is a right/positive skew and mean > median
- If the left tail is longer than the right, then it is a left/negative skew and mean < median
Measures of Dispersion
Used to describe spread of data/variation around a central value
Range
- The difference between the biggest and smallest number
- It is not resistant to the influence of outliers