1/28
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
Displaying Data…
Organizes data
Helps show similarities and differences
Categorical Variables
Bar charts
Pie charts
Two way tables
Quantitative Variables
Dot plots
Stem plots
Box plots
Histogram
Bar Graph
Vertical or horizontal bars
Can show frequency (#) or relative frequency (%)
Must have spaces between bars
What do you need in a bar graph?
Title
Y-axis label
X-axis label
Spaces in between bars
Pie Charts
Always in Relative Frequency (%)
Compares parts of a whole
Title
Key
Two Way Tables
Used to compare observations for two different categorical variables
Column variable (Top to bottom)
Row variable (Left to Right)
Marginal distribution
The totals converted to %
Stemplots
Leaf is the final digit of the number
Stems are always on the left side of the line, leaves are on the right side
Stems are ordered least (top) to greatest (bottom)
All stems must be written
Spacing is important
No commas between each leaf
Always have a key
Back to back stemplots
Compares two sets of data
Splitting Stems
Used when the data is clustered on one or just a few stems
2 stems for every 10 digit or 5 stems for every 10 digits
Histograms
Observations are placed into bins/bars of equal widths
Count how many observations are in each bin & draw bar to that height
Similar to bar chart but there are no spaces between bars/charts
Bin width can be changed but you need a minimum of 5 bars
Label the x-axis, y-axis, and title
Can do % or # on y-axis
Describing Distributions
You must include the
Shape
Center
Spread
SHAPE- symmetric or skewed
Symmetric- left and right sides are approximate mirror images
Skewed Left- majority of the data is on the right side, but trails out to the left
Skewed Right- majority of the data is on the left side, but trails out to the right
SHAPE- modes
Unimodal- one major mode where the data is collected around
Bimodal- two major modes
Multimodal- multiple modes
Uniform- data is flat
SHAPE- unusual features
Outliers- observations away from the main distribution
Gaps- spaces between clumps of data
CENTER- mean or median
Mean- used with symmetric graphs (no outliers)
Median- skewed left/right (has outliers)
SPREAD
Standard deviation- goes with mean
IQR- goes with median
Always list the range of the data (min,max)
Center
Mean: Average
only used for symmetrical, no outliers
Median:
Put #’s in order from least to greatest
Find middle number
If there are two middle numbers, then aveage the two #’s
Quartiles
The medians of the upper and lower half
First (Q1) and third (Q3) quartile
Spread
Spread: the spread that goes with the median is the IQR & Range
Range: (min,max) or Range= max-min=#
IQR:
The difference between Q3 and Q1
IQR= Q3-Q1
Represents the middle 50% of the data
5 number summary
Minimum
Q1
Median
Q3
Maximum
Outliers?
Finding outliers: IQR Test
Find 1.5(IQR)
Use that #: subtract from Q1/A and Q3/B
A= Q1-(1.5 times IQR)
B= Q3+(1.5 times IQR)
(A,B) is that acceptable range
Any # outside (A,B) is an outlier
Spread: Standard Deviation
SD goes with the mean
Decribes how spread out the data is (Variability)
Symbol: s
The higher the # is, the more spread out the data
Properties of Standard Deviation
s=0, when all data points are the same number
s is always positive
Standard deviation is affected by outliers
Mean vs. Median in distributions
Symmetric
mean=median
Skewed Left
mean < median
Skewed Right
mean > median
Rules:
The mean is affected by outliers
The median is resistant to outliers
Comparing Distributions
When comparing 2 distributions, you must still… state the shape, center, and spread of both distributions.
You must also compare the 3 things in each distribution
Use comparison words: higher, lower, similar, same, wider, etc.
Cannot compare mean to median
GUIDE:
The shape of D1 is ____ which is similar to/different than the shaoe of D2 which is ____. Both are unimodal.
The center of D1 is the mean/median of ____ which is higher/lower than the mean/median of D2 which is _____.
The IQR/SD of D1 is _____ which is wider/smaller than the IQR/SD of D2 which is _____. The range of D1 is (_,_) which is wider/smaller than the range of D2 which is (_,_).
Other, outliers, gaps.
If I ADD an upper outlier, how will it affect the summary stats?
Mean-increase
Median-stay the same
Range-increase
IQR- stay the same
SD- increase
If I ADD an lower outlier, how will it affect the summary stats?
Mean-decrease
Median-stay the same
range-increase
IQR-stay the same
SD-increase