1/23
i've seen this happen in other people's lives/now it's happening in mine !
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
what are the 3 distributions of an numerical variable?
shape
center
spread
what are the 5 shapes for numerical distribution?
bell-shaped
left-skewed
right-skewed
bimodal
uniform
what are the measures of center and spread for symmetric and bell-shaped distributions?
mean (average) and standard deviation (variation around the mean)
x̄ is sample mean
μ is population mean
how do you calculate mean on the ti-84?
STAT → edit → enter data into L1
STAT → calc → 1: 1-Var Stats → list: L1 → enter → look for x̄
what is the difference in “balancing point” between symmetric and non-symmetric data?
mean is the balancing point for both symmetric and non-symmetric data.
mean represents “typical value” well in symmetric data, but not in non-symmetric data.
what is standard deviation?
how far away observations are from the mean → “give or take” about the typical value
σ is population standard deviation
s is sample standard deviation
what are some unique properties of standard deviation?
it equals zero when all data values are the same (obviously)
NOT resistant to the pull of outliers. so strong skew or outliers can greatly increase standard deviation.
share same units of measurement as the data values
normal distributions (bell-shaped, symmetric) have the smallest standard deviation (unless your other distribution has like. a single bar.)
what is the empirical rule for standard deviation?
if the distribution is bell-shaped,
1 standard deviation: 68% of observations
2 standard deviations: 95% of observations
3 standard deviations: 99.7% of observations
…from the population mean (μ).
how do you calculate variance?
standard deviation².
what is a z-score and how is it calculated?
measure of how many standard deviations an observation is away from the mean
equation is z = (given number - sample mean) / standard deviation → available in Ti-84 under PRGM 8: ZSCORE
APPLIES TO BELL-SHAPED DATA ONLY!!
what z-scores are considered unusual
unusual: greater than 2 or less than -2
definitely unusual: greater than 3 or less than -3
what is the measure of center and spread for skewed distributions?
median (middle value when data is sorted from smallest to largest) and IQR (interquartile range)
what is median?
the value that is right in the middle when the data is sorted from smallest to largest OR average of the two middle numbers if sample size is even
aka 50th percentile because 50% of observations are at or below it
calculated by: enter data in L1 → STAT → calc → 1-Var Stats
what is a percentile?
value/observation in dataset where a certain % of the observations are less than that value
eg. a value at 40th percentile means 40% of observations are less than the value
what is a quartile?
special types of percentiles that divide the data into fourths
Q1 = 25th percentile
Q2 = 50th percentile (median)
Q3 = 75th percentile
what is interquartile range?
for skewed distributions: measure of variation from the center value (median), kind of like standard deviation → spread.
tells us roughly how much space middle 50% of data occupies (Q1 - Q3)
IQR = Q3 - Q1
Q1, med, and Q3 all found in STAT → calc → 1: 1-Var Stats
what do you have to do to a bimodal or multimodal distribution before measuring trends?
split the data into at least 2 sub-populations like gender, age, etc. then measure trends within those groups individually.
multimodal is usually indicative of multiple groups mushed together anyway
effect of outliers on measures of center?
affected by outliers (sensitive): mean, standard deviation, range (largest # - smallest #). → use on bell-shaped distributions with no outliers.
not affected by outliers (resistant): median, interquartile range (IQR). → use on skewed left and skewed right distributions, and sometimes bell-shaped distributions with outliers
what is the 5-number summary and how is it calculated?
minimum, first quartile (Q1), median, third quartile (Q3), maximum
STAT → calc → 1: 1-Var Stats → scroll down → write down everything from [minX] to [maxX] in the order they appear
how do you identify potential outliers?
symmetric data: empirical rule and z-scores
non-symmetric/skewed data: use limits/fences
→ calculate IQR (Q3 - Q1)
→ calculate left limit/lower fence = Q1 - (1.5*IQR)
→ calculate right limit/upper fence = Q3 + (1.5*IQR)
→ !! values outside of this range are considered potential outliers
how do you draw boxplots / box and whisker plots?
determine and plot the potential outliers
draw lines at Q1, Q3, and median
draw box connecting Q1 to Q3 (pass thru median)
cross out the outliers
draw a horizontal line from the end of the box to the biggest and smallest numbers that aren’t outliers
what are the pros and cons of boxplots?
pros: shows typical range of values, potential outliers, and variation (esp. when comparing distributions)
cons: doesn’t show modality (# of peaks) or mean, doesn’t work for small data sets (esp. when <5 numbers)
how can you tell if a boxplot is skewed?
the “whisker” on one side will be insanely long. also there will probably be an outlier on the insanely long side.
o ————— {{ | }} ———
how do you draw a boxplot on the ti-84 calculator?
STAT → edit → enter data in L1
2nd → Y= (statplot) → turn plot1 on → select box plot with outliers icon (bottom left corner) → Xlist = L1
ZOOM → 9: ZoomStat → use TRACE and arrow keys to navigate graph