descriptive statistics

0.0(0)

Studied by 6 people

Call with Kai

Knowt Play

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/30

Earn XP

Description and Tags

lock in part2

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

31 Terms

New cards

descriptive statistics

describe what has transpired = stats from the past
descriptive measures derived from a sample (statistics) and population (parameters)

New cards

three main characteristics used for the numerical description of data?

Center, variability, and shape

New cards

Center

(average, middle, most common)

New cards

Variability

(how spread out the data is)

New cards

Shape

(symmetry, skewness, tails)

New cards

Measure of center: Mean

average

uses all data
sensitive to outliers

New cards

Measure of center: median

middle value

good if outliers exist

If 𝑛 is odd → middle value.

If 𝑛 is even → average of the two middle values.

New cards

Measure of center: mode

most frequent value

good for categories/ discrete data

New cards

Measure of center: midrange

(max + min) div 2

easy but distorted by outliers

New cards

Measure of center: geometric mean

average of products/ ratio
multiply all values
take the n-th root

New cards

growth rate using geometric mean

measure of center but best for ratio, percent, growth
special use of geometric mean to find average rate of change across time

<ul><li><p>measure of center but best for ratio, percent, growth </p></li><li><p>special use of geometric mean to find average rate of change across time </p></li></ul><p></p>

New cards

arithmetic mean (normal ave)

+ all numbers then div number of items

measure of center

New cards

Measure of center: trimmed mean

average after removing extreme high/low values

New cards

Measure of center

positive Right-skew: Mean > Median > Mode
normal Normal: Mean = Median = Mode
negative Left-skew: Mean < Median < Mode

New cards

Measure of Variability: range

diff between largest & smallest observation

sensitive to data values but ez to interpret

max - min

New cards

Measure of Variability: sample variance

average of squared deviations (how far a data value is from mean) from mean

New cards

Measure of Variability: standard deviation

spread of data around mean
most common same unit as data
symbol: σ

New cards

Measure of Variability: coefficient of variations

CV = SD div mean x 100%
compares spread across datasets w/ diff units

New cards

Measure of Variability: mean absolute deviation

reveals average distance from center
MAD

New cards

standardized data

rescale data so everything is measured in terms of how far it is from the mean, using the standard deviation as the unit.

New cards

z-score

z-score says how many standard deviations away from the mean a data point is. = how far from average, in standard deviation units
Positive z → above the mean.
Negative z → below the mean.

Helps spot unusual data:

|z| > 2 → unusual
|z| > 3 → outlier

New cards

empirical rule

describes how data is spread out in a normal distribution (bell curve)

±1 standard deviation from the mean → about 68% of the data falls here.
±2 standard deviations from the mean → about 95% of the data falls here.
±3 standard deviations from the mean → about 99.7% of the data falls here.

It helps you predict where most data will fall.

<p>describes how data is spread out in a <strong>normal distribution</strong> (bell curve)</p><ul><li><p><strong>±1 standard deviation</strong> from the mean → about <strong>68%</strong> of the data falls here.</p></li><li><p><strong>±2 standard deviations</strong> from the mean → about <strong>95%</strong> of the data falls here.</p></li><li><p><strong>±3 standard deviations</strong> from the mean → about <strong>99.7%</strong> of the data falls here.</p></li></ul><p>It helps you <strong>predict</strong> where most data will fall.</p>

New cards

estimating sigma

fast estimate of SD if the data is roughly normal and you only know the min & max.

New cards

percentiles

position of a value in data

ex: 90th percentile

often used for national edu tests

New cards

quartiles

scale points that div sorted data into 4 grps of equal size

Q1 (First Quartile) = 25th percentile → 25% of data is below it.
Q2 (Second Quartile) = 50th percentile = Median → half the data is below it.
Q3 (Third Quartile) = 75th percentile → 75% of data is below it.

New cards

Interquartile Range (IQR) IQR=Q3−Q1

Show center (Q2) and spread (Q1, Q3).
Help detect outliers using fences:
- Lower Fence = Q1−1.5(IQR)
- Upper Fence = Q3+1.5(IQR)
- Any data outside = outlier.

New cards

Methods for Finding Quartiles

Method of Medians:
- Find Q2 (median) first.
- Q1 = median of lower half.
- Q3 = median of upper half.
Interpolation Method:
- Uses formulas when data size doesn’t split evenly.

New cards

box plots

simple graph shows dataset’s center, spread, and skewness using five key numbers:

Minimum (smallest value, not an outlier)
Q1 (25th percentile)
Median (Q2) (50th percentile)
Q3 (75th percentile)
Maximum (largest value, not an outlier)

= 5-number summary.

New cards

How to Read a Boxplot

The box = middle 50% of the data (from Q1 → Q3).
The line inside the box = the median (Q2).
The “whiskers” extend to the smallest and largest values within the fences (not outliers).
Any points beyond whiskers = outliers.

New cards

Fences (for detecting outliers)

Lower Fence = Q1 – 1.5 × IQR
Upper Fence = Q3 + 1.5 × IQR
Outliers are values outside these fences.

New cards

Shape of Distribution

Skewness = symmetry.
- Right-skewed: tail on right.
- Left-skewed: tail on left.
Kurtosis = tail heaviness & peak.
- High kurtosis = heavy tails (outliers likely).
- Low kurtosis = flat distribution.

<ul><li><p><strong>Skewness</strong> = symmetry.</p><ul><li><p>Right-skewed: tail on right.</p></li><li><p>Left-skewed: tail on left.</p></li></ul></li><li><p><strong>Kurtosis</strong> = tail heaviness & peak.</p><ul><li><p>High kurtosis = heavy tails (outliers likely).</p></li><li><p>Low kurtosis = flat distribution.</p></li></ul></li></ul><p></p>