descriptive statistics

0.0(0)
studied byStudied by 6 people
full-widthCall with Kai
GameKnowt Play
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/30

flashcard set

Earn XP

Description and Tags

lock in part2

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

31 Terms

1
New cards

descriptive statistics

  • describe what has transpired = stats from the past

  • descriptive measures derived from a sample (statistics) and population (parameters) 

2
New cards

three main characteristics used for the numerical description of data?

Center, variability, and shape

3
New cards

Center

(average, middle, most common)

4
New cards

Variability

(how spread out the data is)

5
New cards

Shape

(symmetry, skewness, tails)

6
New cards
<p>Measure of center: Mean</p>

Measure of center: Mean

average

  • uses all data 

  • sensitive to outliers 

7
New cards

Measure of center: median

middle value

  • good if outliers exist

If 𝑛 is odd → middle value.

If 𝑛 is even → average of the two middle values.

8
New cards

Measure of center: mode

most frequent value

  • good for categories/ discrete data 

9
New cards
<p>Measure of center: midrange </p>

Measure of center: midrange

(max + min) div 2

  • easy but distorted by outliers

10
New cards
<p>Measure of center: geometric mean </p>

Measure of center: geometric mean

  • average of products/ ratio 

  • multiply all values 

  • take the n-th root 

11
New cards
<p> growth rate using geometric mean</p>

growth rate using geometric mean

  • measure of center but best for ratio, percent, growth

  • special use of geometric mean to find average rate of change across time 

<ul><li><p>measure of center but best for ratio, percent, growth </p></li><li><p>special use of geometric mean to find average rate of change across time&nbsp;</p></li></ul><p></p>
12
New cards

arithmetic mean (normal ave)

+ all numbers then div number of items 

measure of center 

13
New cards
<p>Measure of center: trimmed mean </p>

Measure of center: trimmed mean

average after removing extreme high/low values

14
New cards
<p>Measure of center</p>

Measure of center

  • positive Right-skew: Mean > Median > Mode

  • normal Normal: Mean = Median = Mode

  • negative Left-skew: Mean < Median < Mode

15
New cards

Measure of Variability: range

diff between largest & smallest observation

  • sensitive to data values but ez to interpret

max - min

16
New cards
<p>Measure of Variability: sample variance </p>

Measure of Variability: sample variance

average of squared deviations (how far a data value is from mean) from mean

17
New cards
<p>Measure of Variability: standard deviation</p>

Measure of Variability: standard deviation

  • spread of data around mean

  • most common same unit as data

  • symbol: σ

18
New cards
<p>Measure of Variability: coefficient of variations&nbsp;</p>

Measure of Variability: coefficient of variations 

  • CV = SD div mean x 100%

  • compares spread across datasets w/ diff units

19
New cards
<p>Measure of Variability: mean absolute deviation&nbsp;</p>

Measure of Variability: mean absolute deviation 

  • reveals average distance from center 

  • MAD

20
New cards

standardized data

rescale data so everything is measured in terms of how far it is from the mean, using the standard deviation as the unit.

21
New cards
<p>z-score</p>

z-score

  • z-score says how many standard deviations away from the mean a data point is. = how far from average, in standard deviation units

  • Positive z → above the mean.

  • Negative z → below the mean.

Helps spot unusual data:

  • |z| > 2 → unusual

  • |z| > 3 → outlier

22
New cards
<p>empirical rule </p>

empirical rule

describes how data is spread out in a normal distribution (bell curve)

  • ±1 standard deviation from the mean → about 68% of the data falls here.

  • ±2 standard deviations from the mean → about 95% of the data falls here.

  • ±3 standard deviations from the mean → about 99.7% of the data falls here.

It helps you predict where most data will fall.

<p>describes how data is spread out in a <strong>normal distribution</strong> (bell curve)</p><ul><li><p><strong>±1 standard deviation</strong> from the mean → about <strong>68%</strong> of the data falls here.</p></li><li><p><strong>±2 standard deviations</strong> from the mean → about <strong>95%</strong> of the data falls here.</p></li><li><p><strong>±3 standard deviations</strong> from the mean → about <strong>99.7%</strong> of the data falls here.</p></li></ul><p>It helps you <strong>predict</strong> where most data will fall.</p>
23
New cards
<p>estimating sigma </p>

estimating sigma

fast estimate of SD if the data is roughly normal and you only know the min & max.

24
New cards
<p>percentiles </p>

percentiles

position of a value in data

ex: 90th percentile

often used for national edu tests 

<p>position of a value in data </p><p>ex: 90th percentile </p><p>often used for national edu tests&nbsp;</p>
25
New cards

quartiles

scale points that div sorted data into 4 grps of equal size

  • Q1 (First Quartile) = 25th percentile → 25% of data is below it.

  • Q2 (Second Quartile) = 50th percentile = Median → half the data is below it.

  • Q3 (Third Quartile) = 75th percentile → 75% of data is below it.

26
New cards

Interquartile Range (IQR) IQR=Q3−Q1

  • Show center (Q2) and spread (Q1, Q3).

  • Help detect outliers using fences:

    • Lower Fence = Q1−1.5(IQR)

    • Upper Fence = Q3+1.5(IQR)

    • Any data outside = outlier.

27
New cards

Methods for Finding Quartiles

  • Method of Medians:

    • Find Q2 (median) first.

    • Q1 = median of lower half.

    • Q3 = median of upper half.

  • Interpolation Method:

    • Uses formulas when data size doesn’t split evenly.

28
New cards

box plots

simple graph shows dataset’s center, spread, and skewness using five key numbers:

  1. Minimum (smallest value, not an outlier)

  2. Q1 (25th percentile)

  3. Median (Q2) (50th percentile)

  4. Q3 (75th percentile)

  5. Maximum (largest value, not an outlier)

= 5-number summary.

29
New cards

How to Read a Boxplot

  • The box = middle 50% of the data (from Q1 → Q3).

  • The line inside the box = the median (Q2).

  • The “whiskers” extend to the smallest and largest values within the fences (not outliers).

  • Any points beyond whiskers = outliers.

30
New cards

Fences (for detecting outliers)

  • Lower Fence = Q1 – 1.5 × IQR

  • Upper Fence = Q3 + 1.5 × IQR

  • Outliers are values outside these fences.

31
New cards
<p><strong>Shape of Distribution</strong></p>

Shape of Distribution

  • Skewness = symmetry.

    • Right-skewed: tail on right.

    • Left-skewed: tail on left.

  • Kurtosis = tail heaviness & peak.

    • High kurtosis = heavy tails (outliers likely).

    • Low kurtosis = flat distribution.

<ul><li><p><strong>Skewness</strong> = symmetry.</p><ul><li><p>Right-skewed: tail on right.</p></li><li><p>Left-skewed: tail on left.</p></li></ul></li><li><p><strong>Kurtosis</strong> = tail heaviness &amp; peak.</p><ul><li><p>High kurtosis = heavy tails (outliers likely).</p></li><li><p>Low kurtosis = flat distribution.</p></li></ul></li></ul><p></p>