Statistics Chapter 3 Flashcards

0.0(0)
studied byStudied by 0 people
GameKnowt Play
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/26

flashcard set

Earn XP

Description and Tags

Key vocabulary terms from Chapter 3 on numerically summarizing data, including measures of center, spread, position, and methods for identifying outliers.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

27 Terms

1
New cards

Shape (distribution)

The distribution’s form, described by symmetry/skewness, number of peaks, clusters or gaps, and any outliers.

2
New cards

Center

A typical or representative value of a distribution; numerical measures include the mean and the median.

3
New cards

Spread

The variability or dispersion of the data, describing how far values are from each other or from the center.

4
New cards

Mean (x̄)

The sum of all observations divided by the number of observations (sample mean); population mean is μ.

5
New cards

Median

The middle value when data are ordered; if n is odd, it is the middle observation; if even, it is the average of the two middle observations.

6
New cards

Mode

The observation that occurs most frequently; often used for categorical data to indicate the most frequent category.

7
New cards

Resistant

A statistic that is little affected by outliers; the median is resistant, the mean is not.

8
New cards

Symmetric

A distribution with balanced tails; typically mean ≈ median.

9
New cards

Skewed right

A distribution with a longer tail to the right; generally mean > median.

10
New cards

Skewed left

A distribution with a longer tail to the left; generally mean < median.

11
New cards

Outlier

An observation unusually far from the rest of the data; causes can include measurement error, different population, or a rare event.

12
New cards

Range

Difference between the largest and smallest observations; simple but sensitive to outliers.

13
New cards

Interquartile Range (IQR)

Q3 − Q1; spread of the middle 50% of data; resistant to outliers.

14
New cards

Variance

Average squared deviation from the mean; s² for a sample, σ² for a population.

15
New cards

Standard Deviation

Square root of the variance; s for a sample, σ for a population; has the same units as the data and is not resistant.

16
New cards

Five-number summary

Minimum, Q1, M (median), Q3, Maximum.

17
New cards

Boxplot

Graph of the five-number summary; whiskers extend to the smallest/largest non-outlier observations; outliers shown as points or asterisks.

18
New cards

Quartile

Values that divide data into four equal parts: Q1 (25th percentile), Q2 (median), Q3 (75th percentile).

19
New cards

Z-score

The number of standard deviations an observation is from the mean; z = (observation − mean)/sd (sample) or (x − μ)/σ (population).

20
New cards

Percentile

A value such that p% of observations fall below it; common examples include quartiles (25th, 50th, 75th).

21
New cards

Empirical Rule

For bell-shaped data: about 68% within 1 SD, 95% within 2 SD, and 99.7% within 3 SD.

22
New cards

1.5 × IQR Rule (outliers)

Fences for identifying outliers: Lower fence = Q1 − 1.5·IQR; Upper fence = Q3 + 1.5·IQR; values outside are outliers.

23
New cards

Boxplot shape guidelines

In boxplots, symmetric distributions have median near the center with similar whiskers; skewed distributions show longer whiskers on the side of skew.

24
New cards

Describing distributions (center vs. spread)

For symmetric distributions report mean and SD; for skewed or outlier-containing distributions report median and IQR; use same measures when comparing groups.

25
New cards

Population vs Sample notation

x̄ denotes the sample mean; μ denotes the population mean; s denotes the sample standard deviation; σ denotes the population standard deviation.

26
New cards

Outlier effect on center

Outliers tend to have a large influence on the mean but little on the median; removing an outlier can change the mean more than the median.

27
New cards

Quartiles by hand vs. software

Q1 and Q3 are typically found by hand using the median and halves of the data; JMP may compute them differently.