# ap stats unit 2

categorical variables

1

categorical variables

places an individual into one of several groups or categories.

2

quantitative variables

takes numerical values for which it makes sense to find an average.

3

distribution

tells us what values the variable takes and how often it takes these values; pattern of variation.

4

data table

lists individuals.

5

frequency table

summarizes distribution in counts.

6

relative frequency table

summarizes distribution in percents.

7

two-way table

a table used to describe two categorical variables.

8

marginal distribution

the distribution of values of a categorical variable among all individuals described by the table.

9

conditional distribution

describes values of variable among individuals who have a specific value of another variable; there is a different conditional distribution for each value of the other variable.

10

segmented bar graph

a "stacked" bar graph that shows parts of a whole; forces us to use percents, easy to compare.

11

association

high/low amounts of V1 associated with high/low amounts of V2.

12

characteristics to address when describing the distribution of a quantitative variable

shape

13

outliers

14

center

15

16

shape

skewness, symmetry

17

center

mean, median

18

range, standard deviation

New cards
histogram

labels, equal classification widths

20

what to do with boundary values (whole number on next bar or lower bar?)

21

make dot plot first

22

minimum of five bins (bars)

23

relative frequency histogram

makes it easier to compare two distributions, especially when number of individuals is very different.

24

x bar

mean of a sample

25

μ

mean of a population

26

resistant measures of center

median - YES, outliers don't affect the number of items in a set

27

mean - NO, mean is pulled in the direction of skewness

28

how does the shape of a distribution affect the relationship between the mean and the median?

skew right: mean > median

29

skew left: mean < median

30

symmetric: mean = median

31

range

max - min

32

33

quartiles

median of observations to left and right of median

34

IQR

Q3 - Q1

35

36

outliers

Q1 - 1.5(IQR) = lower boundary

37

Q3 + 1.5(IQR) = upper boundary

38

five-number summary

minimum, Q1, median, Q3, maximum -> boxplot

39

standard deviation

the typical distance of the values in the data set from the mean.

40

41

similarities between range, IQR, standard deviation

42

differences between range, IQR, standard deviation

range is least resistant to outliers

43

standard deviation is slightly resistant

44

IQR is most resistant

45

properties of standard deviation

measures spread about the mean; only use when mean is chosen as center.

46

Sx is always greater than or equal to 0.

47

Sx has the same measurement units as data (original observations).

48

Sx is NOT resistant.

49

factors to consider when choosing summary statistics

50
• skewed/outlier: median, IQR (resistant)

51
• symmetric data without outliers: mean, standard deviation

52

always graph for shape (histogram)

53

four-part question

1. State the question.

54
1. Plan (set up).

55
1. Do (calculate).

56
1. Conclude (in context).

