AP Stats

0.0(0)
Studied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/67

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 11:25 PM on 4/14/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

68 Terms

1
New cards

marginal distribution

the probability of an event occurring (p(A)), it may be thought of as an unconditional probability.  It is not conditioned on another event. 

  • ex: the probability that a card drawn is red (p(four)=1/13

  • focuses on the margins (the total row/column)

  • can be counts or percentages

marginal relative frequency: type of marginal distribution expressed as a ratio or percentage

  • row/column total divided by table total

<p>the probability of an event occurring (p(A)), it may be thought of as an unconditional probability.&nbsp; It is not conditioned on another event.&nbsp;</p><ul><li><p>ex: the probability that a card drawn is red (p(four)=1/13</p></li></ul><ul><li><p>focuses on the margins (the total row/column)</p></li></ul><ul><li><p>can be counts or percentages</p></li></ul><p></p><p>marginal relative frequency: type of marginal distribution expressed as a ratio or percentage</p><ul><li><p>row/column total divided by table total</p></li></ul><p></p>
2
New cards

conditional distribution

p(A|B) is the probability of event A occurring, given that event B occurs.

  • ex: given that you drew a red card, what’s the probability that it’s a four (p(four|red))=2/26=1/13

  • usually only as a percentage

conditional relative frequency: expressed as a ratio or percentage

  • cell divided by row/column total

interpretation: “Because the distribution of conditional relative frequencies is different for each age group, these two variables are associated.”

3
New cards

joint distribution

p(A and B).  The probability of event A and event B occurring

  • ex: the probability that a card is a four and red p(four and red) = 2/52=1/26

joint relative frequency: expressed as a ratio or percentage

  • cell divided by table total

<p>p(A and B).&nbsp; The probability of event A <strong>and</strong> event B occurring</p><ul><li><p>ex: the probability that a card is a four and red p(four and red) = 2/52=1/26</p></li></ul><p></p><p>joint relative frequency: expressed as a ratio or percentage</p><ul><li><p>cell divided by <strong>table</strong> total</p></li></ul><p></p>
4
New cards

individuals

things described by the data set

5
New cards

skew right/left

its the side the tail is pointing to

6
New cards

standard deviation formula

(variance is sd squared)

<p>(variance is sd squared)</p>
7
New cards

adding a constant to data

mean and median increase (measures of center);

standard deviation, IQR, and range stay the same (measures of spread)

8
New cards

multiplying data by a constant

mean, median, standard deviation, IQR, and range are all multiplied by that constant (both measures of center and spread increase by the multiple)

9
New cards

if data is transformed by addition and multiplication…

measures of spread (standard deviation, etc) will only be affected by multiplication while measures of center (mean, median) will be affected by both

10
New cards

effect of adding/removing a data point or just changing it

  • adding/removing will affect both mean and median, but affect mean more

  • changing data point will affect mean but not median

11
New cards

formula for judging outliers in a data set

lower bound = Q1 - 1.5 * IQR

upper bound = Q3 + 1.5 * IQR

12
New cards

unimodal

distribution has a single peak

<p>distribution has a single peak</p>
13
New cards

bimodal

distribution has 2 peaks

<p>distribution has 2 peaks</p>
14
New cards

uniform distribution

same number of values for each value on the x-axis

<p>same number of values for each value on the x-axis</p>
15
New cards

unusual features include…

outliers, gaps, and clusters

<p>outliers, gaps, and clusters</p>
16
New cards

typical value:

summarizes a dataset with a single number representing the center or "usual" data point. Can be mean, median, or mode

17
New cards

when describing distribution…

always include context

<p>always include context</p>
18
New cards

important characteristics when describing distribution of quantitative data:

shape, center, variability (spread), and unusual features

19
New cards

terms for describing shape:

skewed left/right, symmetric, unimodal, bimodal, uniform

20
New cards

be specific when identifying variables

you can’t just say the variable is “pool?” you have to say it is “whether or not it has a pool”

<p>you can’t just say the variable is “pool?” you have to say it is “whether or not it has a pool”</p>
21
New cards

categorical variables

takes on values that are category names or group labels

<p>takes on values that are category names or group labels</p>
22
New cards

quantitative variable

takes on numerical values for a measured or counted quantity

  • you can find an average of those values

  • not all numerical values are quantitative

    • ex: zipcode

  • can make quantitative variables categorical by grouping values

    • ex: distance to beach → close = <1 mi, nearby = 1-3 mi, far = 3+ mi

<p>takes on numerical values for a measured or counted quantity</p><ul><li><p>you can find an average of those values</p></li><li><p>not all numerical values are quantitative</p><ul><li><p>ex: zipcode</p></li></ul></li><li><p>can make quantitative variables categorical by grouping values</p><ul><li><p>ex: distance to beach → close = &lt;1 mi, nearby = 1-3 mi, far = 3+ mi</p></li></ul></li></ul><p></p>
23
New cards

categorical data

values of a categorical variable in a data set

  • ex: types of property → house, condo, or townhome

24
New cards

individuals:

people, animals, or things described by the data

25
New cards

frequency table:

number of individuals (cases) in each category

<p>number of individuals (cases) in each category</p>
26
New cards

relative frequency table:

gives the proportion or percent of individuals (cases) in each category

<p>gives the proportion or percent of individuals (cases) in each category</p>
27
New cards

should you leave gaps between bars in a bar graph?

yes

28
New cards

how do you display categorical data graphically?

bar chart that displays frequencies or relative frequencies or a pie chart

29
New cards

remember they will try to trick you with questions about proportions given data in counts

comparisons are easier when using relative frequencies for different size groups

<p>comparisons are easier when using relative frequencies for different size groups</p>
30
New cards

discrete variable

can take on a countable number of values (with gaps)

  • ex: # siblings

  • counting rather than measuring

31
New cards

continuous variable

can take on infinitely many values, but those values cannot be counted (no gaps)

  • ex: height

  • measuring rather than counting

32
New cards

intervals on histograms do or don’t include the end point

dont

<p>dont</p>
33
New cards

histogram pros and cons

pros: easier to make for large data sets, easy to see shape of distribution

cons: doesn’t show every individual value in data set

34
New cards

dot plot and stem and leaf plot pros and cons

pros: shows every individual value in data set, easy to see shape of distribution

cons: difficult for large data sets

35
New cards

IQR interpretation

“the middle 50% of the values for ____ has a range of _ (units)”

36
New cards

range interpretation

“the range of the distribution for ___ is _ (units)”

  • range is a single value (range = 9 rather than 3-12)

37
New cards

standard deviation interpretation

“The ___ from each sample typically varies by about __ (units) from the mean of __ (units)”

38
New cards

what’s used to describe the position of a distribution of quantitative data?

position: Q1 and Q3

39
New cards

what summary statistics can be used to describe the variability of a distribution of quantitative data?

variability: range, IQR, standard deviation

40
New cards

what summary statistics can be used to describe the center of a distribution?

center: mean, median

41
New cards

standard deviation method for checking outliers:

an outlier is 2 or more standard deviations above or below the mean

42
New cards

nonresistant

heavily affected by the outlier (mean, range, standard deviation)

43
New cards

resistant

not substantially affected by outliers (median, IQR)

44
New cards

for a skewed distribution the best measures of center and variability are…

median and IQR

45
New cards

for a symmetric distribution, the best measures of center and variability are…

mean, standard deviation

46
New cards

boxplot pros and cons

pros: shows 5-num summary and outliers, splits data into quartiles

cons: doesn’t show every individual value, can hide certain features of the shape of a distribution (ex: clusters and gaps are invisible)

47
New cards

skewed right: mean __ median

>

48
New cards

skewed left: mean __ median

<

49
New cards

five-number summary includes:

min, Q1, median, Q3, max

50
New cards

how to describe shape of a distribution

“The distribution of ___ is uni/bimodal (peak at _) and (strongly) skewed towards the left/right (or symmetric)”

51
New cards

how to describe center

“The typical ___ is around ___ (units)”

52
New cards

how to describe variability:

“___ vary from a minimum value of _ to a maximum value of _ (units)

53
New cards

how to describe unusual features:

“There is a cluster of values between _ and _, a large gap between _ and _, and (several) possible high outliers.”

54
New cards

when describing shape from a boxplot…

use the word “appears” because we are not sure

55
New cards

when comparing distributions…

compare all 4 characteristics (shape, center, variability, unusual features), use comparative words (similar, the same, greater), and include context from the problem

  • check 1.9 daily vid 1

56
New cards

percentile:

percent of data less than or equal to a given value

  • number of data points at or equal to value/total number of values *100

57
New cards

percentile interpretation:

“The value of ___ is at the pth percentile. About (p) percent of the values are less than or equal to ___.”

58
New cards

standardized score:

most often a standardized z-score

<p>most often a standardized z-score</p>
59
New cards

z-score interpretation:

“The value of __ is (z-score) standard deviations above/below the mean.”

(The value of 20 ppb is 0.88 standard deviations above the mean)

60
New cards

z-score

number standard deviations above/below the mean

61
New cards

percentiles and z-scores can be calculated for what distributions?

any, not just normal distribution

62
New cards

normal distributions are determined by…

mean and standard deviation

63
New cards

empirical rule:

68, 95, 99.7

64
New cards

interpretation of area under a normal curve:

“The proportion of ___ is (area)”

(The proportion of adults that are hypertension stage 1 is about 0.13)

if working backwards: “About 10% of adults are high risk, with a blood pressure of more than 122.8 mmHg”

65
New cards

mosaic graph vs segmented

the widths of the bars in a mosaic graph correspond to how many values there are in that bar

<p>the widths of the bars in a mosaic graph correspond to how many values there are in that bar</p>
66
New cards

what do you use to compare 2 categorical variables?

graphical: bar graphs

numerical: 2-way tables, conditional relative frequencies

67
New cards

what do you use to compare 2 quantitative variables?

graphical: scatterplot

numerical: correlation, linear regression, coefficient of determination

68
New cards

process for graphic 2 categorical variables

turn a table in counts into percentages → side-by-side bar graph → segmented bar graph → mosaic graph