AP Statistics - Descriptive Statistics

0.0(0)

Studied by 3 people

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/52

There's no tags or description

Looks like no tags are added yet.

Last updated 3:34 AM on 10/5/25

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No analytics yet

Send a link to your students to track their progress

53 Terms

New cards

Intro & Quantitative Data - TYPES OF DATA

Quantitative: data in the form of numerical values
- ex> height, weight
Qualitative: data in the form of words, characteristics, etc.
- ex> fav color, birthday month

New cards

Intro & Quantitative Data - TYPES OF GRAPHS

For univariable (1 variable) data: bar graph, pie chart, histogram, line graph, stem + leaf plot, dot plot, box plot
For bivariable (studies the relationship b/w 2 variables) data: scatter plot

New cards

Intro & Quantitative Data - Distribution

→ set of data that uses the frequency that each outcome occurs among all possibilities

Measures of Central Tendency → where center of distribution of data lies
- mean, median, mode
Measures of Spread → amount of variation in distribution
- range, IQR, standard deviation
Shape of Distribution

New cards

Intro & Quantitative Data - Histogram

title
x-axis (+labels)
y-axis (+labels)
bars touch, measures a quantitative variable against frequency

New cards

Intro & Quantitative Data - Dot Plot

title
x-axis
dots above corresponding values to represent frequency

New cards

Intro & Quantitative Data - SHAPES OF DISTRIBUTIONS (Wherever tail is…)

…pulls the mean up or down…

New cards

Intro & Quantitative Data - SHAPES OF DISTRIBUTIONS (Skew Right)

Skew Right: most data on left
- mean > med
- high values have a big weight on mean
  - few data points to right pull mean up
- tail w/ less data on right

New cards

Intro & Quantitative Data - SHAPES OF DISTRIBUTIONS (Skew Left)

Skew Left: most data on right
- mean < med
- tail on left
- few data points to left pull mean down

New cards

Intro & Quantitative Data - SHAPES OF DISTRIBUTIONS (Symmetric)

mean = med

New cards

Intro & Quantitative Data - SHAPES OF DISTRIBUTIONS (Unimodal)

“one mode”
- One hump w/ highest frequency

New cards

Intro & Quantitative Data - SHAPES OF DISTRIBUTIONS (Uniform)

frequencies are about the same

New cards

Intro & Quantitative Data - SHAPES OF DISTRIBUTIONS (Bimodal)

(symmetric)

New cards

Intro & Quantitative Data - SHAPES OF DISTRIBUTIONS (Multimodal)

New cards

Intro & Quantitative Data - SYMBOLS: Population Mean

μ (“mu”)

New cards

Intro & Quantitative Data - SYMBOLS: Sample Mean

x̄ (x-bar)
- x → any variable

New cards

Intro & Quantitative Data - SYMBOLS: Population Standard Deviation

𝛔 (sigma)

New cards

Intro & Quantitative Data - SYMBOLS: Population Variable

𝛔² (sigma squared)

New cards

Intro & Quantitative Data - SYMBOLS: Sample Standard Deviation

New cards

Intro & Quantitative Data - SYMBOLS: Sample Variable

New cards

Intro & Quantitative Data - MEASURES OF CENTRAL TENDENCY

Typically the mean best describes a distribution
When outliers exist or a large skew, the median is best
- outliers and skewedness affect the mean b/c the mean takes into account the weight of all values whereas the median does not
Mode is used for qualitative data (you can’t find mean/median w/o #’s)

New cards

Intro & Quantitative Data - HISTOGRAM W/ CLASSES

To create classes → Range / # of classes
- (must be whole #, ALWAYS round up)
Classes: use formula and add by class width for each class
MP: (smaller number in class width + larger number in class width) / 2
- x-axis
Frequency: find how many numbers are present in the distribution in classes
- Should add up to sample size!
Relative Frequency: frequency/sample size
- Y-AXIS
Cumulative Relative Frequency: add up relative frequencies
- Always ends at 1!

New cards

Intro & Quantitative Data - MEASURES OF SPREAD

Range (max-min) = 29-5 = 24
- *The range is 24 or the range is from 5 to 29
IQR: interquartile range (Q₃ - Q₁)
Standard deviation

New cards

Intro & Quantitative Data - BOX PLOTS

List numbers in order
Find MEDIAN
- Median term # when listed in order
  - (n + 1) / 2
Find Q₁
- Median between median and minimum value
Find Q₃
- Median between median and maximum value
25% of the data is within each quartile
- SIZE of quartile doesn’t matter (just indicates more or less spread)
FOR OUTLIERS…
- Solve for outliers
- Make the maximum/minimum value the next highest number

New cards

Intro & Quantitative Data - 5 NUMBER SUMMARY

Minimum
Q₁
Median
Q₃
Maximum

New cards

Intro & Quantitative Data - OGIVE: CUMULATIVE RELATIVE FREQUENCY GRAPH

Plot points as a line
x-axis: MP’s
y-axis: Cumulative Relative Frequency
*ogives are only interpreted to the left (‘this or less”)
*to go from cumulative relative frequency to a box plot, estimate the quartiles (0%, 25%, 50%, 75%, 100%)
- 0% → min
- 25% → Q₁
- 50% → Q₂
- 75% → Q₃
- 100% → max

New cards

Intro & Quantitative Data - STANDARD DEVIATION

→ the average distance each value lies from the mean

Make a table with x, (x-x̄), & (x-x̄)²
List data points under x column
Do (x-x̄) under (x-x̄) column
- Add up all the values
Do (x-x̄)² under (x-x̄)² column
- Add up all the values = TOTAL VARIABLE
𝛔²(population variable) = total variable / average variable
𝛔 (population standard deviation) = √𝛔²
- On average ____ stray ____ (𝛔) away from the mean.

New cards

Intro & Quantitative Data - FORMULAS FOR STANDARD DEVIATION

For population:
For sample:

New cards

Intro & Quantitative Data - CALCULATE OUTLIERS

Rule is outliers fall outside of interval
- [Q₁ - 1.5(IQR), Q₃ + 1.5(IQR)]

New cards

Intro & Quantitative Data - WRITE A FEW SENTENCES DESCRIBING THE DATA

center
spread
shape
unusual features (outliers, gaps, clusters)
MUST be in context ⭐

New cards

Describing Qualitative Data - BAR CHART

x-axis
y-axis: frequency
bars DO NOT touch

New cards

Describing Qualitative Data - PARETO CHART

x-axis
y-axis: Frequency
bars DO NOT touch
*Bars in descending order, highlights the mode

New cards

Describing Qualitative Data - PIE CHART

percentage = relative frequency
# of people = frequency

New cards

Describing Qualitative Data - SEGMENTED BAR GRAPH

Make a table with RELATIVE FREQUENCY & CUMULATIVE RELATIVE FREQUENCY
- Add relative frequencies before value to get cumulative relative frequency
x-axis: One Bar
y-axis: Cumulative Relative Frequency
label segments of bar
*break messes with scale… can make relative frequency look smaller than it is

New cards

Describing Qualitative Data - CONTINGENCY TABLE

2 variables
….of the… = denominator

New cards

Comparing Distributions

Include a discussion of center, spread and shape using context and comparative statements. Include approximate values/ranges when possible.

New cards

Comparing Distributions - Comparative Statements

Comparative Statements: greater than, higher, less than, lower, equal, etc. (except shape)
- Use “whereas” only for shape
List:
- mean
- standard deviation
- sample size
- minimum value
- Q₁
- median
- Q₃
- maximum value
- outliers

New cards

Introduction to Normal Distributions - normal distribution

a bell-shaped frequency distribution curve. Most of the data values in a normal distribution tend to cluster around the mean.
- → the further away a data point is from the mean, the less likely it is to happen

New cards

Introduction to Normal Distributions - Characteristics

unimodal (one mode, one peak), symmetric (right side mirrors left side), asymptotic (approach, but never touch x-axis), mean = median = mode (center = peak → 50% data below mean, 50% data above mean)

New cards

Introduction to Normal Distributions - What does the NORMAL MODEL look like?

x-axis: mean @ center + standard deviations away
curve with asymptotic ends

New cards

Names for Normal Distributions

One of the most important examples of a continuous probability distribution is the normal distribution. The graph is usually called normal, bell-shaped or Gaussian curve.

New cards

Properties of Normal Distributions - Area Under the Curve

Total area under the curve is always equal to one.
The portion of the area under the curve above a given interval represents the probability that a measurement will lie in that interval.
- area under curve = probability

New cards

Properties of Normal Distributions - EMPIRICAL FORMULA

The Empirical Rule can be applied for any normal distribution which says:
- → about 68% of data lies within 1 std. dev. of mean
- →about 95% of data lies within 2 std. dev. of mean
- → about 99.7% of data lies within 3 std. dev. of mean
The Empirical Rule can be used to find different percentiles.
*MAKE SURE TO INCLUDE ABOUT WHEN ANSWERING QUESTIONS

New cards

Properties of Normal Distributions - Normal distributions vary from one another in two ways: the mean may be located anywhere on the x axis and the bell shape may be more or less spread according to the size of the standard deviation. It would be difficult to compute the area under the curve for each different combination… Z SCORES

→ a z-score tells you exactly how many std. dev. a data value is above or below the mean

New cards

Properties of Normal Distributions - Z SCORES: How does standardizing affect the center, spread and shape of the distribution?

When the data is converted to z scores the mean (center) becomes 0, the std. dev. becomes 1, shape remains the same
z = (x - μ) / 𝛔 → standardized test statistic = (statistic - parameter) / std. Dev.

New cards

Properties of Normal Distributions - Z SCORES: We can use these z-scores to then…

… calculate probabilities using our z-score chart to determine the area under the curve that corresponds with each z-score.

New cards

Properties of Normal Distributions - Z SCORES: Find the specified areas!

4 decimal places b/c table uses 4 decimal places (for area under curve/probability)
- Go to z-score table using probability notation → P(z </>/</> _)
  - For negative values (to the left), find z score & corresponding value.
  - For positive values (to the right), subtract corresponding values from 1.
  - For between, (larger value in table) - (smaller value in table).

New cards

Properties of Normal Distributions - Z SCORES: < / <

< / < are the same!

New cards

Properties of Normal Distributions - ACCURACY

ACCURACY: The normal model is not accurate if <3 std. dev. from the mean is negative.
- Depends on the context of the problem…
- A normal model must be able to go 3 std. dev. in both directions.

New cards

Rescaling Data - How does shift affect mean + std. dev.?

The mean increases or decreases by shift.
The std. dev. stays the same (not affected)

New cards

Rescaling Data - How does multiplier affect mean + std. dev.?

→ multiplying (scaling) data
Both mean + std. dev. get multiplied by the scalar value.

New cards

Rescaling Data - Adding a number to a distribution that is the same as the mean will…

not change the mean as it is equal to the current mean
decrease the std. dev. because there is less variability since we have added another value at the center

New cards

Rescaling Data - *when you convert units…

…nothing is actually changing (this applies to z-scores as well)

New cards

Rescaling Data - Turn rescaling into an algebraic expression!

*substitute values you are looking for into expression… need to take into account which values are affected by shifts/multipliers.
*data points/measures of center affected by both shifts + mult. while measures of spread are only affected by mult.