Cumulative AP Exam Study Guide

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/79

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

80 Terms

1
New cards

Statistics

The science of collecting, analyzing, and drawing conclusions from data

2
New cards

Descriptive Statistics

Methods of organizing and summarizing statistics

3
New cards

Inferential Statistics

Making generalizations from a sample to the population

4
New cards

Population

An entire collection of individuals or objects

5
New cards

Sample

A subset of the population selected for study

6
New cards

Variable

Any characteristic whose value changes

7
New cards

Data

Observations on single or multi-variables

8
New cards

Categorical Variables (Qualitative)

Basic characteristics

9
New cards

Numerical Variables (Quantitative)

Measurements or observations of numerical data

10
New cards

Discrete Variables

Listable sets (counts)

11
New cards

Continuous Variables

Any value over an interval of values (measurements)

12
New cards

Univariate

One variable

13
New cards

Bivariate

Two variables

14
New cards

Multivariate

Many variables

15
New cards

Symmetrical Distribution

Data on which both sides are fairly the same shape and size (“Bell Curve”)

16
New cards

Uniform Distribution

Every class has an equal frequency (number) → “a rectangle”

17
New cards

Skewed Distribution

One side (tail) is longer than the other side. The skewness is in the direction that the tail points (left or right)

18
New cards

Bimodal Distribution

Data of two or more classes have large frequencies separated by another class between them (“double hump camel”)

19
New cards

How to describe numerical graphs

  • Shape - overall type (symmetrical, skewed right left, uniform, or bimodal)

  • Outliers - gaps, clusters, etc.

  • Center - middle of the data (mean, median, and mode)

  • Spread - refers to variability (range, standard deviation, and IQR)

Everything must be in context to the data and situation of the graph. When comparing two distributions – MUST use comparative language!

20
New cards

Parameter

Value of a population (typically unknown)

21
New cards

Statistic

A calculated value about a population from a sample(s)

22
New cards

Measures of Center

  • Median - the middle point of the data (50th percentile) when the data is in numerical order. If two values are present, then average them together.

  • Mean - μ is for a population (parameter) and x is for a sample (statistic).

  • Mode - occurs the most in the data. There can be more then one mode, or no mode at all if all data points occur once.

23
New cards

Variability

Allows statisticians to distinguish between usual and unusual occurrences

24
New cards

Measures of Spread (variability)

  • Range - a single value (Max-Min)

  • IQR - interquartile range (Q3 – Q1)

  • Standard deviation - σ for population (parameter) & s for sample (statistic) – measures the typical or average deviation of observations from the mean – sample standard deviation is divided by df = n-1

    *Sum of the deviations from the mean is always zero!

  • Variance - standard deviation squared

25
New cards

Resistant

Not affected by outliers

  • Median

  • IQR

26
New cards

Non-Resistant

  • Mean

  • Range

  • Variance

  • Standard Deviation

  • Correlation Coefficient (r)

  • Least Squares Regression Line (LSRL)

  • Coefficient of Determination (r²)

27
New cards

Comparison of mean & median based on graph type

The mean is always pulled in the direction of the skew away from the median

28
New cards

Symmetrical

Mean and the median are the same value

29
New cards

Skewed Right

Mean is a larger value than the median

30
New cards

Skewed Left

The mean is smaller than the median

31
New cards

Trimmed Mean

Use a % to take observations away from the top and bottom of the ordered data. This possibly eliminates outliers

32
New cards
term image

The mean is changed by both addition (subtract) & multiplication (division)

33
New cards
term image

The standard deviation is changed by multiplication (division) ONLY

34
New cards
term image

Just add or subtract the two (or more) means

35
New cards
term image

Always add the variances – X & Y MUST be independent

36
New cards

Z-Score

A standardized score. This tells you how many standard deviations from the mean an observation is. It creates a standard normal curve consisting of z-scores with a μ = 0 & σ = 1.

<p>A standardized score. This tells you how many standard deviations from the mean an observation is. It creates a standard normal curve consisting of z-scores with a μ = 0 &amp; σ = 1.</p>
37
New cards

Normal Curve

Bell shaped and symmetrical curve.

  • As σ increases the curve flattens.

  • As σ decreases the curve thins.

38
New cards

Empirical Rule

(68-95-99.7) measures 1σ, 2σ, and 3σ on normal curves from a center of μ.

  • 68% of the population is between -1σ and 1σ

  • 95% of the population is between -2σ and 2σ

  • 99.7% of the population is between -3σ and 3σ

39
New cards

Boxplots

Are for medium or large numerical data. It does not contain original observations.

  • Always use modified boxplots where the fences are 1.5 IQRs from the ends of the box (Q1 & Q3). Points outside the fence are considered outliers.

  • Whiskers extend to the smallest & largest observations within the fences.

40
New cards

5-Number Summary

  1. Minimum

  2. Q1 (1st Quartile – 25th Percentile)

  3. Median

  4. Q3 (3rd Quartile – 75th Percentile)

  5. Maximum

41
New cards

Sample Space

Collection of all outcomes.

42
New cards

Event

Any sample of outcomes.

43
New cards

Complement

All outcomes not in the event.

44
New cards

Union

A or B, all the outcomes in both circles.

<p>A or B, all the outcomes in both circles.</p>
45
New cards

Intersection

A and B, happening in the middle of A and B.

<p>A and B, happening in the middle of A and B.</p>
46
New cards

Mutually Exclusive (Disjoint)

A and B have no intersection. They cannot happen at the same time.

47
New cards

Independent

If knowing one event does not change the outcome of another

48
New cards

Experimental Probability

The number of success from an experiment divided by the total amount from the experiment.

49
New cards

Law of Large Numbers

As an experiment is repeated the experimental probability gets closer and closer to the true (theoretical) probability. The difference between the two probabilities will approach “0”.

50
New cards

Probability Rules

  • All values are 0 < P < 1.

  • Probability of sample space is 1.

  • P (at least 1 or more) = 1 – P (none)

51
New cards

Compliment of a Probability

P + (1 - P) = 1

52
New cards

Addition of Probabilities

P(A or B) = P(A) + P(B) – P(A & B)

53
New cards

Multiplication of Probabilities

P(A & B) = P(A) · P(B)

  • If a & B are independent

54
New cards

Conditional Probability

Takes into account a certain condition.

<p>Takes into account a certain condition.</p>
55
New cards

Correlation Coefficient (r)

A quantitative assessment of the strength and direction of a linear relationship. (use ρ (rho) for population parameter)

  • There is a strength, direction, linear association between x & y.

  • 0 → no correlation

  • (0, ±0.5) → weak

  • [±0.5, ±0.8) → moderate

  • [±0.8, ±1] → strong

56
New cards

Least Squares Regression Line (LSRL)

A line of mathematical best fit. Minimizes the deviations (residuals) from the line. Used with bivariate data.

  • x is independent, the explanatory variable & y is dependent, the response variable

<p>A line of mathematical best fit. Minimizes the deviations (residuals) from the line. Used with bivariate data.</p><ul><li><p>x is independent, the explanatory variable &amp; y is dependent, the response variable</p></li></ul><p></p>
57
New cards

Residuals (error)

Vertical difference of a point from the LSRL. All residuals sum up to “0”.

<p>Vertical difference of a point from the LSRL. All residuals sum up to “0”.</p>
58
New cards

Residual Plot

Scatterplot of (x (or ˆy) , residual). No pattern indicates a linear relationship.

59
New cards

Coefficient of Determination (r²)

Gives the proportion of variation in y (response) that is explained by the relationship of (x, y). Never use the adjusted r².

  • Approximately r²% of the variation in y can be explained by the LSRL of x any y.

60
New cards

Slope (b)

For unit increase in x, then the y variable will increase/decrease slope amount.

61
New cards

Extrapolation

LRSL cannot be used to find values outside of the range of the original data.

62
New cards

Influential Points

Points that if removed significantly change the LSRL.

63
New cards

Outliers

Points with large residuals.

64
New cards

Census

A complete count of the population.

  • Why not to use a census?

    • Expensive

    • Impossible to do

    • If destructive sampling you get extinction

65
New cards

Sampling Frame

A list of everyone in the population.

66
New cards

SRS (Simple Random Sample)

One chooses so that each unit has an equal chance and every set of units has an equal chance of being selected.

  • Advantages: easy and unbiased.

  • Disadvantages: large σ² and must know population.

67
New cards

Stratified

Divide the population into homogeneous groups called strata, then SRS each strata.

  • Advantages: more precise than an SRS and cost reduced if strata already available.

  • Disadvantages: difficult to divide into groups, more complex formulas & must know population.

68
New cards

Systematic

Use a systematic approach (every 50th) after choosing randomly where to begin.

  • Advantages: unbiased, the sample is evenly distributed across population & don’t need to know population.

  • Disadvantages: a large σ² and can be confounded by trends.

69
New cards

Cluster Sample

Based on location. Select a random location and sample ALL at that location.

  • Advantages: cost is reduced, is unbiased & don’t need to know population.

  • Disadvantages: May not be representative of population and has complex formulas.

70
New cards

Random Digit Table

Each entry is equally likely and each digit is independent of the rest.

71
New cards

Random # Generator

Calculator or computer program

72
New cards

Bias

Error that favors a certain outcome, has to do with center of sampling distributions – if centered over true parameter then considered unbiased

73
New cards

Voluntary Response

People choose themselves to participate.

74
New cards

Convenience Sampling

Ask people who are easy, friendly, or comfortable asking.

75
New cards

Undercoverage

Some group(s) are left out of the selection process.

76
New cards
77
New cards
78
New cards
79
New cards
80
New cards