AP Statistics Comprehensive Review

0.0(0)
studied byStudied by 0 people
GameKnowt Play
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/60

flashcard set

Earn XP

Description and Tags

Contains U1, U2, and U3

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

61 Terms

1
New cards

What makes a study an Observational Study?

  • treatments were not randomly assigned

    • ex: This is an observational study b/c volunteers were not assigned to whether they watched TV or didn’t watch TV.

2
New cards

What makes a study an Experiment?

  • treatments ARE randomly assigned

    • ex: This is an experiment because the teacher randomly assigns which students will use the physical program and which students will use the virtual program.

3
New cards

How do you conduct a Simple Random Sample?

  1. State how you will number the population

  2. State the method used (random number generator)

  3. State the range of numbers used and how many you will select. Also, communicate no repeated numbers allowed.

4
New cards

When do you conduct a Stratified Random Sample?

  • Use stratification when a variable that is inherent to the subjects/units (ie. age, gender) might influence the outcome of the experiment. 

5
New cards

What are the advantages and disadvantages of a Stratified Random Sample?

  • Advantage:

    • reduces variability in the response variable (state the response variable)

    • often more representative of the population as each group is proportionally represented in the sample

  • Disadvantage:

    • often time consuming to carry out

6
New cards

How do you conduct a Stratified Random Sample?

  1. Group individuals into their strata 

  2. State how you will number the population (number may change for each strata)

  3. State the method used (random number generator)

  4. State the range of numbers used and how many you will select. Also, communicate no repeated numbers allowed.

7
New cards

When do you conduct a Cluster Random Sample?

  • formed based on convenience instead of the confounding variable 

8
New cards

What are the advantages and disadvantages of a Cluster Random Sample?

  • Advantages:

    • quicker to carry out than stratifying 

  • Disadvantages:

    • the sample may not be representative of the population 

9
New cards

How do you conduct a Cluster Random Sample?

  1. Group individuals based off of proximity 

    1. Ex. If you’re taking a sample of a hotel, group the subjects by their floor 

  2. State how you will number the population

  3. State the method used (random number generator)

  4. State the range of numbers used and how many you will select. Also, communicate no repeated numbers allowed.

10
New cards

When does a study have selection bias?

  • when no randomness was used to select the sample/ poorly selected

11
New cards

What are the types of selection bias?

  • Convenience Sampling

    • sampled based on what is convenient for the researcher

      • ex. sampling the first 30 people you see 

  • Voluntary Response Sampling 

    • the people who respond are often similar to each other with respect to their views/opinions

      • ex. surveys online

12
New cards

When does a study have nonresponse bias?

  • there was random sampling

  • HOWEVER, the people contacted for the study can’t be contacted or refuse to participate 

    • ex. mail people the survey, but don’t follow up if they don’t mail it back

13
New cards

When does a study have response bias?

  • participants were randomly selected AND the people responded

  • but the responses are quite inaccurate 

14
New cards

How does response bias occur?

  • Leading questions: people know what answer you want so they tell you that answer

  • Confusing questions: people don’t understand the question 

  • Awkwardness between the researcher & individual answering

15
New cards

What is the explanatory variable?

  • a variable whose levels are manipulated intentionally 

    • the independent variable 

    • EXPLAINS the response variable 

16
New cards

What is the response variable?

  • a variable that is the outcome of a study; what you are measuring 

    • dependent variable 

    • is what happens in RESPONSE to the explanatory variable being manipulated 

17
New cards

What is a confounding variable?

  • a variable that is related to the explanatory variable and possibly influences the response variable 

18
New cards

What are the criteria of a well-designed experiment?

  • comparison: must compare two or more treatment groups 

  • random assignment: experimental units/subjects must be randomly assigned to treatments 

  • control: control potential confounding variables by keeping all other variables constant for all groups 

  • replication: must have more than one experimental unit/subject in each treatment 

19
New cards

What is the purpose of random assignment?

  • random assignment creates roughly equivalent groups (provide context)

  • allows for fair comparison between (provide context about the treatments)

  • can be attributed to the treatment (context) instead of untested variables 

20
New cards

What is the statistical advantage of blocking by the confounding variable?

  • blocking separates natural chance variability in responses from the differences due to the confounding variable (state with context)

  • this makes it easier to determine if one treatment is better/worse than the other treatment (makes more sense with context) 

21
New cards

Why is replication important?

  • there is more than one unit/subject in each treatment (provide context for both the unit and treatment)

  • this is important to show that the results aren’t due to random chance

22
New cards

What is a completely randomized designed experiment?

  • most basic designs

  • take a whole group of subjects/units and use a random method to assign them to treatment groups (typically of equal/similar size)

23
New cards

What is a randomized block designed experiment?

  • units/subjects are FIRST separated into blocks, and then random assignment to treatments is done within those sections

  • Reason: separates natural chance variability in responses from differences due to the blocking variable. This makes it easier to determine if one treatment is really more effective than another.

24
New cards

What is a matched pair design?

  • Pairing where 2 similar subjects are paired together

    • ex. pair golfers with a similar skill level

  • Pairing where each subject does both treatments 

25
New cards

What can you generalize for an observational study?

  • can only generalize the findings of a study to the population from which the sample was selected from

26
New cards

What can you generalize from a well-designed experiment?

  • results from the experiment can be generalized to others who are similar to the volunteers

  • results can be generalized to the general population when the subjects/units ARE randomly sampled from the population before randomly assigning to treatments

27
New cards

What is categorical data?

  • data is placed into one of several groups of categories

    • ex. favorite color, car model, zip code

    • appropriate graphs: bar graph, segmented bar graph, mosaic plot, pie chart

28
New cards

What is quantitative data?

  • data is numerical and it makes sense to average the values

    • ex. height, length, age, # of siblings, # of pairs of shoes you own

      • appropriate graphs: dotplot, stemplot, histogram, box plot

29
New cards

What are the characteristics of the mean?

\overline{x}=\frac{\sum_{}^{}x_{i}^{^{}}}{n}

  • add up all the values in the set, then divide by the number of values 

  • heavily influenced by outliers 

30
New cards

What are the characteristics of the median?

  • the middle value of an ordered distribution

  • not influenced by outliers 

31
New cards

What are the characteristics of range?

  • Max - Min

  • Highly influenced by outliers

32
New cards

What are the characteristics of the Interquartile Range (IQR)?

  • Q3 - Q1

  • measures how wide the middle 50% of the data is 

  • not influenced by outliers, b/c only values toward the middle of the distribution are used in the calculation

33
New cards

How do you find outlier(s)?

  • Do the upper and lower fence test 

    • LF = Q1 - 1.5(IQR)

      • x < LF = outlier 

    • UF = Q3 + 1.5(IQR)

      • x > UF = outlier 

34
New cards

What is the five-number summary?

  1. min

  2. Q1

  3. median

  4. Q3

  5. max

35
New cards

What are the characteristics of standard deviation?

s=\sqrt{\frac{1}{n-1}\sum_{}^{}\left(x_{i}-\overline{x}\right)^2}

  • the on average amount of deviation from the mean

  • since it’s related to the mean, it is heavily influenced by outliers

36
New cards

What does skewed left mean?

  • long tail on the left

37
New cards

What does skewed right mean?

  • long tail on the right 

38
New cards

Characteristics of a stem plot

  • you can tell the shape of a graph

  • easily find the middle 

  • can find the unusual features 

  • can easily find the spread

39
New cards

Characteristics of a histogram

  • can’t easily find the middle 

  • can find the spread

  • can find unusual features

  • can find the shape

40
New cards

How do describe/ compare a data set?

CUSS + CONTEXT

  • center - median/mean (based on presence of outliers)

  • unusual features - gaps/outliers

  • spread - range/IQR/standard deviation (based on presence of outliers)

  • shape - unimodal/skewed/symmetric 

  • context of the problem 

  • NOTE: when comparing, use words like: greater than, less than, equivalent

41
New cards

How do you create a relative cumulative frequency graph?

  • make the graph from 0%-100%

  • take the values from the relative frequency and then add them in order

    • rf1: 36%, rf2: 29%

    • rcf1: 36%, rcf: 65%

42
New cards

What is the important aspect to remember about relative cumulative frequency tables?

  • Whatever value is chosen, the amount is up to that value 

    • ex. How many automobiles have a score less than 180?

      • 93%, the leftover 7% is over 180.

43
New cards

What are the characteristics of z-scores?

z=\frac{x_{i}-\overline{x}}{s}

  • measure of relative position

  • tells you how many standard deviations above or below the mean a value is 

  • explanation: (context) is z-score standard deviations above/below the mean score (context).

    • when comparing z-scores, whoever has the highest generally did better 

44
New cards

What are the characteristics of a scatterplot?

  • shows the relationship btwn. two quantitative variables 

45
New cards

How do you describe the characteristics of a linear relationship?

DUST + CONTEXT

  • Direction - positive/negative

  • Unusual features - outliers (points that fall outside of the line of data)

  • Strength 

    • strong - data points fall close to the line 

    • weak - data points are very spread out around the line 

  • Type - linear, nonlinear

  • context of the problem 

46
New cards

What are the characteristics of the correlation coefficient?

  • r only measures the strength of a linear relationship - between two quantitative variables 

  • measured between -1< r < 1 (can equal -1 or 1)

    • if r is near zero = relationship is weak

    • if r is near one = relationship is strong

  • r has no units

47
New cards

What are the basic characteristics of the least squares regression line (LSRL)?

\displaylines{\overline{y}=a+bx\\ }  

  • “y hat” = predicted y value 

  • the line of best fit that minimizes the sum of the squares of the residuals

48
New cards

What happens when you predict (“y hat”) with the LSRL?

  • Interpolating 

    • when the x-value is inside the domain of the data 

  • Extrapolating 

    • when the s-value is outside the domain of the data 

49
New cards

How do you interpret the slope/“b” of the LSRL?

  • As (x in context) increases by 1 (unit), the predicted (y in context) increases/decreases by (b).

50
New cards

How do you interpret the intercept/“a” of the LSRL?

  • When (x in context) is zero, the predicted (y in context) is (a). 

51
New cards

How do you interpret the correlation/ r?

You must address:

  • strength - moderate/weak/strong

  • type - linear/nonlinear

  • direction - positive/negative

  • context

52
New cards

How do you characterize the coefficient of determination/ r2?

  • calculated by squaring the correlation

  • Interpretation: r2% of the variation in (y in context) is attributed to the linear relationship with (x in context).

53
New cards

What are the characteristics of a residual?

\varepsilon=y-\overline{y}

  • residual = actual - predicted

  • the vertical distances between each point on a scatterplot and the LSRL

54
New cards

How do you interpret residual plots?

  • no obvious pattern = this model is appropriate

  • clear/obvious pattern = this model isn’t appropriate

    • fanning is also a bad thing 

55
New cards
56
New cards
57
New cards
58
New cards
59
New cards
60
New cards
61
New cards