AP Statistics Ultimate Guide (copy)

0.0(0)
studied byStudied by 49 people
GameKnowt Play
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/106

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

107 Terms

1
New cards

Categorical Variables

Variables that take on values as category names or group labels.

2
New cards

Frequency Tables

Tables showing the number of cases in each category.

3
New cards

Relative Frequency Tables

Tables showing the proportion or percentage of cases in each category.

4
New cards

Quantitative Variable

Variable that takes on numerical values for a measured quantity.

5
New cards

Discrete Quantitative Variable

Variable with a finite or countable number of values.

6
New cards

Continuous Quantitative Variable

Variable with uncountable or infinite values.

7
New cards

Center

Value that separates the data roughly in half.

8
New cards

Spread

The scope of values from smallest to largest.

9
New cards

Clusters

Natural subgroups in which values fall.

10
New cards

Gaps

Holes where no values fall in the data.

11
New cards

Unimodal Distribution

Distribution with one peak.

12
New cards

Bimodal Distribution

Distribution with two peaks.

13
New cards

Skewed Distribution

Distribution spreading thinly towards higher or lower values.

14
New cards

Bell-shaped Distribution

Symmetric distribution with a center mound and sloping tails.

15
New cards

Descriptive Statistics

Presentation of data including average values, variability measures, and distribution shape.

16
New cards

Inferential Statistics

Drawing inferences from limited data.

17
New cards

Median

Middle number in a set of numbers.

18
New cards

Mean

Average found by summing items and dividing by the number of items.

19
New cards

Variability

Key concept in statistics describing the spread of data.

20
New cards

Range

Difference between the largest and smallest values.

21
New cards

Interquartile Range (IQR)

Range of the middle 50% of data.

22
New cards

Variance

Average of squared differences from the mean.

23
New cards

Standard Deviation

Square root of the variance, indicating typical distance from the mean.

24
New cards

Simple Ranking

Arranging elements to determine a value's position.

25
New cards

Percentile Ranking

Indicates the percentage of values at or below a specific value.

26
New cards

Z-Score

Number of standard deviations a value is above or below the mean.

27
New cards

Parallel Boxplots

Graphical representation showing the comparison of multiple datasets, indicating median, quartiles, and outliers.

28
New cards

Normal Distribution

A bell-shaped symmetrical distribution with the mean equal to the median, following the empirical rule of 68-95-99.7 for standard deviations.

29
New cards

Correlation Coefficient

A numerical measure indicating the strength and direction of a linear relationship between two variables, ranging from -1 to 1.

30
New cards

Coefficient of Determination (r^2)

The proportion of variance in the response variable explained by the variation in the explanatory variable in a linear regression model.

31
New cards

Residuals

Differences between observed and predicted values in a regression model, with a sum that is always zero.

32
New cards

Outliers

Data points that significantly deviate from the overall pattern in a dataset, affecting the regression analysis and interpretation.

33
New cards

Influential Scores

Scores whose removal would sharply change the regression line, especially points with extreme x-values.

34
New cards

High Leverage

Points with x-values far from the mean of x-values, having the potential to strongly influence the regression line.

35
New cards

Regression Outlier

A point with a large residual compared to others, affecting the regression line but not necessarily influential.

36
New cards

Correlation Coefficient (r)

Indicates the strength and direction of a linear relationship between two variables.

37
New cards

Simple Random Sample (SRS)

A sampling method where every possible sample of the desired size has an equal chance of being selected.

38
New cards

Stratified Sampling

Dividing the population into homogeneous groups and picking random samples from each stratum.

39
New cards

Cluster Sampling

Involves dividing the population into heterogeneous groups and selecting entire clusters randomly.

40
New cards

Systematic Sampling

Involves selecting every kth individual from a list after choosing a random starting point.

41
New cards

Sampling Variability

Refers to the natural presence of sampling error in a sample, which can be described using probability and tends to decrease with larger sample sizes.

42
New cards

Observational Studies

Involves observing and measuring without influencing the subjects, aiming to show associations between variables.

43
New cards

Experiments

Involve imposing treatments on subjects, measuring responses, and aiming to establish cause-and-effect relationships.

44
New cards

Experimental Units

Objects on which an experiment is performed, while subjects refer to individuals if the units are people.

45
New cards

Placebo Effect

The phenomenon where individuals respond to any perceived treatment, even if it is inactive.

46
New cards

Blinding

Occurs when subjects are unaware of the treatment they are receiving.

47
New cards

Double-blinding

When both subjects and evaluators are unaware of the treatment allocation.

48
New cards

Matched Pairs Design

Compares two treatments based on responses of paired subjects, often using the same individual for both treatments.

49
New cards

Replication

Involves having more than one experimental unit in each treatment group to enhance the reliability of results.

50
New cards

Law of Large Numbers

States that as the number of trials in an experiment increases, the relative frequency of an event tends to approach its true probability.

51
New cards

Guess Strategy

A strategy in a standard literacy test where the test taker selects answers randomly when the correct answer is unknown.

52
New cards

Score 60-79

A range of scores in a standard literacy test considered passing but not superior, falling between 60 and 79.

53
New cards

Does not score 60-79

The probability of a test taker not achieving a score between 60 and 79 in a standard literacy test.

54
New cards

Strategy "Answer (c)" and Scores 80-100

The joint probability of a test taker choosing answer (c) and scoring between 80 and 100 in a standard literacy test.

55
New cards

Strategy "Longest Answer" or Scores 0-59

The probability of a test taker choosing the longest answer or scoring between 0 and 59 in a standard literacy test.

56
New cards

Strategy "Guess" given Score 0-59

The probability of a test taker using the guess strategy given that their score falls between 0 and 59 in a standard literacy test.

57
New cards

Scored 80-100 given Strategy "Longest Answer"

The probability of a test taker scoring between 80 and 100 given that they chose the strategy of selecting the longest answer in a standard literacy test.

58
New cards

Guess Strategy and Scoring 0-59 Independence

The assessment of whether the strategy of guessing and scoring between 0 and 59 are independent events in a standard literacy test.

59
New cards

Strategy "Longest Answer" and Scoring 80-100 Mutual Exclusivity

The evaluation of whether the strategy of choosing the longest answer and scoring between 80 and 100 are mutually exclusive events in a standard literacy test.

60
New cards

Cumulative Probability Distribution

A function, table, or graph linking outcomes with the probability of less than or equal to that outcome occurring.

61
New cards

Normal Distribution

Provides a model for how sample statistics vary under random sampling, often calculated using z-scores.

62
New cards

Central Limit Theorem

States that for sufficiently large sample sizes, the sampling distribution of the mean will be approximately normal.

63
New cards

Biased and Unbiased Estimators

Bias means the sampling distribution is not centered on the population parameter; unbiased estimators are centered on the population parameter.

64
New cards

Sampling Distribution for Sample Proportions

Focuses on the proportion of successes in a sample, approximating a normal distribution for large sample sizes.

65
New cards

Sampling Distribution for Differences in Sample Proportions

Deals with differences obtained by subtracting sample proportions of one population from another.

66
New cards

Sampling Distribution for Sample Means

The variance of sample means is the population variance divided by the sample size squared.

67
New cards

Sampling Distribution

The distribution of sample means or proportions taken from a population, with a mean equal to the population mean and a standard deviation equal to the population standard deviation divided by the square root of the sample size.

68
New cards

Confidence Interval

A range of values that is likely to contain the true population parameter with a certain level of confidence, typically expressed as (point estimate ± margin of error).

69
New cards

Standard Error

A measure of how much the sample statistic typically varies from the population parameter, calculated as the standard deviation of the sampling distribution.

70
New cards

Normality Assumption

The assumption that the sampling distribution of sample means or proportions is approximately normal if certain conditions are met, like the sample size being large enough.

71
New cards

Type I Error

Mistakenly rejecting a true null hypothesis in hypothesis testing, with a probability denoted as α (alpha).

72
New cards

Type II Error

Mistakenly failing to reject a false null hypothesis in hypothesis testing, with a probability denoted as β (beta).

73
New cards

Power of a Test

The probability of correctly rejecting a false null hypothesis, influenced by the sample size and significance level chosen for the test.

74
New cards

P-value

A measure that helps determine the strength of the evidence against the null hypothesis in hypothesis testing.

75
New cards

Type I error

Occurs when the null hypothesis is wrongly rejected when it is actually true.

76
New cards

Type II error

Happens when the null hypothesis is not rejected when it is false.

77
New cards

Confidence Interval

A range of values that is likely to contain the true population parameter.

78
New cards

Two-sample z-interval

A method used to estimate the difference between two population proportions.

79
New cards

Null hypothesis

A statement that there is no significant difference or relationship between the variables being studied.

80
New cards

Alternative hypothesis

A statement that there is a significant difference or relationship between the variables being studied.

81
New cards

t-distribution

A probability distribution that is used when the population standard deviation is unknown.

82
New cards

Standard error

An estimate of the standard deviation of a sampling distribution.

83
New cards

Significance Test

A statistical method used to determine whether there is enough evidence to reject the null hypothesis.

84
New cards

Type-I Error

Mistakenly rejecting a true null hypothesis, leading to the consequence of discouraging customers from purchasing a product that might actually deliver as advertised.

85
New cards

Confidence Interval

An estimate of a population parameter that asks for a range of values within which the true parameter is likely to fall.

86
New cards

Type-II Error

Mistakenly failing to reject a false null hypothesis, potentially resulting in missed opportunities for necessary actions or improvements.

87
New cards

Significance Level

The threshold used to determine whether there is enough evidence to reject the null hypothesis in a hypothesis test.

88
New cards

Power

The probability of correctly rejecting a false null hypothesis, indicating the effectiveness of a test in detecting a true effect.

89
New cards

Hypothesis Test

A statistical method to assess the validity of a claim about a population parameter by comparing sample data to the null hypothesis.

90
New cards

Paired Data

Involves analyzing the differences between two related measurements, often using a one-sample analysis on the paired differences.

91
New cards

Simulation

A method to estimate the likelihood of observing a certain outcome by random chance alone, often used to determine P-values in hypothesis testing.

92
New cards

Two-Sample T-Test

A statistical test to compare the means of two independent samples, assessing whether there is a significant difference between the population means.

93
New cards

Confidence Interval for the Difference of Two Means

Estimating the range within which the true difference between two population means is likely to lie, based on sample data.

94
New cards

Chi-Square Test for Goodness-of-Fit

A statistical test used to determine if there is a significant difference between observed and expected frequencies in different categories.

95
New cards

Chi-Square Statistic (χ²)

The sum of weighted differences between observed and expected frequencies in a chi-square test.

96
New cards

P-value

The probability of obtaining a chi-square value as extreme as the one observed, assuming the null hypothesis is true.

97
New cards

Degrees of Freedom (df)

The number of categories minus one in a chi-square distribution.

98
New cards

Chi-Square Test for Independence

A statistical test to determine if there is a significant association between two categorical variables.

99
New cards

Chi-Square Test for Homogeneity

A test used to compare samples from two or more populations to see if they have the same distribution.

100
New cards

Sampling Distribution for the Slope

The distribution of sample slopes in linear regression models.