AP Statistics Ultimate Guide

studied byStudied by 25 people
0.0(0)
Get a hint
Hint

Categorical Variables

1 / 97

flashcard set

Earn XP

Description and Tags

98 Terms

1

Categorical Variables

Variables that take on values as category names or group labels, organized into frequency tables or represented by displays like bar graphs, dot plots, and pie charts.

New cards
2

Quantitative Variable

Variables with numerical values for measured quantities, organized into frequency tables or represented by displays like histograms, dot plots, and box plots.

New cards
3

Discrete Quantitative Variable

Takes on a countable number of values with gaps between them.

New cards
4

Continuous Quantitative Variable

Can take on infinite values without gaps, like heights and weights.

New cards
5

Center

The value that separates the data roughly in half, indicating the middle.

New cards
6

Spread

The range of values from smallest to largest, showing the variability.

New cards
7

Clusters

Natural subgroups in the data, indicating where values fall.

New cards
8

Gaps

Holes in the data where no values fall, showing gaps in the distribution.

New cards
9

Unimodal Distribution

Distribution with one peak; Bimodal Distribution:Distribution with two peaks.

New cards
10

Skewed Distribution

Spread towards higher (right-skewed) or lower (left-skewed) values.

New cards
11

Bell-shaped Distribution

Symmetric with a center mound and sloping tails.

New cards
12

Descriptive Statistics

Data presentation including average values, variability measures, and distribution shape.

New cards
13

Inferential Statistics

Drawing inferences from limited data, discussed in later units.

New cards
14

Median

Middle number in a set; Mean:Average found by summing and dividing by the number of items.

New cards
15

Variability

Key concept in statistics, described by range, interquartile range, variance, and standard deviation.

New cards
16

Parallel Boxplots

Graphical representation showing the comparison of stock price statistics across different years, including median, quartiles, yearly low, and interquartile range.

New cards
17

Normal Distribution

A bell-shaped and symmetric distribution used to model various natural phenomena, with the mean equal to the median and points of inflection at one standard deviation from the mean.

New cards
18

Empirical Rule

Also known as the 68-95-99.7 rule, states the percentage of values within 1, 2, and 3 standard deviations from the mean in a normal distribution.

New cards
19

Two-Way Table

A table displaying qualitative data from two categorical variables, often used to calculate marginal frequencies and distributions.

New cards
20

Scatterplot

A visual representation of the relationship between two quantitative variables, showing form, direction, strength, and unusual features like outliers.

New cards
21

Correlation

A measure (r) of the strength of a linear relationship between two variables, ranging from -1 to +1, with r^2 indicating the proportion of variance explained by the relationship.

New cards
22

Coefficient of Determination (r^2)

The percentage of variation in the response variable explained by the linear regression model, derived from the correlation coefficient.

New cards
23

Least Squares Regression

A method to find the best-fitting line through a set of points by minimizing the sum of squared vertical differences, with the slope determined by the correlation coefficient.

New cards
24

Residuals

The differences between observed and predicted values in a regression model, with a sum of residuals always equal to zero.

New cards
25

Outliers

Data points that significantly deviate from the overall pattern in a scatterplot, often identified by large discrepancies in the response variable compared to predicted values.

New cards
26

Influential Scores

Scores whose removal would sharply change the regression line, especially points with extreme x-values.

New cards
27

High Leverage

Points with x-values far from the mean x-value, having the potential to strongly influence the regression line.

New cards
28

Regression Outlier

A point with a large residual compared to others, affecting the regression line but not necessarily influential.

New cards
29

Correlation Coefficient (r)

Indicates the strength and direction of a linear relationship between two variables.

New cards
30

Simple Random Sampling (SRS)

A sampling method where every possible sample of the desired size has an equal chance of being selected.

New cards
31

Stratified Sampling

Involves dividing the population into homogeneous groups (strata) and selecting random samples from each stratum.

New cards
32

Cluster Sampling

Divides the population into heterogeneous groups (clusters) and selects entire clusters randomly.

New cards
33

Systematic Sampling

Involves selecting every kth individual from a list after choosing a random starting point.

New cards
34

Sampling Variability

The natural presence of sampling error in a sample, which can be described using probability and tends to decrease with larger sample sizes.

New cards
35

Observational Studies

Studies where observations and measurements are made without influencing the subjects, aiming to show associations between variables.

New cards
36

Experiments

Studies where treatments are imposed on subjects to measure responses, aiming to establish cause-and-effect relationships.

New cards
37

Experimental Units

Objects on which an experiment is performed, while subjects refer to people as units.

New cards
38

Explanatory Variables

Factors in an experiment believed to affect the response variables, with different levels of treatment applied to groups.

New cards
39

Control Group

A group in an experiment that does not receive the treatment of interest, or receives a placebo, to determine the treatment's effect.

New cards
40

Placebo Effect

The phenomenon where individuals respond to any perceived treatment, even if it is inactive.

New cards
41

Blinding

When subjects are unaware of the treatment they are receiving in an experiment.

New cards
42

Double-blinding

When both subjects and evaluators are unaware of the treatment assignments in an experiment.

New cards
43

Matched Pairs Design

A design where two treatments are compared based on responses from paired subjects, often involving single subjects receiving both treatments in random order.

New cards
44

Guess Strategy

A strategy in a standard literacy test where the test taker selects answers randomly when the correct answer is unknown.

New cards
45

Score 60-79

A range of scores in a standard literacy test considered passing but not superior, falling between 60 and 79.

New cards
46

Does not score 60-79

The probability of a test taker not achieving a score between 60 and 79 in a standard literacy test.

New cards
47

Strategy "Answer (c)" and Scores 80-100

The joint probability of a test taker choosing answer (c) and scoring between 80 and 100 in a standard literacy test.

New cards
48

Strategy "Longest Answer" or Scores 0-59

The probability of a test taker either choosing the longest answer or scoring between 0 and 59 in a standard literacy test.

New cards
49

Guess Strategy given Score 0-59

The probability of a test taker using the guess strategy given that their score falls between 0 and 59 in a standard literacy test.

New cards
50

Scored 80-100 given Strategy "Longest Answer"

The probability of a test taker scoring between 80 and 100 given that they chose the strategy "longest answer" in a standard literacy test.

New cards
51

Guess Strategy and Scoring 0-59 Independence

The assessment of whether the strategy "guess" and scoring between 0 and 59 are independent events in a standard literacy test.

New cards
52

Strategy "Longest Answer" and Scoring 80-100 Mutual Exclusivity

The evaluation of whether the strategy "longest answer" and scoring between 80 and 100 are mutually exclusive events in a standard literacy test.

New cards
53

Cumulative Probability Distribution

A function, table, or graph linking outcomes with the probability of less than or equal to that outcome occurring.

New cards
54

Normal Distribution

Provides a model for how sample statistics vary under random sampling, often calculated using z-scores.

New cards
55

Central Limit Theorem

States that for sufficiently large sample sizes, the sampling distribution of the mean will be approximately normal.

New cards
56

Biased and Unbiased Estimators

Bias indicates the sampling distribution is not centered on the population parameter; unbiased estimators are centered on the population parameter.

New cards
57

Sampling Distribution for Sample Proportions

Focuses on the proportion of successes in a sample, approximating a normal distribution for large sample sizes.

New cards
58

Sampling Distribution for Differences in Sample Proportions

Deals with differences obtained by subtracting sample proportions of one population from another.

New cards
59

Sampling Distribution for Sample Means

The variance of sample means is the population variance divided by the sample size squared.

New cards
60

Sampling Distribution

The distribution of sample means or proportions taken from a population, with a mean equal to the population mean and a standard deviation equal to the population standard deviation divided by the square root of the sample size.

New cards
61

Confidence Interval

A range of values that is likely to contain the true population parameter with a certain level of confidence, typically expressed as (point estimate ± margin of error).

New cards
62

Standard Error

A measure of how much the sample statistic typically varies from the population parameter, calculated as the standard deviation of the sampling distribution.

New cards
63

Normality Assumption

The assumption that the sampling distribution of sample means or proportions is approximately normal if certain conditions are met, like the sample size being large enough.

New cards
64

Type I Error

Mistakenly rejecting a true null hypothesis in hypothesis testing, with a probability denoted as α (alpha).

New cards
65

Type II Error

Mistakenly failing to reject a false null hypothesis in hypothesis testing, with a probability denoted as β (beta).

New cards
66

Power of a Test

The probability of correctly rejecting a false null hypothesis, influenced by the sample size and significance level chosen for the test.

New cards
67

P-value

A measure that helps determine the significance of results in a hypothesis test; a small P-value indicates strong evidence against the null hypothesis.

New cards
68

Type I error

Occurs when the null hypothesis is rejected when it is actually true, leading to a false positive conclusion.

New cards
69

Type II error

Occurs when the null hypothesis is not rejected when it is false, resulting in a false negative conclusion.

New cards
70

Confidence Interval

A range of values that is likely to contain the true parameter being estimated, with a specified level of confidence.

New cards
71

Difference of Two Proportions

Refers to the contrast between two population proportions, often analyzed using hypothesis tests or confidence intervals.

New cards
72

t-distribution

A probability distribution that is used when the population standard deviation is unknown, providing a more accurate estimate than the normal distribution for small sample sizes.

New cards
73

Standard Error

An estimate of the standard deviation of a sampling distribution, often used to calculate confidence intervals and conduct hypothesis tests for means.

New cards
74

Significance Test

A statistical method used to determine whether there is enough evidence to reject the null hypothesis in favor of an alternative hypothesis.

New cards
75

Type-I Error

Mistakenly rejecting a true null hypothesis, leading to the consumer agency discouraging customers from purchasing a new brand of air-conditioning unit that could actually save on electricity consumption.

New cards
76

Confidence Interval

A range of values that is likely to contain the true parameter, such as the 95% confidence interval for the mean difference in accidents per month between two departments.

New cards
77

Type-II Error

Mistakenly failing to reject a false null hypothesis, potentially resulting in a company not making necessary fixes, affecting future sales.

New cards
78

Paired Data

Involves one-sample analysis on the differences from paired data, like finding a 90% confidence interval of the mean improvement in test scores for a SAT preparation class.

New cards
79

P-Value

A measure that helps determine the strength of the evidence against the null hypothesis, as seen in the simulation example where a recalibration of machinery was deemed necessary based on the P-value.

New cards
80

Power

The probability of correctly rejecting a false null hypothesis, contrasting with Type II error, as illustrated in the scenario where the candidate's true support was 63% but might not be recognized due to a Type II error.

New cards
81

Hypothesis Test

Involves making a claim about a population parameter and testing it, like the significance test for the difference of two means in the example of comparing computer downtimes.

New cards
82

Parameter

A characteristic of a population, such as the mean electricity usage of a new brand of air-conditioning units, denoted by μ in hypothesis testing.

New cards
83

Chi-Square Statistic

The sum of weighted differences or discrepancies used in the Chi-Square test denoted as χ2.

New cards
84

P-value

The probability of obtaining a Chi-Square value as extreme as the one obtained if the null hypothesis is true.

New cards
85

Degrees of Freedom (df)

The number of categories minus one used in Chi-Square distributions to determine the critical value.

New cards
86

Goodness-of-Fit Test

A test to determine if a given theoretical distribution correctly describes a situation, problem, or activity.

New cards
87

Chi-Square Test for Independence

A test to determine if there is a significant association between two categorical variables.

New cards
88

Chi-Square Test for Homogeneity

A test to compare samples from two or more populations to see if they are homogeneous.

New cards
89

Sampling Distribution for the Slope

The distribution of the sample slope b with mean μb and standard deviation σb.

New cards
90

Confidence Interval for the Slope

An interval estimate for the slope of the regression line using t-scores with degrees of freedom n-2.

New cards
91

Confidence Interval

A range of values that is likely to contain the true slope of the regression line with a certain level of confidence.

New cards
92

Null Hypothesis (H0)

The assumption that there is no relationship or no effect in a statistical test.

New cards
93

Residuals Plot

A graph that shows the differences between observed values and predicted values in a regression analysis.

New cards
94

P-Value

The probability of obtaining results as extreme as the observed results of a statistical hypothesis test, assuming that the null hypothesis is true.

New cards
95

Least Squares Regression Line

The line that minimizes the sum of the squared differences between the observed values and the values predicted by the line.

New cards
96

Slope

The measure of the steepness of a line, indicating the rate of change of the dependent variable with respect to the independent variable.

New cards
97

Linear Relationship

A relationship between two variables that can be represented by a straight line.

New cards
98

Scatterplot

A graph that shows the relationship between two variables by displaying data points on a two-dimensional plane.

New cards

Explore top notes

note Note
studied byStudied by 5 people
... ago
5.0(1)
note Note
studied byStudied by 14 people
... ago
5.0(1)
note Note
studied byStudied by 79 people
... ago
5.0(4)
note Note
studied byStudied by 2 people
... ago
4.0(1)
note Note
studied byStudied by 73 people
... ago
5.0(1)
note Note
studied byStudied by 27 people
... ago
4.5(2)
note Note
studied byStudied by 9 people
... ago
5.0(1)
note Note
studied byStudied by 32 people
... ago
4.5(2)

Explore top flashcards

flashcards Flashcard (335)
studied byStudied by 33 people
... ago
5.0(1)
flashcards Flashcard (115)
studied byStudied by 14 people
... ago
5.0(1)
flashcards Flashcard (27)
studied byStudied by 6 people
... ago
5.0(1)
flashcards Flashcard (44)
studied byStudied by 8 people
... ago
5.0(1)
flashcards Flashcard (94)
studied byStudied by 3 people
... ago
5.0(1)
flashcards Flashcard (75)
studied byStudied by 307 people
... ago
4.5(2)
flashcards Flashcard (172)
studied byStudied by 2 people
... ago
5.0(1)
flashcards Flashcard (632)
studied byStudied by 70 people
... ago
5.0(1)
robot