Stats Unit 1

0.0(0)
studied byStudied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/123

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 2:45 AM on 2/12/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

124 Terms

1
New cards

What is data?

set of measurements comprised of variables and cases

2
New cards

What are case (units)?

refer to what we obtain info about

3
New cards

What are variables?

characteristics of each case

4
New cards

What are categorical (qualitative) variables?

variables that divide cases into categories/groups

5
New cards

What are quantitative (numerical) variables?

variables that measure a numerical (unit) value for each case

6
New cards

What are exploratory variables?

variables that impact/explain another variable

7
New cards

What are response variables?

variables that change in response to the explanatory variable

8
New cards

What are descriptive statistics?

statistics used to make sense of data to visualize and summarize it

9
New cards

What can we use to visualize categorical data?

frequency and relative frequency tables, proportions

10
New cards

What are frequency tables?

method that displays categorical data by showing the number of cases that fall into each group

11
New cards

What are proportions?

measurement that describes the amount or percentage of data in a group

12
New cards

How is a proportion calculated?

# in category/sample size

13
New cards

What are relative frequency tables?

similar to frequency tables, but display the proportion rather than a number

14
New cards

What can we use to visualize quantitative data?

dot plots and histograms

15
New cards

What are dot plots?

graph in which the number of dots above each measurement represents the quantity of cases for that value

16
New cards

What is important about dot plots?

they’re more useful for smaller sets of data

17
New cards

What are histograms?

graph in which the height of each bar represent the number of cases within a range of values

18
New cards

What is important about histograms?

NOT the same as bar graphs, and better for large data sets

19
New cards

What does the shape of a graph give info about?

center, spread, and type of data

20
New cards

What does the term skewed refer to?

shorter end of a graph

21
New cards
<p>What shape of distribution is this?</p>

What shape of distribution is this?

bell curved, symmetric, unimodal

22
New cards
<p>What shape of distribution is this?</p>

What shape of distribution is this?

skewed right

23
New cards
<p>What shape of distribution is this?</p>

What shape of distribution is this?

skewed left

24
New cards
<p>What shape of distribution is this?</p>

What shape of distribution is this?

uniform

25
New cards
<p>What shape of distribution is this?</p>

What shape of distribution is this?

non symmetric, bimodal

26
New cards
<p>What shape of distribution is this?</p>

What shape of distribution is this?

symmetric, bimodal

27
New cards

What are the three measures of center?

mean, median, and mode

28
New cards

What is a mean?

average of a data set

29
New cards

How do you calculate a mean?

add up all values and divide by number of responses/data

30
New cards

What is a median?

middle of a data set

31
New cards

How do you calculate a median?

order data from smallest to largest, median is the center value or between the two middle ones if data is even

32
New cards

What is a mode?

data value that occurs most often, can be multiple or none

33
New cards

How are mean and median related when shape of distribution is symmetric?

mean = median

34
New cards

How are mean and median related when shape of distribution is skewed left?

mean < median

35
New cards

How are mean and median related when shape of distribution is skewed right?

mean > median

36
New cards

How is the mean affected when shape of distribution is skewed?

mean is pulled toward the skewed side

37
New cards

What is an outlier?

observed value that is notably distinct from other values in a dataset

38
New cards

What is a resistant statistic?

statistic that is relatively unaffected by extreme values

39
New cards

What are some examples of resistant statistics?

median, interquartile range

40
New cards

What is a 5-number summary?

method of summarizing datasets into quantiles to quickly find the 25th, 50th, and 75th percentiles of a distribution

41
New cards

What does a 5-number summary consist of?

minimum - smallest number, 0%

first quartile (Q1) - median of the first half of the data, 25%

median (Q2) - middle number, 50%

third quartile (Q3) - median of the 2nd half of the data, 75%

maximum - largest number, 100%

42
New cards

How is range of a data set calculated?

from 5-number summary: max-min

43
New cards

How is interquartile range (IQR) of a dataset calculated?

from 5-number summary: Q3-Q1

44
New cards

How are the bounds for outliers calculated?

lower bound: Q1 - 1.5(IQR)

upper bound: Q3 - 1.5(IQR)

45
New cards

What is a population?

includes all individuals/objects of interest

46
New cards

What is a sample?

refers to all cases we get data from, subset of the population

47
New cards

What is a statistical inference?

inference made using data from a sample to get info/make predictions about a population

48
New cards

What is sampling variability?

idea that if you take a sample of five people, another sample of five people will probably not get the same results

all samples result in this

49
New cards

What is sampling bias?

bias that occurs when the method of selecting a sample causes the sample to differ from the population in some relevant way

if this occurs we can’t trust the generalizations from the sample to the population

50
New cards

What is a census?

way to collect data in which data is collected from every subject in the population

ideal way to collect data but often hard to access

51
New cards

What are simple random samples (SRS)?

way to collect data in which each unit has the same probability of being chosen

52
New cards

What are systematic samples?

way to collect data in which you choose every nth person

53
New cards

What is stratified sampling?

way to collect data in which you break the population into groups that might matter and then pull an SRS of each group

54
New cards

What are the three ways of random sampling?

simple random sampling, systematic samples, and stratified sampling

55
New cards

What is important about random sampling?

occurs in data sets that don’t have bias, if we don’t randomly sample then bias will occur

56
New cards

What are the types of bias?

dependent sampling, voluntary response, response bias, and non-response bias

57
New cards

What is dependent sampling?

sampling scheme that gets observations related to each other

58
New cards

What is voluntary response?

only getting volunteers for a survey

59
New cards

What is response bias?

occurs when the wording of a question impacts the respondents’ answer

60
New cards

What is non-response bias?

when participants selected for the survey don’t complete it

more about fear of what responses may affect the person being surveyed rather than not having time to complete something or forgetting about it

61
New cards

What are box plots?

visualization of a 5 number summary and each section represents 25% of the data

62
New cards

What are the three measures of spread?

range, variance, standard deviation

63
New cards

What is range?

biggest value - smallest value

64
New cards

What is variance (s2)?

expected square deviation from mean

65
New cards

What is standard deviation (s)?

average deviation from the mean

66
New cards

What is the 95% rule/empirical rule?

used for data that is symmetric/bell-curved, 68% of dataset are within one s from the mean, 95% of dataset are within two s from the mean, 99.7% of dataset are within three s of the mean

67
New cards

How do we describe quantitive data?

CUSS statements: center, unusual points, shape, spread

68
New cards

How do we describe categorical data?

talk about proportions, counts, and percents

69
New cards

What are z-scores?

tells you how many standard deviations away from the mean a data point is

70
New cards

When a z-score is further from zero, what does that mean?

the value is more extreme

71
New cards

What is the equation for a z-score?

z = (x-mean)/s

72
New cards

What is an experiment?

study in which research actively controls one or more of the explanatory variables

73
New cards

What is an observational study?

study in which research doesn’t actively control value of any variables but simply observes values as they naturally exist

74
New cards

How are two variables determined to be associated?

values of one variable tend to be related to the values of the other variable

75
New cards

How are two variables determined to be casually associated?

changing the value of one variable influences the value of the other variable

76
New cards

What is a randomized experiment?

value of explanatory variable for each unit is determined randomly, before response variable is measured

77
New cards

What are the two types of randomized experiments?

comparative and matched pairs

78
New cards

What is a randomized comparative experiment?

randomly assign cases to different treatment groups, then compare results on the response variable

79
New cards

What is a matched pairs experiment?

each case gets both treatments in random order and we examine individual differences in response variable between two treatments

80
New cards

What is a confounding variable?

third variable related to both explanatory and response variables, can offer a plausible explanation for an association between two variables of interest, reason why we utilize random assignment, difficult to avoid in observational studies

81
New cards

What is a two-way table?

shows relationship between 2 categorical varibales

82
New cards

How do we determine categorical variables are associated?

one variable changes the likelihood of certain values for another variable

83
New cards

How do we determine association between numerical and categorical variables?

look at the center of the data, if the mean and standard deviation fall out of the mean of the other category then there’s an association

84
New cards

What is a correlation?

measurement used to describe how strong a linear relationship exists between numerical variables

85
New cards

What is important about correlations?

ONLY occur between numerical variables

86
New cards

What is a positive correlation?

as on variable increases, so does the other

87
New cards

What is a negative correlation?

as one variable increases, the other decreases

88
New cards

What is no correlation?

no distinct pattern/relationship

89
New cards

What are scatterplots?

visualization plot that allows us to see sty[e of correlation between two quantitative variables

90
New cards

What are the axes of scatterplots?

x: explanatory variable, y: response variable

91
New cards

How do we describe scatterplots?

DOFS statements: direction, outliers, form (linear or not), strength

92
New cards

What is a linear association?

when an explanatory variable increases, the response changes at a constant rate

93
New cards

How do we interpret correlation?

values close to -1 and 1 indicate a strong relationship, sign of correlation indicates the direction (positive or negative), closer to 0 indicates no linear relationship

94
New cards

What is a coefficient of determination (r2)?

how much of a response variable is caused or explained by an explanatory variable

95
New cards

What is a least squares regression line?

line that best represents the relationship between two quantitative variables in a scatterplot

96
New cards

How do we calculate a least squares line?

slope b: r(sy/sx)

y-intecept a: y bar - b(x bar)

97
New cards

How do we interpret a least squares regression line?

slope represents predicted change in response variable given a one unit increase in the explanatory variable, intercept represents predicted value for the response variable when the explanatory variable equals zero

98
New cards

What is a residual?

difference between observed and predicted values of a response variable

99
New cards

How do we determine a residual?

observed-predicted

100
New cards

What does a positive residual indicate?

above least squares regression line, under predicted value