Unit 2: Exploring Two-Variable Data

studied byStudied by 1 person
0.0(0)
learn
LearnA personalized and smart learning plan
exam
Practice TestTake a test on your terms and definitions
spaced repetition
Spaced RepetitionScientifically backed study method
heart puzzle
Matching GameHow quick can you match all your cards?
flashcards
FlashcardsStudy terms and definitions

1 / 64

65 Terms

1

What is univariate data?

a one-variable data set

New cards
2

What is bivariate data?

a data set that describes the relationship between 2 variables

New cards
3

What is a response variable?

the variable that measures the outcome of a study

New cards
4

What is an explanatory variable?

the variable that may help predict or explain changes in a response variable

New cards
5

What does a scatterplot show?

the relationship (association) between two quantitative variables measured on the same individuals

New cards
6

Why do we study relationships between 2 variables?

to help us explain how one variable affects another and why something happens

New cards
7

How do you make a scatterplot?

  1. Label the axes

  2. Scale the axes

  3. Plot individual data values

New cards
8

How do you describe a scatterplot?

using DUFS – direction, unusual features, form and strength

New cards
9

What is the easiest way to lose points when making a scatterplot?

not labeling the axes

New cards
10

How do you know which variable to put on what axis?

the explanatory always goes on the x-axis, and the response variable goes on the y-axis; if there is no explanatory variable, either variable can go on the x-axis

New cards
11

Where do you start each axis of a scatterplot?

at a number smaller than the smallest value of that variable

New cards
12

When do 2 variables have a positive association?

when above average values of one variable tend to accompany above average values of the other variable

New cards
13

When do 2 variables have a negative association?

when above average values of one variable tend to accompany below average values of the other variable

New cards
14

When do 2 variables have no association?

if knowing one variable does not help us predict the value of the other variable

New cards
15

How do you describe direction of a scatterplot?

positive association, negative association, or no association

New cards
16

How do you describe form of a scatterplot?

linear or nonlinear

New cards
17

How do you describe strength of a scatterplot?

weak, moderate, or strong

New cards
18

How do you describe unusual features of a scatterplot?

outliers or clusters

New cards
19

What does correlation r measure?

the direction and strength of the association of the linear relationship between two quantitative variables

New cards
20

What interval does r always fall between?

-1 to 1

New cards
21

What does r > 0 indicate?

a positive association

New cards
22

What does r < 0 indicate?

a negative association

New cards
23

What only occurs in the case of a perfect linear relationship?

the extreme values r = -1 and r = 1

New cards
24

What do values of r close to -1 or 1 indicate?

a very strong relationship

New cards
25

What do values of r close to 0 indicate?

a very weak relationship

New cards
26

What does correlation not imply?

causation

New cards
27

What type of relationships should correlation only be used on?

linear

New cards
28

True or false: Correlation is not a resistant measure of strength

true

New cards
29

True or false: You can determine the form of a relationship using only correlation

false

New cards
30

How do you find r on a calculator?

after entering the values in the lists, press STAT → CALC → 8: LinReg(a+bx) → Calculate

New cards
31

What does correlation require of both variables?

that they be quantitative

New cards
32

What does correlation make no distinction between?

explanatory and response variables

New cards
33

True or false: r does not change when change the units of measurement of x, y, or both

true

New cards
34

What is a regression line?

a line that describes how a response variable (y) changes as an explanatory variable (x) changes

New cards
35

What form are regression lines expressed in?

ŷ = a + bx

New cards
36

What is the regression line used to predict?

the value of y for a given value of x

New cards
37

What is extrapolation?

the use of a regression line for prediction far outside the interval of values of the explanatory variable x used to obtain the line

New cards
38

Why is extrapolation dangerous?

there is no guarantee the linear pattern we see will continue beyond the given data

New cards
39

True or false: The regression line will pass exactly through all the points in a scatterplot

false

New cards
40

What is a residual?

the difference between an observed value of the response variable and the value of y predicted by the regression line

New cards
41

What is the equation to find a residual?

y - ŷ

New cards
42

How do you interpret a residual?

give the size and direction of the residual

The actual value of [response variable] is [residual value] more/less than the value predicted by the regression line with x = [explanatory variable]

New cards
43

What does a represent in the regression line equation ŷ = a + bx?

the y-intercept, the predicted value of y when x = 0

New cards
44

What does b represent in the regression line equation ŷ = a + bx?

the slope, the amount by which the predicted value of y changes when x increases by 1 unit

New cards
45

How do you interpret slope?

The predicted value of [response variable] goes up/down by b for each [unit of x].

New cards
46

How do you interpret the y-intercept?

The predicted value of a [individual] that has 0 [unit of x] is [y-intercept]

New cards
47

What regression line do we want?

the one that minimizes the sum of the squared residuals

New cards
48

What is the least-squares regression line?

the line that makes the sum of the squared residuals as small as possible

New cards
49

What point is always guaranteed to be on the least squares regression line?

(x̄, ȳ)

New cards
50

How do outliers affect the least squares regression line?

they strongly influence the line

New cards
51

What is a residual plot?

a scatterplot of the residuals on the vertical axis and the explanatory variable on the horizontal axis

New cards
52

How do you find the residual plot on a calculator?

2nd → y= → Plot 1 → Enter → Adjust settings → Zoom → 9: ZoomStat → Enter

New cards
53

How does a residual plot work?

it magnifies the deviations of the points from the line, making it easier to see unusual observations and patterns

New cards
54

What is the purpose of a residual plot?

to assess linearity with a tool other than the actual scatterplot

New cards
55

What do you look for in a residual plot?

random scattered points above and below the regression line

New cards
56

How can you tell if a linear model is appropriate?

if there are no obvious patterns

New cards
57

What does the standard deviation of the residuals?

s, which gives the typical size of a prediction error (residual)

New cards
58

How do you calculate the standard deviation of the residuals?

2nd → STAT -> Math → 7: stdDev( → 2nd → STAT → RESID

New cards
59

How do you interpret the standard deviation of the residuals?

Using the LRSL that predicts [y] using [x], we will typically be off by about “s” units in our predictions

New cards
60

What is the coefficient of determination?

r2, which measures the percent reduction in the sum of squared residuals when using the least-squares regression line to make predictions, rather than the mean value of y

New cards
61

How do you calculate r2?

STAT → CALC → 8: LinReg(a+bx)

New cards
62

How do you interpret r2?

[r2 as a percentage]% of variation in [y variable] is accounted for by the least squares regression line with x = [x variable]

New cards
63

What are points with high leverage in regression?

points that have much larger or much smaller x-values than the other points in the data set

New cards
64

What is an outlier in regression?

a point that does not follow the pattern of the data and has a large residual

New cards
65

What is an influential point in regression?

any point, that if removed, substantially changes the slope, y-intercept, correlation, coefficient of determination, or standard deviation of the residuals

New cards
robot