statistics chapter 3

studied byStudied by 3 people
0.0(0)
learn
LearnA personalized and smart learning plan
exam
Practice TestTake a test on your terms and definitions
spaced repetition
Spaced RepetitionScientifically backed study method
heart puzzle
Matching GameHow quick can you match all your cards?
flashcards
FlashcardsStudy terms and definitions

1 / 42

43 Terms

1

response variable

measures an outcome of a study. independent variable.

New cards
2

explanatory variable

attempts to explain the observed outcomes. dependent variable.

New cards
3

how to examine data

plot the data. use numerical summaries. look for overall patterns and striking deviations (outliers). if overall pattern is regular, use a compact mathematical model to describe it.

New cards
4

scatterplot

shows the relationship between two quantitative variables measured on the same individuals. explanatory variable on x-axis. response variable on y-axis.

New cards
5

explanatory/response variables

change in x causes change in y. x used to predict the values of y.

New cards
6

how to make a scatterplot

look for an overall pattern and striking deviations (outliers). describe the form of the scatterplot. make axes and label.

New cards
7

how to describe a scatterplot

form is the pattern (linear or curved or clusters). direction is the association (positive or negative). strength is how closely the points follow a clear form such as a line (strong or moderately strong or weak).

New cards
8

outlier

an individual value that falls outside the overall pattern of the relationship.

New cards
9

positively associated

when above-average values of one tend to accompany above-average values of the other and below-average values also tend to occur together.

New cards
10

negatively associated

when above-average values of one tend to accompany below-average values of the other, and vice versa.

New cards
11

how to display categorical values in a scatterplot

use two different plotting symbols, such as colors, to differentiate the values.

New cards
12

correlation

measures the direction and strength of the linear relationship between two quantitative variables. numerical measure to supplement the graph, thus proving linear relationship. standardized, no units. r.

New cards
13

r

  1. positive=positive association between variables. negative=negative association between variables.

New cards
14
  1. makes no distinction between explanatory and response variables. x or y does not matter.

New cards
15
  1. requires that both variables be quantitative.

New cards
16
  1. always between -1 and 1.

New cards
17
  1. does not describe curved relationships.

New cards
18
  1. like mean and SD, not resistant. strongly affected by a few outlying observations.

New cards
19

r=0

no linear relationship. scattered.

New cards
20

r=.99

strong, positive linear relationship.

New cards
21

r=-.99

strong, negative linear relationship.

New cards
22

how to use correlation

correlation is not a complete description of two variable data, even when the relationship is linear. give the means and SDs of both x and y along with the correlation. conclusions based on correlation. describe data more.

New cards
23

r=1, r=-1

points lie exactly on a straight line.

New cards
24

least-squares regression

a straight line that describes how a response variable y changes as an explanatory variable x changes. often used to predict the value of y for a given value x. unlike correlation, requires an explanatory variable and a response variable.

New cards
25

least-squares regression line

the line that makes the vertical distances of the points in a scatterplot from the line as small as possible.

New cards
26

LSRL

ŷ=a + bx

New cards
27

ŷ

predicted value.

New cards
28

y

observed value.

New cards
29

_____% of the variation in the response variable (y) is accounted for by the regression line. a measure of how successful the regression was in explaining the response.

New cards
30

correlation and slope of LSRL

a change of one standard deviation in x corresponds to a change of r standard deviations in y.

New cards
31

residual

the difference between an observed value of the response variable and the value predicted by the regression line. y - ŷ. the mean of the least-squares residuals of a LSLR is always zero. otherwise, caused by a roundoff error.

New cards
32

residual plot

a scatterplot of the regression residuals against the explanatory variable. help us assess the fit of a regression line.

New cards
33

how to make a residual plot

plot the x values on the x-axis and the residuals on the y-axis. draw a line at zero. label the axes.

New cards
34

how to examine a residual plot

  1. a curved pattern shows the relationship is not linear. thus, a straight line is an inappropriate model.

New cards
35
  1. increasing or decreasing spread about the line shows that prediction of y will be less accurate for larger x.

New cards
36
  1. individual points with large residuals are outliers in the vertical (y) direction because they lie far from the line that describes the overall pattern.

New cards
37
  1. individual points that are extreme in the x direction may not have large residuals, but can be very important.

New cards
38

outlier

observation that lies outside the overall pattern of the other observations.

New cards
39

influential observation

removing the observation would markedly change the result of the calculation. points that are outliers in the x direction of a scatterplot are often influential observations for the LSRL. has small residuals because it pulls the regression line toward itself.

New cards
40

how to analyze data for two variables

  1. plot your data in a scatterplot.

New cards
41
  1. interpret what you see: direction, form strength. linear?

New cards
42
  1. numerical summary? x bar, y bar, SD x, SD y and r?

New cards
43
  1. mathematic model? regression line?

New cards
robot