1/29
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
What are the two most powerful and versatile approaches for investigating variable relationships
correlation analysis
Regression analysis
What does a scatterplot show?
The graphical relationship between two quantitative variables measured on the same individual.
What does a scatterplot display?
form: linear or non linear
Direction: positive or negative
Strength: none, weak, strong
What does correlation of the scatterplot measure?
direction
Strength
Linear relationship
Between 2 quantitative variables —> standardize covariance
What does covariance mean?
How x and Y vary together
What does it mean when Cov(X,Y)>0?
X and Y tend to move in the same direction
What does it mean when Cov(X,Y)<0?
X and Y tend to move in opposite directions
What does it mean when Cov(X,Y)=0?
X and Y are independent
What does it mean when correlation ( r ) is close de -1 or 1
-1 means that the negative linear relation is strong and vice-versa for 1. When it is zero, it means that there is not relationship
True or false, correlation ( r ) is a indicator of causal relationship between variables
false
True or false, a strong correlation between two variables means that if one variable changes, the other one changes as well (causation)
false
How can we establish a legitimate causal connection statistically?
through the use of designed experiments
What can we use for testing correlation
t-test for single parameter
When two variables are jointly normal
True or false: correlation treats two variables X and Y as equals
true
What is simple linear relationship?
How DV changes as single independent variable changes. Used as a mathematical model to predict the value of DV based on a value of IV
What’s the purpose of drawing a line through the points on a scatterplot?
compact description of the dependency of the DV on the IV
In the equation of the line, what does a signify?
Intercept, the mean value of DV when IV is zero
In the equation of the line, what does b signify?
slope, amount by which DV changes when IV changes by 1 unit
How do we determine the best regression line? what is the most common method?
The line has to be closest to all the points as much as possible.
Most common method is least squares
What os the least squares method?
Sum of squares of the vertical distance of the data points from the line as small as possible
What is the purpose of least-square regression line?
To minimize SS(error), the unexplained portion of Y by the regression line
What are the assumptions if we want to apply t-test for significance of slope?
Normal distribution
Independent observations
How is the total variation in Y partitioned?
SS(regression): Variation in Y explained by the regression line
SS(error): Variation in Y unexplained by the regression line (residual)
What does the goodness-of-fit of the regression model mean?
Proportion of the total variation in Y accounted for by the regression model.
The closer R2 is to 1, the mode variance of DV explained
ex: R2=0.71 —> means that 71% of the total variance of the DV is explained by the IV
True or false, in simple regression analysis, both F and t-test are used for testing significance of the single slope, resulting in the same conclusion
True
What are the assumptions of linear regression?
Y values are independent and sampled at random from population
Relationship between X and Y is linear (linearity)
Y is distributed normally at each value of X (normality)
Variance of Y at every value of X is the same (homogeneity of variances)
What is residual analysis and what are its function / purpose?
Difference between Yi and ^Yi
Plot residual vs Xi values
Standardized residual
Purpose:
Examine functional form (linear vs non-linear)
Evaluate violations of assumptions
What is the difference between simple and multiple linear regression
Both simple and multiple have 1 DV
Simple has 1 IV
Multiple has 2 or more
In multiple regression model, what does bj mean
Amount by which Y changes on average when Xj changes by one unit, holding all other Xjs remain constant
How do we determine a and bj in multiple regression?
least squares, same as in simple