GIS Quiz 3 (copy)

Covariance: tool used to determine the relationship between 2 random variables

Scatter plot: used to determine whether a linear correlation exists between two variables

Covariance formula: captures degree to which pairs of points systematically vary around their respective means

high positive covariance: occurs when paired x and y values both tend to be above or below their means at the same time; if both are up

High negative covariance: occurs when paired x and y values tend to be on opposite sides of their respective means; if one is up and the other is down

Zero covariance: no systematic tendencies of any sort between paired x and y values

Pearson’s correlation coefficient: measure of the strength and direction of a linear relationship between two variables ranging between -1 and 1, with -1 representing negative linear correlation and 1 representing positive correlation. R is close to zero when there is no linear correlation

Caveats of correlation: only works with interval or ratio data; only demonstrates linear relationship; correlation doesn’t imply causation

Regression: quantifies the relationship between variables where causation is implied; explanatory variables are termed independent and explained variables are termed dependent;

Regression models: estimate nature of relationship between independent and dependent variables; relationship between variables is shown as a linear function via slope intercept form

Least squares regression: mathematical procedure for finding the best-fitting curve to a given set of points; difference between actual y values and predicted y values

Residual: difference between the observed and predicted values for y; residual = observed y - predicted y 

R2: statistical measure of how close the data are to fitted regression line; percentage of the response variable variation explained by linear model; always between 0 and 100%; higher R2 percentage means that the model better fits your data

R2 formula: explained variation / total variation

0% R2: indicates that model explains none of the variability of the response data around its mean

100% R2: model explains all the variability of the response data

Assumptions of regression: dependent variable should be normally distributed; predictors should not be strongly correlated with each other; observations should be independent of each other; 10 observations per independent variable

Difference between correlation and regression: Regression attempts to establish causality and produces an entire equation while correlation is a single statistic

Spatial regression: attempts to account for variation across a landscape with one equation

Geographically weighted regression: coefficients are allowed to vary between areas caross landscape



robot