Review Lecture Notes

Exam Review Notes

Cumulative Topics

  • Constructs / Operational Definitions: Understanding how abstract concepts are defined and measured is crucial.

  • Hypotheses: A testable statement about the relationship between variables.

Types of claims: Different types of assertions made in research (

  • frequency: assertions that describe the occurrence or rate of a particular phenomenon within a specific population. They report how often something happens without suggesting any cause-and-effect relationships.

  • association: a statement that indicates a relationship between two variables, suggesting that they are interconnected or correlated in some way.

    • Unlike frequency claims, which only report how often something occurs,

    • association claims imply that changes in one variable are related to changes in another, but do not specify the direction or causality of the relationship.

  • causal: a statement that indicates a cause-and-effect relationship between two variables, suggesting that changes in one variable directly result in changes in another variable.

    • This type of claim must be supported by evidence demonstrating that the relationship is not only correlated but also that one variable can be shown to influence the other.

      Validity and Reliability: Key aspects of measurement.

  • Validity: the extent to which a test measures what it is supposed to measure.

  • Reliability: the consistency of a measurement.

  • Scales of measurement:

    • Nominal: Categorical data without inherent order (e.g., colors, types of fruit).

    • Ordinal: Data with a meaningful order but unequal intervals (e.g., rankings).

    • Interval: Data with equal intervals but no true zero point (e.g., temperature in Celsius).

    • Ratio: Data with equal intervals and a true zero point (e.g., height, weight).

  • Central tendency: Measures like mean, median, and mode that describe the typical value in a dataset.

  • Variance: A measure of how spread out the data is.

  • Confidence intervals: A range of values likely to contain a population parameter with a certain degree of confidence.

  • T-statistics: Used to determine if there is a significant difference between the means of two groups.

  • Sample size: The number of observations in a sample.

  • Variability: The extent to which data points in a statistical distribution or data set diverge from the average value.

  • Confidence: The degree of certainty that a sample accurately reflects the population.

  • Significance:

    • P-values: The probability of obtaining results as extreme as, or more extreme than, the observed results, assuming the null hypothesis is true.

    • Confidence intervals: Provide a range within which the true population parameter is likely to fall.

Exam Preparation

  • Focus on lecture notes.

  • Review lab materials, especially "Stats Labs".

  • Read assigned chapters.

  • Review InQuizitives.

Correlation

  • A weak positive correlation has a Pearson's r close to 0 but positive. e.g., Pearson’s r=0.17\text{Pearson's } r = 0.17

Cross-Correlation

  • Cross-sectional correlation: Measures the association between two variables at the same point in time.

  • Autocorrelation: Measures the association of a variable with itself over two different time points.

  • Cross-lag correlation: Measures the association between an earlier measure of one variable and a later measure of another variable; crucial for establishing directionality.

Hypothesis Testing for Correlations

  • If there is no relationship between two variables, the expected value of Pearson's r is 0.

  • The observed r is used as a test statistic.

  • If the observed r is unlikely to occur if the null hypothesis (H0H_0) is true, it indicates a significant relationship between the variables.

  • Confidence intervals can be calculated.

    • If the 95% confidence interval of r does not include zero, the correlation is statistically significant.

Choosing the Right Statistic

  • Phi Coefficient: Used to compare 2 dichotomous variables.

  • Pearson’s r: Used to compare 2 continuous variables.

  • Point-biserial r: Used to compare 1 dichotomous and 1 continuous variable.

  • Spearman’s rho: Used to compare 2 ordinal variables or 1 ordinal and 1 continuous variable.

  • Multiple regression: Used when you have multiple predictors and one outcome variable.

Multiple Regression

  • Developing a linear regression model to predict a criterion variable using 2+ predictors.

Interpreting Coefficients in Multiple Regression

  • When examining the coefficient between a variable (e.g., HS Math Grades) and College GPA, it is important to consider whether the relationship is assessed:

    • Without controlling for other variables (a simple, significant relationship).

    • Controlling for other variables (assessing the unique contribution of the variable).

Example Regression Analysis

  • Given coefficients for High School Math Grades, High School Science Grades, High School English Grades, and Sex in predicting College GPA.

  • Regression equation:
    GPApredicted=0.60+0.17Math+0.03Science+0.05English+(0.01)(Sex)\text{GPA}_{\text{predicted}} = 0.60 + 0.17 \cdot \text{Math} + 0.03 \cdot \text{Science} + 0.05 \cdot \text{English} + (-0.01) \cdot (\text{Sex})

  • Interpretation:

    • Controlling for the effect of Science grades, English grades, and Sex, Math grades have a significant positive relationship with College GPA, b=0.17b = 0.17, β=0.35\beta = 0.35, t=4.74t = 4.74, p < .001.

    • Controlling for the effect of the covariates, a 1-unit increase in High School Math Grades predicts a 0.17 increase in College GPA.

Linear Regression Equation

  • General form: Y^=a+byxX\hat{Y} = a + b_{yx}X

    • Y^\hat{Y} = the predicted value of y (the DV).

    • xx = a given value of the IV.

    • aa = the "y-intercept", the value when x=0x = 0 (the “regression constant”).

    • byxb_{yx} = slope of the regression line (the “regression coefficient”).

  • bb is in the units of YY.

  • "A 1-unit increase in XX predicts a b-unit increase in YY"

  • byxb_{yx} = how much expected change in y for 1 unit change in x.

  • The intercept ensures the line goes through the center of the data.

  • X=0X = 0 does not need to be realistic or have meaning.

  • β\beta (beta) = the “standardized” b value.