Revision

Overview of the RCSI Workshop

  • Presenter: Valéria Lima Passos

  • Affiliation: School of Pharmacy and Biomolecular Sciences, RCSI

  • Contact: valerialimapassos@rcsi.ie

Model's Goodness of Fit (GOF)

  • Definition

    • A statistical measure that determines how well the model explains the variability of the data.

  • Key Metric

    • R-squared (R²)

      • Measures the proportion of variance in the dependent variable that can be explained by the independent variables.

Mean Values and Their Importance

  • Mean Y

    • Definition: The average of the observed values of Y.

      • Notation:

      • $Mean ext{ } Y$ (= \frac{\sum{Y}}{N})

      • Importance: Serves as a reference point for evaluating deviations from expected values.

Total Variation and Sources

  • SSTOTAL (Total Variation)

    • Definition: The total sum of squared distances from the observed values to the mean.

    • Calculation:

      • SSTOTAL=(yiMean Y)2SSTOTAL = \sum{(y_i - Mean \text{ } Y)^2}

  • SSREGRESSION (Explained Variation)

    • Definition: The sum of squared distances from the predicted values to the mean.

    • Calculation:

      • SSREGRESSION=(y^iMean Y)2SSREGRESSION = \sum{(\hat{y}_i - Mean \text{ } Y)^2}

  • SSRESIDUAL (Unexplained Variation)

    • Definition: The sum of squared distances from the observed to the predicted values.

    • Calculation:

      • SSRESIDUAL=(y<em>iy^</em>i)2SSRESIDUAL = \sum{(y<em>i - \hat{y}</em>i)^2}

Understanding of Errors in Regression

  • Residuals

    • Definition: The distance from the observed value to the predicted value according to the regression line.

    • Significance: Measures the deviation of observed data from the model prediction.

Explore MSTOTAL and its Components

  • Formula Relation between Components

    • SSTOTAL=SSREGRESSION+SSRESIDUALSSTOTAL = SSREGRESSION + SSRESIDUAL

    • Importance: This relationship is vital in understanding the model's explanatory power.

Examples and Interpretation of Variability

  • Interpretation(s):

    • Proportion (in %) of variability shared between Y and X.

    • Proportion of Y variation attributable to changes in X.

    • Proportion of Y variation explained by the regression model.

    • Terminology:

      • Use Variation (SS) or Variability, NOT Variance!

Real-World Application Illustrations

  • Dietary Restrictions and Lifespan

    • Study observed that genetic differences in mice explained significantly more variability in lifespan than dietary interventions.

    • Finding: Genetic variants account for about 24% of lifespan variation, while dietary changes only contributed about 7%.

    • Implication: Lifespan and responses to dietary changes are complex polygenic traits influenced by multiple genetic factors.

  • Genetics and Political Ideology

    • Research indicates that political beliefs are approximately 40% heritable based on twin studies.

    • Note: The heritability measure discusses population-level variability, not individual predispositions.

    • Important Distinction: Genes influence traits indirectly through personality, education, income, and intelligence.

Interaction Terms in Multiple Linear Regression

  • Definition of Interaction

    • Interaction effect: A combined influence of two or more independent variables on the dependent variable.

    • Distinction: Two variables can influence the dependent variable without interaction.

    • Prediction complexity: Cannot predict the effect of one variable without considering the other.

  • Misinterpretation of Interaction

    • Misunderstanding frequently arises around the term 'interaction'. It necessitates an understanding of how one variable's influence is conditional on another's level.

Detailed Explanation of Interaction Effects

  • Terms for Interaction Phenomenon

    • Terms include: joint effects, synergism, antagonism, interaction, effect modification, and effect measure modification.

  • Distinction from Confounders: Unlike confounding variables, modifying variables do not create false associations where none exist.

Examples and Applications in Variables Analysis

  • Case Studies of Interaction Effects

    • FEV Age Unadjusted vs Adjusted

      • Overestimated differences between smokers and non-smokers due to height distribution disparities.

    • FEV Interaction Example

      • FEVpred=β<em>0+β</em>1smoke+β<em>2age+β</em>3agesmokeFEVpred = \beta<em>0 + \beta</em>1 smoke + \beta<em>2 age + \beta</em>3 age*smoke

      • Changes in interpretation of interaction terms based on age thresholds.

Gene-Environment Interaction Patterns

  • Qualitative Patterns of Gene-Environment Interaction

    • Illustrated through gene-environment interaction types as per Haldane, 1938, indicating variations in traits across environments. (trait value vs environment)

Confounding vs Interaction in Statistical Analysis

  • Uses examples like maternal age's effect on Down syndrome incidence

  • Explaining how adjustments must consider both conditions affecting outcomes.

Presenting Linear Regression Results

  • Example of a linear regression analysis comparing statin users versus non-users with detailed coefficients and confidence intervals.

  • Important statistics captured in the analysis including p-values indicating statistical significance.

Discussion on Model Adjustments and Interactions

  • Discussion on whether it is reasonable to expect an interaction between diabetes and age in relation to eGFR (estimated glomerular filtration rate).

  • Clinical implications: Understanding how diabetes modifies age-related declines in kidney function.

Importance of Adjustment in Statistical Models

  • Discussions reiterate the need for model adjustments to improve accuracy.

  • Emphasis on the influence of confounding variables and their effects on perceived relationships in health-related studies.