Revision
Overview of the RCSI Workshop
Presenter: Valéria Lima Passos
Affiliation: School of Pharmacy and Biomolecular Sciences, RCSI
Contact: valerialimapassos@rcsi.ie
Model's Goodness of Fit (GOF)
Definition
A statistical measure that determines how well the model explains the variability of the data.
Key Metric
R-squared (R²)
Measures the proportion of variance in the dependent variable that can be explained by the independent variables.
Mean Values and Their Importance
Mean Y
Definition: The average of the observed values of Y.
Notation:
$Mean ext{ } Y$ (= \frac{\sum{Y}}{N})
Importance: Serves as a reference point for evaluating deviations from expected values.
Total Variation and Sources
SSTOTAL (Total Variation)
Definition: The total sum of squared distances from the observed values to the mean.
Calculation:
SSREGRESSION (Explained Variation)
Definition: The sum of squared distances from the predicted values to the mean.
Calculation:
SSRESIDUAL (Unexplained Variation)
Definition: The sum of squared distances from the observed to the predicted values.
Calculation:
Understanding of Errors in Regression
Residuals
Definition: The distance from the observed value to the predicted value according to the regression line.
Significance: Measures the deviation of observed data from the model prediction.
Explore MSTOTAL and its Components
Formula Relation between Components
Importance: This relationship is vital in understanding the model's explanatory power.
Examples and Interpretation of Variability
Interpretation(s):
Proportion (in %) of variability shared between Y and X.
Proportion of Y variation attributable to changes in X.
Proportion of Y variation explained by the regression model.
Terminology:
Use Variation (SS) or Variability, NOT Variance!
Real-World Application Illustrations
Dietary Restrictions and Lifespan
Study observed that genetic differences in mice explained significantly more variability in lifespan than dietary interventions.
Finding: Genetic variants account for about 24% of lifespan variation, while dietary changes only contributed about 7%.
Implication: Lifespan and responses to dietary changes are complex polygenic traits influenced by multiple genetic factors.
Genetics and Political Ideology
Research indicates that political beliefs are approximately 40% heritable based on twin studies.
Note: The heritability measure discusses population-level variability, not individual predispositions.
Important Distinction: Genes influence traits indirectly through personality, education, income, and intelligence.
Interaction Terms in Multiple Linear Regression
Definition of Interaction
Interaction effect: A combined influence of two or more independent variables on the dependent variable.
Distinction: Two variables can influence the dependent variable without interaction.
Prediction complexity: Cannot predict the effect of one variable without considering the other.
Misinterpretation of Interaction
Misunderstanding frequently arises around the term 'interaction'. It necessitates an understanding of how one variable's influence is conditional on another's level.
Detailed Explanation of Interaction Effects
Terms for Interaction Phenomenon
Terms include: joint effects, synergism, antagonism, interaction, effect modification, and effect measure modification.
Distinction from Confounders: Unlike confounding variables, modifying variables do not create false associations where none exist.
Examples and Applications in Variables Analysis
Case Studies of Interaction Effects
FEV Age Unadjusted vs Adjusted
Overestimated differences between smokers and non-smokers due to height distribution disparities.
FEV Interaction Example
Changes in interpretation of interaction terms based on age thresholds.
Gene-Environment Interaction Patterns
Qualitative Patterns of Gene-Environment Interaction
Illustrated through gene-environment interaction types as per Haldane, 1938, indicating variations in traits across environments. (trait value vs environment)
Confounding vs Interaction in Statistical Analysis
Uses examples like maternal age's effect on Down syndrome incidence
Explaining how adjustments must consider both conditions affecting outcomes.
Presenting Linear Regression Results
Example of a linear regression analysis comparing statin users versus non-users with detailed coefficients and confidence intervals.
Important statistics captured in the analysis including p-values indicating statistical significance.
Discussion on Model Adjustments and Interactions
Discussion on whether it is reasonable to expect an interaction between diabetes and age in relation to eGFR (estimated glomerular filtration rate).
Clinical implications: Understanding how diabetes modifies age-related declines in kidney function.
Importance of Adjustment in Statistical Models
Discussions reiterate the need for model adjustments to improve accuracy.
Emphasis on the influence of confounding variables and their effects on perceived relationships in health-related studies.