Instructor: Donald Reid (glasgow.ac.uk)
Structure of a General Linear Model
Consideration of factor and covariate explanatory variables
Model output predictions
Reporting results including statistical metrics like degrees of freedom, Mean Sums of Squares, and F-ratio
Various attributes recorded for each observation (Height, Age, Gender, Eye colour)
Sample Size: 69
Example heights data is detailed with gender and eye color listed for each individual.
TSS = 1380.64
Research Question: Can variation in class height be explained by biological sex?
Response Variable: Class height
Explanatory Variable: Sex (factor)
Hypotheses:
Null Hypothesis (H0): Class height not affected by sex
Alternative Hypothesis (Ha): Class height affected by sex
Model Structure: Model fit formula - Height ~ Sex
Display of Height versus Sex visually represented.
Response Variable: Height
Degrees of Freedom (Df)
Sum of Squares
Factor (Sex): Df = 1, Sum Sq = 146.48, Mean Sq = 146.48, F-value = 7.9519, P-value = 0.006312 (significant)
Residuals: Df = 67, Sum Sq = 1234.16
Coefficients for the model analyzed:
Intercept: 65.0857, Std. Error: 0.6623
Factor (Sex) MALE: 2.9854, Std. Error: 1.0587
Coefficients indicate fitted values to assess the relationship between height and sex
Model equation: fheight = c + aM (males) + aF (females)
Introduction to covariate in model fitting
Response variable vs. explanatory variables
Main equation: fheight = m · (Age) + c
Calculation of fitted values at different ages
Prediction based on height values related to weight.
Example: If height increases by 1 foot for a dragon, its weight increases by 0.3 tons.
Importance of conveying relevant statistics: F-statistics, significance levels, p-values
Example reporting: Discussing effects on test statistics and their significance thresholds
Unique pieces of information quantifying variation
Calculation of total and explained variation understood through df
For class height, TSS: 1380.64 and Total Dfs: 68
F-ratio derived from comparing explained mean squares to residual mean squares
Formula Definition: F = Mean ESS / Mean RSS
P-value indicates the probability of observing such extreme results if the null hypothesis holds true.
Expectation to analyze PTC data using R
Investigating links between genotype and other variables like Sex, Smoking preference, etc.
Choosing research questions, forming hypotheses, testing, reporting results, and predictive analytics for response variables.
Understand the statistical output of a General Linear Model (Reinforced in Data Analysis 2 lab)
Calculate values of your response variable for a given value of explanatory variable
Know how to report statistical results (Reinforced in Data Analysis 2 lab)
Statistical Outputs:
Model output predictions include key statistical metrics:
Degrees of freedom
Mean Sums of Squares
F-ratio
Research Question: Can variation in class height be explained by biological sex?
Response Variable: Class height
Explanatory Variable: Sex (factor)
Hypotheses:
Null Hypothesis (H0): Class height not affected by sex
Alternative Hypothesis (Ha): Class height affected by sex
Model Structure: Height ~ Sex
Sample Size: 69
Height Data: Variation among attributes including Age, Gender, Eye Color.
Factor Df Sum Sq Mean Sq F-value P-value | |||||
Sex | 1 | 146.48 | 146.48 | 7.9519 | 0.006312 |
Residuals | 67 | 1234.16 |
Coefficients:
Intercept: 65.0857, Std. Error: 0.6623
Factor (Sex) MALE: 2.9854, Std. Error: 1.0587
Interpretation of Coefficients:
Fitted values assess relationship between height and sex.
Model equation: fheight = c + aM (males) + aF (females)
Modeling with Covariates:
Covariate Model Structure: Response variable vs. other explanatory variables.
Main equation for prediction: fheight = m · (Age) + c
If height increases by 1 foot, weight increases by 0.3 tons for the modeled variable.
Importance of Reporting:
Clearly convey relevant statistics:
F-statistics
Significance levels
P-values
Discussing effects on test statistics and their significance thresholds.
Understanding Degrees of Freedom (df) for quantifying variation.
Competence in forming hypotheses, reporting results, and predictive analytics for response variables is developed through these concepts and tasks.
Understand the statistical output of a General Linear Model (Reinforced in Data Analysis 2 lab)
Calculate values of your response variable for a given value of explanatory variable
Know how to report statistical results (Reinforced in Data Analysis 2 lab)
Statistical Outputs:
Model output predictions include key statistical metrics:
Degrees of freedom
Mean Sums of Squares
F-ratio
Research Question: Can variation in class height be explained by biological sex?
Response Variable: Class height
Explanatory Variable: Sex (factor)
Hypotheses:
Null Hypothesis (H0): Class height not affected by sex
Alternative Hypothesis (Ha): Class height affected by sex
Model Structure: Height ~ Sex
Sample Size: 69
Height Data: Variation among attributes including Age, Gender, Eye Color.
Factor Df Sum Sq Mean Sq F-value P-value | |||||
Sex | 1 | 146.48 | 146.48 | 7.9519 | 0.006312 |
Residuals | 67 | 1234.16 |
Coefficients:
Intercept: 65.0857, Std. Error: 0.6623
Factor (Sex) MALE: 2.9854, Std. Error: 1.0587
Interpretation of Coefficients:
Fitted values assess relationship between height and sex.
Model equation: fheight = c + aM (males) + aF (females)
Modeling with Covariates:
Covariate Model Structure: Response variable vs. other explanatory variables.
Main equation for prediction: fheight = m · (Age) + c
If height increases by 1 foot, weight increases by 0.3 tons for the modeled variable.
Importance of Reporting:
Clearly convey relevant statistics:
F-statistics
Significance levels
P-values
Discussing effects on test statistics and their significance thresholds.
Understanding Degrees of Freedom (df) for quantifying variation.
Competence in forming hypotheses, reporting results, and predictive analytics for response variables is developed through these concepts and tasks.