1/20
Flashcards covering R programming functions for linear modeling, statistical evaluation metrics, and epidemiological SIR model parameters.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
PFU_per_ml
A measure of viral growth used in experiments looking at the in vitro growth of SARS-CoV-2 variants.
lm()
An R function used to fit a linear model, specified with a formula like y∼x.
Coefficients
Components of a linear model object, such as the intercept or the slope of an explanatory variable, accessed via modelfit$coefficients.
Residuals
The difference between the actual data and the model's prediction; a positive value means the model underestimated the real data, while a negative value means an overestimate.
Fitted values
The predicted values generated by a linear model based on the input data, accessed via modelfit$fitted.values.
Wave
A period in a pandemic characterized by a distinct rise, peak, and fall in hospital admissions.
A model fits training data too closely and fails to predict new data; memorising the “noise“.
vaccine_sentiment
The percentage of people who were vaccinated, had an appointment, or would get vaccinated if given the opportunity, based on a Facebook survey.
mask_sentiment
The percentage of people who wore a mask most or all of the time in public, based on a Facebook survey.
Sum of squared error
A way of measuring model fit defined as the sum of the squares of the residuals.
Mean Squared Error (MSE)
A measure of model performance calculated as the mean of the squared residuals: mean(residuals2). Smaller values indicate a better fit to the data.
Dependent variable
Also called the response variable, this is the outcome of interest that a model tries to predict (e.g., CovDp100K).
Independent variable
Also called predictors, these are the factors used within a model to explain variation in the dependent variable (e.g., pPop65, VacFullp100).
Cross-validation
A process used to evaluate a model's predictive power by splitting data into a training set to fit the model and a test set to evaluate performance.
Training data
The subset of a dataset used to build and fit the parameters of a specific model.
Test data
The subset of data used to evaluate how well a model predicts new, unseen observations.
set.seed()
An R function that ensures random number generation (such as splitting data into test and training sets) is reproducible for all users.
predict()
An R function used to generate predicted values for a test dataset based on a model that was previously fitted to training data.
SIR model
A compartmental model used to simulate disease dynamics based on three groups: Susceptibles (S), Infecteds (I), and Recovereds (R).
Transmission rate (beta)
In an SIR model, the parameter β that represents the rate at which susceptible individuals become infected.
Recovery rate (gamma)
In an SIR model, the parameter γ that represents the rate at which infected individuals recover or are removed from the infectious pool.