Week 4 - Linear-In-Parameter Functions and Validation

0.0(0)

Studied by 3 people

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/12

There's no tags or description

Looks like no tags are added yet.

Last updated 6:59 PM on 5/26/26

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No analytics yet

Send a link to your students to track their progress

13 Terms

New cards

Pros of Polynomials

Can find any (smooth enough) function
Linear model “closed form” solution
- Well understood numerical problem
- Many software packages
Explicit - Very basic, transparent and understandable

New cards

Cons of Polynomials

Matrix inversion - cubic in computer resources
For m x m matrix, doubling m, requires 8 times more memory, takes 8 times longer
Most terms irrelevant - unnecessary complexity
Leads to problems for high degree and dimensions
- Num of coefficients = (p + d) Choose (d) = (p + d) ! / (p ! d !)
- p = number of predictors
- d = degree

New cards

Gaussian Radial Basis Functions

Radially symmetric - only the distance from the “centre” is important
Formula: 𝜙(𝑥) = exp( − (𝑥−𝑐_𝑖)² / 2 𝑙_𝑖²)
- 𝑙_𝑖 = width parameter
- 𝑐_𝑖 = centre parameter
Decay with distance from 𝑐_𝑖

New cards

Local Minima

The Empirical Cost Function is RSS / MSE
For linear regression the estimated function guarantees convexity which will have a unique minimum
The estimated function does not always guarantee convexity
- Numerical algorithms and Gradient Descent can experience local minima
Local minima can happen also for convex models or linear models when the cost function is non convex

New cards

Training, testing and deploying models

A model is at most as good as the data used to create it, it is usually not applicable away from that data range. Training, testing and deploying models should be done on consistent data ranges

New cards

Overfitting the training data

Occurs when the model is over trained to the extent that it can not recognise new data instances even though the data is part of the domain.
An over fitted model also learns the noise and random fluctuations in the training dataset
Implies that RSS is zero or very close to zero
- More likely with nonparametric and nonlinear models

New cards

Underfitting the training data

When the model is too simple to model the domain accurately and hence can not generalize to new data
Poor performance on training data

New cards

Hold Out Validation

AKA Train_Test approach
Data is randomly split into training and test sets - typically 70:30 or 80:20
The training dataset is made up of known data which is used to train the model.
The test dataset is made up of data not seen by the machine learning methodology during training. It is used to validate the model
Need to randomise the sample by predictors

New cards

Hold Out Advantages

Computationally fast
If our data is huge and our test sample as well as train sample have the same distribution then this approach works

New cards

Hold Out Disadvantages

With limited data, some information about the data might be missed during training resulting in high bias.
Not ideal for tuning hyperparameter

New cards

K-Fold Cross Validation

First randomise the sample by predictors then:

Divide a set of n observations into K groups of equal size
Train K models using each of the (K-1) groups of data and validate the performance of each model on the single group of data left
Use the average performance of the model validations for the assessment

New cards

Leave-one-out Cross Validation

K-Fold Cross Validation taken to the extreme, where K is equal to N - the number of data points in the set

More computationally demanding than K-Folds

New cards

Key differences between Hold Out and Cross Validation

Hold-out validation wastes the held out data usually in short supply
Cross-validation provides a way of using all data for both training and testing and gives a more accurate estimate of generalisation performance.