CM22009 - Machine Learning

0.0(0)
studied byStudied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/28

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 3:01 PM on 2/2/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

29 Terms

1
New cards

Supervised Learning

A machine learning technique where an algorithm learns from a labelled dataset to make predictions or classifications on new, unseen data. The algorithm is trained on input-output pairs, where each input has a correct output label, similar to how a student learns from an example with a provided answer.

2
New cards

Unsupervised Learning

A type of machine learning that uses algorithms to find patterns and relationships in unlabelled data without any prior guidance or “ground truth”. Unlike supervised learning, which uses labelled data, unsupervised learning algorithms must discover hidden structures and insights on their own. Common tasks include clustering, association and dimensionality reduction

3
New cards

Reinforcement Learning

A machine learning paradigm where an agent learns to make optimal decisions in an environment through trial and error. It learns by taking actions, receiving rewards or penalties for those actions, and then adjusting its behaviour to maximise its cumulative reward over time

4
New cards

Supervised Learning Model Examples

Rainfall Forecasting, Object Detection, Classifying Fraudulent Transactions

5
New cards

Unsupervised Learning Examples

Customer Segmentation, Anomaly Detection

6
New cards

Reinforcement Learning Examples

Personalised Learning, Traffic Control, Gaming AI

7
New cards

Linear Regression Formula

knowt flashcard image
8
New cards

Linear Regression: Hypothesis Function, Cost Function, Gradient Descent Formula

knowt flashcard image
9
New cards

Regression Problem

Using statistical and machine learning methods to predict a continuous numerical value (the dependent variable) based on one or more other variables (the independent variables)

10
New cards

Classification Problem

A type of supervised learning in machine learning where the goal is to predict a class label for a given input, meaning the output is a category rather than a continuous value

11
New cards

Types of Classification

Binary, Multi-class, Multi-label

12
New cards

Binary Classification

Classification between two classes (e.g. is this email spam or ham)

13
New cards

Multi-Class Classification

Classification between more than two classes, each input corresponds to one class (e.g. is this picture a dog, cat or goat)

14
New cards

Multi-Label Classification

Classification tasks where an input may have multiple classes associated with it (e.g. what colour does this image contain)

15
New cards

Logistic Regressions

A statistical method used to model the probability of a binary outcome based on one or more predictor variables

16
New cards

Logistic Regression: Hypothesis Function, Log Loss, Gradient Descent

knowt flashcard image
17
New cards

K-Nearest Neighbour (KNN) Algorithm

A machine learning method used for classification and regression that works by finding the “k” nearest data points to a new, unknown data point

18
New cards

Advantages of KNN

Intuitive and easy to implement, No “training” needed, Adaptable

19
New cards

Disadvantages of KNN

Computationally expensive at the testing stage, Must hold training data in memory, Curse of Dimensionality (More points become equidistant in higher dimensions)

20
New cards

KNN Classification Examples

Recommendation System, Medical Diagnosis (finding similar patient cases)

21
New cards

KNN Regression Examples

House Price Prediction, Stock Price Forecasting

22
New cards

Distance Metrics

knowt flashcard image
23
New cards

Hamming Distance

Counts mismatches between categorical vectors

<p>Counts mismatches between categorical vectors</p>
24
New cards

Jaccard Distance

knowt flashcard image
25
New cards

Why use Hamming Distance

Data is ordered and vectors are of the same length, You care about position-by-position mismatches (e.g. DNA sequence, multi-choice answers)

26
New cards

Jaccard Distance

Data is set-like (unordered categories, tags, items), You care about overlap vs total unique items (e.g. customer segmentation, shopping baskets, keywords/tags in documents)

27
New cards

Bias

Measures how far off, on average, a model’s predictions are from the ground truth values

28
New cards

Variance

Measures how much a model’s predictions change with different training datasets

29
New cards

Bias-Variance Tradeoff

When choosing the best model to fit your data you must strike a balance between high bias (prone to underfitting by missing important patterns) and high variance (prone to overfitting by capturing noise)