AI

0.0(0)

Studied by 11 people

Knowt Play

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/50

Earn XP

Description and Tags

Computer Science

A-Level Information Technology

Edexcel

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

51 Terms

New cards

Generative Models

type of machine learning model that is used to generate new data samples based on a training set

New cards

Discriminative Models

type of machine learning models that separate the data points into different classes and learn the boundaries.

New cards

Language model

a type of machine learning model trained to conduct a probability distribution over words, a model tries to predict the next most appropriate word to fill in a blank space in a sentence or phrase, based on the context of the given text.

New cards

Metaverse

A virtual-reality space where users can interact with a computer-generated environment and other users. It uses two techniques: VR and AR.

New cards

Purpose of Generative Models

Model data distribution

New cards

Purpose of Discriminative Models

Model conditional probability of labels given data

New cards

Use Cases of Generative Models

Data generation, denoising, unsupervised learning

New cards

Use Cases of Discriminative Models

Classification, supervised learning tasks

New cards

Training Focus of Generative Models

Maximize probability of observed data, Capture data structure

New cards

Training Focus of Discriminative Models

Learn decision boundary, Differentiate between classes

New cards

Linear regression

is a statistical technique used to model the relationship between a dependent variable and one or more independent variables

New cards

Implementing linear regression using the scikit-learn library

Step 1: Importing the libraries/dataset

Step 2: Data pre-processing

Step 3: Splitting the dataset into training data - validate data and test data

Step 4: Train model

Step 5: Evaluate the model

New cards

k-NN

a simple machine learning technique, used for classification and regression tasks. When make a prediction for a new data point, it looks at the k closest data points from the training dataset.

New cards

Pros of k-NN

Simplistic algorithm — uses only value of K (odd number) and the distance function (Euclidean, as mentioned today).
Efficient method for small datasets.
Utilises “Lazy Learning.” In doing so, the training dataset is stored and is used only when making predictions, therefore making it quicker than Support Vector Machines (SVMs) and Linear Regression.

New cards

Cons of k-NN

Large datasets take longer to process.

Requires feature scaling.

Inability to do will result in wrongful predictions.

Noisy data can result in overfitting or underfitting of data.

New cards

Classification

a supervised machine learning method where the model tries to predict the correct label of a given input data

New cards

Regression

a supervised machine learning technique which is used to predict continuous values

a method for understanding the relationship between independent variables and a dependent variable

New cards

Type of Linear regression

Simple linear regression

Multiple linear regression

New cards

Difference between classification and regression

Classification Task: predict label

Regression Task: predict specific value

New cards

Overfitting

An undesirable machine learning behavior that occurs when the model gives accurate predictions for training data but not for new data

New cards

How to solve overfitting

Increase training data, simplify model architecture, regularize model parameters.

New cards

How to solve underfitting

Increase model complexity.

Increase the number of features, performing feature engineering.

Remove noise from the data.

Increase the number of epochs or increase the duration of training to get better results.

New cards

Underfitting

an undesirable machine learning behavior that occurs when a model is too simple to capture data complexities. It represents the inability of the model to learn the training data effectively result in poor performance both on the training and testing data

New cards

Difference between overfitting & underfitting

Overfitting:
Model is too complex, needs to reduce complexity
Perform well on training data and poorly on unseen data
Training accuracy is good, but validation accuracy is poor
Happens when we train model with a lot of noisy datasets
Low bias, high variance

Underfitting:
Model is too simple, needs to increase complexity
Perform poorly on both training data and unseen data
Both training accuracy and validation accuracy are poor
Happens when we have very small amount of data
High bias, low variance

New cards

Reasons for Overfitting

High variance and low bias

The model is too complex

The size of the training data is not enough

New cards

Reasons for underfitting

High bias and low variance

The model is too simple.

Training data is not cleaned and also contains noise in it.

New cards

neural network

A computational model inspired by the human brain, consisting of interconnected nodes called neurons. It learns from data through a process called training, adjusting the strength of connections between neurons to improve performance. It is used for tasks such as pattern recognition, classification, and regression.

New cards

example for neural network

speech and image recognition, spam email filtering, finance, and medical diagnosis

New cards

Advantages of neural networks

Parallel processing: Can handle multiple tasks at one time
Adaptability: Can learn and improve from experience
Non-linearity: Can model complex relationships
Fault tolerance: Can still work even with damaged nodes
Real-time processing: Can make quick, real-time predictions

New cards

multilayer perceptron (MLP)

a type of artificial neural network that consists of multiple layers of interconnected nodes, known as neurons. It is commonly used for tasks: classification and regression.

New cards

Activation function

is a function that calculates the output of a neuron and decides whether a neuron should be activated or not.

It helps in decision-making by assigning weights to inputs and producing an output signal.

The role of the Activation Function is to get output from a set of input values to feed to a node

Ex: sigmoid, ReLU, and tanh.

New cards

Sigmoid

mathematical function having a characteristic S-shaped curve or sigmoid curve

New cards

ReLU

is an non-linear activation function that will output the input directly if it is positive, otherwise, it will output zero

New cards

Tanh

similar to the sigmoid activation function and has the same S-shape. This function takes any real value as input and outputs values in the range -1 to 1

New cards

Loss function

a function that calculates the error between the actual output and the desired output in the neural network

New cards

Gradient descent

an optimization algorithm for finding a local minimum of a differentiable function.

simply used to find the values of a function's parameters (coefficients) that minimize a cost function as far as possible.

New cards

Backpropagation learning algorithm

calculates the output by forward calculations given the input,

then calculates the error between the actual output and the desired output.

Aiming to minimize the mean squared error (MSE)

New cards

GPT1

is a language model introduced in 2018. It uses unsupervised learning to pre-train on a large warehouse of text data and can generate coherent and contextually relevant text. It has 117 million parameters and is capable of performing tasks like text completion and text generation.

New cards

GPT2

is a large language model chatbot developed by OpenAI. It is a transformer-based model with 1.5 billion parameters, trained on a massive dataset of text and code. It can generate text, translate languages, write different kinds of content, and answer questions

New cards

GPT3

is a state-of-the-art language processing AI model developed by OpenAI. It is known for its impressive ability to generate human-like text. It has been trained on a massive amount of internet text data and can perform a wide range of language-related tasks, including translation, question-answering, and text generation. It consists of 175 billion parameters, making it one of the largest language models ever created.

New cards

Main difference between GPT1 2 and 3

Main difference between GPT-1, GPT-2, and GPT-3: Scaling. GPT-1 had 117M parameters, GPT-2 had 1.5B parameters, and GPT-3 has 175B parameters. The increase in parameters allows for more complex and better language generation, making GPT-3 the most powerful language model to date.

New cards

Difference between GPT and chatGPT and OpenAI

GPT is a type of large language model developed by OpenAI. It can be used for tasks like generating text, translating languages, writing many kinds of content, and answering questions.

ChatGPT is an AI chatbot developed by OpenAI. It is a variance of GPT and is designed to be more conversational than other LLMs.

OpenAI is a research laboratory that develops and publishes research on AI

New cards

Full Batch Learning

Training a machine learning model using the entire dataset in each iteration
Calculate the loss function's gradient by considering all the training examples at once
Expensive for large datasets
Provides accurate parameter updates

New cards

Mini-Batch Learning

A training technique in machine learning where the dataset is divided into smaller subsets called mini-batches.
Instead of updating the model after each individual data point, the model is updated after processing each mini-batch.
This approach balances computational efficiency and model optimization, making it suitable for large datasets.

New cards

Optimizer

A software tool that improves efficiency and performance by minimizing resource usage and maximizing output.

New cards

Hyper parameter

are parameters whose values control the learning process and determine the values of model parameters that a learning algorithm ends up learning.

New cards

Mean Squared Error (MSE)

Measures the average squared difference between predicted and actual values.

New cards

Binary Cross-Entropy

Used in binary classification problems to measure the dissimilarity between predicted and actual class probabilities.

New cards

Categorical Cross-Entropy

Used in multi-class classification problems to measure the dissimilarity between predicted and actual class probabilities.

New cards

Mean Absolute Error (MAE)

Measures the average absolute difference between predicted and actual values.

New cards

List some loss functions

Mean Squared Error (MSE), Binary Cross-Entropy,
Categorical Cross-Entropy, Mean Absolute Error (MAE), Hinge Loss, Log Loss, Kullback-Leibler Divergence