1/24
Week 1
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
What Is Machine Learning?
A computer program is said to learn from an experience E
with respect to some task T and some performance
measure P, if its performance on T, as measured by P,
improves with experience E.
Traditional Approach to AI
Typically uses a list of hand written logical
rules
Works well if the domain is simple and well
understood
Makes extensive use of the domain
knowledge of the designer
Machine Learning Approach to AI
Typically results in models (e.g
mathematical) of the domain
Due to extensive training time it provides
better value if the domain is complex, not
well understood and has lots of data
Useful where the rules of the domain are
not very clearly defined
Four Components of ML
Assumption
Model
Interference Paradigm
Interference Engine
Assumption
What we think the world looks like.
Eg - An apple’s height and time to the ground are related
Model
A way of expressing the thought (mathematically)
Eg The relationship between an apple’s height and time to the ground is linear, quadratic ect
Interference Paradigm
A framework for matching the model to the world
Eg The difference between measured and predicted time to the ground, sort of the performance metric
Interference engine
A way of doing the matching
Eg Tweaking the model coefficients - adjusting the model to new data
Types of training (4)
Supervised
Unsupervised
Semi-supervised
Reinforcement Learning
Supervised Learning
Learn from Labeled data
Give ML algo a bunch of examples with associated labels
Classification and Regression
Classification
Goal: Predict a label or category
Output is from a fixed set of options
A type of supervised learning
Using the learnt labels, it is able to classify new instances.
Eg. Spam filter
Regression
Goal: Predict a continuous number
Output can be any value along a range
Developing a model that predicts the value of an item e.g house price given predictor number of rooms
Unsupervised Learning
Find patterns in unlabelled data
Clustering
Visualisation and dimensionality reduction
Association rule learning
Clustering
Group similar datapoints together
Eg group customers by buying behaviour
Visualisation and dimensionality reduction
Goal: Simplify complex data while keeping important information
High number of variables are hard to visualise and process
Association Rule learning
Find interesting relationships between variables
Eg people who purchase BBQ sauce and potato chips tend to buy stake
Discovers rules like:
“If A happens, B is likely to happen”
Semi Supervised
A small amount of labelled data and a large amount of unlabelled data
Small amount of labelled data used to initially train the system
System is then used to classify the unlabelled data
Benefits of semi-Supervised learning
Improved learning accuracy over unsupervised learning
but without the time and costs needed for supervised learning
Often used when you can get lots of unlabelled data from a domain but tagging them or labelling them is costly in terms of time
Eg Tagging friends and family in photos app so that the algo can predict who is who
Reinforcement Learning
Learn by trial and error with rewards and punishments
Agent interacts with environment
takes actions and gains feedback
Elements of Reinforcement Learning
Environment - Physical world in which the agent operates
State - Current Situation of the agent
Reward - Feedback from the environment
Policy - method to map agents state to actions
Value - future reward that an agent would receive by taking an action in a particular state
How they use data to learn - two types
Batch Learning - Offline Learning
Learning on the fly - Online Learning
Batch Learning
The system is trained using all available data offline. Once trained, the system is launched into production without learning again
Advantages/Disadvantages of Batch Learning
Advantage:
Problems in data are dealt with before deployment
Disadvantage:
Training can take a long time and requires lots of data
Dangers of domain overfitting
Uses a lot of computing resources
Learning on the fly/Online Learning
The system is fed with data instances in small groups called mini-batches
Advantages/Disadvantages of Online Learning
Advantages:
Each learning step is fast and cheap so the system can learn about new data on the fly as it arrives
Does not require a lot of training data
Models are adapting with time and so do not overfit to data
Disadvantages:
Prone to wrong models due to errors in data