Note

0.0(0)

Take a practice test

Chat with Kai

Explore Top Notes

23. In 10 lần chữ hello ra màn hình

Studied by 2 people

Waves of Migration

Studied by 1 person

Chapter 5: Working with Windows and CLI Systems

Studied by 20 people

AP Physics 2 Ultimate Guide

Studied by 5236 people

Studied by 5 people

Chapter 21 - Gilded Age Politics

Studied by 36 people

ICT202 Machine Learning - Topic 1: Introduction to Machine Learning

Introduction to Machine Learning

What is Machine Learning?

Machine Learning (ML) is a part of AI where computers learn from data.
ML helps computers make predictions without being directly programmed.
It's a way to automatically learn rules from data to improve predictions and decisions.
ML allows computers to learn without specific instructions.

From Learning to Machine Learning

Learning: Getting skills from experience.
Machine Learning: Getting skills from data.
Skill: Improving how well something performs (like accuracy).
Example: In healthcare, ML uses patient data to predict health outcomes.

Why Machine Learning?

Some problems are hard to solve with simple rules.
Example: Recognizing trees is difficult to program by hand.
ML can automatically learn rules from images to recognize trees.
ML helps build complex systems.
Use Cases:
- Navigating on Mars because humans can't easily define the solution.
- Recognizing speech/visuals, needing fast decisions.
- High-speed trading.

Key Essence of Machine Learning

There’s a pattern to learn, improving performance.
No direct programming is possible, so use ML.
Data is available for ML.

ML Applications

Food Data: Predict food poisoning risk using Twitter data.
Clothing Data: Suggest clothes using sales and surveys.
Housing Data: Predict energy use of buildings.
Transportation Data: Recognize traffic signs.
Recommender System Data: Predict movie ratings.

Example: Credit Approval

Data: Age, Gender, Salary, Years at Address, Debt.
Output: Credit Approval (yes/no).

Formalizing the Learning Problem

Basic Notations:
- Input: x (customer applications).
- Output: y (good/bad credit risk).
- Target Function: f: X \rightarrow Y (ideal approval formula).
- Data: D = {(x1, y1), (x2, y2), …, (xn, yn)} (bank records).
- Hypothesis: g (learned formula).

Formalizing the Learning Problem

Target function: f: X \rightarrow Y.
Training examples: D = {(x1, y1), (x2, y2), …, (xn, yn)}.
Final hypothesis: g \approx f.
ML uses data to find g that is close to f.

Types of Machine Learning

Supervised Learning
Unsupervised Learning
Semi-Supervised Learning
Reinforcement Learning

Supervised vs. Unsupervised Learning

Supervised Learning:
- Has labels.
- Gets direct feedback.
- Predicts outcomes.
Unsupervised Learning:
- No labels.
- No feedback.
- Finds hidden structures.

Supervised Learning

Labeled data --> ML Algorithm --> Model

Learn a model to predict future data using labeled data.
Supervised means we know the correct answers.

Classification

Predict categories using past data.
Categories are distinct groups.
Example: Spam filtering.

Binary Classification

y = {+1, -1}
Two categories.
Learn a boundary to separate classes.

Multiclass Classification

y = {1, 2, …, K}
Assign a category from training data to new data.
Example: Character recognition, email sorting.

Regression

y = [a, b] \subset R
Predict continuous results.
Find relationships between variables.
Example: Predict student scores based on study time.

Regression Details

Fit a line to minimize distance between points and the line.
Use the line to predict new outcomes.

Unsupervised Learning

Supervised: knows the answer.
Reinforcement: defines a reward.
Unsupervised: uses unlabeled data.
Explore data structure without a known outcome.

Unsupervised Learning Problems

Clustering: {xi} \Rightarrow {Ci}, like "unsupervised classification". Example: articles to topics.
Density Estimation: {x_i} \Rightarrow p(x), like "unsupervised regression". Example: traffic reports -> dangerous areas.
Outlier Detection: {x_i} \Rightarrow {0, 1}, like "unsupervised binary classification". Example: internet logs -> intrusion alert.

Clustering

Organize data into subgroups without knowing groups beforehand.
Clusters group similar objects.
Good for structuring info.
Example: Finding customer groups.
Other examples: search engines, image analysis.

Unsupervised Dimensionality Reduction

High-dimensional data is hard to store and process.
Reduce noise and compress data while keeping important info.

Unsupervised Dimensionality Reduction

Data visualization: Show high-dimensional data in 2D or 3D plots.

Semi-Supervised Learning

Uses some labeled and more unlabeled data.
Examples:
- Face images with some labels -> face identifier.
- Health data with some labels -> predict medicine effects.
Avoids expensive labeling.

Basic Terminology and Notations

Iris Dataset example

Samples (instances)
Features (attributes)
Class labels (targets)
Example: Sepal length, Sepal width, Petal length, Petal width, Class labels (Setosa, Versicolor, Virginica)
X = \begin{bmatrix} x{11} & x{12} & x{13} & x{14} \ x{21} & x{22} & x{23} & x{24} \ … & … & … & … \ x{150,1} & x{150,2} & x{150,3} & x{150,4} \ \end{bmatrix}
y = {Setosa, Versicolor, Virginica}

Roadmap for Building Machine Learning Systems

Raw Data --> Preprocessing --> Training & Test Datasets --> Training, Evaluation, Prediction

Preprocessing: cleaning data.
Learning Algorithm: picking the right method.
Model Selection: choosing the best model.

Preprocessing

Data preprocessing is very important.
Raw data is often not ready for ML.
ML algorithms need features on the same scale.

Preprocessing cont.

Some features might be similar.
Reduce dimensions to compress features.
Split data into training and test sets.

Training and Selecting a Predictive Model

No Free Lunch Theorems: no one way works for everything.
Each algorithm has biases; compare different ones to pick the best.

Training and Selecting a Predictive Model cont.

Measure performance with specific metrics.
Use cross-validation: split data to check how well the model works.

Evaluation and Prediction

Test the model on unseen data.
If it works well, use it to predict new data.
Apply preprocessing steps from training to new data.

Python for Machine Learning

Python is good for data science with many libraries.
Python can be slow, but libraries like NumPy and SciPy make it faster.

Python Libraries

NumPy
SciPy
Pandas
Scikit-learn
Tensorflow
Keras
Visualization: matplotlib, Seaborn
Resources: Python in ML, TensorFlow course

Python for Machine Learning – Coding Platform

Jupyter Notebook: A tool to write and share code, equations, and visuals.
Uses your computer's resources.
Install libraries with pip or Anaconda.

Python for Machine Learning – Coding Platform (Colab)

Colab: Google's cloud-based notebook.
Write and run Python code in a browser.
Edit together with others.
Save notebooks to Google Drive/GitHub.
Free to use with free GPUs.
Has common libraries pre-installed.

Note

0.0(0)

Take a practice test

Chat with Kai

Explore Top Notes

23. In 10 lần chữ hello ra màn hình

Studied by 2 people

Waves of Migration

Studied by 1 person

Chapter 5: Working with Windows and CLI Systems

Studied by 20 people

AP Physics 2 Ultimate Guide

Studied by 5236 people

Studied by 5 people

Chapter 21 - Gilded Age Politics

Studied by 36 people