Meru University Machine Learning Exam Notes
Meru University Exam Notes 2021/2022
Machine Learning Strategies
Supervised Learning: Learning from labeled data.
Unsupervised Learning: Learning from unlabeled data, finding patterns.
Reinforcement Learning: Learning through trial and error, receiving rewards or penalties.
Steps in Designing a Learning System
Define the problem.
Gather and prepare data.
Select a learning algorithm.
Train the model.
Evaluate and tune the model.
Deploy the model.
Application Areas of Machine Learning
Healthcare: Disease prediction and diagnostics.
Finance: Fraud detection and credit scoring.
Retail: Customer segmentation and recommendation systems.
Decision Tree Algorithms
Basic idea: A tree structure where internal nodes represent tests on attributes, branches represent outcomes, and leaf nodes represent final decisions based on the majority class.
Bank System and Requirements
Before Deployment:
Ample historical data for model training.
Problems:
Data privacy and compliance issues.
Data bias leading to flawed predictions.
Machine Learning in Computer Vision
Example: Facial recognition involves identifying human faces in images using patterns learned from data.
/
Bias vs. Variance
Bias: Error due to overly simplistic assumptions in the learning algorithm.
Variance: Error due to excessive complexity in the learning model leading to overfitting.
Dimensionality Reduction
Definition: The process of reducing the number of input variables in a dataset.
Methods:
Principal Component Analysis (PCA).
Singular Value Decomposition (SVD).
t-Distributed Stochastic Neighbor Embedding (t-SNE).
Find-S Algorithm
A method used in concept learning to find the most specific hypothesis that satisfies all positive training examples and no negative examples.
Decision Tree Attributes for Tennis Game Prediction
Entropy Calculation: Used to select attributes by measuring information gain.
Tree Pruning: A technique to reduce the complexity of a decision tree by removing branches that have little importance.
Methods: Cost complexity pruning and reduced error pruning.
Candidate Elimination Algorithm
Derives the version space based on training instances to identify the most specific and general hypotheses.
Overfitting in Machine Learning
Definition: A model performs well on training data but poorly on unseen data.
Causes: Complexity of the model, noise in training data, insufficient training data.
Cross Validation vs. Hyper-Parameter Optimization
Cross Validation: Technique for assessing how the results of a statistical analysis will generalize to an independent dataset.
Hyper-Parameter Optimization: The process of tuning the parameters of the learning algorithm to improve performance.
Deep Learning Concept
A subset of machine learning involving neural networks with many layers, designed to automatically learn representations from data.
Distance Measures in K-Neighbors
Measures include Euclidean, Manhattan, Minkowski, and Hamming distance used to determine closest neighbors in datasets.
Bayes Theorem in AI
Definition: A mathematical formula that describes how to update the probabilities of hypotheses given new evidence.
Application: Used in probabilistic models such as Naive Bayes classification in machine learning to make predictions based on prior knowledge and observed data.