Chapter 7 Machine Learning ( Supervised ) & 8 ( Unsupervised Learning )

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/21

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

22 Terms

1
New cards

What is machine learning?

Machine learning is a set of methods that can automatically detect patterns in data, and then use the uncovered patterns to predict future data, or to perform other kinds of decision making under uncertainty.

2
New cards

What are the two types of machine learning?

  • Supervised

  • Unsupervised

3
New cards

What is classification/prediction?

  • Classification/prediction is like human learn from past experiences.

  • Computer does not have “experience” so it learns from data, which represent some “past experiences” of an application domain.

4
New cards

What is the general flow process of supervised learning?

i ) Training Text, Documents, Images, Sounds, … > Features Vectors > / Labels >

ii) Machine Learning Algorithm >

iii) Predictive Model

iv) New Text Document, Images, Sound > Features Vector > iii > Expected Label

5
New cards

Where can dataset can be retrieved?

  • Public datasets

  • Data marketplace

  • Company and organization datasets

  • Web scraping

6
New cards

What are the activities carried out in preprocessing?

Data cleaning -Handling missing values by by either removing the corresponding samples or filling in the missing values with techniques such as mean, median, or mode imputation. Address outliers by using Winsorization technique or outlier imputation

Data Integration - If you have multiple data sources or datasets, you may need to integrate them into a single dataset. This typically involves handling inconsistencies in attribute names, resolving conflicts in data formats, and merging the data based on common identifiers.

Data Transformation - Common transformations include scaling numerical features to a similar range (e.g., using normalization or standardization), encoding categorical variables into numerical representations (e.g., one-hot encoding, front-end development → 1, other development → 2), and transforming skewed distributions

Imbalanced Data - Technique such as oversampling the minority class, undersampling the majority class, or using advanced algorithms like SMOTE (Synthetic Minority Over- sampling Technique) can be employed to address the imbalance.

Time-Series Data - Handling missing or irregular timestamps, resampling or interpolating the data to a regular time interval, and creating lag features or rolling windows for capturing temporal patterns.

7
New cards

What are the major types of machine learning algorithms?

Classification - Uses categorical / nominal

Regression - Continuous values

8
New cards

What are the types of classification methods?

  • K Nearest Neighbour

  • Decision Tree

  • Support Vector Machine

  • Bayesian Classification

9
New cards

What are the conditions for stopping partitioning?

  • All samples for a given node belong to the same class

  • There are no remaining attributes for further partitioning - majority voting is employed for classifying the leaf

  • There are no samples left

10
New cards

What is entropy?

Entropy is the measure of randomness in a dataset

11
New cards

What is the aim of decision tree?

Split the data in a way that the entropy in the data decreases so it is easier to make predictions

12
New cards

What are the two approaches to avoid overfitting?

Prepruning: Halt tree construction early - do not split a node if this would result in the goodness measure failing below a threshold

  • Difficult to choose an appropriate threshold

Postprunning: remove branches from a “fully grown” tree - get a sequence of progressively pruned trees

  • Use a set of data different from the training data to decide which is the “best pruned tree”

13
New cards

What are the contents of Naive Bayesian Classification?

Probabilistic learning: Calculate explicit probabilities for hypothesis, among the most practical approaches to certain types of learning problems.

Incremental: each training example can incrementally increase/decrease the probability that a hypothesis is correct. prior knowledge can be combined with observed data.

Probabilistic prediction: Predict multiple hypotheses, weighted by their probabilities

Standard: Even when Bayesian methods are computationally intractable, they can provide a standard of optimal decision making against which other methods can be measured

14
New cards

What is a confusion matrix?

A confusion matrix is a table that is often used to describe the performance of a classification model (or "classifier") on a set of test data for which the true values are known.

15
New cards

What are the clustering methods?

  • K-Means

  • Gaussian Mixture Model

  • Mean-Shift

  • Hierarchical Clustering

16
New cards

What is the stopping/convergence criterion for K-Means?

  1. no (or minimum) re-assignments of data points to different clusters

  2. no (or minimum) change of centroids, or

  3. minimum decrease in the sum of squared error J

17
New cards

What are the limitations of K-Means?

  1. very sensitive to the initial points. - do many runs of k-means, each with different initial centroids.

  2. must manually choose k - learn the optimal k for the clustering.

  3. K-means has problems when clusters are of differing size, densities, non-globular shapes

  4. K-means has problems when the data contains outliers

18
New cards

What are the advantages and disadvantages of mean shift?

Advantages

  • Does not assume number of clusters

  • Just a single parameter

  • Finds variable number of modes

  • Robust to outliers

Disadvantages

  • Output depends on window size

  • Computationally expensive ( one that, for a given input size, requires a relatively large number of steps to complete )

19
New cards

What are the types of hierarchical clustering?

Agglomerative (bottom up) clustering: It build the dendrogram (tree) from the bottom level

Divisive (top down) clustering: It starts with all data points in one cluster, the root.

20
New cards

What are the evaluation based on internal information?

Intra-cluster cohesion (compactness)

  • Cohesion measures how near the data points in a cluster are to the cluster centroid.

  • Sum of squared error (SSE) is a commonly used measure.

inter-cluster separation (isolation)

  • Separation means that different cluster centroids should be far away from one another.

21
New cards

Confusion matrix for predicted no, predicted yes, actual no and actual yes

Predicted no - Negative

Predicted yes - Positive

Predicted is actual - True

Predicted is not actual - False

Actual no but predicted yes - False positive

Actual no and predicted no -True negative ( True = correct, negative = no )

22
New cards

What is the formula for accuracy, precision and recall?

Accuracy = TP+TN / total

Precision = TP/ Predicted yes

Recall = TP/Actual yes