Machine Learning Concepts Review

0.0(0)

Studied by 1 person

0.0(0)

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/16

Earn XP

Description and Tags

These flashcards cover key concepts, definitions, and true/false statements from the lecture notes on machine learning techniques, including frequent itemset mining, dimensionality reduction, clustering, decision trees, and support vector machines.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

17 Terms

New cards

True or False: Increasing the minimum support threshold in the Apriori algorithm always results in a smaller number of frequent itemsets being discovered.

True

New cards

True or False: The main objective of t-SNE is to reduce the dimensionality of the data while preserving global structures.

False

New cards

True or False: DBSCAN can identify clusters of arbitrary shape and can also detect outliers.

True

New cards

True or False: Both Lasso and Ridge regression are techniques used to prevent overfitting by adding a penalty term to the cost function.

True

New cards

True or False: The linear kernel is suitable for handling non-linearly separable data in SVM.

False

New cards

True or False: A classifier with high precision and low recall is ideal for scenarios where false negatives are more harmful than false positives.

False

New cards

True or False: In reinforcement learning, an agent always receives immediate rewards after taking an action.

False

New cards

True or False: A single-layer perceptron can accurately classify non-linearly separable data such as the XOR problem.

False

New cards

What is the difference between closed itemsets and maximal itemsets in frequent itemset mining?

Closed itemsets are those that have no superset with the same support, while maximal itemsets are those that are not a subset of any other frequent itemset.

New cards

What does a confidence value of 0.7 in an association rule {Item A} → {Item C} indicate?

It indicates that the probability of Items A and C appearing together is 0.7, given that Item A is present.

New cards

What is the curse of dimensionality?

It refers to the phenomenon where the feature space becomes sparse with high-dimensional data, making it difficult for models to determine relationships between points.

New cards

What is the primary goal of PCA in dimensionality reduction?

The primary goal of PCA is to retain as much variance in the data as possible while reducing the number of dimensions.

New cards

What are the common metrics used to evaluate impurity in decision trees?

Common metrics include information gain, which measures the effectiveness of an attribute in classifying data, and Gini impurity, which calculates the likelihood of misclassifying a randomly chosen element.

New cards

What distinguishes K-Means clustering from hierarchical clustering?

K-Means clustering assigns data points into a specified number of clusters based on centroids, while hierarchical clustering builds a hierarchy of clusters through a linkage function.

New cards

What are the implications of high precision in a spam detection system?

High precision means most flagged emails (predicted spam) are indeed spam, reducing false positives and improving user experience.

New cards

What is the purpose of using different kernels in SVMs?

Different kernels allow SVMs to classify data in higher dimensions where it can become linearly separable, enabling better generalization.

New cards

Why can a single-layer perceptron model AND and OR operations but not the XOR operation?

A single-layer perceptron can only create linear decision boundaries, which are sufficient for AND and OR but not for the non-linear boundary required for XOR.