Machine Learning

0.0(0)

Studied by 0 people

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/10

There's no tags or description

Looks like no tags are added yet.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

11 Terms

New cards

ID3 algorithm sensitivity to irrelevant features

Sensitivity: High
Technical Explanation: ID3 uses information gain to select features but can overfit by splitting on noisy or irrelevant features, especially with small datasets.
Simplified Analogy: Chef who picks ingredients based on popularity. Grabs useless ingredients if they seem “popular,” leading to a messy dish.

New cards

K-NN with Euclidean distance sensitivity to irrelevant features

Sensitivity: High
Technical Explanation: K-NN relies on Euclidean distance, which treats all features equally. Irrelevant features add noise to the distance calculation, reducing accuracy.
Simplified Analogy: Chef who measures all ingredients equally. Useless ingredients mess up the recipe by adding unnecessary flavor.

New cards

K-NN with information gain weighted Euclidean distance sensitivity to irrelevant features

Sensitivity: Moderate
Technical Explanation: Weighting by information gain reduces the impact of irrelevant features by prioritizing more informative ones, but some sensitivity remains if weights aren’t perfectly tuned.
Simplified Analogy: Chef who prioritizes better ingredients. Less distracted by useless ones but can still mess up if priorities aren’t perfect.

New cards

Logistic regression sensitivity to irrelevant features

Sensitivity: Moderate
Technical Explanation: Logistic regression can be affected by irrelevant features, but regularization (e.g., L1/L2) can mitigate this by shrinking coefficients of irrelevant features toward zero.
Simplified Analogy: Chef who can ignore bad ingredients with a trick. Downplays useless ingredients to keep the dish simple, but not perfect with too much junk.

New cards

Kernelized Perceptron with Gaussian kernel sensitivity to irrelevant features

Sensitivity: Moderate to High
Technical Explanation: The Gaussian kernel maps data into a higher-dimensional space, where irrelevant features can still influence the decision boundary, especially if the kernel’s bandwidth isn’t well-tuned.
Simplified Analogy: Chef who remixes ingredients in a fancy blender. Useless ingredients can sneak into the mix if the blender settings aren’t tuned right.

New cards

SVM sensitivity to irrelevant features

Sensitivity: Low to Moderate
Technical Explanation: SVMs focus on support vectors and maximize margins, making them less sensitive to irrelevant features. However, without feature selection, irrelevant features can still add noise in high dimensions.
Simplified Analogy: Chef who focuses only on the best ingredients. Picks the most important ones for the dish, but too many useless ones can still clutter the kitchen.

New cards

-layer MLP with 100 latent units sensitivity to irrelevant features

Sensitivity: Low to Moderate
Technical Explanation: A 2-layer MLP can learn to ignore irrelevant features by adjusting weights during training, but with many latent units, it may overfit to noise if regularization isn’t applied.
Simplified Analogy: Chef who learns to ignore bad ingredients over time. Adjusts the recipe to skip useless ones but might overthink and add them by mistake.

New cards

Pruning - decision trees effective?

Definition: Pruning removes branches of the tree after it’s built by eliminating splits that don’t improve accuracy significantly, often based on a validation set or error metric.
Effectiveness: Effective
Technical Explanation: Pruning simplifies the tree by removing branches with little impact on accuracy, preventing it from memorizing noise in the training data.
Simplified Analogy: Gardener trims extra branches. Cuts off wild growth to stop the tree from getting too tangled and memorizing every weed.

New cards

Enforce a minimum number of samples in leaf nodes - decision trees effective?

Enforce a minimum number of samples in leaf nodes

Definition: This strategy sets a minimum number of data points required in each leaf node, stopping the tree from splitting if a leaf would have too few samples.
Effectiveness: Effective
Technical Explanation: Requiring a minimum number of samples per leaf stops the tree from creating tiny, specific leaves that memorize noise, reducing unnecessary branches.
Simplified Analogy: Gardener ensures each branch has enough fruit. Limits tiny branches that only hold a few fruits (data points), preventing the tree from overgrowing.

New cards

Make sure each leaf node is one pure class - decision trees effective?

Definition: This strategy requires each leaf node to contain samples of only one class, forcing the tree to keep splitting until purity is achieved.
Effectiveness: Ineffective (Increases Overfitting)
Technical Explanation: Forcing leaves to be pure creates more branches as the tree keeps splitting to isolate every sample, leading to memorization of noise.
Simplified Analogy: Gardener insists each branch has only one type of fruit. Keeps adding more branches to separate everything, making the tree too wild and memorized.

New cards

Enforce a maximum depth for the tree - decision trees effective?

Definition: This strategy sets a maximum number of levels (depth) the tree can grow, limiting how many splits can occur from root to leaf.
Effectiveness: Effective
Technical Explanation: Limiting the tree’s depth caps the number of branches, preventing the tree from growing too complex and memorizing noise instead of general patterns.
Simplified Analogy: Gardener sets a height limit for the tree. Stops it from growing too many branches, keeping it from getting tangled and over-memorizing the garden.