Machine Learning

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/10

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

11 Terms

1
New cards

ID3 algorithm sensitivity to irrelevant features

  • Sensitivity: High

  • Technical Explanation: ID3 uses information gain to select features but can overfit by splitting on noisy or irrelevant features, especially with small datasets.

  • Simplified Analogy: Chef who picks ingredients based on popularity. Grabs useless ingredients if they seem “popular,” leading to a messy dish.

2
New cards

K-NN with Euclidean distance sensitivity to irrelevant features

  • Sensitivity: High

  • Technical Explanation: K-NN relies on Euclidean distance, which treats all features equally. Irrelevant features add noise to the distance calculation, reducing accuracy.

  • Simplified Analogy: Chef who measures all ingredients equally. Useless ingredients mess up the recipe by adding unnecessary flavor.

3
New cards

K-NN with information gain weighted Euclidean distance sensitivity to irrelevant features

  • Sensitivity: Moderate

  • Technical Explanation: Weighting by information gain reduces the impact of irrelevant features by prioritizing more informative ones, but some sensitivity remains if weights aren’t perfectly tuned.

  • Simplified Analogy: Chef who prioritizes better ingredients. Less distracted by useless ones but can still mess up if priorities aren’t perfect.

4
New cards

Logistic regression sensitivity to irrelevant features

  • Sensitivity: Moderate

  • Technical Explanation: Logistic regression can be affected by irrelevant features, but regularization (e.g., L1/L2) can mitigate this by shrinking coefficients of irrelevant features toward zero.

  • Simplified Analogy: Chef who can ignore bad ingredients with a trick. Downplays useless ingredients to keep the dish simple, but not perfect with too much junk.

5
New cards

Kernelized Perceptron with Gaussian kernel sensitivity to irrelevant features

  • Sensitivity: Moderate to High

  • Technical Explanation: The Gaussian kernel maps data into a higher-dimensional space, where irrelevant features can still influence the decision boundary, especially if the kernel’s bandwidth isn’t well-tuned.

  • Simplified Analogy: Chef who remixes ingredients in a fancy blender. Useless ingredients can sneak into the mix if the blender settings aren’t tuned right.

6
New cards

SVM sensitivity to irrelevant features

  • Sensitivity: Low to Moderate

  • Technical Explanation: SVMs focus on support vectors and maximize margins, making them less sensitive to irrelevant features. However, without feature selection, irrelevant features can still add noise in high dimensions.

  • Simplified Analogy: Chef who focuses only on the best ingredients. Picks the most important ones for the dish, but too many useless ones can still clutter the kitchen.

7
New cards

-layer MLP with 100 latent units sensitivity to irrelevant features

  • Sensitivity: Low to Moderate

  • Technical Explanation: A 2-layer MLP can learn to ignore irrelevant features by adjusting weights during training, but with many latent units, it may overfit to noise if regularization isn’t applied.

  • Simplified Analogy: Chef who learns to ignore bad ingredients over time. Adjusts the recipe to skip useless ones but might overthink and add them by mistake.

8
New cards

Pruning - decision trees effective?

  • Definition: Pruning removes branches of the tree after it’s built by eliminating splits that don’t improve accuracy significantly, often based on a validation set or error metric.

  • Effectiveness: Effective

  • Technical Explanation: Pruning simplifies the tree by removing branches with little impact on accuracy, preventing it from memorizing noise in the training data.

  • Simplified Analogy: Gardener trims extra branches. Cuts off wild growth to stop the tree from getting too tangled and memorizing every weed.

9
New cards

Enforce a minimum number of samples in leaf nodes - decision trees effective?

Enforce a minimum number of samples in leaf nodes

  • Definition: This strategy sets a minimum number of data points required in each leaf node, stopping the tree from splitting if a leaf would have too few samples.

  • Effectiveness: Effective

  • Technical Explanation: Requiring a minimum number of samples per leaf stops the tree from creating tiny, specific leaves that memorize noise, reducing unnecessary branches.

  • Simplified Analogy: Gardener ensures each branch has enough fruit. Limits tiny branches that only hold a few fruits (data points), preventing the tree from overgrowing.

10
New cards

Make sure each leaf node is one pure class - decision trees effective?

  • Definition: This strategy requires each leaf node to contain samples of only one class, forcing the tree to keep splitting until purity is achieved.

  • Effectiveness: Ineffective (Increases Overfitting)

  • Technical Explanation: Forcing leaves to be pure creates more branches as the tree keeps splitting to isolate every sample, leading to memorization of noise.

  • Simplified Analogy: Gardener insists each branch has only one type of fruit. Keeps adding more branches to separate everything, making the tree too wild and memorized.

11
New cards

Enforce a maximum depth for the tree - decision trees effective?

  • Definition: This strategy sets a maximum number of levels (depth) the tree can grow, limiting how many splits can occur from root to leaf.

  • Effectiveness: Effective

  • Technical Explanation: Limiting the tree’s depth caps the number of branches, preventing the tree from growing too complex and memorizing noise instead of general patterns.

  • Simplified Analogy: Gardener sets a height limit for the tree. Stops it from growing too many branches, keeping it from getting tangled and over-memorizing the garden.