1/30
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
b — the number of nearest neighbors
What does the “k” in k-Nearest Neighbor represent?
a — stored training samples
What does a nearest neighbor classifier mainly use to classify a new point?
b — distance
What must be calculated between a test point and each training point?
b — it may include points from other classes
What happens when k is very large?
b — if it looks like something, it is likely that thing
Which simple idea describes the basic concept of nearest neighbor classification?
c — a method that repeatedly asks simple yes/no questions to make predictions
What is a Decision Tree?
c — supervised learning
According to the slides, what type of learning does a Decision Tree use?
c — the data is split into smaller and more uniform groups
In a Decision Tree, what happens each time a question is asked?
c — Boolean (yes/no) questions
In the commute example, which type of questions are used in the tree?
a — to choose attributes that give the cleanest splits
What is the purpose of selecting attributes when building a Decision Tree?
a — the posterior probability of a hypothesis
What does Bayes Theorem help us compute in classification?
d — attribute values are conditionally independent given the class
What is a key assumption of the Naive Bayes Classifier?
d — probability of a hypothesis before seeing any data
What does the prior probability represent?
d — a probability distribution over all possible classes
What does the Naive Bayes Classifier output?
b — it is simple and works well even with small datasets
What is one advantage of the Naive Bayes Classifier mentioned in the slides?
a — to evaluate how well a model predicts future data
What is the main purpose of cross validation?
a — it is randomly split into training and test sets
In the test-set (hold-out) method, what typically happens to the dataset?
a — it wastes data because only part of the dataset is used for training
What is one downside of the test-set method mentioned in the slides?
c — all points except one
In Leave-One-Out Cross Validation (LOOCV), how many points are used for training each time?
b — average the errors across all folds
In k-fold Cross Validation, what do we do after computing the error for each fold?
a — to maximize the evaluation function by moving to better states
What is the main goal of hill-climbing?
b — it can get stuck in a local optimum
What is a common problem with basic hill-climbing?
c — it picks a random move from the moveset
How does randomized hill-climbing differ from normal hill-climbing?
b — accept worse moves with some probability
What is the key idea of simulated annealing?
b — natural selection and evolution
What biological process inspires the genetic algorithm?
a — to load the Iris dataset
What is the purpose of the following code: from sklearn.datasets import loadiris; iris = loadiris()?
a — train_test_split
Which function is used to split the data into train and test sets?
a — test_size
Which parameter in traintestsplit specifies the proportion of the test set?
a — create an object with LogisticRegression() and then call fit()
Which of the following is the correct way to create a classification model in scikit-learn?
b — to train the model on the training data
What is the role of model.fit(Xtrain, ytrain)?
c — to use the train/test splitting function traintest_split
What is the purpose of the following import statement: from sklearn.modelselection import traintestsplit?