1/17
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
Which of the following would be the least effective way to represent a color (e.g., "Pink") in a dataset used in a predictive modeling task?
As an ordinal value based on its rank in an alphanumeric sorting of all colors
Consider a dataset with the following structure:
City | State | Date | Temperature |
Berkeley | CA | Jan 25, 2018 | 11 |
Assuming we wanted to transform this dataset into a dataset with only the features of State, Month, Temperature, with State represented by the longitude and latitude of the State's capital, Month represented by a one-hot, and temperature left as a numeric: How many total features (columns) would be in this dataset?
(hint: longitude and latitude count as two features)
15
Sum of Squares Error (SSE) can be used with K-means clustering to:
(check all that apply)
K = number of clusters
n = number of data points being clustered
Choose a value of K based on the heuristic of the "elbow" method
Choose between different clusterings (for a fixed K) produced by starting with different random K-means centroids
What is the range of the silhouette score?
[-1,1]
How could a data point have a silhouette coefficient of 0?
If the data point is as close to points in its cluster as it is to points in the nearest cluster (not including its own)
How many different assignments of data points to clusters are there given n data points and K clusters? Assume a data point can only belong to a single cluster.
K^n
The plot below depicts data points for a dataset of 10 credit card seeking individuals, 6 of whom are considered to be a high credit risk and 4 of whom are considered to be a low credit risk.
What is the starting Gini impurity (index) of this dataset given credit risk as the target?
[Reminder] Gini impurity (index) formula:
0.48
If there were equal low credit risk as high credit risk individuals, what would the Gini impurity be of the dataset without any splits?
0.5
If you were creating a decision tree based on this dataset using the C4.5 or CART algorithm, the first step would be to choose an attribute and split point that best partitioned the data points by the target value.
According to the credit risk plot, which attribute and split point would be the best choice among the following options?
Age with a split point of 35
Given enough depth (splits), a decision tree can successfully classify any training dataset with 100% accuracy.
False
Assume you are a building an image classification neural network to predict an image as either a dog, cat, or turtle. The images are 32x32 pixels and serialized into a vector of 1024 features per image. Assume there is only one hidden layer between the input and output layer. The hidden layer has 10 neurons (nodes). Ignoring bias terms, what is the total number of weights for this network?
10,270
Using a sigmoid as the activation function for a binary class in the output layer, what output value produced by the sigmoid would denote highest uncertainty for a class prediction:
+0.5
What input value into the sigmoid function would produce the highest uncertainty output value?
0
A binary classifier needs to predict the question: "Does the patient have lung cancer?" The table below shows a validation dataset labels and predictions. Compute the precision of these predictions:
Sample Number | Actual | Predicted |
1 | Normal | Cancer |
2 | Cancer | Cancer |
3 | Cancer | Cancer |
4 | Normal | Normal |
5 | Cancer | Normal |
Assume "Cancer" represents the positive class, and "Normal" represents the negative class.
Please round your answer to the 2nd decimal place.
[note: precision is a value between 0 and 1]
0.66 (with margin: 0.01)
0.666 (with margin: 0.01)
0.67 (with margin: 0.01)
In which of the following prediction scenarios would it be appropriate to apply AUC as the metric?
When predicting a binary label with a probabilistic prediction
Simple aggregation (also known as a simple combiner) differs from bagging in the following ways (check all that apply):
bagging requires bootstrapping and simple aggregation does not
bagging requires that the same algorithm be used for prediction/classification and simple aggregation does not
What type of ensemble technique uses bootstrapping but modifies the probability of sampling an instance based on how well it was predicted in previously trained models:
Boosting
Which ensemble method does not allow for parallel training of the models in the ensemble:
Boosting