Data Science Fundemtals 2

studied byStudied by 0 people
0.0(0)
Get a hint
Hint

An ___ feature has values that are unaffected by other features.

1 / 94

encourage image

There's no tags or description

Looks like no one added any tags here yet for you.

95 Terms

1

An ___ feature has values that are unaffected by other features.

Input

New cards
2

An ___ feature has values affected by other features.

Output

New cards
3

Residual Error

The difference between the observed and predicted value.

New cards
4

Extrapolation

A prediction that is far beyond the range of the original data.

New cards
5

Simple Linear Regression =

f(0)+mx

New cards
6

Sum of Squared Errors (SSE)

The sum of the squares of all residuals.

New cards
7

Least Squares Regression Line

The simple linear regression formula that minimizes SSE.

New cards
8

Correlation Coefficient

Measures the direction and strength of a linear relationship as a value between 0 and 1.

New cards
9

Fitted vs. Residuals Plots

Displays the predicted values against the residuals.

New cards
10

Normal Q-Q Plot

Displays the sample quantiles against the theoretical quantiles.

New cards
11

Multiple Linear Regression =

f(0) x_0 + f(1) x_1 + … + f(k) x_k

New cards
12

Simple Polynomial Regression =

f(0) x^{0} + f(1) x^{1} + … + f(k) x^{k}

New cards
13

Polynomial Regression Model

A regression model that displays a polynomial relationship between two features.

New cards
14

Interaction Term

A term in a regression model that contains multiple input features.

New cards
15

Logistic Regression =

\frac{e^{b_0 + b_1 x}}{1+e^{b_0 + b_1 x}}

New cards
16

Hot Encoding

Transforming a categorical feature into a numeric feature.

New cards
17

Log-Odds = ln(\frac{p}{1-p}) =

b_0 + b_1 x

New cards
18

Odds Ratio

Compares the relative odds of a outcome given a feature.

New cards
19

A model is ___ if it is too simple to fit the given data.

Underfit

New cards
20

A model is ___ if it is too complex to fit the given data.

Overfit

New cards
21

Ideally, a model ___ pass through every point on a graph.

Shouldn’t

New cards
22

The ___ complex model is preferred over the ___ complex model.

Least, More

New cards
23

Total Error

How much the observed values differ from predicted values.

New cards
24

Bias

How much a model’s prediction differs from the observed values.

New cards
25

Variance

How spread out a model’s predictions are.

New cards
26

Irreducible Error

Error inherent to the situation, unaffected by the model.

New cards
27

A complex model will have more ___ than ___.

Variance, Bias

New cards
28

A simple model will have more ___ than ___.

Bias, Variance

New cards
29

Machine Learning Algorithm

Uses data to build a model that makes predictions.

New cards
30

Regression

A machine learning model used to predict numerical values.

New cards
31

Classification

A machine learning model used to predict categorical values.

New cards
32

Model Training

The process of estimating model parameters used to make a prediction.

New cards
33

___ data is used to fit a model.

Training

New cards
34

___ data is used to evaluate a model’s performance while working on the model.

Validation

New cards
35

___ data is used to evaluatethe final model’s performance compared to other models.

Test

New cards
36

Loss Function

Quantifies the difference between a model’s predictions and the observed values.

New cards
37

Regression Metric

The value returned by a loss function.

New cards
38

The lower the regression metric, the ___ the model is.

Better

New cards
39

Mean Squared Error =

\frac{1}{n} \sum (y_i - \hat{y}_{i})^{2}

New cards
40

Mean Squared Error

A direct measure of a model’s variance.

New cards
41

Mean Absolute Error =

\frac{1}{n} \sum |y_i - \hat{y}_{i}|^{2}

New cards
42

Mean Absolute Error

Like Mean Squared Error, but is less influenced by outliers.

New cards
43

Absolute Loss

Quantifies the loss due to uncertainty.

New cards
44

L_{abs}(y,\hat{p})=|y-\hat{p}| where y is the ___ and \hat{p} is the ___.

Observed class, Predicted probability

New cards
45

An instance is ___ if the output feature’s value is known for that instance.

Labeled

New cards
46

Supervised Learning

Training a model to predict a labeled output feature.

New cards
47

A model is ___ if the relationship between input and output features in the model are easy to explain.

Interpretable

New cards
48

A model is ___ if the outputs produced by the model match the actual outputs with new data.

Predictive

New cards
49

K-Nearest Neighbors

A supervised learning algorithm that predicts the output of a new instance using instances with similar inputs.

New cards
50

Metric

A method of determining the distance between two instances.

New cards
51

Confusion Matrix

A table that summarizes the combinations of predicted and actual values.

New cards
52

Accuracy =

\frac{\text{TP} + \text{TN}}{\text{TP}+\text{FP}+\text{TN}+\text{FN}}

New cards
53

Precision =

\frac{\text{TP}}{\text{TP} + \text{FP}}

New cards
54

Recall =

\frac{\text{TP}}{\text{TP}+\text{FN}}

New cards
55

Receiver Operating Characteristic Curve (ROC Curve)

Measures how well a classification model distinguishes between classes at various probabilties.

New cards
56

Area Under The ROC Curve (AUC)

A metric used to compare the performance between two classification models.

New cards
57

Naive Bayes Classification

A supervised learning classifier that uses the number of times a category occurs in a class to eastimate the likelihood of an instance being in that class.

New cards
58

P(\text{class}|\text{data}) indicates the probability that ___.

The probability of an instance being in \text{class} given \text{data}.

New cards
59

Laplace Smoothing

Adds one ficitonal instance to a class if none exist.

New cards
60

Naive Bayes Classification assumes all categories are ___.

Equally important

New cards
61

Support Vector Machine

A supervised learning algorithm that uses hyperplanes to divide data into different classes.

New cards
62

Hyperplane

A flat surface that is one dimension lower than the input feature space.

New cards
63

A dataset is ___ if a hyperplane can divide the dataset so that all instances of one class fall on one side and everything else falls on the other.

Well-Seperated

New cards
64

Margin

The space between a hyperplane and its supporting vectors.

New cards
65

Support Vectors

The closest instances to a hyperplane.

New cards
66

Vectors on the wrong side of a hyperplane are often given a ___.

Penalty

New cards
67

Hinge Function

Takes the distance from the margin as input, returns a 0 if vector is on the right side and a linear penalty if on the wrong side.

New cards
68

Sensitivity/Recall

The True-Positive rate.

New cards
69

Specificity

The True-Negative rate.

New cards
70

Accuracy

The ratio of the number of correct labels to the total labels.

New cards
71

Missclassification Rate

The ratio of the number of incorrect labels to the total labels.

New cards
72

Missclassification Rate =

1 - \text{Accuracy}

New cards
73

F1 Score

A number between 0 and 1 that represents the harmonic mean of precision and recall.

New cards
74

F1 Score =

2 \frac{\text{Precision} * \text{Recall}}{\text{Precision} + \text{Recall}}

New cards
75

Sensitivity =

\frac{\text{TP}}{\text{TP}+\text{FN}}

New cards
76

Specificity =

\frac{\text{TN}}{\text{TN}+\text{FP}}

New cards
77

Entropy

Describes the number of ways a situation could diverge.

New cards
78

Steps to make a decision tree:

Calculate entropy of decision, split decision’s attributes into subtables and calculate their entropy, choose the attribute with the largest entropy, then repeat the process.

New cards
79

Information Gain

Entropy before split compared to entropy after split.

New cards
80

Heuristic

The attribute that produces the purest node.

New cards
81

Entropy / Expected Information needed to classify tuple D=

\text{Info}(D)=-\sum_{m}^{i=l}p_{i}\log_{2}(p_{i})

New cards
82

Information needed to classify D after using A to split D into v partitions=

\text{Info}_{A}(D)=\sum^{v}_{j=l}\frac{|D_{j}|}{|D|}*I(D_j)

New cards
83

Information gained by branching on attribute A=

\text{Gain}(A)=\text{Info}(D)-\text{Info}_{A}(D)

New cards
84

When picking a distance metric for kNN, the metric doesn’t have to be the ___ on a graph.

Physical distance

New cards
85

The ___ set is used to train the model before testing it.

Training

New cards
86

The ___ set is used to test the model’s abilities after training it.

Testing

New cards
87

Picking an ___ is the 3rd step in creating a kNN model.

Evaluation Metric

New cards
88

The k in kNN represents the ___.

Distance Metric

New cards
89

Unsupervised Learning

Teaching a model to categorize data where no labels are available.

New cards
90

kMeans

An unsupervised learning technique that groups different tuples together based on known attributes.

New cards
91

Centroids

The center points in a cluster for kMeans.

New cards
92

Each cluster in kMeans represents an individual ___.

Attribute

New cards
93

Step 3 of kMeans is to ___.

Move the centroids to the average location of the data points

New cards
94

kMeans should repeat until ___.

The centroids move either very little or not at all.

New cards
95

kMeans has the possible to fall into an ___ or give a ___ answer.

Infinite loop, Useless

New cards

Explore top notes

note Note
studied byStudied by 51 people
... ago
5.0(1)
note Note
studied byStudied by 9 people
... ago
5.0(1)
note Note
studied byStudied by 14 people
... ago
5.0(1)
note Note
studied byStudied by 4 people
... ago
5.0(1)
note Note
studied byStudied by 59 people
... ago
5.0(3)
note Note
studied byStudied by 7 people
... ago
4.0(1)
note Note
studied byStudied by 123508 people
... ago
4.8(561)

Explore top flashcards

flashcards Flashcard (85)
studied byStudied by 4 people
... ago
5.0(2)
flashcards Flashcard (37)
studied byStudied by 17 people
... ago
5.0(1)
flashcards Flashcard (40)
studied byStudied by 11 people
... ago
5.0(1)
flashcards Flashcard (56)
studied byStudied by 548 people
... ago
4.8(5)
flashcards Flashcard (169)
studied byStudied by 1 person
... ago
5.0(1)
flashcards Flashcard (24)
studied byStudied by 4 people
... ago
5.0(2)
flashcards Flashcard (118)
studied byStudied by 52 people
... ago
5.0(1)
flashcards Flashcard (21)
studied byStudied by 2 people
... ago
5.0(1)
robot