# Machine learning

Studied by 3 people
0.0(0)
get a hint
hint

Probability

1 / 397

## Tags and Description

Everything until deep learning

### 398 Terms

1

Probability

Study of uncertainty and randomness, used to model and analyze uncertainty in data.

New cards
2

A form of regularization

Ridge regression

New cards
3

Rows on a confusion matrix

Correspond to what is predicted

New cards
4

Collumns on a confusion matrix

Correspond to the known truth

New cards
5

The sensitivity Metric equation

True positives divided by the sum of true positives and false negatives

New cards
6

The Specificity metric equation

True negatives divided by true negatives plus false positives

New cards
7

if sensitivity = 0,81 what does it mean

example: tells us that 81% of the people with heart disease were correctly identifies by the logistic regression model

New cards
8

If specificity = 0.85 what does it mean

It means that 85% of the people without heart disease were correctly identified

New cards
9

When a correlation matrix has more than 2 rows, how do we calculate the sensitivity

We sum the false negatives

New cards
10

What is the function of specificity and sensitivity:

It helps us to decide which machine learning method would be best for our data

New cards
11

Sensitivity

If correcty identifying positives is the most important thing to do, which one should i choose? Sensitivity or Specificity?

New cards
12

If correctly identifying negatives is the most important thing, which one should I choose? Sensitivity or specificity?

Specificity

New cards
13

ROC

New cards
14

Roc funtion

To provide a simple way to summarize all the information, instead of making several confusion matrix

New cards
15

The y axis, in ROC, is the same thing as

Sensitivity

New cards
16

The x axis, in ROC, is the same thing as

Specificity

New cards
17

True positive rate =

Sensitivity

New cards
18

False positive rate =

Specificity

New cards
19

In another words, ROC allows us to

Set the right threshold

New cards
20

When specificity and sensitivity are equal,

the diagonal line shows where True positive rate = False positive rate

New cards
21

The ROC summarizes…

All of the confusion matrices that each threshold produced

New cards
22

AUC

Area under the curve

New cards
23

AUC function

To compare one ROC curve to another

New cards
24

Precision equation

True positives / true positives + false positives

New cards
25

Precision

the proportion of positive results that were correctly classified

New cards
26

Precision is not affected by imbalance because

It does not include the number of true negatives

New cards
27

Example when imbalance occurs

When studying a rare disease. In this case, the study will contain many more people without the disease than with the disease

New cards
28

ROC Curves make it easy to

Identify the best threshold for making a decision

New cards
29

AUC curves make it easy to

to decide which categorization method is better

New cards
30

Entropy can also be used to

Build classification trees

New cards
31

Entropy is also the basis of

Mutual Information

New cards
32

Mutual Information

Quantifies the relationship between 2 things

New cards
33

Entropy is also the basis of

Relative entropy ( the kullback leibler distance) and Cross entropy

New cards
34

Entropy is used to

quantify similarities and differences

New cards
35

If the probability is low, the surprise is

high

New cards
36

If the probability is high, the surprise is

low

New cards
37

The entropy of the result of X is

The expected surprise everytime we try the data

New cards
38

Entropy IS

The expected value of the surprise

New cards
39

We can rewrite entropy using

The sigma notation

New cards
40

Equation for surprise

New cards
41

Equation for entropy

New cards
42

Entropy

Is the log for the inverse of the probability

New cards
43

R2 *R Squared does not work for

Binary data, yes or no

New cards
44

R squared works for

Continuous data

New cards
45

Mutual information is

A numeric value that gives us a sense of how closely related two variables are

New cards
46

Equation for mutual information

New cards
47

Joint probabilities

The probability of two things occuring at the same time

New cards
48

Marginal Probabiities

The opposite of joint probability, is the probability of one thing occuring

New cards
49

Least sqaures =

Linear regression

New cards
50

squaring ensures

That each term is positive

New cards
51

Sum of Squared Residuals

How well the line fits the data

New cards
52

Sum of Squared Residuals function

The residuals are the differences between the real data and the line, and we are summing the square of these values

New cards
53

The Sum of square residuals must be

as low as possible

New cards
54

First step when working with bias and variance

Split the data in 2 sets, one for training and one for testing

New cards
55

How do we find the optimal rotation for the line

We take the derivative of the function. The derivative tells us the slope of the function at every point

New cards
56

Least squares final line

Result of the final line, that minimizes the distance between it and the real data

New cards
57

The first thing you do in linear regression

Use least squares to fit a line to the data

New cards
58

The second thing you do in linear regression

calculate r squared

New cards
59

The third thing you do in linear regression

calculate a p value for R

New cards
60

Residual

The distance from the line to a data point

New cards
61

SS(Mean)

Sum of squares around the mean

New cards
62

SS(Fit)

Sum of squares around the least squares fit

New cards
63
New cards
64

Linear regression is also called:

Least squares

New cards
65

What is Bias

Inability for a machine learning method like linear regression to capture the true relationship

New cards
66

How do we calculate how the lines will fit the training set:

By calculating the sum of squares. We measure how far the dots are from the main line

New cards
67

How do we calculate how the lines will fit the testing set:

New cards
68

Overfit

When the line at the training set data fits well, but not it does not fit well on the testing set

New cards
69

Ideal algorithm

Low bias, accurate on the true relationship

New cards
70

Low variability

Producing consistent predictions across different datasets

New cards
71

Result of least squares determination value for the equation parameters

it minimizes The sum of the square residuals

New cards
72

Y= Y-intercept + slope X

Linear regression

New cards
73

Y = Y-intercept + slope x + slope z

Multiple regression

New cards
74

Equation for R2 r squared

R2 = ss(mean) - ss(fit)

ss(mean)

New cards
75

Goal of a t test

Compare means and see if they are significantly different from each other

New cards
76

Odds are NOT

Probabilities

New cards
77

ODDS are

the ration of something happening ex. the team winning

to something not happening, ex. the team NOT winning

New cards
78

Logit function

Log of the ration of the probabilities and formas the basis for logistic regression

New cards
79

log(odds)

Log of the odds

New cards
80

log odds use?

Log odds is useful to determine probabilitirs about win/lose, yes/no, or true/false

New cards
81

Odds ratio

ex>

New cards
82

Relationship between odds ration and the log(odds ratio)

They indicate a relationship between 2 things, ex a relationship between the mutated gene and cancer, like weather or not having a mutated gene increases the odds of having cancer

New cards
83

Tests used to determine p values for log (odds ratio)

Fisher`s exact test, chi square test and the wald test

New cards
84

Large r squared implies…

A large effect

New cards
85

Machine Learning

Using data to predict something

New cards
86

Example of continous data

Weight and age

New cards
87

Example of discrete data

Genotype and astrological sign

New cards
88

Which curve is better? the one with maximum likelihood or minimum?

Maximum likelihood

New cards
89

Type of regression used to asses what variables are useful for classifying samples

Logistic regression

New cards
90

Components of GLM - Generalized Linear Models

Logistic regression and Linear models

New cards
91

The slope indicates

the rate at which the probability of a particular event occurring changes as the independent variable changes.

New cards
92

Logit function

Log(p)

1-p p is the middle line

New cards
93

If the coefficient estimate in logistic regression is negative, the odds are

against, Ex if you don't weigh anything, the odds are against you being obese

New cards
94

if the coefficient estimate is positive, that means that

For every unit of x gaines, the odds of y increases by number on the coefficient

New cards
95

In logistic regression, by using the z value, how do we confirm that it is statistically significant?

Greater than 2. ex. 2.255 with a p-value less than 0.05 ex 0.0241

New cards
96

What the difference between the coeeficitents used for linear models and logistic regression?

Is the exact same, except the coefficients are in terms of log odds

New cards
97

In logistic regression, what is the scale of the coefficients?

Log(odds)

New cards
98

How lines are fit in Linear regression?

by using least squares, measuring the residuals, the distances between the data and the line, and then squared them so that the negative value do not cancel out positive values

New cards
99

Line with the smallest sum of squared residuals is

The best line

New cards
100

Line with the biggest sum of squared residuals is

The worst line

New cards

## Explore top notes

Note
Studied by 3 people
Updated ... ago
5.0 Stars(1)
Note
Studied by 36 people
Updated ... ago
5.0 Stars(2)
Note
Studied by 5 people
Updated ... ago
5.0 Stars(2)
Note
Studied by 3 people
Updated ... ago
5.0 Stars(1)
Note
Studied by 11 people
Updated ... ago
5.0 Stars(2)
Note
Studied by 95 people
Updated ... ago
5.0 Stars(2)
Note
Studied by 5 people
Updated ... ago
5.0 Stars(1)
Note
Studied by 6346 people
Updated ... ago
4.8 Stars(49)

## Explore top flashcards

Flashcard43 terms
Studied by 15 people
Updated ... ago
5.0 Stars(1)
Flashcard30 terms
Studied by 14 people
Updated ... ago
5.0 Stars(2)
Flashcard65 terms
Studied by 31 people
Updated ... ago
5.0 Stars(3)
Flashcard382 terms
Studied by 6 people
Updated ... ago
5.0 Stars(1)
Flashcard88 terms
Studied by 77 people
Updated ... ago
5.0 Stars(7)
Flashcard39 terms
Studied by 8 people
Updated ... ago
5.0 Stars(1)
Flashcard58 terms
Studied by 17 people
Updated ... ago
5.0 Stars(1)
Flashcard136 terms
Studied by 5 people
Updated ... ago
5.0 Stars(1)