Studied by 1 person

0.0(0)

Get a hint

Hint

Looks like no one added any tags here yet for you.

1

machine learning is the study of _________ that improve their __________ at some ________ with ___________

algorithms; performance; task; experience

New cards

2

well-defined learning task: < ____ >

P, T, E

New cards

3

machine learning is good at recognizing ________, recognizing _________, and _________

patterns; anomalies; prediction

New cards

4

deep learning is a type of __________ ________ _________

artificial neural network

New cards

5

more than 2 hidden layers makes it a _____ _______ _________ (____)

deep neural network (DNN)

New cards

6

the most popular machine learning algorithm

deep learning

New cards

7

the ______ in Python is important because it indicates a block of code

indentation

New cards

8

comments in python use ___; block comments use three ___ or ___

#; ‘; “

New cards

9

variables in python are _____ _________ and must start with a ______ or the ________ character, no _________

case sensitive; letter; underscore; numbers

New cards

10

boolean in Python is declared as _______

bool

New cards

11

Python for loop syntax through myList

for x in myList:

New cards

12

Python for loop syntax for range of 10

for x in range(10):

New cards

13

used in Python to store data values in key:value pairs; they are _______ and do not allow ________

dictionaries; ordered; duplicates

New cards

14

Python function definition

def myFunction(input):

New cards

15

what do you add to you parameter for an arbitrary number of arguments?

*

New cards

16

declare arr [1 2 3 4] as an numpy array

arr = np.array([1, 2, 3, 4])

New cards

17

declare 2×2 matrix (mat) as numpy array

mat = np.array([1,1],[2,2])

New cards

18

check the dimension of numpy array (arr) TWO WAYS

arr.ndim; arr.shape

New cards

19

comprehensive library for creating static, animated, and interactive visualizations in Python

matplotlib

New cards

20

a vector is a ____ _____

1D array

New cards

21

matrix transpose is an operator that _____ the matrix over its _______, in turn switching the _____ and ______

flips; diagonal; rows; columns

New cards

22

v = [a,b]

f(v) = a² + b²

what is f’(v) with respect to v?

f’(v) = [2a, 2b]

New cards

23

for derivatives with a matrix or vector, we normally multiply the ________ and the ____ ________

transpose; one vector

New cards

24

python code to find magnitude of vector x

y = x**2

s = np.sum(y)

d = np.sqrt(s)

New cards

25

python add/subtract vectors x and y

x + y; x - y

New cards

26

numpy dot product for x and y

np.dot(x,y)

New cards

27

matplotlib plot function for x, y

plt.plot(x, y, label='My Plot’, linewidth=2.0)

New cards

28

KNN is ___ __________ _________

non parameter learning

New cards

29

non-parameter learning y = _____

parameter learning y = ____

f(X, X_train); f(X,W)

New cards

30

non-parameter learning needs the _____ ________ ________ and is very slow in ______ with almost no ______ ________

entire training dataset; inferring; training process

New cards

31

similar to using a dictionary to find definitions or synonyms

non-parameter learning

New cards

32

parameter learning requires the ____, is very _____ in _______, but takes more ______ in _______

weight; fast; inferring; time; training

New cards

33

similar to having the word in your brain to recognize it at once

parameter learning

New cards

34

gives you the ground truth

loss function

New cards

35

common loss function

Loss(y, y^) = sum(y-y^)²

New cards

36

with different combinations of theta0 and theta1, we obtain different ______ ______, it is a ____ surface

loss values; 3D

New cards

37

loss value shows how close your _________ __________ ___________ is to the ________ ________

machine learning algorithm; ground truth

New cards

38

for a loss value, the _______ the ________

lower; better

New cards

39

machine learning aims to find the best ________ that ____ ________ could obtain the _______ value

parameters; loss function; lowest

New cards

40

how do we get the smallest loss value

gradient descent

New cards

41

each step of gradient descent uses all of the training examples - this is known as …

batch gradient descent

New cards

42

your step size in gradient descent is known as the _________ _____

learning rate

New cards

43

output is decrete in _________

classification

New cards

44

output is continuous in _________

regression

New cards

45

machine learning is a ____-______ approach

data driven

New cards

46

data: any __________ fact, value, text, sound, or picture not being _______ and __________

unprocessed; interpreted; analyzed

New cards

47

a set of data collected for machine learning based task

dataset

New cards

48

a set of data used to discover predictive relationships

training dataset

New cards

49

a set of data used to asses the strength and utility of a predictive relationship

test dataset

New cards

50

the attributes to each data sample

features

New cards

51

KNN stands for:

k nearest neighbors

New cards

52

for KNN:

calculate the ____ _________ for every _____ _______

select the ____ data points with the _________ _________

________ based on the k point (new data point should belong to same category as the _______ )

L-2 distance; data point; K; smallest distance; voting; majority

New cards

53

can you still use KNN if there is more than one feature for distance calculations?

yes

New cards

54

when setting up KNN, you can choose two parameters:

the best ______ of ___ for ________

the best _______ for ________

value; k; voting; distance; measuring

New cards

55

the parameters you set of KNN are known as _____________ and are not ________ by the machine learning _______ itself

hyperparameters; adapted; algorithm

New cards

56

a set of examples used to tune the hyperparameters

validation dataset

New cards

57

never use _____ data to _____ _______

test; train model

New cards

58

cross validation: when dataset is ______, ______ data, try each fold as _______ and _______

small; split; validation; average

New cards

59

cross validation is _________ in deep learning

uncommon

New cards

60

learning from labeled examples

supervised learning

New cards

61

draw from inferences from datasets consisting of input data without labeled responses

unsupervised learning

New cards

62

supervised learning has pairs with an ______ object and a desired ______ value

input; ouput

New cards

63

unsupervised learning finds ______ ________ or _________ in data

hidden patterns; grouping

New cards

64

K-Means Algorithm:

initialize ____ _______ _______

assign _____ ______ to ________ clusters

update _______ _________ by calculating _________

repeat ___ and ___ until _________

select optimal number of ________

K center centroids; data points; nearest; center centroids; average; 2; 3; convergence; clusters

New cards

65

non-parameter learning requires computation of all of the _______ ________, taking more ______ and ________

training dataset; time; memory

New cards

66

non-parameter/parameter, supervised/unsupervised

KNN:

K-Means:

Linear Regression:

non-parameter, supervised; non-parameter, unsupervised; parameter, supervised

New cards

67

KNN and K-Means are __________ tasks whereas linear regression is a _________ task

classification; regression

New cards

68

linear regression steps

propose model; gradient descent; get parameters and test

New cards

69

image recognition is _______; stock price prediction is ________

classification; regression

New cards

70

softmax classifier: build upon ________ _________; _____ score of class k to __________ of being in this class; __________ of being in different classes sum up to ____

linear classification; map; probability; probabilities; 1

New cards

71

loss over the dataset is the _________ ______ for all _________

average loss; examples

New cards

72

three loss functions

MAE; MSE; Cross Entropy

New cards

73

MAE: ______ ________ __________

Equation:

mean absolution error; abs(y^ - y)

New cards

74

MSE: ______ _________ _________

Equation:

mean square error; (y^ - y)²

New cards

75

Cross Entropy is the _________ _____ likelihood of the __________ ________ as the loss

negative log; correct class

New cards

76

cross entropy for the following:

true label: [1 0 0 0 0]

softmax: [0.1 0.5 0.1 0.1 0.2]

-(1*log(0.1))+(0*log(0.5))+(0*log(0.1))+(0*log(0.1))+(0*log(0.2))

New cards

77

Regularization:

- it is likely different ___ has the same _____

- regularization helps to _______ ________ and avoid _________

W; loss; express preference; overfitting

New cards

78

L(W) including regularization

L(W) = data loss + regularization

New cards

79

overfitting: model tries to fit not only the __________ relation between _____ and ______ but also the _______ ________; ________ ______________ helps select simple models

regular; inputs; outputs; sampling errors; weight regularization

New cards

80

numerical gradient: __________, ______, easy to _____

analytic gradient: ______, _______, _______ prone

—> in practice we use _______ but check with _________

approximate; slow; write; exact; fast; error; analytic; numerical

New cards

81

with backpropogation, given f(x, y, z), you’ll end up getting which derivatives

df/dx; df/dy; df/dz

New cards

82

in backpropogation, multiply the _________ by the ______ ___________

upstream; local gradient

New cards

83

tool used for forward and back propogation

computational graph

New cards

84

the local gradient is the _________

derivative

New cards

85

the input to the local gradient is found from __________-____________

forward-propogation

New cards

86

current gradient =

local gradient * upstream gradient

New cards

87

what do we assume to begin back propogation if forward not clear

2

New cards

88

the input layer for a neural network

the first layer

New cards

89

the output layer for a neural network

the last layerl

New cards

90

layers in between input and output layers of neural networks

hidden layers

New cards

91

neurons between ________ layers are typically connected, neurons _______ the ______ layer are not connected

adjacent; within; same

New cards

92

the input layer of a neural network is _________ meaning the ________ is the input

transparent; output

New cards

93

two parts in the neurons of hidden layer

accumulation of product; activation function

New cards

94

connection among neurons has a _______ and it is the parameter that should be ________

weight; learned

New cards

95

which layer has a “special” activation function

output

New cards

96

squashes numbers [0, 1] and is popular, used in RNN

sigmoid

New cards

97

squashes numbers [-1,1] and zero-centered, used in RNN

tanh

New cards

98

squashes number [0, infinity] and does not saturate, used in CNN and FCN

RELU

New cards

99

squashes numbers [-infinity, infinity], does not saturate

leaky RELU

New cards

100

for choosing activation functions, typically choose _____, then _____ _______ if that doesn’t work; sometimes _____, not normally _______

RELU; leaky RELU; tanh; sigmoid

New cards