Lecture 7: Shallow Neural Networks

0.0(0)

Studied by 0 people

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/79

There's no tags or description

Looks like no tags are added yet.

Last updated 10:36 AM on 5/20/26

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No analytics yet

Send a link to your students to track their progress

80 Terms

1

New cards

How do AWS define a NN?

knowt flashcard image

2

New cards

How do Feedforward NN work?

knowt flashcard image

3

New cards

How do Convolutional NN work?

knowt flashcard image

4

New cards

How do Recurrent NN work?

knowt flashcard image

5

New cards

What are the two types of Feedforward NN?

Shallow and Deep

6

New cards

In a SNN, what are the input and output layer connected by?

Single Hidden Layer; single layer makes it shallow

7

New cards

<p>How many units does this Hidden Layer have?</p>

How many units does this Hidden Layer have?

Three units (h₁, h₂, h₃)

8

New cards

<p>What are the inputs into the hidden layer?</p>

What are the inputs into the hidden layer?

knowt flashcard image

9

New cards

What are the inputs in the hidden layer then subjected to?

Activation Function, a()

10

New cards

<p>What does the Activation Function then give?</p>

What does the Activation Function then give?

knowt flashcard image

11

New cards

Where are the outputs from the hidden layer then fed to?

The output layer, giving the final output

12

New cards

<p>What is the final output?</p>

What is the final output?

knowt flashcard image

13

New cards

<p>How many parameters does this NN have?</p>

How many parameters does this NN have?

10

14

New cards

<p>What do those parameters represent?</p>

What do those parameters represent?

knowt flashcard image

15

New cards

What Activation Function do we use?

Rectified Linear Unit (RELU)

16

New cards

What does the RELU do?

knowt flashcard image

17

New cards

How can do we define the RELU function in R?

knowt flashcard image

18

New cards

<p>How would this look with RELU?</p>

How would this look with RELU?

knowt flashcard image

19

New cards

<p>What does this plot show and how do we plot it in R?</p>

What does this plot show and how do we plot it in R?

knowt flashcard image

20

New cards

What is a combination of RELUs known as and why is it useful?

knowt flashcard image

21

New cards

How does the Universal Approximation Theorem look graphically?

knowt flashcard image

22

New cards

What other Activation Function do we use?

Sigmoid

<p>Sigmoid</p>

23

New cards

What other Activation Functions exist?

knowt flashcard image

24

New cards

What does the algorithm that selects the value of the parameters do?

knowt flashcard image

25

New cards

What is the “Sine Wave with Noise” example?

knowt flashcard image

26

New cards

What code is used in R to train the NN?

knowt flashcard image

27

New cards

<p>What do each of the hyperparameters do?</p>

What do each of the hyperparameters do?

knowt flashcard image

28

New cards

<p>What does the estimated network look like?</p>

What does the estimated network look like?

knowt flashcard image

29

New cards

<p>How many weights and offsets are there when the input feeds into each unit in the hidden layer?</p>

How many weights and offsets are there when the input feeds into each unit in the hidden layer?

15 weights and 15 offsets

30

New cards

<p>How many weights and offsets are there when each unit in the hidden layer feeds into the output layer?</p>

How many weights and offsets are there when each unit in the hidden layer feeds into the output layer?

The algorithm then estimates values for both sets

<p>The algorithm then estimates values for both sets</p>

31

New cards

What does the estimated NN plot with RELU look like?

knowt flashcard image

32

New cards

What is the Sigmoid Activation Function also known as?

Logistic Activation Function

33

New cards

In this context, how are the Sigmoid and Logistic functions related?

The same

34

New cards

How do the Sigmoid and RELU Activation Functions compare?

Smoother fit than RELU

<p>Smoother fit than RELU</p>

35

New cards

<p>How does the NN going to recognise the pattern that generates these data?</p>

How does the NN going to recognise the pattern that generates these data?

knowt flashcard image

36

New cards

<p>What decision boundaries is the NN trying to form?</p>

What decision boundaries is the NN trying to form?

knowt flashcard image

37

New cards

What is the first step in establishing the pattern?

knowt flashcard image

38

New cards

What do the hyperparameters represent?

Decay sets the value of δ

<p>Decay sets the value of δ</p>

39

New cards

What does the trained network look like?

knowt flashcard image

40

New cards

<p>What do we mean by the network being “fully-connected”?</p>

What do we mean by the network being “fully-connected”?

Each of the inputs (two in this example) are connected to each of the 10 units in the hidden layer

41

New cards

<p>What are the weights in this example?</p>

What are the weights in this example?

Each unit in the hidden layer is connected to the single unit in the output layer: this gives another set of weights

<p>Each unit in the hidden layer is connected to the single unit in the output layer: this gives another set of weights</p>

42

New cards

<p>What are the biases in this example?</p>

What are the biases in this example?

Output layer also has a bias

<p>Output layer also has a bias</p>

43

New cards

<p>How many parameters are there and what do the different colours represent?</p>

How many parameters are there and what do the different colours represent?

knowt flashcard image

44

New cards

<p>What does this tell us about the NN?</p>

What does this tell us about the NN?

Overfitting

45

New cards

What do the fitted Decision Boundaries look like?

knowt flashcard image

46

New cards

<p>What do the fitted Decision Boundaries look like when combined with the data?</p>

What do the fitted Decision Boundaries look like when combined with the data?

knowt flashcard image

47

New cards

<p>What does each dot represent in the image?</p>

What does each dot represent in the image?

Every blue dot in the pink zone, and every red dot in the blue zone is a prediction error

48

New cards

What happens in this example when we increase the network to 50 units?

knowt flashcard image

49

New cards

<p>What does the estimated NN for 50 units look like?</p>

What does the estimated NN for 50 units look like?

knowt flashcard image

50

New cards

What do the fitted Decision Boundaries now look like?

knowt flashcard image

51

New cards

<p>What do the more complex fitted Decision Boundaries with the data look like?</p>

What do the more complex fitted Decision Boundaries with the data look like?

knowt flashcard image

52

New cards

What happens when we increase the size of the network to 200 units?

knowt flashcard image

53

New cards

What does the estimated NN look like for 200 units?

knowt flashcard image

54

New cards

What do the fitted Decision boundaries now look like?

The fitted Decision Boundaries now capture the checkerboard very well

<p>The fitted Decision Boundaries now capture the checkerboard very well</p>

55

New cards

What do the new fitted Decision Boundaries with the data look like?

knowt flashcard image

56

New cards

<p>How did the NN achieve this?</p>

How did the NN achieve this?

Trained to minimise a loss function

57

New cards

<p>What is the resulting loss function?</p>

What is the resulting loss function?

knowt flashcard image

58

New cards

<p>What does this simplify to if δ = 0?</p>

What does this simplify to if δ = 0?

Loss function simplifies to the Binary Cross-Entropy Loss function (similar logic to the entropy index)

59

New cards

<p>If y = 1, what is the loss function equal to?</p>

If y = 1, what is the loss function equal to?

knowt flashcard image

60

New cards

<p>And what do these results tell us?</p>

And what do these results tell us?

knowt flashcard image

61

New cards

If y = 0, what is the loss function equal to?

knowt flashcard image

62

New cards

<p>And what do the results tell us?</p>

And what do the results tell us?

knowt flashcard image

63

New cards

Therefore, what is the conclusion about having δ = 0?

knowt flashcard image

64

New cards

term image

knowt flashcard image

65

New cards

<p>For parameter ω<sub>j</sub>, what is the optimality conditon?</p>

For parameter ω_j, what is the optimality conditon?

knowt flashcard image

66

New cards

<p>What is the first step in solving this numerically?</p>

What is the first step in solving this numerically?

knowt flashcard image

67

New cards

<p>What do we use these parameter values for? What is that process known as?</p>

What do we use these parameter values for? What is that process known as?

knowt flashcard image

68

New cards

<p>What do we calculate next and what is this process known as?</p>

What do we calculate next and what is this process known as?

knowt flashcard image

69

New cards

How do we calculate the gradient for the Backward Pass?

Using Back Propagation

70

New cards

What is Back Propagation?

knowt flashcard image

71

New cards

<p>What do we do to the value of ω<sub>j</sub>?</p>

What do we do to the value of ω_j?

knowt flashcard image

72

New cards

<p>What do we do to the value of ω<sub>j</sub>?</p>

What do we do to the value of ω_j?

knowt flashcard image

73

New cards

How do we determine the amount by which the value of ω_jchanges?

knowt flashcard image

74

New cards

Now that the parameter values have been updated, what do we do next?

knowt flashcard image

75

New cards

When does the algorithm stop updating the values of the parameters?

knowt flashcard image

76

New cards

<p>(Example) Why do we need to pre-process this data and how do we do this?</p>

(Example) Why do we need to pre-process this data and how do we do this?

knowt flashcard image

77

New cards

How do we calculate the scaled target variable in the training and test data?

knowt flashcard image

78

New cards

<p>How do we determine the fit of the NN generated across multiple units?</p>

How do we determine the fit of the NN generated across multiple units?

knowt flashcard image

79

New cards

<p>What do the mses tell us?</p>

What do the mses tell us?

There is evidence of overfitting the training data for networks with more than 20 units

80

New cards

<p>How might the predicted house price against the actual house price look for the chosen training model with 15 units?</p>

How might the predicted house price against the actual house price look for the chosen training model with 15 units?

knowt flashcard image