Lecture 7: Shallow Neural Networks

0.0(0)
Studied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/79

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 10:36 AM on 5/20/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

80 Terms

1
New cards

How do AWS define a NN?

knowt flashcard image
2
New cards

How do Feedforward NN work?

knowt flashcard image
3
New cards

How do Convolutional NN work?

knowt flashcard image
4
New cards

How do Recurrent NN work?

knowt flashcard image
5
New cards

What are the two types of Feedforward NN?

Shallow and Deep

6
New cards

In a SNN, what are the input and output layer connected by?

Single Hidden Layer; single layer makes it shallow

7
New cards
<p>How many units does this Hidden Layer have?</p>

How many units does this Hidden Layer have?

Three units (h1, h2, h3)

8
New cards
<p>What are the inputs into the hidden layer?</p>

What are the inputs into the hidden layer?

knowt flashcard image
9
New cards

What are the inputs in the hidden layer then subjected to?

Activation Function, a()

10
New cards
<p>What does the Activation Function then give?</p>

What does the Activation Function then give?

knowt flashcard image
11
New cards

Where are the outputs from the hidden layer then fed to?

The output layer, giving the final output

12
New cards
<p>What is the final output?</p>

What is the final output?

knowt flashcard image
13
New cards
<p>How many parameters does this NN have?</p>

How many parameters does this NN have?

10

14
New cards
<p>What do those parameters represent?</p>

What do those parameters represent?

knowt flashcard image
15
New cards

What Activation Function do we use?

Rectified Linear Unit (RELU)

16
New cards

What does the RELU do?

knowt flashcard image
17
New cards

How can do we define the RELU function in R?

knowt flashcard image
18
New cards
<p>How would this look with RELU?</p>

How would this look with RELU?

knowt flashcard image
19
New cards
<p>What does this plot show and how do we plot it in R?</p>

What does this plot show and how do we plot it in R?

knowt flashcard image
20
New cards

What is a combination of RELUs known as and why is it useful?

knowt flashcard image
21
New cards

How does the Universal Approximation Theorem look graphically?

knowt flashcard image
22
New cards

What other Activation Function do we use?

Sigmoid

<p>Sigmoid</p>
23
New cards

What other Activation Functions exist?

knowt flashcard image
24
New cards

What does the algorithm that selects the value of the parameters do?

knowt flashcard image
25
New cards

What is the “Sine Wave with Noise” example?

knowt flashcard image
26
New cards

What code is used in R to train the NN?

knowt flashcard image
27
New cards
<p>What do each of the hyperparameters do?</p>

What do each of the hyperparameters do?

knowt flashcard image
28
New cards
<p>What does the estimated network look like?</p>

What does the estimated network look like?

knowt flashcard image
29
New cards
<p>How many weights and offsets are there when the input feeds into each unit in the hidden layer?</p>

How many weights and offsets are there when the input feeds into each unit in the hidden layer?

15 weights and 15 offsets

30
New cards
<p>How many weights and offsets are there when each unit in the hidden layer feeds into the output layer?</p>

How many weights and offsets are there when each unit in the hidden layer feeds into the output layer?

The algorithm then estimates values for both sets

<p>The algorithm then estimates values for both sets</p>
31
New cards

What does the estimated NN plot with RELU look like?

knowt flashcard image
32
New cards

What is the Sigmoid Activation Function also known as?

Logistic Activation Function

33
New cards

In this context, how are the Sigmoid and Logistic functions related?

The same

34
New cards

How do the Sigmoid and RELU Activation Functions compare?

Smoother fit than RELU

<p>Smoother fit than RELU</p>
35
New cards
<p>How does the NN going to recognise the pattern that generates these data?</p>

How does the NN going to recognise the pattern that generates these data?

knowt flashcard image
36
New cards
<p>What decision boundaries is the NN trying to form?</p>

What decision boundaries is the NN trying to form?

knowt flashcard image
37
New cards

What is the first step in establishing the pattern?

knowt flashcard image
38
New cards

What do the hyperparameters represent?

Decay sets the value of δ

<p>Decay sets the value of δ</p>
39
New cards

What does the trained network look like?

knowt flashcard image
40
New cards
<p>What do we mean by the network being “fully-connected”?</p>

What do we mean by the network being “fully-connected”?

Each of the inputs (two in this example) are connected to each of the 10 units in the hidden layer

41
New cards
<p>What are the weights in this example?</p>

What are the weights in this example?

Each unit in the hidden layer is connected to the single unit in the output layer: this gives another set of weights

<p>Each unit in the hidden layer is connected to the single unit in the output layer: this gives another set of weights</p>
42
New cards
<p>What are the biases in this example?</p>

What are the biases in this example?

Output layer also has a bias

<p>Output layer also has a bias</p>
43
New cards
<p>How many parameters are there and what do the different colours represent?</p>

How many parameters are there and what do the different colours represent?

knowt flashcard image
44
New cards
<p>What does this tell us about the NN?</p>

What does this tell us about the NN?

Overfitting

45
New cards

What do the fitted Decision Boundaries look like?

knowt flashcard image
46
New cards
<p>What do the fitted Decision Boundaries look like when combined with the data?</p>

What do the fitted Decision Boundaries look like when combined with the data?

knowt flashcard image
47
New cards
<p>What does each dot represent in the image?</p>

What does each dot represent in the image?

Every blue dot in the pink zone, and every red dot in the blue zone is a prediction error

48
New cards

What happens in this example when we increase the network to 50 units?

knowt flashcard image
49
New cards
<p>What does the estimated NN for 50 units look like?</p>

What does the estimated NN for 50 units look like?

knowt flashcard image
50
New cards

What do the fitted Decision Boundaries now look like?

knowt flashcard image
51
New cards
<p>What do the more complex fitted Decision Boundaries with the data look like?</p>

What do the more complex fitted Decision Boundaries with the data look like?

knowt flashcard image
52
New cards

What happens when we increase the size of the network to 200 units?

knowt flashcard image
53
New cards

What does the estimated NN look like for 200 units?

knowt flashcard image
54
New cards

What do the fitted Decision boundaries now look like?

The fitted Decision Boundaries now capture the checkerboard very well

<p>The fitted Decision Boundaries now capture the checkerboard very well</p>
55
New cards

What do the new fitted Decision Boundaries with the data look like?

knowt flashcard image
56
New cards
<p>How did the NN achieve this?</p>

How did the NN achieve this?

Trained to minimise a loss function

57
New cards
<p>What is the resulting loss function?</p>

What is the resulting loss function?

knowt flashcard image
58
New cards
<p>What does this simplify to if δ = 0?</p>

What does this simplify to if δ = 0?

Loss function simplifies to the Binary Cross-Entropy Loss function (similar logic to the entropy index)

59
New cards
<p>If y = 1, what is the loss function equal to?</p>

If y = 1, what is the loss function equal to?

knowt flashcard image
60
New cards
<p>And what do these results tell us?</p>

And what do these results tell us?

knowt flashcard image
61
New cards

If y = 0, what is the loss function equal to?

knowt flashcard image
62
New cards
<p>And what do the results tell us?</p>

And what do the results tell us?

knowt flashcard image
63
New cards

Therefore, what is the conclusion about having δ = 0?

knowt flashcard image
64
New cards
term image
knowt flashcard image
65
New cards
<p>For parameter ω<sub>j</sub>, what is the optimality conditon?</p>

For parameter ωj, what is the optimality conditon?

knowt flashcard image
66
New cards
<p>What is the first step in solving this numerically?</p>

What is the first step in solving this numerically?

knowt flashcard image
67
New cards
<p>What do we use these parameter values for? What is that process known as?</p>

What do we use these parameter values for? What is that process known as?

knowt flashcard image
68
New cards
<p>What do we calculate next and what is this process known as?</p>

What do we calculate next and what is this process known as?

knowt flashcard image
69
New cards

How do we calculate the gradient for the Backward Pass?

Using Back Propagation

70
New cards

What is Back Propagation?

knowt flashcard image
71
New cards
<p>What do we do to the value of ω<sub>j</sub>?</p>

What do we do to the value of ωj?

knowt flashcard image
72
New cards
<p>What do we do to the value of ω<sub>j</sub>?</p>

What do we do to the value of ωj?

knowt flashcard image
73
New cards

How do we determine the amount by which the value of ωj changes?

knowt flashcard image
74
New cards

Now that the parameter values have been updated, what do we do next?

knowt flashcard image
75
New cards

When does the algorithm stop updating the values of the parameters?

knowt flashcard image
76
New cards
<p>(Example) Why do we need to pre-process this data and how do we do this?</p>

(Example) Why do we need to pre-process this data and how do we do this?

knowt flashcard image
77
New cards

How do we calculate the scaled target variable in the training and test data?

knowt flashcard image
78
New cards
<p>How do we determine the fit of the NN generated across multiple units?</p>

How do we determine the fit of the NN generated across multiple units?

knowt flashcard image
79
New cards
<p>What do the mses tell us?</p>

What do the mses tell us?

There is evidence of overfitting the training data for networks with more than 20 units

80
New cards
<p>How might the predicted house price against the actual house price look for the chosen training model with 15 units?</p>

How might the predicted house price against the actual house price look for the chosen training model with 15 units?

knowt flashcard image