3c&d NeuralNets & Reinforcement

0.0(0)
Studied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/49

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 8:17 AM on 5/10/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

50 Terms

1
New cards

What is a limitation of perceptrons?

Many useful functions, such as XOR, are not linearly separable.

2
New cards

What is the structure of a two-layer neural network?

It consists of input units, hidden units, and output units.

3
New cards

What is the purpose of continuous activation functions?

To replace discontinuous step functions with differentiable functions like sigmoid or tanh.

4
New cards

What is the formula for the error function in neural networks?

E = 1/2 Σ (zi - ti)², where zi is the actual output and ti is the target output.

5
New cards

What is gradient descent in the context of neural networks?

A method to minimize the error function by adjusting weights in the direction of the steepest descent.

6
New cards

What is the learning rate in gradient descent?

A parameter (η) that determines the size of the steps taken towards minimizing the error.

7
New cards

What is backpropagation?

A method for calculating the gradient of the loss function with respect to each weight by propagating errors backward through the network.

8
New cards

What is the XOR problem in neural networks?

The XOR function cannot be learned by a single-layer perceptron but can be learned by a two-layer neural network.

9
New cards

What is the significance of the chain rule in neural networks?

It allows for the efficient computation of partial derivatives needed for backpropagation.

10
New cards

-

-

11
New cards

What is the purpose of data augmentation in neural networks?

To create additional training items by transforming existing data to cover various scenarios.

12
New cards

What is the role of momentum in training neural networks?

To dampen oscillations and accelerate convergence by adding a momentum factor to weight updates.

13
New cards

What is the Adam optimization algorithm?

An advanced optimization method that maintains running averages and adjusts the learning rate for each weight.

14
New cards

What is the purpose of weight initialization in neural networks?

To set initial weights to small random values to prevent issues during training.

15
New cards

What is the function of hidden units in a neural network?

To learn complex representations of the input data by transforming it through multiple layers.

16
New cards

What is the significance of the sigmoid function in neural networks?

It is a continuous activation function that maps input values to a range between 0 and 1.

17
New cards

What is the purpose of re-scaling inputs in neural network training?

To ensure that all input values are within a similar range, improving the efficiency of training.

18
New cards

What are the three methods to prevent overfitting in neural networks?

Limit the number of hidden nodes/connections, limit training time using a validation set, and apply weight decay.

19
New cards

What is the output of a neural network?

The predicted value based on the input data processed through the network.

20
New cards

What is the purpose of the loss function in neural networks?

To quantify the difference between the actual output and the target output, guiding the training process.

21
New cards

What is the role of the activation function in a neural network?

To introduce non-linearity into the model, allowing it to learn complex patterns.

22
New cards

-

-

23
New cards

What is the function of the output layer in a neural network?

To produce the final predictions or classifications based on the processed inputs.

24
New cards

What is the effect of using a flat error landscape in weight space?

Progress in training will be slow due to minimal changes in error with weight adjustments.

25
New cards

What is the purpose of using multiple hidden units in a neural network?

To increase the network's capacity to learn and represent complex functions.

26
New cards

What is the main difference between Reinforcement Learning and Supervised Learning?

Reinforcement Learning learns from interactions with the environment to maximize cumulative rewards, while Supervised Learning uses labeled training data to predict outcomes.

27
New cards

What is the goal of an agent in Reinforcement Learning?

To find an optimal policy π* that maximizes cumulative reward.

28
New cards

What is the Exploration vs. Exploitation dilemma?

The trade-off between choosing the best-known action (exploitation) and trying new actions to discover their rewards (exploration).

29
New cards

What does Temporal Difference Learning involve?

It updates the value of a state based on the immediate reward plus the discounted value of the next state.

30
New cards

What is Q-Learning?

A model-free reinforcement learning algorithm that learns the value of action in a given state, using the Q-function.

31
New cards

What is a Value Function in Reinforcement Learning?

A function that estimates the expected cumulative reward for an agent starting from a given state and following a certain policy.

32
New cards

What is the K-Armed Bandit Problem?

A scenario in reinforcement learning where an agent must choose between multiple actions (slot machines) to maximize rewards in a single state.

33
New cards

What is a stochastic policy?

A policy that incorporates randomness in action selection, often used in environments where deterministic strategies perform poorly.

34
New cards

What is the purpose of a policy π in Reinforcement Learning?

To define the action that an agent will take in each state.

35
New cards

What is the difference between finite horizon and infinite discounted reward models?

Finite horizon considers rewards over a limited time, while infinite discounted rewards account for future rewards with a discount factor.

36
New cards

What is the significance of the learning rate η in Q-Learning?

It determines how much the estimated Q-value is updated towards the new estimate.

37
New cards

What is the role of the environment in Reinforcement Learning?

The environment provides feedback to the agent in the form of states and rewards based on the actions taken.

38
New cards

What is the main challenge of delayed reinforcement?

Rewards from actions may not be received until several time steps later, complicating the learning process.

39
New cards

What does the term 'Bootstrapping' refer to in Value Function Learning?

The process of iteratively improving the value function estimate based on its own previous estimates.

40
New cards

What is the relationship between Q-values and Value Function?

The Q-value for a state-action pair is related to the Value Function as it represents the expected cumulative reward from that state when taking that action.

41
New cards

What is the significance of the discount factor γ in reinforcement learning?

It determines the present value of future rewards, influencing the agent's preference for immediate versus delayed rewards.

42
New cards

What are the types of environments in Reinforcement Learning?

Environments can be passive, active, deterministic, stochastic, known, or unknown.

43
New cards

What is the purpose of using a random action occasionally in reinforcement learning?

To ensure convergence to the optimal strategy by exploring less preferred actions.

44
New cards

What is the theoretical result regarding Q-learning convergence?

Q-learning will eventually converge to the optimal policy for any deterministic Markov decision process, assuming a randomized strategy.

45
New cards

What is the main limitation of theoretical results in reinforcement learning?

Delayed reinforcement and the requirement for a finite search space can slow down learning and convergence.

46
New cards

What does the term 'average reward' refer to in models of optimality?

The long-term average reward obtained by an agent as the time horizon approaches infinity.

47
New cards

What is the role of the reward function R in reinforcement learning?

It defines the immediate reward received after taking an action in a given state.

48
New cards

What is the difference between deterministic and stochastic environments?

Deterministic environments have predictable outcomes for actions, while stochastic environments have random outcomes.

49
New cards

What is Behavioral Cloning in the context of supervised learning?

Learning actions based on a training set of situation-action pairs.

50
New cards

What does the term 'policy improvement' refer to in reinforcement learning?

The process of refining a policy to increase the expected cumulative reward.