Reinforcement Learning Vocabulary

0.0(0)

Studied by 0 people

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/15

Earn XP

Description and Tags

Flashcards for Reinforcement Learning Lecture

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

16 Terms

New cards

Reinforcement Learning

Goal-directed learning from interaction, where a system adapts to new situations and learns from the past through interaction and feedback to achieve a desired state.

New cards

RL Problem

Formalizing reinforcement learning, analyzing its bounds, and mapping concrete applications to the abstract reinforcement learning problem.

New cards

RL Solutions

Algorithms to solve defined reinforcement learning problems and methods to approximate these solutions.

New cards

RL Field of Study

Everything surrounding reinforcement learning, including problem definition, solutions, preprocessing, and related aspects.

New cards

Trial-and-error search

A key aspect of RL involving a loop of action and feedback, where gradient descent cannot be directly used and all possible solutions cannot be enumerated.

New cards

Delayed Reward

A key aspect of RL where the value of an action may not be immediately known, requiring planning and non-greedy approaches.

New cards

Supervised Learning vs RL

Supervised learning uses correct actions and labels given a priori with a fixed set of examples, separate training/testing, and passive involvement, while RL evaluates actions during deployment, gathers new examples, interacts directly with the environment, and focuses on causal action.

New cards

Unsupervised Learning vs RL

Unsupervised learning aims to discover hidden structure with separate phases and passive involvement, while RL maximizes reward, can use unsupervised learning as a subtask, and involves direct interaction with the environment.

New cards

Exploitation-Exploration Tradeoff

RL focuses on the whole real-world problem rather than just a subproblem, balancing exploitation and exploration, while supervised and unsupervised learning can be used as subproblems within RL.

New cards

Agent (in RL)

The component in RL that senses its environment, observes its state, and takes actions.

New cards

Environment (in RL)

The component in RL that provides feedback based on an agent’s actions, changes over time, and is affected by both external events and the agent’s actions.

New cards

Policy

Maps from environment state to action, perhaps stochastically.

New cards

Reward

Encodes long-term goal via short-term sensations/rewards and is relatively easy to define and observe/estimate.

New cards

Value

Represents long-term value of an environment state or action and is hard to define and estimate.

New cards

Model of the Environment

Enables the agent to hypothesize about future states of the environment (e.g., planning) and can be a physics simulation environment or an ML prediction model.

New cards

Interaction in RL

The key difference from other learning paradigms because actions can both change the environment (e.g., causal effects) and determine which data is collected during learning.