Introduction to Reinforcement Learning

0.0(0)

Studied by 0 people

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/20

Earn XP

Description and Tags

These flashcards cover key concepts and terminology related to Reinforcement Learning, as outlined in the lecture notes.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

21 Terms

New cards

Reinforcement Learning

A type of machine learning where an agent learns to take actions in an environment to maximize cumulative rewards.

New cards

Agent

The learner or decision maker in a reinforcement learning problem that interacts with the environment.

New cards

Environment

The external system with which the agent interacts and within which it operates.

New cards

Reward Signal

A numeric signal received by the agent from the environment, used to evaluate the success of its actions.

New cards

Markov Decision Process (MDP)

A mathematical framework for modeling decision-making, characterized by states, actions, rewards, and transitions.

New cards

Optimal Policy

A strategy that maximizes the expected cumulative reward over time in a reinforcement learning context.

New cards

Exploration vs. Exploitation

The trade-off in reinforcement learning between exploring new actions to find better rewards and exploiting known actions that yield good rewards.

New cards

Q-Learning

A model-free, value-based reinforcement learning algorithm that seeks to learn the value of actions taken in given states.

New cards

Learning Rate (α)

A parameter that determines how much of the newly acquired information overrides old information in the learning process.

New cards

Discount Factor (γ)

A parameter used to weigh future rewards, with values between 0 and 1, affecting the importance of immediate versus long-term rewards.

New cards

State (s)

The current situation in which the agent finds itself within the environment.

New cards

Action (a)

A decision made by the agent that affects the state of the environment.

New cards

Exploit

To make use of known good actions or strategies to maximize immediate rewards.

New cards

Explore

To try new actions or strategies to gather more information that may lead to better long-term rewards.

New cards

Model-based RL

Reinforcement learning that uses a model of the environment to make decisions.

New cards

Model-free RL

Reinforcement learning where the agent learns directly from its experiences without a model of the environment.

New cards

Value Function

A function that estimates the expected cumulative reward from a given state following a certain policy.

New cards

Q-value function

A function that estimates the expected cumulative reward for taking a specific action in a specific state and following a policy thereafter.

New cards

Blame Attribution Problem

The challenge of determining which specific action was responsible for a received reward or punishment.

New cards

Trajectory

A sequence of states and actions produced by following a policy over time.

New cards

Episode

A sequence of actions taken by the agent that ends when the goal is reached or a failure occurs.

Explore top notes

Chapter 34 - Foreign and Domestic Crises of the Vietnam Era

Updated 945d ago

Note

Units 3 and 4 Study Guide/Essay Outlines

Updated 231d ago

Note

ME2

Updated 115d ago

Note

Chapter 1- What is Economics?

Updated 802d ago

Note

Using Classes and Objects

Updated 751d ago

Note

Unit 5: Political Participation

Updated 753d ago

Note

Questions & Answers (Working Progress)

Updated 424d ago

Note

Unit 1: Fluids - Pressure and Forces

Updated 202d ago

Note

Explore top flashcards

Flashcards (72)

Flashcards (32)

Flashcards (93)

Bio Circulatory System Terms

Updated 819d ago

Flashcards (62)

Latin 13 Nouns, pronouns, adj

Updated 426d ago

Flashcards (21)

WEEK 10 (MIDDLE AGES)

Updated 541d ago

Flashcards (22)

Med Micro Exam 1

Updated 789d ago

Flashcards (137)

Depression, War, and Recovery (1930-1951)

Updated 433d ago

Flashcards (27)