1/19
20 vocabulary flashcards focused on reinforcement learning, MDPs, Q-learning, and related concepts from the lecture notes.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Machine Learning
A subset of artificial intelligence that allows machines to learn automatically and improve from experience without explicit programming.
Supervised Learning
A category of machine learning where models are trained using labeled data.
Unsupervised Learning
A category of machine learning where models infer patterns from unlabeled data.
Reinforcement Learning
A type of machine learning where an agent learns to behave in an environment by taking actions and observing results.
Agent
The RL component that learns from trial and error.
Environment
The world through which the agent moves and interacts.
Action
Any permissible move the agent can take in a given state.
State
The current condition or situation returned by the environment.
Reward
The instantaneous feedback from the environment evaluating the last action.
Policy
The strategy the agent uses to decide the next action based on the state.
Value
The expected long-term return with discount applied, contrasting with immediate reward.
Action-value (Q)
A value function that also accounts for the current action, Q(s,a).
Markov Decision Process (MDP)
The mathematical framework for modeling decision making in RL with states, actions, and rewards.
Graph
A network of nodes connected by edges used to model relationships, such as rooms and doors.
Node
A state in a graph, e.g., a room.
Edge
A connection between two nodes, e.g., a door linking rooms.
Door
A two-way link between rooms that enables movement.
Instant Reward
The reward value attached to a single transition (arrow) between states.
Q-Learning
A reinforcement learning algorithm that learns Q-values for state-action pairs from experience.
Gamma (Γ)
The discount factor in Q-learning (0 to 1) that weighs future rewards versus immediate rewards.