1/14
This set of flashcards covers key vocabulary and concepts related to Artificial Intelligence, specifically focused on Markov Decision Processes and decision-making frameworks.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
Markov Decision Process (MDP)
A sequential decision problem in a fully observable, stochastic environment with a Markovian transition model and additive rewards.
Utility Function (U(s))
A function that assigns a value reflecting how desirable a state is, guiding preference in decision making.
Maximum Expected Utility (MEU)
The principle that an agent should choose the action with the highest expected utility from available actions.
Transition Model (P(s’ | s, a))
The probability of ending up in a new state s’ given the agent was in state s and took action a.
Policy (π)
A strategy that specifies the action to take in each state defined by the agent's intended actions.
Rewards (R(s, a, s’))
The feedback received by an agent after transitioning from state s to s’ via action a, which can be positive or negative.
Expected Utility
The average utility of potential outcomes of an action, weighted by the probabilities of those outcomes occurring.
Discount Factor (𝛾)
A value between 0 and 1 used to prioritize immediate rewards over future rewards in the context of decision making.
Value Iteration
An algorithm used to compute the optimal policy by iteratively updating the utility of states based on expected utilities.
Policy Iteration
An algorithm that finds an optimal policy through repeated evaluation and improvement of policy utilities.
Bellman Equation
Expresses the relationship between the utility of a state and the expected rewards plus the utility of neighboring states.
Q-function
Also known as action-utility function; it estimates the expected utility of taking a specific action in a given state.
Environment History
A sequence of states and actions that an agent follows during its interaction with the environment.
Convergence
The state of utilities becoming stable over iterations in value or policy iteration processes.
Non-terminal State
A state that does not terminate the decision-making process and has associated negative rewards in certain contexts.