Key Concepts in Artificial Intelligence and Markov Decision Processes

0.0(0)

Studied by 0 people

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/14

Earn XP

Description and Tags

This set of flashcards covers key vocabulary and concepts related to Artificial Intelligence, specifically focused on Markov Decision Processes and decision-making frameworks.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

15 Terms

New cards

Markov Decision Process (MDP)

A sequential decision problem in a fully observable, stochastic environment with a Markovian transition model and additive rewards.

New cards

Utility Function (U(s))

A function that assigns a value reflecting how desirable a state is, guiding preference in decision making.

New cards

Maximum Expected Utility (MEU)

The principle that an agent should choose the action with the highest expected utility from available actions.

New cards

Transition Model (P(s’ | s, a))

The probability of ending up in a new state s’ given the agent was in state s and took action a.

New cards

Policy (π)

A strategy that specifies the action to take in each state defined by the agent's intended actions.

New cards

Rewards (R(s, a, s’))

The feedback received by an agent after transitioning from state s to s’ via action a, which can be positive or negative.

New cards

Expected Utility

The average utility of potential outcomes of an action, weighted by the probabilities of those outcomes occurring.

New cards

Discount Factor (𝛾)

A value between 0 and 1 used to prioritize immediate rewards over future rewards in the context of decision making.

New cards

Value Iteration

An algorithm used to compute the optimal policy by iteratively updating the utility of states based on expected utilities.

New cards

Policy Iteration

An algorithm that finds an optimal policy through repeated evaluation and improvement of policy utilities.

New cards

Bellman Equation

Expresses the relationship between the utility of a state and the expected rewards plus the utility of neighboring states.

New cards

Q-function

Also known as action-utility function; it estimates the expected utility of taking a specific action in a given state.

New cards

Environment History

A sequence of states and actions that an agent follows during its interaction with the environment.

New cards

Convergence

The state of utilities becoming stable over iterations in value or policy iteration processes.

New cards

Non-terminal State

A state that does not terminate the decision-making process and has associated negative rewards in certain contexts.