Reinforcement Learning Applications in Trading - Vocabulary Flashcards

0.0(0)
studied byStudied by 0 people
full-widthCall with Kai
GameKnowt Play
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/44

flashcard set

Earn XP

Description and Tags

Vocabulary flashcards covering key concepts, terms, and examples from the reinforcement learning for trading lecture.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

45 Terms

1
New cards

Reinforcement Learning (RL)

An AI method where an agent learns to make decisions by interacting with an environment, receiving rewards, and aiming to maximize cumulative return without supervised labels.

2
New cards

Deep Reinforcement Learning

RL that uses neural networks to approximate value or policy functions, replacing explicit Q tables with deep learning models.

3
New cards

Q-table

A table that stores the expected future reward (Q-value) for each state-action pair in traditional Q-learning.

4
New cards

Bellman Equation

The core RL relation that defines the optimal Q-value as the immediate reward plus the discounted best future value.

5
New cards

Gamma (discount factor)

A factor between 0 and 1 that weighs future rewards; gamma = 1 uses full horizon, gamma = 0 focuses on immediate rewards.

6
New cards

Greedy

Choosing the action with the highest estimated value based on current information, favoring immediate reward.

7
New cards

State

The current description of the environment, e.g., market conditions such as volatility, momentum, and liquidity.

8
New cards

Action

The decision the agent can take, such as go long, go short, or hold in trading.

9
New cards

Reward

Feedback received after taking an action, used to guide learning; in trading it can be profit or a virtual score.

10
New cards

Environment

The external system the agent interacts with; in trading, the market constitutes the environment.

11
New cards

Backward Induction

Solving the Bellman equation by starting from the end of the horizon and iterating backward to compute values.

12
New cards

Deep Q-Network (DQN)

A neural network that approximates the Q-table, enabling deep reinforcement learning in complex environments.

13
New cards

Activation Function

A nonlinear function applied to a neuron's input to produce its output, enabling complex representations.

14
New cards

Sigmoid

An S-shaped activation function mapping inputs to the 0–1 range, commonly used in neural networks.

15
New cards

Hidden Layer

A layer of neurons between input and output that transforms inputs into more abstract features.

16
New cards

Weights

Learnable parameters that scale inputs in a neural network, determining the network output.

17
New cards

Bias

A constant input added to neurons to improve learning flexibility and stability.

18
New cards

Loss Function

A measure of the difference between the network output and the true target; lower loss means better predictions.

19
New cards

Optimizer

Algorithm used to adjust weights to minimize the loss; examples include SGD and Adam.

20
New cards

Adam Optimizer

An adaptive learning rate optimizer that adjusts step sizes during training for faster convergence.

21
New cards

Overfitting

When a model captures noise rather than the underlying pattern, especially problematic in finance and hard to cure.

22
New cards

Supervised Learning

Learning from labeled data where the model predicts known targets; includes regression and classification.

23
New cards

Unsupervised Learning

Learning from data without labels to discover structure, such as PCA and clustering.

24
New cards

Regression

A supervised learning task where the target is continuous, e.g., predicting a price.

25
New cards

Classification

A supervised learning task where the target is discrete classes, e.g., buy/hold/sell decisions.

26
New cards

Principal Component Analysis (PCA)

An unsupervised dimensionality reduction method that finds orthogonal components capturing variance.

27
New cards

Autocorrelation

The correlation of a time series with its past values at a given lag, indicating repeating patterns.

28
New cards

Autocorrelation Function (ACF)

A function that quantifies the correlation of a series with its lagged versions across lags.

29
New cards

Walk-Forward Optimization

Backtesting by training on a moving window and testing on the next period to avoid look ahead bias.

30
New cards

Look-Ahead Bias

Unintentionally using future information to train or test a model, inflating apparent performance.

31
New cards

Stationarity

A property where a time series has constant mean and variance over time; returns are often stationary while prices are not.

32
New cards

Levy Stable / Cauchy Distributions

Heavy-tailed distributions used to describe non-Gaussian market returns with unpredictable variance.

33
New cards

Marshmallow Experiment

A metaphor for delayed gratification in RL, where waiting yields larger future rewards.

34
New cards

Keynesian Beauty Contest

Predicting what others think will win, illustrating markets as a contest of predicting others expectations.

35
New cards

Reward Function

The RL signal that assigns value to actions to shape behavior, can incorporate profits, risk, and holding time.

36
New cards

Gamify Trades

Treating each trade as a game with states, actions, and scores to train RL agents.

37
New cards

Sine Wave Testing

Using a simple sine wave as a proxy price to validate RL learning before real markets.

38
New cards

Features

Input variables describing market state, such as OHLCV, indicators, time features, and position size.

39
New cards

Market State

The current set of features describing market conditions used as input to RL.

40
New cards

Action Space

The set of possible actions the agent can take, such as long, short, or hold.

41
New cards

Returns vs Prices

In ML for finance, returns are preferred as inputs because they are more stationary than prices.

42
New cards

Nonlinear Correlations

Relationships in data that are not captured by linear autocorrelation but can be learned by deep models.

43
New cards

Q-Learning vs Deep Q-Learning

Q-Learning uses a Q-table; Deep Q-Learning replaces the table with a neural network.

44
New cards

Pong (RL Example)

A classic reinforcement learning demonstration where an agent learns to play a simple game.

45
New cards

AlphaGo

A landmark reinforcement learning system that mastered the game of Go through self play and learning.