Model-based Deep Reinforcement Learning for Dynamic Portfolio Optimization

Pengqian Yu , Joon Sern Lee , Ilya Kulyatin , Zekun Shi , Sakyasingha Dasgupta

Authors on Pith no claims yet

classification 💻 cs.LG cs.AIstat.ML

keywords tradingagentlearningmodulearchitecturedatadeepdesign

read the original abstract

Dynamic portfolio optimization is the process of sequentially allocating wealth to a collection of assets in some consecutive trading periods, based on investors' return-risk profile. Automating this process with machine learning remains a challenging problem. Here, we design a deep reinforcement learning (RL) architecture with an autonomous trading agent such that, investment decisions and actions are made periodically, based on a global objective, with autonomy. In particular, without relying on a purely model-free RL agent, we train our trading agent using a novel RL architecture consisting of an infused prediction module (IPM), a generative adversarial data augmentation module (DAM) and a behavior cloning module (BCM). Our model-based approach works with both on-policy or off-policy RL algorithms. We further design the back-testing and execution engine which interact with the RL agent in real time. Using historical {\em real} financial market data, we simulate trading with practical constraints, and demonstrate that our proposed model is robust, profitable and risk-sensitive, as compared to baseline trading strategies and model-free RL agents from prior work.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

SBCA: Cross-Modal BERT-driven Actor-Critic for Multi-Asset Portfolio Optimization
q-fin.CP 2026-05 unverdicted novelty 6.0

SBCA is a reinforcement learning framework using BERT cross-modal fusion and Actor-Critic to integrate price data with sentiment text for multi-asset portfolio optimization with practical trading constraints.