Model-based Deep Reinforcement Learning for Dynamic Portfolio Optimization

Ilya Kulyatin; Joon Sern Lee; Pengqian Yu; Sakyasingha Dasgupta; Zekun Shi

arxiv: 1901.08740 · v1 · pith:OFGYNOHPnew · submitted 2019-01-25 · 💻 cs.LG · cs.AI· stat.ML

Model-based Deep Reinforcement Learning for Dynamic Portfolio Optimization

Pengqian Yu , Joon Sern Lee , Ilya Kulyatin , Zekun Shi , Sakyasingha Dasgupta This is my paper

classification 💻 cs.LG cs.AIstat.ML

keywords tradingagentlearningmodulearchitecturedatadeepdesign

0 comments

read the original abstract

Dynamic portfolio optimization is the process of sequentially allocating wealth to a collection of assets in some consecutive trading periods, based on investors' return-risk profile. Automating this process with machine learning remains a challenging problem. Here, we design a deep reinforcement learning (RL) architecture with an autonomous trading agent such that, investment decisions and actions are made periodically, based on a global objective, with autonomy. In particular, without relying on a purely model-free RL agent, we train our trading agent using a novel RL architecture consisting of an infused prediction module (IPM), a generative adversarial data augmentation module (DAM) and a behavior cloning module (BCM). Our model-based approach works with both on-policy or off-policy RL algorithms. We further design the back-testing and execution engine which interact with the RL agent in real time. Using historical {\em real} financial market data, we simulate trading with practical constraints, and demonstrate that our proposed model is robust, profitable and risk-sensitive, as compared to baseline trading strategies and model-free RL agents from prior work.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

SBCA: Cross-Modal BERT-driven Actor-Critic for Multi-Asset Portfolio Optimization
q-fin.CP 2026-05 unverdicted novelty 6.0

SBCA is a reinforcement learning framework using BERT cross-modal fusion and Actor-Critic to integrate price data with sentiment text for multi-asset portfolio optimization with practical trading constraints.
Multi-Agent Deep Reinforcement Learning for Liquidation Strategy Analysis
q-fin.TR 2019-06 unverdicted novelty 5.0

The authors extend the Almgren-Chriss model to a multi-agent setting and apply deep reinforcement learning to simulate and optimize liquidation strategies under practical constraints.
Predicting Liquidity-Aware Bond Yields using Causal GANs and Deep Reinforcement Learning with LLM Evaluation
q-fin.CP 2025-02 unverdicted novelty 3.0

CausalGAN + SAC RL pipeline generates synthetic bond yield data; fine-tuned Qwen2.5-7B LLM produces trading signals, with reported MAE 0.103, 60% profit rate, and LLM score 3.37/5.