pith. sign in

arxiv: 1901.08740 · v1 · pith:OFGYNOHPnew · submitted 2019-01-25 · 💻 cs.LG · cs.AI· stat.ML

Model-based Deep Reinforcement Learning for Dynamic Portfolio Optimization

classification 💻 cs.LG cs.AIstat.ML
keywords tradingagentlearningmodulearchitecturedatadeepdesign
0
0 comments X
read the original abstract

Dynamic portfolio optimization is the process of sequentially allocating wealth to a collection of assets in some consecutive trading periods, based on investors' return-risk profile. Automating this process with machine learning remains a challenging problem. Here, we design a deep reinforcement learning (RL) architecture with an autonomous trading agent such that, investment decisions and actions are made periodically, based on a global objective, with autonomy. In particular, without relying on a purely model-free RL agent, we train our trading agent using a novel RL architecture consisting of an infused prediction module (IPM), a generative adversarial data augmentation module (DAM) and a behavior cloning module (BCM). Our model-based approach works with both on-policy or off-policy RL algorithms. We further design the back-testing and execution engine which interact with the RL agent in real time. Using historical {\em real} financial market data, we simulate trading with practical constraints, and demonstrate that our proposed model is robust, profitable and risk-sensitive, as compared to baseline trading strategies and model-free RL agents from prior work.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. SBCA: Cross-Modal BERT-driven Actor-Critic for Multi-Asset Portfolio Optimization

    q-fin.CP 2026-05 unverdicted novelty 6.0

    SBCA is a reinforcement learning framework using BERT cross-modal fusion and Actor-Critic to integrate price data with sentiment text for multi-asset portfolio optimization with practical trading constraints.

  2. Multi-Agent Deep Reinforcement Learning for Liquidation Strategy Analysis

    q-fin.TR 2019-06 unverdicted novelty 5.0

    The authors extend the Almgren-Chriss model to a multi-agent setting and apply deep reinforcement learning to simulate and optimize liquidation strategies under practical constraints.

  3. Predicting Liquidity-Aware Bond Yields using Causal GANs and Deep Reinforcement Learning with LLM Evaluation

    q-fin.CP 2025-02 unverdicted novelty 3.0

    CausalGAN + SAC RL pipeline generates synthetic bond yield data; fine-tuned Qwen2.5-7B LLM produces trading signals, with reported MAE 0.103, 60% profit rate, and LLM score 3.37/5.