FPILOT optimizes pre-trained RL trading policies at inference time using forecasted price trajectories to improve portfolio allocations and risk-adjusted returns on the DJ30 benchmark.
When to Trust Your Model: Model-Based Policy Optimization , url =
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.LG 2verdicts
UNVERDICTED 2representative citing papers
Return-conditional diffusion models for policies outperform offline RL on benchmarks by circumventing dynamic programming and enable constraint or skill composition.
citing papers explorer
-
Plan Before You Trade: Inference-Time Optimization for RL Trading Agents
FPILOT optimizes pre-trained RL trading policies at inference time using forecasted price trajectories to improve portfolio allocations and risk-adjusted returns on the DJ30 benchmark.
-
Is Conditional Generative Modeling all you need for Decision-Making?
Return-conditional diffusion models for policies outperform offline RL on benchmarks by circumventing dynamic programming and enable constraint or skill composition.