A delay-aware model-based RL framework with sequential belief filtering handles random out-of-sequence observations in POMDPs and outperforms MDP baselines while showing robustness to delay shifts.
Learning a belief representation for delayed reinforcement learning
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Model-Based Reinforcement Learning under Random Observation Delays
A delay-aware model-based RL framework with sequential belief filtering handles random out-of-sequence observations in POMDPs and outperforms MDP baselines while showing robustness to delay shifts.