Proposes S2AC and SDAC streaming RL algorithms that match batch performance on benchmarks and introduce a principled method for batch-to-streaming policy transitions.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Towards Batch-to-Streaming Deep Reinforcement Learning for Continuous Control
Proposes S2AC and SDAC streaming RL algorithms that match batch performance on benchmarks and introduce a principled method for batch-to-streaming policy transitions.