DIBS decouples task policy learning via RL from evolution function learning via behavioral cloning to achieve more stable training and better generalization than prior RL and meta-RL methods for inductive generalization from specifications.
One subgoal at a time: Zero-shot generalization to arbitrary linear temporal logic requirements in multi-task reinforcement learning.arXiv preprint arXiv:2508.01561, 2025
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AI 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Decoupled Behavioral Cloning for Scalable Inductive Generalization in RL from Specifications
DIBS decouples task policy learning via RL from evolution function learning via behavioral cloning to achieve more stable training and better generalization than prior RL and meta-RL methods for inductive generalization from specifications.