TCE bridges domain gaps in offline RL by selectively using source data or generating target-aligned transitions via a dual score-based model, outperforming baselines in experiments.
Mujoco: A physics engine for model-based control
4 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
Target-Aligned Bellman Backup (TABB) improves cross-domain offline RL by selecting source transitions according to their contribution to accurate target-domain Bellman target estimation.
SSE improves long-horizon goal-conditioned RL by using failure and partial-success transitions to identify unreliable subgoals, streamline high-level planning, and outperform prior hierarchical methods on benchmarks.
citing papers explorer
-
Bridging Domain Gaps with Target-Aligned Generation for Offline Reinforcement Learning
TCE bridges domain gaps in offline RL by selectively using source data or generating target-aligned transitions via a dual score-based model, outperforming baselines in experiments.
-
Target-Aligned Bellman Backup for Cross-domain Offline Reinforcement Learning
Target-Aligned Bellman Backup (TABB) improves cross-domain offline RL by selecting source transitions according to their contribution to accurate target-domain Bellman target estimation.
-
Strict Subgoal Execution: Reliable Long-Horizon Planning in Hierarchical Reinforcement Learning
SSE improves long-horizon goal-conditioned RL by using failure and partial-success transitions to identify unreliable subgoals, streamline high-level planning, and outperform prior hierarchical methods on benchmarks.
- Coding Agent Is Good As World Simulator