TCE bridges domain gaps in offline RL by selectively using source data or generating target-aligned transitions via a dual score-based model, outperforming baselines in experiments.
Mujoco: A physics engine for model-based control
4 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 4representative citing papers
Target-Aligned Bellman Backup (TABB) improves cross-domain offline RL by selecting source transitions according to their contribution to accurate target-domain Bellman target estimation.
SSE improves long-horizon goal-conditioned RL by using failure and partial-success transitions to identify unreliable subgoals, streamline high-level planning, and outperform prior hierarchical methods on benchmarks.
An agentic framework generates executable physics simulation code from text prompts via coordinated planning, coding, visual, and physics agents that iterate to satisfy both prompt fidelity and physical constraints.
citing papers explorer
-
Bridging Domain Gaps with Target-Aligned Generation for Offline Reinforcement Learning
TCE bridges domain gaps in offline RL by selectively using source data or generating target-aligned transitions via a dual score-based model, outperforming baselines in experiments.
-
Target-Aligned Bellman Backup for Cross-domain Offline Reinforcement Learning
Target-Aligned Bellman Backup (TABB) improves cross-domain offline RL by selecting source transitions according to their contribution to accurate target-domain Bellman target estimation.
-
Coding Agent Is Good As World Simulator
An agentic framework generates executable physics simulation code from text prompts via coordinated planning, coding, visual, and physics agents that iterate to satisfy both prompt fidelity and physical constraints.