FAN achieves state-of-the-art offline RL performance on robotic tasks by anchoring flow policies and using single-sample noise-conditioned Q-learning, with proven convergence and reduced runtimes.
arXiv preprint arXiv:2512.03973 (2025)
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.LG 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
Sobolev-trained diffusion policies using trajectories and feedback gains provide warm-starts that reduce trajectory optimization solving time by 2x to 20x while avoiding compounding errors.
citing papers explorer
-
Towards Efficient and Expressive Offline RL via Flow-Anchored Noise-conditioned Q-Learning
FAN achieves state-of-the-art offline RL performance on robotic tasks by anchoring flow policies and using single-sample noise-conditioned Q-learning, with proven convergence and reduced runtimes.
-
Accelerating trajectory optimization with Sobolev-trained diffusion policies
Sobolev-trained diffusion policies using trajectories and feedback gains provide warm-starts that reduce trajectory optimization solving time by 2x to 20x while avoiding compounding errors.