FAN simplifies expressive flow policies and distributional critics in offline RL via single-iteration behavior regularization and single-sample noise conditioning to claim SOTA performance with lower training and inference time.
Proceedings of the 27th International Conference on Machine Learning (ICML-10) , pages=
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Towards Efficient and Expressive Offline RL via Flow-Anchored Noise-conditioned Q-Learning
FAN simplifies expressive flow policies and distributional critics in offline RL via single-iteration behavior regularization and single-sample noise conditioning to claim SOTA performance with lower training and inference time.