However, the fundamental distinction lies in the generative policy class, which dictates optimization complexity and intermediate value construction

explicitly learns the intermediate value via contrastive energy prediction, is the most similar approach to Q-Flow · 2025

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Q-Flow: Stable and Expressive Reinforcement Learning with Flow-Based Policy

cs.LG · 2026-05-13 · unverdicted · novelty 5.0

Q-Flow bridges stability and expressivity in flow-based RL policies by propagating terminal trajectory values to intermediate states for gradient-based optimization.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Q-Flow: Stable and Expressive Reinforcement Learning with Flow-Based Policy cs.LG · 2026-05-13 · unverdicted · none · ref 18
Q-Flow bridges stability and expressivity in flow-based RL policies by propagating terminal trajectory values to intermediate states for gradient-based optimization.

However, the fundamental distinction lies in the generative policy class, which dictates optimization complexity and intermediate value construction

fields

years

verdicts

representative citing papers

citing papers explorer