More approaches include sequence modeling (Chen et al., 2021; Janner et al

or via pessimistic value learning (Kumar et al · 2020

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Q-Flow: Stable and Expressive Reinforcement Learning with Flow-Based Policy

cs.LG · 2026-05-13 · unverdicted · novelty 6.0

Q-Flow enables stable optimization of expressive flow-based policies in RL by propagating terminal values along deterministic flow dynamics to intermediate states for gradient updates without solver unrolling.

citing papers explorer

Showing 1 of 1 citing paper.

Q-Flow: Stable and Expressive Reinforcement Learning with Flow-Based Policy cs.LG · 2026-05-13 · unverdicted · none · ref 13
Q-Flow enables stable optimization of expressive flow-based policies in RL by propagating terminal values along deterministic flow dynamics to intermediate states for gradient updates without solver unrolling.

More approaches include sequence modeling (Chen et al., 2021; Janner et al

fields

years

verdicts

representative citing papers

citing papers explorer