Flow q-learning, 2025

Seohong Park, Qiyang Li, Sergey Levine · 2025

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

Fisher Decorator: Refining Flow Policy via a Local Transport Map

cs.LG · 2026-04-20 · unverdicted · novelty 6.0

Fisher Decorator refines flow policies in offline RL via a local transport map and Fisher-matrix quadratic approximation of the KL constraint, yielding controllable error near the optimum and SOTA benchmark results.

ISEP: Implicit Support Expansion for Offline Reinforcement Learning via Stochastic Policy Optimization

cs.LG · 2026-05-18 · unverdicted · novelty 5.0

ISEP expands action support in offline RL via value interpolation between data and policy samples, then uses stochastic policy optimization to avoid mode collapse in the resulting multimodal objective.

citing papers explorer

Showing 2 of 2 citing papers.

Fisher Decorator: Refining Flow Policy via a Local Transport Map cs.LG · 2026-04-20 · unverdicted · none · ref 19
Fisher Decorator refines flow policies in offline RL via a local transport map and Fisher-matrix quadratic approximation of the KL constraint, yielding controllable error near the optimum and SOTA benchmark results.
ISEP: Implicit Support Expansion for Offline Reinforcement Learning via Stochastic Policy Optimization cs.LG · 2026-05-18 · unverdicted · none · ref 29
ISEP expands action support in offline RL via value interpolation between data and policy samples, then uses stochastic policy optimization to avoid mode collapse in the resulting multimodal objective.

Flow q-learning, 2025

fields

years

verdicts

representative citing papers

citing papers explorer