Mean Flow Policy Optimization (MFPO) uses few-step flow-based models for RL policies and achieves performance on par with or better than diffusion-based methods while substantially lowering training and inference time on MuJoCo and DeepMind Control Suite.
Communications in Statistics-Simulation and Computation18(3), 1059–1076 (1989)
2 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.LG 2years
2026 2roles
background 1polarities
background 1representative citing papers
Latent diffusability is quantified by decomposing the MMSE rate along diffusion trajectories into Fisher Information and Fisher Information Rate, with three geometric penalties (dimensional compression, tangential distortion, curvature injection) identified as sources of failure.
citing papers explorer
-
Mean Flow Policy Optimization
Mean Flow Policy Optimization (MFPO) uses few-step flow-based models for RL policies and achieves performance on par with or better than diffusion-based methods while substantially lowering training and inference time on MuJoCo and DeepMind Control Suite.
-
Understanding Latent Diffusability via Fisher Geometry
Latent diffusability is quantified by decomposing the MMSE rate along diffusion trajectories into Fisher Information and Fisher Information Rate, with three geometric penalties (dimensional compression, tangential distortion, curvature injection) identified as sources of failure.