SOM is an actor-critic algorithm that constructs the target velocity field for one-step MeanFlow policies directly from the Q-function via score estimation and probability flow ODE, achieving claimed SOTA on locomotion tasks with reduced training and inference time.
Consistency models
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.LG 2years
2026 2representative citing papers
A hierarchical variational formulation amortizes test-time guidance in diffusion models to achieve strong quality-speed tradeoffs with significantly reduced inference compute.
citing papers explorer
-
Score-Based One-step MeanFlow Policy Optimization
SOM is an actor-critic algorithm that constructs the target velocity field for one-step MeanFlow policies directly from the Q-function via score estimation and probability flow ODE, achieving claimed SOTA on locomotion tasks with reduced training and inference time.
-
Hierarchical Variational Policies for Reward-Guided Diffusion
A hierarchical variational formulation amortizes test-time guidance in diffusion models to achieve strong quality-speed tradeoffs with significantly reduced inference compute.