pith. sign in

OMP: One-step Meanflow Policy with Directional Alignment

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it
abstract

Robot manipulation has increasingly adopted data-driven generative policy frameworks, yet the field faces a persistent trade-off: diffusion models suffer from high inference latency, while flow-based methods often require complex architectural constraints. Although in image generation domain, the MeanFlow paradigm offers a path to single-step inference, its direct application to robotics is impeded by critical theoretical pathologies, specifically spectral bias and gradient starvation in low-velocity regimes. To overcome these limitations, we propose the One-step MeanFlow Policy (OMP), a novel framework designed for high-fidelity, real-time manipulation. We introduce a lightweight directional alignment mechanism to explicitly synchronize predicted velocities with true mean velocities. Furthermore, we implement a Differential Derivation Equation (DDE) to approximate the Jacobian-Vector Product (JVP) operator, which decouples forward and backward passes to significantly reduce memory complexity. Extensive experiments on the Adroit and Meta-World benchmarks demonstrate that OMP outperforms state-of-the-art methods in success rate and trajectory accuracy, particularly in high-precision tasks, while retaining the efficiency of single-step generation.

fields

cs.RO 1

years

2026 1

verdicts

UNVERDICTED 1

clear filters

representative citing papers

MARS Policy: Multimodality Only When It Matters

cs.RO · 2026-05-28 · unverdicted · novelty 5.0

MARS policy adaptively activates multimodal generation only when beneficial in robotic tasks, claiming 16.67% higher success and 83.20% lower inference latency than baselines in real-world tests.

citing papers explorer

Showing 1 of 1 citing paper after filters.

  • MARS Policy: Multimodality Only When It Matters cs.RO · 2026-05-28 · unverdicted · none · ref 3 · internal anchor

    MARS policy adaptively activates multimodal generation only when beneficial in robotic tasks, claiming 16.67% higher success and 83.20% lower inference latency than baselines in real-world tests.