Action-to-Action Flow Matching

· 2026 · cs.RO · arXiv 2602.07322

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

open full Pith review browse 3 citing papers arXiv PDF

abstract

Diffusion-based policies have recently achieved remarkable success in robotics by formulating action prediction as a conditional denoising process. However, the standard practice of sampling from random Gaussian noise often requires multiple iterative steps to produce clean actions, leading to high inference latency that incurs a major bottleneck for real-time control. In this paper, we challenge the necessity of uninformed noise sampling and propose Action-to-Action flow matching (A2A), a novel policy paradigm that shifts from random sampling to initialization informed by the previous proprioceptive action. Unlike existing methods that treat proprioceptive action feedback as static conditions, A2A leverages historical proprioceptive sequences, embedding them into a high-dimensional latent space as the starting point for action generation. This design bypasses costly iterative denoising while effectively capturing the robot's physical dynamics and temporal continuity. Extensive experiments demonstrate that A2A exhibits high training efficiency, fast inference speed, and improved generalization. Notably, A2A enables high-quality action generation in as few as a single inference step, and exhibits superior robustness to visual perturbations and enhanced generalization to unseen configurations. Lastly, we also extend A2A to video generation, demonstrating its broader versatility in temporal modeling. Project site: https://lorenzo-0-0.github.io/A2A_Flow_Matching.

representative citing papers

Feedback World Model Enables Precise Guidance of Diffusion Policy

cs.RO · 2026-05-15 · unverdicted · novelty 6.0

Feedback world model closes the prediction-observation loop at inference time to correct errors and improve diffusion policy performance under distribution shift in robotics.

FLASH: Efficient Visuomotor Policy via Sparse Sampling

cs.RO · 2026-05-15 · unverdicted · novelty 6.0

FLASH Policy uses sparse Legendre polynomial trajectory fitting and history-anchored flow matching to enable single-step inference for visuomotor control, reporting 31.4 ms per-episode latency and >=92% success on five simulated plus two real manipulation tasks.

WarmPrior: Straightening Flow-Matching Policies with Temporal Priors

cs.LG · 2026-05-13 · unverdicted · novelty 6.0

Replacing Gaussian noise with a temporally grounded prior from recent actions straightens flow-matching paths and improves success rates in robotic manipulation and prior-space RL.

citing papers explorer

Showing 3 of 3 citing papers.

Feedback World Model Enables Precise Guidance of Diffusion Policy cs.RO · 2026-05-15 · unverdicted · none · ref 5 · internal anchor
Feedback world model closes the prediction-observation loop at inference time to correct errors and improve diffusion policy performance under distribution shift in robotics.
FLASH: Efficient Visuomotor Policy via Sparse Sampling cs.RO · 2026-05-15 · unverdicted · none · ref 33 · internal anchor
FLASH Policy uses sparse Legendre polynomial trajectory fitting and history-anchored flow matching to enable single-step inference for visuomotor control, reporting 31.4 ms per-episode latency and >=92% success on five simulated plus two real manipulation tasks.
WarmPrior: Straightening Flow-Matching Policies with Temporal Priors cs.LG · 2026-05-13 · unverdicted · none · ref 4 · internal anchor
Replacing Gaussian noise with a temporally grounded prior from recent actions straightens flow-matching paths and improves success rates in robotic manipulation and prior-space RL.

Action-to-Action Flow Matching

fields

years

verdicts

representative citing papers

citing papers explorer