Derives optimal inference-time guidance for stochastic interpolant policies via Kolmogorov equation analysis, enabling reactive streaming robot control with training-free and training-based mechanisms.
Dif- fusion policy: Visuomotor policy learning via action diffusion
8 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.RO 8years
2026 8roles
background 2polarities
background 2representative citing papers
DMP retargeting within 3DGS scenes preserves expert motion shape and phase to create diverse yet high-fidelity demonstrations, yielding lower deviation, fewer collisions, and higher downstream policy success than planner-based synthesis on Spot manipulator tasks.
Tube Diffusion Policy learns observation-conditioned feedback flows around nominal action chunks to enable fast reactive control in visual-tactile contact-rich manipulation.
WARPED synthesizes realistic wrist-view observations from monocular egocentric human videos via foundation models, hand-object tracking, retargeting, and Gaussian Splatting to train visuomotor policies that match teleoperation success rates on five tabletop tasks with 5-8x less collection effort.
The paper introduces Hyper Diffusion Planner (HDP), a diffusion-based E2E AD framework that identifies insights on loss space, trajectory representation and data scaling, adds RL post-training, and reports 10x performance gains over 200 km of real-world testing across 6 scenarios.
Generative VLAs hallucinate physically invalid actions due to topological, precision, and horizon mismatches between model architectures and feasible robot behavior.
CLAMP pretrains 3D multi-view encoders with contrastive learning on point clouds and actions, then initializes diffusion policies for more sample-efficient fine-tuning on robotic tasks.
TAIL-Safe learns a Lipschitz Q-function from visibility, recognizability, and graspability criteria in a Gaussian Splatting twin to define an empirical safe set for IL policies and recovers unsafe actions via Nagumo-inspired gradient ascent.
citing papers explorer
-
Guided Streaming Stochastic Interpolant Policy
Derives optimal inference-time guidance for stochastic interpolant policies via Kolmogorov equation analysis, enabling reactive streaming robot control with training-free and training-based mechanisms.
-
A Principled Approach for Creating High-fidelity Synthetic Demonstrations for Imitation Learning
DMP retargeting within 3DGS scenes preserves expert motion shape and phase to create diverse yet high-fidelity demonstrations, yielding lower deviation, fewer collisions, and higher downstream policy success than planner-based synthesis on Spot manipulator tasks.
-
Tube Diffusion Policy: Reactive Visual-Tactile Policy Learning for Contact-rich Manipulation
Tube Diffusion Policy learns observation-conditioned feedback flows around nominal action chunks to enable fast reactive control in visual-tactile contact-rich manipulation.
-
WARPED: Wrist-Aligned Rendering for Robot Policy Learning from Egocentric Human Demonstrations
WARPED synthesizes realistic wrist-view observations from monocular egocentric human videos via foundation models, hand-object tracking, retargeting, and Gaussian Splatting to train visuomotor policies that match teleoperation success rates on five tabletop tasks with 5-8x less collection effort.
-
Unleashing the Potential of Diffusion Models for End-to-End Autonomous Driving
The paper introduces Hyper Diffusion Planner (HDP), a diffusion-based E2E AD framework that identifies insights on loss space, trajectory representation and data scaling, adds RL post-training, and reports 10x performance gains over 200 km of real-world testing across 6 scenarios.
-
Action Hallucination in Generative Vision-Language-Action Models
Generative VLAs hallucinate physically invalid actions due to topological, precision, and horizon mismatches between model architectures and feasible robot behavior.
-
CLAMP: Contrastive Learning for 3D Multi-View Action-Conditioned Robotic Manipulation Pretraining
CLAMP pretrains 3D multi-view encoders with contrastive learning on point clouds and actions, then initializes diffusion policies for more sample-efficient fine-tuning on robotic tasks.
-
TAIL-Safe: Task-Agnostic Safety Monitoring for Imitation Learning Policies
TAIL-Safe learns a Lipschitz Q-function from visibility, recognizability, and graspability criteria in a Gaussian Splatting twin to define an empirical safe set for IL policies and recovers unsafe actions via Nagumo-inspired gradient ascent.