Roboengine: Plug-and-play robot data augmentation with semantic robot segmentation and background generation

Chengbo Yuan et al · 2025 · arXiv 2503.18738

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

read on arXiv browse 6 citing papers

citation-role summary

background 1

citation-polarity summary

unclear 1

representative citing papers

DockAnywhere: Data-Efficient Visuomotor Policy Learning for Mobile Manipulation via Novel Demonstration Generation

cs.RO · 2026-04-16 · unverdicted · novelty 7.0

DockAnywhere lifts single demonstrations to diverse docking points via structure-preserving augmentation and point-cloud spatial editing to improve viewpoint generalization in visuomotor policies for mobile manipulation.

A Principled Approach for Creating High-fidelity Synthetic Demonstrations for Imitation Learning

cs.RO · 2026-05-02 · unverdicted · novelty 6.0

DMP retargeting within 3DGS scenes preserves expert motion shape and phase to create diverse yet high-fidelity demonstrations, yielding lower deviation, fewer collisions, and higher downstream policy success than planner-based synthesis on Spot manipulator tasks.

Vidar: Embodied Video Diffusion Model for Generalist Manipulation

cs.LG · 2025-07-17 · unverdicted · novelty 6.0

Vidar shows that a video diffusion prior continuously pre-trained on 750K multi-view robot trajectories plus a label-free masked inverse dynamics adapter can generalize manipulation to new robot embodiments with 1% of typical demonstration data.

StableIDM: Stabilizing Inverse Dynamics Model against Manipulator Truncation via Spatio-Temporal Refinement

cs.RO · 2026-04-20 · unverdicted · novelty 5.0

StableIDM stabilizes inverse dynamics models under manipulator truncation by combining robot-centric masking, directional spatial feature aggregation, and temporal dynamics refinement, yielding 12.1% higher strict action accuracy on AgiBot and 9.7-17.6% gains in real-robot tasks.

SEVO: Semantic-Enhanced Virtual Observation for Robust VLA Manipulation via Active Illumination and Data-Centric Collection

cs.RO · 2026-05-11 · conditional · novelty 4.0

SEVO raises ACT and SmolVLA pick-and-place success from 30-35% to 75-85% in novel environments by using active illumination, semantic cues, and diversified teleoperation data.

Vision-Language-Action in Robotics: A Survey of Datasets, Benchmarks, and Data Engines

cs.RO · 2026-04-24 · unverdicted · novelty 3.0

A survey of VLA robotics research identifies data infrastructure as the primary bottleneck and distills four open challenges in representation alignment, multimodal supervision, reasoning assessment, and scalable data generation.

citing papers explorer

Showing 6 of 6 citing papers.

DockAnywhere: Data-Efficient Visuomotor Policy Learning for Mobile Manipulation via Novel Demonstration Generation cs.RO · 2026-04-16 · unverdicted · none · ref 26
DockAnywhere lifts single demonstrations to diverse docking points via structure-preserving augmentation and point-cloud spatial editing to improve viewpoint generalization in visuomotor policies for mobile manipulation.
A Principled Approach for Creating High-fidelity Synthetic Demonstrations for Imitation Learning cs.RO · 2026-05-02 · unverdicted · none · ref 34
DMP retargeting within 3DGS scenes preserves expert motion shape and phase to create diverse yet high-fidelity demonstrations, yielding lower deviation, fewer collisions, and higher downstream policy success than planner-based synthesis on Spot manipulator tasks.
Vidar: Embodied Video Diffusion Model for Generalist Manipulation cs.LG · 2025-07-17 · unverdicted · none · ref 21
Vidar shows that a video diffusion prior continuously pre-trained on 750K multi-view robot trajectories plus a label-free masked inverse dynamics adapter can generalize manipulation to new robot embodiments with 1% of typical demonstration data.
StableIDM: Stabilizing Inverse Dynamics Model against Manipulator Truncation via Spatio-Temporal Refinement cs.RO · 2026-04-20 · unverdicted · none · ref 50
StableIDM stabilizes inverse dynamics models under manipulator truncation by combining robot-centric masking, directional spatial feature aggregation, and temporal dynamics refinement, yielding 12.1% higher strict action accuracy on AgiBot and 9.7-17.6% gains in real-robot tasks.
SEVO: Semantic-Enhanced Virtual Observation for Robust VLA Manipulation via Active Illumination and Data-Centric Collection cs.RO · 2026-05-11 · conditional · none · ref 14
SEVO raises ACT and SmolVLA pick-and-place success from 30-35% to 75-85% in novel environments by using active illumination, semantic cues, and diversified teleoperation data.
Vision-Language-Action in Robotics: A Survey of Datasets, Benchmarks, and Data Engines cs.RO · 2026-04-24 · unverdicted · none · ref 29
A survey of VLA robotics research identifies data infrastructure as the primary bottleneck and distills four open challenges in representation alignment, multimodal supervision, reasoning assessment, and scalable data generation.

Roboengine: Plug-and-play robot data augmentation with semantic robot segmentation and background generation

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer