pith. sign in

arxiv: 2503.03464 · v2 · pith:Z2772OFMnew · submitted 2025-03-05 · 💻 cs.RO

Generative Artificial Intelligence in Robotic Manipulation: A Survey

classification 💻 cs.RO
keywords datagenerationmodelschallengesgenerativelayerroboticsurvey
0
0 comments X
read the original abstract

This survey provides a comprehensive review on recent advancements of generative learning models in robotic manipulation, addressing key challenges in the field. Robotic manipulation faces critical bottlenecks, including significant challenges in insufficient data and inefficient data acquisition, long-horizon and complex task planning, and the multi-modality reasoning ability for robust policy learning performance across diverse environments. To tackle these challenges, this survey introduces several generative model paradigms, including Generative Adversarial Networks (GANs), Variational Autoencoders (VAEs), diffusion models, probabilistic flow models, and autoregressive models, highlighting their strengths and limitations. The applications of these models are categorized into three hierarchical layers: the Foundation Layer, focusing on data generation and reward generation; the Intermediate Layer, covering language, code, visual, and state generation; and the Policy Layer, emphasizing grasp generation and trajectory generation. Each layer is explored in detail, along with notable works that have advanced the state of the art. Finally, the survey outlines future research directions and challenges, emphasizing the need for improved efficiency in data utilization, better handling of long-horizon tasks, and enhanced generalization across diverse robotic scenarios. All the related resources, including research papers, open-source data, and projects, are collected for the community in https://github.com/GAI4Manipulation/AwesomeGAIManipulation

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 10 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. 3D Generation for Embodied AI and Robotic Simulation: A Survey

    cs.RO 2026-04 accept novelty 7.0

    3D generation for embodied AI is shifting from visual realism toward interaction readiness, organized into data generation, simulation environments, and sim-to-real bridging roles.

  2. Off the Rails: Hijacking the Scoring Head in Generative End-to-End Driving Planners with Safety-Violating Adversarial Perturbations

    cs.RO 2026-06 unverdicted novelty 6.0

    Derail adversarial perturbations hijack the scoring head in generative E2E driving planners, flipping safe to unsafe trajectory selection with 39-80% score drops and up to 50% collision rates.

  3. From Reaction to Anticipation: Proactive Failure Recovery through Agentic Task Graph for Robotic Manipulation

    cs.RO 2026-05 unverdicted novelty 6.0

    AgentChord models manipulation tasks as directed graphs enriched with anticipatory recovery branches, using specialized agents to enable immediate, low-latency failure responses and improve success on long-horizon bim...

  4. PhyRoGen: Synthetic Generation of Physical Robot Manipulation Puzzles Using Procedural Content Generation

    cs.RO 2026-06 unverdicted novelty 5.0

    PhyRoGen uses procedural content generation to create 24 solvable physical robot manipulation puzzles with interlocking dependencies, shown solvable by sampling-based planners and a KUKA robot in simulation.

  5. VLAMotor: Test-Guided Enhancement of Vision-Language-Action Models via Agent-BasedData Synthesis

    cs.RO 2026-05 unverdicted novelty 5.0

    VLAMotor exposes VLA failures via distance-aware uncertainty testing and synthesizes agent-planned repair data to fine-tune models, reporting 49.25% success rate gains in simulation and 57.5% on hardware.

  6. EmbodiedClaw: Conversational Workflow Execution for Embodied AI Development

    cs.RO 2026-04 unverdicted novelty 5.0

    EmbodiedClaw automates embodied AI development workflows through conversation, reducing manual effort and improving consistency and reproducibility.

  7. Large VLM-based Vision-Language-Action Models for Robotic Manipulation: A Survey

    cs.RO 2025-08 unverdicted novelty 5.0

    This survey organizes large VLM-based VLA models for robotic manipulation into monolithic and hierarchical paradigms, reviews their integrations and datasets, and outlines future directions.

  8. Genie Sim PanoRecon: Fast Immersive Scene Generation from Single-View Panorama

    cs.RO 2026-04 unverdicted novelty 4.0

    A feed-forward Gaussian-splatting system reconstructs photo-realistic 3D scenes from single-view panoramas in seconds via cube-map decomposition and depth-aware fusion for robotic simulation use.

  9. 3D Generation for Embodied AI and Robotic Simulation: A Survey

    cs.RO 2026-04 unverdicted novelty 3.0

    The survey organizes 3D generation for embodied AI into data generators for assets, simulation environments for interaction, and sim-to-real bridges, noting a shift toward interaction readiness and listing bottlenecks...

  10. 3D Generation for Embodied AI and Robotic Simulation: A Survey

    cs.RO 2026-04 unverdicted novelty 2.0

    The paper surveys 3D generation techniques for embodied AI and robotics, categorizing them into data generation, simulation environments, and sim-to-real bridging while identifying bottlenecks in physical validity and...