pith. sign in

arxiv: 2512.01204 · v4 · pith:54V2UTPVnew · submitted 2025-12-01 · 💻 cs.CV

TabletopGen: Tabletop Scene Generation and Interactive Simulation for Robotic Manipulation

classification 💻 cs.CV
keywords manipulationdatascenerobotictabletopgengenerationsimulationtabletop
0
0 comments X
read the original abstract

Simulation provides a low-cost, scalable pathway to large-scale robotic manipulation data collection. However, existing 3D scene generation methods can rarely be applied directly to manipulation data synthesis, as their generated scenes often lack instance-level interactivity and physical plausibility. Focusing on tabletop manipulation, we propose TabletopGen, a training-free and automated tabletop scene generation and interactive simulation engine. Starting from text or a single image, we first obtain independent 3D object models via generative instance extraction. Second, we introduce a novel pose and scale alignment approach that recovers a collision-free scene layout using a Differentiable Rotation Optimizer and a Top-View Spatial Alignment mechanism. Finally, we assemble the generated scene in a physics simulator with collision geometry, yielding a stable, interactable environment for synthesizing multimodal manipulation data. Extensive experiments and user studies demonstrate that TabletopGen achieves state-of-the-art performance in visual fidelity, layout accuracy, and physical plausibility. Furthermore, we validate the executability of the collected trajectories on a real robotic arm via zero-shot real-to-sim-to-real policy transfer, indicating that TabletopGen can serve as a reliable data engine for robotic manipulation data synthesis.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 9 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. One Video, One World: Turning Monocular Video into Physical 4D Scenes

    cs.CV 2026-06 unverdicted novelty 8.0

    OVOW reconstructs instance-level, simulation-ready 4D mesh scenes from monocular video via a four-stage training-free pipeline and introduces a new benchmark for structured Video-to-4D evaluation.

  2. REST3D: Reconstructing Physically Stable 3D Scenes from a Single Image

    cs.CV 2026-05 unverdicted novelty 7.0

    REST3D reconstructs physically stable 3D scenes from single images via agentic scene-tree understanding and physics-constrained optimization.

  3. 3D Generation for Embodied AI and Robotic Simulation: A Survey

    cs.RO 2026-04 accept novelty 7.0

    3D generation for embodied AI is shifting from visual realism toward interaction readiness, organized into data generation, simulation environments, and sim-to-real bridging roles.

  4. Perceive-then-Plan: Layout-as-Policy for Monocular 3D Scene Layout Estimation

    cs.CV 2026-05 unverdicted novelty 6.0

    Introduces Layout-as-Policy (LaP) to turn 3D layout estimation into an iterative policy-learning refinement process for better physical coherence.

  5. STABLE: Simulation-Ready Tabletop Layout Generation via a Semantics-Physics Dual System

    cs.CV 2026-05 unverdicted novelty 6.0

    STABLE generates simulation-ready tabletop scenes by alternating a semantic LLM reasoner for task-aligned coarse layouts with a physics corrector for physical plausibility using progressive scene expansion.

  6. V-CAGE: Vision-Closed-Loop Agentic Generation Engine for Robotic Manipulation

    cs.RO 2026-04 unverdicted novelty 6.0

    V-CAGE automates the creation of scalable, high-quality robotic manipulation datasets through context-aware scene construction, closed-loop visual verification, and perceptually-driven compression.

  7. WorldAct: Activating Monolithic 3D Worlds into Interactive-Ready Object-Centric Scenes

    cs.CV 2026-05 unverdicted novelty 5.0

    WorldAct activates monolithic 3D worlds into interactive scenes via multimodal agent-guided decomposition, geometrically aligned mesh reconstruction, and 3D inpainting.

  8. 3D Generation for Embodied AI and Robotic Simulation: A Survey

    cs.RO 2026-04 unverdicted novelty 3.0

    The survey organizes 3D generation for embodied AI into data generators for assets, simulation environments for interaction, and sim-to-real bridges, noting a shift toward interaction readiness and listing bottlenecks...

  9. 3D Generation for Embodied AI and Robotic Simulation: A Survey

    cs.RO 2026-04 unverdicted novelty 2.0

    The paper surveys 3D generation techniques for embodied AI and robotics, categorizing them into data generation, simulation environments, and sim-to-real bridging while identifying bottlenecks in physical validity and...