Learning to Infer and Execute 3D Shape Programs

Andrew Luo; Jiajun Wu; Joshua B. Tenenbaum; Kevin Ellis; William T. Freeman; Xingyuan Sun; Yonglong Tian

arxiv: 1901.02875 · v3 · pith:4U62HILWnew · submitted 2019-01-09 · 💻 cs.CV · cs.AI· cs.GR· cs.LG

Learning to Infer and Execute 3D Shape Programs

Yonglong Tian , Andrew Luo , Xingyuan Sun , Kevin Ellis , William T. Freeman , Joshua B. Tenenbaum , Jiajun Wu This is my paper

classification 💻 cs.CV cs.AIcs.GRcs.LG

keywords shapeprogramsshapesinferexecutegeometryhigher-levellow-level

0 comments

read the original abstract

Human perception of 3D shapes goes beyond reconstructing them as a set of points or a composition of geometric primitives: we also effortlessly understand higher-level shape structure such as the repetition and reflective symmetry of object parts. In contrast, recent advances in 3D shape sensing focus more on low-level geometry but less on these higher-level relationships. In this paper, we propose 3D shape programs, integrating bottom-up recognition systems with top-down, symbolic program structure to capture both low-level geometry and high-level structural priors for 3D shapes. Because there are no annotations of shape programs for real shapes, we develop neural modules that not only learn to infer 3D shape programs from raw, unannotated shapes, but also to execute these programs for shape reconstruction. After initial bootstrapping, our end-to-end differentiable model learns 3D shape programs by reconstructing shapes in a self-supervised manner. Experiments demonstrate that our model accurately infers and executes 3D shape programs for highly complex shapes from various categories. It can also be integrated with an image-to-shape module to infer 3D shape programs directly from an RGB image, leading to 3D shape reconstructions that are both more accurate and more physically plausible.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Compositionality and the lexicon in evolutionary semantics
cs.CL 2026-06 unverdicted novelty 7.0

A co-evolutionary model of lexical meanings and composition under simplicity and accuracy pressures shows conservativity in quantifiers emerges as an efficient abstraction on the Pareto frontier.
Text-Driven 3D Indoor Scene Synthesis in Non-Manhattan Environments
cs.AI 2026-07 unverdicted novelty 3.0

SPG-Layout combines statistical object priors with hierarchical large-object-first placement to produce physically plausible text-driven 3D scenes in non-Manhattan rooms and outperforms baselines on a new 500-scene benchmark.