pith. sign in

arxiv: 2503.04496 · v2 · submitted 2025-03-06 · 💻 cs.GR · cs.CV· cs.LG

Learning to Place Objects with Programs and Iterative Self Training

classification 💻 cs.GR cs.CVcs.LG
keywords sceneprogramssystemobjectplacementintroducelocationperformance
0
0 comments X
read the original abstract

In this work we study indoor scene object placement. Given a 3D indoor scene and an object, the task is to predict placement locations within the scene. Empirical observations of data-driven approaches to the problem show their tendency to miss placement modes. We introduce a system which helps to address this flaw. We design a Domain Specific Language (DSL) that specifies object relational constraints. Upon execution, programs from our language predict possible placements from a partial scene and object. We design a generative model which writes these programs automatically. Available 3D scene datasets do not contain programs to train on, and naively extracted programs only predict the original placement location of scene objects. Training on these programs results in subpar performance so we introduce a new program bootstrapping algorithm that improves our system's performance compared to the naive approach. To quantify our qualitative observations, we introduce a new evaluation procedure which captures how well a system models per-object location distributions. We ask human annotators to label all the possible places an object can go in a scene and compare this set against locations produced by the system in question. Our system produces per-object location distributions more consistent with human annotators than those produced by existing data-driven approaches and a zero-shot approach using an LLM. While other systems degrade in performance when training data is sparse, our system does not degrade to the same degree.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. SDesc3D: Towards Layout-Aware 3D Indoor Scene Generation from Short Descriptions

    cs.CV 2026-04 unverdicted novelty 7.0

    SDesc3D produces more plausible 3D indoor scenes from short texts by augmenting inputs with multi-view structural priors, functionality-aware grounding, and iterative self-rectification.