ArchSIBench is a new benchmark dataset and evaluation suite that measures vision-language models on architectural spatial intelligence across 17 subtasks, showing most models lag human baselines especially in transformation and configuration.
Spatialgen: Layout-guided 3d indoor scene generation.arXiv preprint arXiv:2509.14981, 3
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 5verdicts
UNVERDICTED 5roles
background 2polarities
background 2representative citing papers
VoxScene is a new anchor-conditioned voxel diffusion model that synthesizes collision-free 3D indoor scene arrangements via discrete volumetric occupancies and uses the grids for asset retrieval.
H-OmniStereo trains a stereo matcher on 2.8 million synthetic equirectangular pairs and adds a heading-aligned normal prior to improve zero-shot accuracy and generalization on out-of-domain and real omnidirectional data.
Rein3D generates photorealistic, globally consistent 3D indoor scenes by using a restore-and-refine process where radial panoramic videos are restored via diffusion models and then used to update a 3D Gaussian field.
DataEvolver introduces a reusable framework with generation-time self-correction and validation-time self-expansion loops that improves visual datasets, shown to outperform baselines on an object-rotation task.
citing papers explorer
-
ArchSIBench: Benchmarking the Architectural Spatial Intelligence of Vision-Language Models
ArchSIBench is a new benchmark dataset and evaluation suite that measures vision-language models on architectural spatial intelligence across 17 subtasks, showing most models lag human baselines especially in transformation and configuration.
-
VoxScene: Anchor-Conditioned Voxel Diffusion for Indoor Scene Arrangement
VoxScene is a new anchor-conditioned voxel diffusion model that synthesizes collision-free 3D indoor scene arrangements via discrete volumetric occupancies and uses the grids for asset retrieval.
-
H-OmniStereo: Zero-Shot Omnidirectional Stereo Matching with Heading-Aligned Normal Priors
H-OmniStereo trains a stereo matcher on 2.8 million synthetic equirectangular pairs and adds a heading-aligned normal prior to improve zero-shot accuracy and generalization on out-of-domain and real omnidirectional data.
-
Rein3D: Reinforced 3D Indoor Scene Generation with Panoramic Video Diffusion Models
Rein3D generates photorealistic, globally consistent 3D indoor scenes by using a restore-and-refine process where radial panoramic videos are restored via diffusion models and then used to update a 3D Gaussian field.
-
DataEvolver: Let Your Data Build and Improve Itself via Goal-Driven Loop Agents
DataEvolver introduces a reusable framework with generation-time self-correction and validation-time self-expansion loops that improves visual datasets, shown to outperform baselines on an object-rotation task.