VISA improves closed-set 3D occupancy mIoU on nuScenes by using VLM instance audits as reliability-weighted semantic supervisors during training of existing world models.
Quadricformer: Scene as superquadrics for 3d semantic occupancy prediction
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.CV 4years
2026 4verdicts
UNVERDICTED 4roles
background 1polarities
background 1representative citing papers
Diffusion for 3D shapes is moved from dense geometry to compact superquadric parameter sets, cutting state size to roughly 7 KB per shape and enabling faster generation plus new editing capabilities.
DVGT-2 is a streaming vision-geometry-action model that jointly reconstructs dense 3D geometry and plans trajectories online, achieving better reconstruction than prior batch methods while transferring directly to planning benchmarks without fine-tuning.
TFusionOcc uses a family of Student's t-distribution T-primitives and a T-mixture model for multi-sensor 3D occupancy prediction, reporting state-of-the-art results on nuScenes.
citing papers explorer
-
VISA: VLM-Guided Instance Semantic Auditing for 3D Occupancy World Models
VISA improves closed-set 3D occupancy mIoU on nuScenes by using VLM instance audits as reliability-weighted semantic supervisors during training of existing world models.
-
Rethinking 3D Shape Generation: Diffusion over Superquadrics
Diffusion for 3D shapes is moved from dense geometry to compact superquadric parameter sets, cutting state size to roughly 7 KB per shape and enabling faster generation plus new editing capabilities.
-
DVGT-2: Vision-Geometry-Action Model for Autonomous Driving at Scale
DVGT-2 is a streaming vision-geometry-action model that jointly reconstructs dense 3D geometry and plans trajectories online, achieving better reconstruction than prior batch methods while transferring directly to planning benchmarks without fine-tuning.
-
TFusionOcc: T-Primitive Based Object-Centric Multi-Sensor Fusion Framework for 3D Occupancy Prediction
TFusionOcc uses a family of Student's t-distribution T-primitives and a T-mixture model for multi-sensor 3D occupancy prediction, reporting state-of-the-art results on nuScenes.