UniOcc: Unifying Vision-Centric 3D Occupancy Prediction with Geometric and Semantic Rendering

Jiaming Liu; Kuiyuan Yang; Li Liu; Longlong Wang; Mingjie Pan; Peixiang Huang; Shanghang Zhang; Shaoqing Xu; Zhiyi Lai

arxiv: 2306.09117 · v1 · pith:U2JQRV47new · submitted 2023-06-15 · 💻 cs.CV · cs.AI

UniOcc: Unifying Vision-Centric 3D Occupancy Prediction with Geometric and Semantic Rendering

Mingjie Pan , Li Liu , Jiaming Liu , Peixiang Huang , Longlong Wang , Shanghang Zhang , Shaoqing Xu , Zhiyi Lai

show 1 more author

Kuiyuan Yang

This is my paper

classification 💻 cs.CV cs.AI

keywords occupancypredictionsemanticunioccchallengefine-grainedlabelsmethod

0 comments

read the original abstract

In this technical report, we present our solution, named UniOCC, for the Vision-Centric 3D occupancy prediction track in the nuScenes Open Dataset Challenge at CVPR 2023. Existing methods for occupancy prediction primarily focus on optimizing projected features on 3D volume space using 3D occupancy labels. However, the generation process of these labels is complex and expensive (relying on 3D semantic annotations), and limited by voxel resolution, they cannot provide fine-grained spatial semantics. To address this limitation, we propose a novel Unifying Occupancy (UniOcc) prediction method, explicitly imposing spatial geometry constraint and complementing fine-grained semantic supervision through volume ray rendering. Our method significantly enhances model performance and demonstrates promising potential in reducing human annotation costs. Given the laborious nature of annotating 3D occupancy, we further introduce a Depth-aware Teacher Student (DTS) framework to enhance prediction accuracy using unlabeled data. Our solution achieves 51.27\% mIoU on the official leaderboard with single model, placing 3rd in this challenge.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Monocular 3D Occupancy Perception for Robots on Sidewalks via Hybrid 2D-3D Learning
cs.RO 2026-06 unverdicted novelty 6.0

WalkOCC bootstraps pseudo 3D occupancy labels from paired LiDAR-RGB sequences and jointly trains on unpaired monocular images for sidewalk robots, plus introduces the Sidewalk3D dataset.