SToRe3D delivers up to 3x faster inference for multi-view 3D object detection in ViTs by selecting relevant 2D tokens and 3D queries via mutual relevance heads with only marginal accuracy loss.
First, our evaluation is restricted to the nuScenes dataset and camera-only multi-view 3D detection
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
SToRe3D: Sparse Token Relevance in ViTs for Efficient Multi-View 3D Object Detection
SToRe3D delivers up to 3x faster inference for multi-view 3D object detection in ViTs by selecting relevant 2D tokens and 3D queries via mutual relevance heads with only marginal accuracy loss.