InarXiv preprint arXiv:2203.17054

Bevdet4d: Exploit temporal cues in multi-camera 3d object detection · 2022 · arXiv 2203.17054

10 Pith papers cite this work. Polarity classification is still indexing.

10 Pith papers citing it

read on arXiv browse 10 citing papers

citation-role summary

background 1 method 1

citation-polarity summary

background 1 use method 1

representative citing papers

Learning Ego-Centric BEV Representations from a Perspective-Privileged View: Cross-View Supervision for Online HD Map Construction

cs.CV · 2026-05-12 · unverdicted · novelty 6.0

Cross-View Supervision transfers geometric and topological priors from ego-aligned overhead perspectives into camera-based BEV encoders via feature-space alignment, yielding up to 44% relative mAP gains at long range on nuScenes.

SimPB++: Simultaneously Detecting 2D and 3D Objects from Multiple Cameras

cs.CV · 2026-05-03 · unverdicted · novelty 6.0

SimPB++ unifies multi-view 2D perspective and 3D BEV object detection in one model via an interactive hybrid decoder, reporting state-of-the-art results on nuScenes and long-range detection up to 150 m on Argoverse2.

CAM3DNet: Comprehensively mining the multi-scale features for 3D Object Detection with Multi-View Cameras

cs.CV · 2026-04-18 · unverdicted · novelty 6.0

CAM3DNet outperforms prior camera-based 3D detectors on nuScenes, Waymo and Argoverse by using three new modules to better mine multi-scale spatiotemporal features from 2D queries and pyramid maps.

RQR3D: Reparametrizing the regression targets for BEV-based 3D object detection

cs.CV · 2025-05-23 · unverdicted · novelty 6.0

RQR3D reparametrizes oriented bounding box regression in BEV 3D detection as regressing a horizontal box plus corner offsets and achieves SOTA camera-radar performance on nuScenes with 67.5 NDS and 59.7 mAP.

SemLT3D: Semantic-Guided Expert Distillation for Camera-only Long-Tailed 3D Object Detection

cs.CV · 2026-04-20 · unverdicted · novelty 5.0

SemLT3D introduces semantic-guided expert distillation with a language MoE module and CLIP projection to enrich features for long-tailed classes in camera-only 3D detection.

Revisiting Token Compression for Accelerating ViT-based Sparse Multi-View 3D Object Detectors

cs.CV · 2026-04-16 · conditional · novelty 5.0

SEPatch3D accelerates ViT-based 3D object detectors up to 57% faster than StreamPETR via dynamic patch sizing and cross-granularity enhancement while keeping comparable accuracy on nuScenes and Argoverse 2.

Not All Agents Matter: From Global Attention Dilution to Risk-Prioritized Game Planning

cs.CV · 2026-04-07 · unverdicted · novelty 5.0

GameAD models autonomous driving as a risk-prioritized game among agents via Risk-Aware Topology Anchoring, Minimax Risk-Aware Sparse Attention and related components, yielding safer trajectories than prior end-to-end methods on nuScenes and Bench2Drive.

Multi-Modal Sensor Fusion using Hybrid Attention for Autonomous Driving

cs.CV · 2026-04-06 · unverdicted · novelty 5.0

MMF-BEV fuses camera and radar branches with deformable self- and cross-attention, outperforming unimodal baselines on the VoD 4D radar dataset through a two-stage training process.

BEVPredFormer: Spatio-temporal Attention for BEV Instance Prediction in Autonomous Driving

cs.CV · 2026-04-03 · unverdicted · novelty 5.0

BEVPredFormer uses attention-based temporal processing and 3D camera projection to match or exceed prior methods on nuScenes for BEV instance prediction.

Fast-BEV++: Fast by Algorithm, Deployable by Design

cs.CV · 2025-12-09 · unverdicted · novelty 5.0

Fast-BEV++ achieves at least 3x speedup over Fast-BEV, a new SOTA of 0.488 NDS on nuScenes 3D detection, and over 134 FPS inference by redesigning the core transformation pipeline and adding a learnable depth module.

citing papers explorer

Showing 10 of 10 citing papers.

Learning Ego-Centric BEV Representations from a Perspective-Privileged View: Cross-View Supervision for Online HD Map Construction cs.CV · 2026-05-12 · unverdicted · none · ref 12
Cross-View Supervision transfers geometric and topological priors from ego-aligned overhead perspectives into camera-based BEV encoders via feature-space alignment, yielding up to 44% relative mAP gains at long range on nuScenes.
SimPB++: Simultaneously Detecting 2D and 3D Objects from Multiple Cameras cs.CV · 2026-05-03 · unverdicted · none · ref 34
SimPB++ unifies multi-view 2D perspective and 3D BEV object detection in one model via an interactive hybrid decoder, reporting state-of-the-art results on nuScenes and long-range detection up to 150 m on Argoverse2.
CAM3DNet: Comprehensively mining the multi-scale features for 3D Object Detection with Multi-View Cameras cs.CV · 2026-04-18 · unverdicted · none · ref 6
CAM3DNet outperforms prior camera-based 3D detectors on nuScenes, Waymo and Argoverse by using three new modules to better mine multi-scale spatiotemporal features from 2D queries and pyramid maps.
RQR3D: Reparametrizing the regression targets for BEV-based 3D object detection cs.CV · 2025-05-23 · unverdicted · none · ref 14
RQR3D reparametrizes oriented bounding box regression in BEV 3D detection as regressing a horizontal box plus corner offsets and achieves SOTA camera-radar performance on nuScenes with 67.5 NDS and 59.7 mAP.
SemLT3D: Semantic-Guided Expert Distillation for Camera-only Long-Tailed 3D Object Detection cs.CV · 2026-04-20 · unverdicted · none · ref 14
SemLT3D introduces semantic-guided expert distillation with a language MoE module and CLIP projection to enrich features for long-tailed classes in camera-only 3D detection.
Revisiting Token Compression for Accelerating ViT-based Sparse Multi-View 3D Object Detectors cs.CV · 2026-04-16 · conditional · none · ref 15
SEPatch3D accelerates ViT-based 3D object detectors up to 57% faster than StreamPETR via dynamic patch sizing and cross-granularity enhancement while keeping comparable accuracy on nuScenes and Argoverse 2.
Not All Agents Matter: From Global Attention Dilution to Risk-Prioritized Game Planning cs.CV · 2026-04-07 · unverdicted · none · ref 9
GameAD models autonomous driving as a risk-prioritized game among agents via Risk-Aware Topology Anchoring, Minimax Risk-Aware Sparse Attention and related components, yielding safer trajectories than prior end-to-end methods on nuScenes and Bench2Drive.
Multi-Modal Sensor Fusion using Hybrid Attention for Autonomous Driving cs.CV · 2026-04-06 · unverdicted · none · ref 9
MMF-BEV fuses camera and radar branches with deformable self- and cross-attention, outperforming unimodal baselines on the VoD 4D radar dataset through a two-stage training process.
BEVPredFormer: Spatio-temporal Attention for BEV Instance Prediction in Autonomous Driving cs.CV · 2026-04-03 · unverdicted · none · ref 12
BEVPredFormer uses attention-based temporal processing and 3D camera projection to match or exceed prior methods on nuScenes for BEV instance prediction.
Fast-BEV++: Fast by Algorithm, Deployable by Design cs.CV · 2025-12-09 · unverdicted · none · ref 2
Fast-BEV++ achieves at least 3x speedup over Fast-BEV, a new SOTA of 0.488 NDS on nuScenes 3D detection, and over 134 FPS inference by redesigning the core transformation pipeline and adding a learnable depth module.

InarXiv preprint arXiv:2203.17054

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer