{"record_type":"pith_number_record","schema_url":"https://pith.science/schemas/pith-number/v1.json","pith_number":"pith:2021:V4JMSFK6F45TDOC6NPE4ETERC6","short_pith_number":"pith:V4JMSFK6","schema_version":"1.0","canonical_sha256":"af12c9155e2f3b31b85e6bc9c24c91178aa5ca4c79c1d5eb6e50a9faa3f6c61e","source":{"kind":"arxiv","id":"2112.11790","version":3},"attestation_state":"computed","paper":{"title":"BEVDet: High-performance Multi-camera 3D Object Detection in Bird-Eye-View","license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","headline":"BEVDet detects 3D objects in bird-eye-view by reusing standard modules plus custom data augmentation and upgraded NMS.","cross_cats":[],"primary_cat":"cs.CV","authors_text":"Dalong Du, Guan Huang, Junjie Huang, Yun Ye, Zheng Zhu","submitted_at":"2021-12-22T10:48:06Z","abstract_excerpt":"Autonomous driving perceives its surroundings for decision making, which is one of the most complex scenarios in visual perception. The success of paradigm innovation in solving the 2D object detection task inspires us to seek an elegant, feasible, and scalable paradigm for fundamentally pushing the performance boundary in this area. To this end, we contribute the BEVDet paradigm in this paper. BEVDet performs 3D object detection in Bird-Eye-View (BEV), where most target values are defined and route planning can be handily performed. We merely reuse existing modules to build its framework but "},"verification_status":{"content_addressed":true,"pith_receipt":true,"author_attested":false,"weak_author_claims":0,"strong_author_claims":0,"externally_anchored":false,"storage_verified":false,"citation_signatures":0,"replication_records":0,"graph_snapshot":true,"references_resolved":true,"formal_links_present":true},"canonical_record":{"source":{"id":"2112.11790","kind":"arxiv","version":3},"metadata":{"license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","primary_cat":"cs.CV","submitted_at":"2021-12-22T10:48:06Z","cross_cats_sorted":[],"title_canon_sha256":"22725f03a1294bed6ad437d3134bc6201141149ed621d202df7fb1a5cc5d1244","abstract_canon_sha256":"b4e975b47487059937fd96bd00e5f76acd9f85c939ce8e73d436f37c498caaa7"},"schema_version":"1.0"},"receipt":{"kind":"pith_receipt","key_id":"pith-v1-2026-05","algorithm":"ed25519","signed_at":"2026-05-17T23:38:52.339703Z","signature_b64":"LFT1Zn+34Z86C+khX2rVTMiqZfxLfxLeLM/K6DdmAB+cMg4TFguJ9jK7+9T+pHJEqFynMtW1LFn8OMDEDaaYDw==","signed_message":"canonical_sha256_bytes","builder_version":"pith-number-builder-2026-05-17-v1","receipt_version":"0.3","canonical_sha256":"af12c9155e2f3b31b85e6bc9c24c91178aa5ca4c79c1d5eb6e50a9faa3f6c61e","last_reissued_at":"2026-05-17T23:38:52.339275Z","signature_status":"signed_v1","first_computed_at":"2026-05-17T23:38:52.339275Z","public_key_fingerprint":"8d4b5ee74e4693bcd1df2446408b0d54"},"graph_snapshot":{"paper":{"title":"BEVDet: High-performance Multi-camera 3D Object Detection in Bird-Eye-View","license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","headline":"BEVDet detects 3D objects in bird-eye-view by reusing standard modules plus custom data augmentation and upgraded NMS.","cross_cats":[],"primary_cat":"cs.CV","authors_text":"Dalong Du, Guan Huang, Junjie Huang, Yun Ye, Zheng Zhu","submitted_at":"2021-12-22T10:48:06Z","abstract_excerpt":"Autonomous driving perceives its surroundings for decision making, which is one of the most complex scenarios in visual perception. The success of paradigm innovation in solving the 2D object detection task inspires us to seek an elegant, feasible, and scalable paradigm for fundamentally pushing the performance boundary in this area. To this end, we contribute the BEVDet paradigm in this paper. BEVDet performs 3D object detection in Bird-Eye-View (BEV), where most target values are defined and route planning can be handily performed. We merely reuse existing modules to build its framework but "},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"BEVDet-Base scores 39.3% mAP and 47.2% NDS, significantly exceeding all published results. With a comparable inference speed, it surpasses FCOS3D by a large margin of +9.8% mAP and +10.0% NDS.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"That the exclusive data augmentation strategy and upgraded NMS will deliver consistent gains on unseen environments and datasets without introducing hidden biases or requiring extensive per-dataset retuning.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"BEVDet achieves 39.3% mAP and 47.2% NDS on nuScenes val set with a fast BEV-based multi-camera 3D detector that outperforms FCOS3D while using far less compute in its tiny variant.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"BEVDet detects 3D objects in bird-eye-view by reusing standard modules plus custom data augmentation and upgraded NMS.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"7f039aa23bd69fc8c6f26f89f615b96e98821587e2688cd70f8b22933107d0a9"},"source":{"id":"2112.11790","kind":"arxiv","version":3},"verdict":{"id":"31df25e2-e85a-4bb0-9d5e-9a7e308fabc7","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-15T14:30:46.098411Z","strongest_claim":"BEVDet-Base scores 39.3% mAP and 47.2% NDS, significantly exceeding all published results. With a comparable inference speed, it surpasses FCOS3D by a large margin of +9.8% mAP and +10.0% NDS.","one_line_summary":"BEVDet achieves 39.3% mAP and 47.2% NDS on nuScenes val set with a fast BEV-based multi-camera 3D detector that outperforms FCOS3D while using far less compute in its tiny variant.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"That the exclusive data augmentation strategy and upgraded NMS will deliver consistent gains on unseen environments and datasets without introducing hidden biases or requiring extensive per-dataset retuning.","pith_extraction_headline":"BEVDet detects 3D objects in bird-eye-view by reusing standard modules plus custom data augmentation and upgraded NMS."},"references":{"count":64,"sample":[{"doi":"","year":2020,"title":"In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","work_id":"5a67f25e-fce0-4d85-8451-c8ee4ad8134c","ref_index":1,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2019,"title":"IEEE Transactions on Pattern Analysis and Machine In- telligence (2019)","work_id":"a26396a0-acef-4950-b032-fdc9317acc83","ref_index":2,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2020,"title":"In: Proceedings of the European Conference on Computer Vision","work_id":"bfa3a195-e0d3-476f-b23f-1b565a759cfa","ref_index":3,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2019,"title":"In: Pro- ceedings of the IEEE Conference on Computer Vision and Pattern Recognition","work_id":"ae30c326-2273-4efd-821e-e47c0576ae08","ref_index":4,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2020,"title":"In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition","work_id":"d25c5d1d-9808-4a00-a6fa-cc651fbab0bd","ref_index":5,"cited_arxiv_id":"","is_internal_anchor":false}],"resolved_work":64,"snapshot_sha256":"1a8911548b9b0163f488f0cd441744091db3de0e1b358641f145dcd4ad217208","internal_anchors":2},"formal_canon":{"evidence_count":2,"snapshot_sha256":"e0862ba810c8c89a1eefe7a283181f26f048cf1740e8d0176d76d3a2f9634ab4"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"},"aliases":[{"alias_kind":"arxiv","alias_value":"2112.11790","created_at":"2026-05-17T23:38:52.339348+00:00"},{"alias_kind":"arxiv_version","alias_value":"2112.11790v3","created_at":"2026-05-17T23:38:52.339348+00:00"},{"alias_kind":"doi","alias_value":"10.48550/arxiv.2112.11790","created_at":"2026-05-17T23:38:52.339348+00:00"},{"alias_kind":"pith_short_12","alias_value":"V4JMSFK6F45T","created_at":"2026-05-18T12:33:33.725879+00:00"},{"alias_kind":"pith_short_16","alias_value":"V4JMSFK6F45TDOC6","created_at":"2026-05-18T12:33:33.725879+00:00"},{"alias_kind":"pith_short_8","alias_value":"V4JMSFK6","created_at":"2026-05-18T12:33:33.725879+00:00"}],"events":[],"event_summary":{},"paper_claims":[],"inbound_citations":{"count":30,"internal_anchor_count":30,"sample":[{"citing_arxiv_id":"2512.08237","citing_title":"Fast-BEV++: Fast by Algorithm, Deployable by Design","ref_index":3,"is_internal_anchor":true},{"citing_arxiv_id":"2605.18059","citing_title":"Bench2Drive-Robust: Benchmarking Closed-Loop Autonomous Driving under Deployment Perturbations","ref_index":26,"is_internal_anchor":true},{"citing_arxiv_id":"2605.16911","citing_title":"VGGT-Occ: Geometry-Grounded and Density-Aware Gated Fusion for 3D Occupancy Prediction","ref_index":10,"is_internal_anchor":true},{"citing_arxiv_id":"2505.17732","citing_title":"RQR3D: Reparametrizing the regression targets for BEV-based 3D object detection","ref_index":13,"is_internal_anchor":true},{"citing_arxiv_id":"2506.07002","citing_title":"BePo: Dual Representation for 3D Occupancy Prediction","ref_index":15,"is_internal_anchor":true},{"citing_arxiv_id":"2507.04503","citing_title":"U-ViLAR: Uncertainty-Aware Visual Localization for Autonomous Driving via Differentiable Association and Registration","ref_index":16,"is_internal_anchor":true},{"citing_arxiv_id":"2510.12796","citing_title":"DriveVLA-W0: World Models Amplify Data Scaling Law in Autonomous Driving","ref_index":18,"is_internal_anchor":true},{"citing_arxiv_id":"2512.23421","citing_title":"DriveLaW:Unifying Planning and Video Generation in a Latent Driving World","ref_index":32,"is_internal_anchor":true},{"citing_arxiv_id":"2602.06400","citing_title":"TFusionOcc: T-Primitive Based Object-Centric Multi-Sensor Fusion Framework for 3D Occupancy Prediction","ref_index":38,"is_internal_anchor":true},{"citing_arxiv_id":"2603.01558","citing_title":"TopoMaskV3: 3D Mask Head with Dense Offset and Height Predictions for Road Topology Understanding","ref_index":10,"is_internal_anchor":true},{"citing_arxiv_id":"2603.11566","citing_title":"R4Det: 4D Radar-Camera Fusion for High-Performance 3D Object Detection","ref_index":5,"is_internal_anchor":true},{"citing_arxiv_id":"2605.12743","citing_title":"Still Camouflage, Moving Illusion: View-Induced Trajectory Manipulation in Autonomous Driving","ref_index":17,"is_internal_anchor":true},{"citing_arxiv_id":"2604.00813","citing_title":"DVGT-2: Vision-Geometry-Action Model for Autonomous Driving at Scale","ref_index":18,"is_internal_anchor":true},{"citing_arxiv_id":"2604.02930","citing_title":"BEVPredFormer: Spatio-temporal Attention for BEV Instance Prediction in Autonomous Driving","ref_index":10,"is_internal_anchor":true},{"citing_arxiv_id":"2605.12297","citing_title":"EgoEV-HandPose: Egocentric 3D Hand Pose Estimation and Gesture Recognition with Stereo Event Cameras","ref_index":46,"is_internal_anchor":true},{"citing_arxiv_id":"2605.11594","citing_title":"PointForward: Feedforward Driving Reconstruction through Point-Aligned Representations","ref_index":10,"is_internal_anchor":true},{"citing_arxiv_id":"2605.05072","citing_title":"Height-Guided Projection Reparameterization for Camera-LiDAR Occupancy","ref_index":15,"is_internal_anchor":true},{"citing_arxiv_id":"2605.04355","citing_title":"InterFuserDVS: Event-Enhanced Sensor Fusion for Safe RL-Based Decision Making","ref_index":26,"is_internal_anchor":true},{"citing_arxiv_id":"2605.05072","citing_title":"Height-Guided Projection Reparameterization for Camera-LiDAR Occupancy","ref_index":15,"is_internal_anchor":true},{"citing_arxiv_id":"2605.01924","citing_title":"SimPB++: Simultaneously Detecting 2D and 3D Objects from Multiple Cameras","ref_index":33,"is_internal_anchor":true},{"citing_arxiv_id":"2604.17915","citing_title":"OneDrive: Unified Multi-Paradigm Driving with Vision-Language-Action Models","ref_index":21,"is_internal_anchor":true},{"citing_arxiv_id":"2604.12918","citing_title":"Radar-Camera BEV Multi-Task Learning with Cross-Task Attention Bridge for Joint 3D Detection and Segmentation","ref_index":4,"is_internal_anchor":true},{"citing_arxiv_id":"2604.08074","citing_title":"DinoRADE: Full Spectral Radar-Camera Fusion with Vision Foundation Model Features for Multi-class Object Detection in Adverse Weather","ref_index":13,"is_internal_anchor":true},{"citing_arxiv_id":"2604.04797","citing_title":"Multi-Modal Sensor Fusion using Hybrid Attention for Autonomous Driving","ref_index":8,"is_internal_anchor":true},{"citing_arxiv_id":"2604.05449","citing_title":"Not All Agents Matter: From Global Attention Dilution to Risk-Prioritized Game Planning","ref_index":10,"is_internal_anchor":true}]},"formal_canon":{"evidence_count":2,"sample":[],"anchors":[]},"links":{"html":"https://pith.science/pith/V4JMSFK6F45TDOC6NPE4ETERC6","json":"https://pith.science/pith/V4JMSFK6F45TDOC6NPE4ETERC6.json","graph_json":"https://pith.science/api/pith-number/V4JMSFK6F45TDOC6NPE4ETERC6/graph.json","events_json":"https://pith.science/api/pith-number/V4JMSFK6F45TDOC6NPE4ETERC6/events.json","paper":"https://pith.science/paper/V4JMSFK6"},"agent_actions":{"view_html":"https://pith.science/pith/V4JMSFK6F45TDOC6NPE4ETERC6","download_json":"https://pith.science/pith/V4JMSFK6F45TDOC6NPE4ETERC6.json","view_paper":"https://pith.science/paper/V4JMSFK6","resolve_alias":"https://pith.science/api/pith-number/resolve?arxiv=2112.11790&json=true","fetch_graph":"https://pith.science/api/pith-number/V4JMSFK6F45TDOC6NPE4ETERC6/graph.json","fetch_events":"https://pith.science/api/pith-number/V4JMSFK6F45TDOC6NPE4ETERC6/events.json","actions":{"anchor_timestamp":"https://pith.science/pith/V4JMSFK6F45TDOC6NPE4ETERC6/action/timestamp_anchor","attest_storage":"https://pith.science/pith/V4JMSFK6F45TDOC6NPE4ETERC6/action/storage_attestation","attest_author":"https://pith.science/pith/V4JMSFK6F45TDOC6NPE4ETERC6/action/author_attestation","sign_citation":"https://pith.science/pith/V4JMSFK6F45TDOC6NPE4ETERC6/action/citation_signature","submit_replication":"https://pith.science/pith/V4JMSFK6F45TDOC6NPE4ETERC6/action/replication_record"}},"created_at":"2026-05-17T23:38:52.339348+00:00","updated_at":"2026-05-17T23:38:52.339348+00:00"}