Learning Adaptive Pseudo-Label Selection for Semi-Supervised 3D Object Detection
Pith reviewed 2026-05-18 11:48 UTC · model grok-4.3
The pith
Two networks trained on pseudo-label alignment with ground truth enable automatic adaptive selection of high-quality labels for semi-supervised 3D object detection.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that a learnable pseudo-labeling module, built from a quality-assessment network performing score fusion and a threshold network producing context-adaptive decisions, can be supervised by the geometric alignment of pseudo-labels to ground-truth boxes; when combined with soft supervision that prioritizes cleaner labels, this module selects high-precision pseudo-labels while preserving wider contextual coverage and higher recall than fixed-threshold or earlier dynamic methods.
What carries the argument
A learnable pseudo-labeling module containing a quality-assessment network and a context-adaptive threshold network, both supervised by the alignment between pseudo-labels and ground-truth bounding boxes.
If this is right
- The method produces higher overall performance than prior SS3DOD approaches on the KITTI and Waymo benchmarks.
- Selected pseudo-labels achieve high precision together with broader coverage of contexts and improved recall rates.
- Soft supervision lets the student network focus training on cleaner labels even when some pseudo-label noise remains.
- Context-aware thresholds replace the need for hand-tuned confidence cutoffs.
Where Pith is reading between the lines
- The same alignment-based supervision could be tested on other pseudo-labeling pipelines outside 3D detection.
- Removing manual threshold search may simplify scaling to new unlabeled datasets or sensor setups.
- The two-network design might be combined with different teacher-student architectures to check for further gains.
Load-bearing premise
Alignment of pseudo-labels with ground-truth bounding boxes supplies reliable and sufficient supervision for the quality-assessment and threshold networks without introducing harmful bias.
What would settle it
On a held-out validation set with full ground truth, measure whether the adaptive module's selected labels show measurably higher precision at the same or higher recall than a fixed-threshold baseline; if no consistent gain appears across distance and class bins, the adaptive selection claim is refuted.
Figures
read the original abstract
Semi-supervised 3D object detection (SS3DOD) aims to reduce costly 3D annotations utilizing unlabeled data. Recent studies adopt pseudo-label-based teacher-student frameworks and demonstrate impressive performance. The main challenge of these frameworks is in selecting high-quality pseudo-labels from the teacher's predictions. Most previous methods, however, select pseudo-labels by comparing confidence scores over thresholds manually set. The latest works tackle the challenge either by dynamic thresholding or refining the quality of pseudo-labels. Such methods still overlook contextual information e.g. object distances, classes, and learning states, and inadequately assess the pseudo-label quality using partial information available from the networks. In this work, we propose a novel SS3DOD framework featuring a learnable pseudo-labeling module designed to automatically and adaptively select high-quality pseudo-labels. Our approach introduces two networks at the teacher output level. These networks reliably assess the quality of pseudo-labels by the score fusion and determine context-adaptive thresholds, which are supervised by the alignment of pseudo-labels over GT bounding boxes. Additionally, we introduce a soft supervision strategy that can learn robustly under pseudo-label noises. This helps the student network prioritize cleaner labels over noisy ones in semi-supervised learning. Extensive experiments on the KITTI and Waymo datasets demonstrate the effectiveness of our method. The proposed method selects high-precision pseudo-labels while maintaining a wider coverage of contexts and a higher recall rate, significantly improving relevant SS3DOD methods.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes a semi-supervised 3D object detection (SS3DOD) framework that augments teacher-student pseudo-labeling with a learnable module containing a quality-assessment network (via score fusion) and a threshold network that produces context-adaptive thresholds. Both networks are supervised by alignment between teacher pseudo-labels and ground-truth boxes on labeled data; a soft-supervision strategy is added to let the student down-weight noisy labels. Experiments on KITTI and Waymo are reported to yield higher-precision pseudo-labels, broader contextual coverage, and improved recall over prior SS3DOD baselines.
Significance. If the empirical gains are reproducible and the learned selector generalizes, the work would meaningfully advance pseudo-label selection in 3D detection by replacing hand-tuned or non-contextual thresholds with data-driven, context-aware components. The soft-supervision mechanism is a constructive addition for noise robustness.
major comments (2)
- [Abstract / §3] Abstract and §3 (method): the claim that the two networks learn a generalizable quality function rests on supervision obtained solely from GT alignment on labeled data. Because this supervision is unavailable on the unlabeled distribution, it is unclear whether the resulting thresholds and quality scores avoid systematic under- or over-selection when object distances, classes, or learning states differ from the labeled subset; this directly underpins the stated improvements in recall and contextual coverage.
- [§4] §4 (experiments): the abstract asserts that the method 'significantly improving relevant SS3DOD methods' yet provides no numerical deltas, ablation tables isolating the contribution of the quality-assessment versus threshold network, or direct comparison against recent dynamic-thresholding baselines; without these the central performance claim cannot be verified.
minor comments (2)
- Clarify the precise input features and output heads of the two networks and how their predictions are fused at inference time on unlabeled frames.
- Specify the exact formulation of the soft-supervision loss (e.g., weighting scheme or temperature) and whether it is applied only to the student or also back to the teacher.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback. We address each major comment below and indicate the revisions we will make.
read point-by-point responses
-
Referee: [Abstract / §3] Abstract and §3 (method): the claim that the two networks learn a generalizable quality function rests on supervision obtained solely from GT alignment on labeled data. Because this supervision is unavailable on the unlabeled distribution, it is unclear whether the resulting thresholds and quality scores avoid systematic under- or over-selection when object distances, classes, or learning states differ from the labeled subset; this directly underpins the stated improvements in recall and contextual coverage.
Authors: The quality-assessment and threshold networks are supervised exclusively on labeled data where GT alignments are available. These networks learn to map contextual features—object distances, classes, and model learning state—to quality scores and adaptive thresholds. The same feature extraction is used for unlabeled data, so the learned mapping is applied directly. Experiments on KITTI and Waymo, which contain diverse distances and classes, show higher recall and broader context coverage than baselines, providing empirical evidence that systematic under- or over-selection is mitigated. In revision we will add a paragraph in §3 explicitly discussing the generalization assumption and its empirical support. revision: partial
-
Referee: [§4] §4 (experiments): the abstract asserts that the method 'significantly improving relevant SS3DOD methods' yet provides no numerical deltas, ablation tables isolating the contribution of the quality-assessment versus threshold network, or direct comparison against recent dynamic-thresholding baselines; without these the central performance claim cannot be verified.
Authors: The current abstract statement is qualitative. Section 4 already reports mAP and recall gains over multiple SS3DOD baselines on both KITTI and Waymo. To address the request directly, we will (1) insert concrete numerical deltas into the abstract, (2) add ablation tables that isolate the quality-assessment network from the threshold network, and (3) include comparisons against recent dynamic-thresholding methods. These changes will be incorporated in the revised manuscript. revision: yes
Circularity Check
No significant circularity in derivation chain
full rationale
The paper's central mechanism trains quality-assessment and threshold networks via explicit supervision from alignment between teacher pseudo-labels and ground-truth boxes on labeled data, then applies the learned context-adaptive thresholds to unlabeled pseudo-labels. This external GT-derived signal is independent of the model's own predictions on the target unlabeled distribution and does not reduce any claimed prediction or result to a fitted input or self-definition by construction. No equations or steps in the abstract or description equate the output selection to the input supervision through renaming or tautology. The soft-supervision strategy for the student is likewise a standard noise-robust loss rather than a circular re-use of the same fitted values. The overall framework therefore remains self-contained with respect to external benchmarks on KITTI and Waymo.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Pseudo-label quality can be reliably assessed and thresholded by learned score fusion and context features.
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
PQE encodes teacher scores … to Q(xs_i) predicting GT-IoU; trained by LPQE = MSE(Q(xs_i), IoU(b_i, bGT_i))
-
IndisputableMonolith/Foundation/BranchSelection.leanbranch_selection unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
CTE learns T(c_i, d_i | θ_t) via Lthr on false-positive/negative cases w.r.t. τ_iou
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Honggyu Choi, Zhixiang Chen, Xuepeng Shi, and Tae- Kyun Kim. Semi-supervised object detection with object- wise contrastive learning and regression uncertainty.arXiv preprint arXiv:2212.02747, 2022. 3
-
[2]
V oxel r-cnn: Towards high performance voxel-based 3d object detection
Jiajun Deng, Shaoshuai Shi, Peiwei Li, Wengang Zhou, Yanyong Zhang, and Houqiang Li. V oxel r-cnn: Towards high performance voxel-based 3d object detection. InPro- ceedings of the AAAI conference on artificial intelligence, pages 1201–1209, 2021. 2, 3, 6, 1
work page 2021
-
[3]
Are we ready for autonomous driving? the kitti vision benchmark suite
Andreas Geiger, Philip Lenz, and Raquel Urtasun. Are we ready for autonomous driving? the kitti vision benchmark suite. In2012 IEEE conference on computer vision and pat- tern recognition, pages 3354–3361. IEEE, 2012. 6
work page 2012
-
[4]
Cross-modality knowl- edge distillation network for monocular 3d object detection
Yu Hong, Hang Dai, and Yong Ding. Cross-modality knowl- edge distillation network for monocular 3d object detection. InEuropean Conference on Computer Vision, pages 87–104. Springer, 2022. 2
work page 2022
-
[5]
Label propagation for deep semi-supervised learning
Ahmet Iscen, Giorgos Tolias, Yannis Avrithis, and Ondrej Chum. Label propagation for deep semi-supervised learning. InProceedings of the IEEE/CVF conference on computer vi- sion and pattern recognition, pages 5070–5079, 2019. 3
work page 2019
-
[6]
Semi- supervised 3d object detection with channel augmentation using transformation equivariance
Minju Kang, Taehun Kong, and Tae-Kyun Kim. Semi- supervised 3d object detection with channel augmentation using transformation equivariance. In2024 IEEE Interna- tional Conference on Image Processing (ICIP), pages 638–
-
[7]
Pointpillars: Fast encoders for object detection from point clouds
Alex H Lang, Sourabh V ora, Holger Caesar, Lubing Zhou, Jiong Yang, and Oscar Beijbom. Pointpillars: Fast encoders for object detection from point clouds. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 12697–12705, 2019. 2, 3
work page 2019
-
[8]
Pseudo-label: The simple and effi- cient semi-supervised learning method for deep neural net- works
Dong-Hyun Lee et al. Pseudo-label: The simple and effi- cient semi-supervised learning method for deep neural net- works. InWorkshop on challenges in representation learn- ing, ICML, page 896. Atlanta, 2013. 3
work page 2013
-
[9]
Dds3d: Dense pseudo-labels with dynamic threshold for semi-supervised 3d object detection
Jingyu Li, Zhe Liu, Jinghua Hou, and Dingkang Liang. Dds3d: Dense pseudo-labels with dynamic threshold for semi-supervised 3d object detection. In2023 IEEE Inter- national Conference on Robotics and Automation (ICRA), pages 9245–9252. IEEE, 2023. 1, 2, 3, 7
work page 2023
-
[10]
Siyuan Li, Weiyang Jin, Zedong Wang, Fang Wu, Zicheng Liu, Cheng Tan, and Stan Z Li. Semireward: A general reward model for semi-supervised learning.arXiv preprint arXiv:2310.03013, 2023. 3
-
[11]
Lidar r-cnn: An efficient and universal 3d object detector
Zhichao Li, Feng Wang, and Naiyan Wang. Lidar r-cnn: An efficient and universal 3d object detector. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 7546–7555, 2021. 3
work page 2021
-
[12]
Hierarchical supervision and shuffle data augmentation for 3d semi-supervised ob- ject detection
Chuandong Liu, Chenqiang Gao, Fangcen Liu, Pengcheng Li, Deyu Meng, and Xinbo Gao. Hierarchical supervision and shuffle data augmentation for 3d semi-supervised ob- ject detection. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 23819– 23828, 2023. 1, 2, 3, 4, 5, 6, 7, 8
work page 2023
-
[13]
Pyramid r-cnn: Towards bet- ter performance and adaptability for 3d object detection
Jiageng Mao, Minzhe Niu, Haoyue Bai, Xiaodan Liang, Hang Xu, and Chunjing Xu. Pyramid r-cnn: Towards bet- ter performance and adaptability for 3d object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 2723–2732, 2021. 3
work page 2021
-
[14]
V oxel transformer for 3d object detection
Jiageng Mao, Yujing Xue, Minzhe Niu, Haoyue Bai, Ji- ashi Feng, Xiaodan Liang, Hang Xu, and Chunjing Xu. V oxel transformer for 3d object detection. InProceedings of the IEEE/CVF international conference on computer vision, pages 3164–3173, 2021. 2
work page 2021
-
[15]
Takeru Miyato, Shin-ichi Maeda, Masanori Koyama, and Shin Ishii. Virtual adversarial training: a regularization method for supervised and semi-supervised learning.IEEE transactions on pattern analysis and machine intelligence, 41(8):1979–1993, 2018. 3
work page 1979
-
[16]
Reliable student: Addressing noise in semi-supervised 3d object detection
Farzad Nozarian, Shashank Agarwal, Farzaneh Rezaeia- naran, Danish Shahzad, Atanas Poibrenski, Christian M¨uller, and Philipp Slusallek. Reliable student: Addressing noise in semi-supervised 3d object detection. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4981–4990, 2023. 1, 3, 7
work page 2023
-
[17]
3d object detection with pointformer
Xuran Pan, Zhuofan Xia, Shiji Song, Li Erran Li, and Gao Huang. 3d object detection with pointformer. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 7463–7472, 2021. 2
work page 2021
-
[18]
Clocs: Camera- lidar object candidates fusion for 3d object detection
Su Pang, Daniel Morris, and Hayder Radha. Clocs: Camera- lidar object candidates fusion for 3d object detection. In2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 10386–10393. IEEE, 2020. 4
work page 2020
-
[19]
Fast-clocs: Fast camera-lidar object candidates fusion for 3d object detection
Su Pang, Daniel Morris, and Hayder Radha. Fast-clocs: Fast camera-lidar object candidates fusion for 3d object detection. InProceedings of the IEEE/CVF Winter Conference on Ap- plications of Computer Vision, pages 187–196, 2022. 4
work page 2022
-
[20]
Detmatch: Two teachers are bet- ter than one for joint 2d and 3d semi-supervised object de- tection
Jinhyung Park, Chenfeng Xu, Yiyang Zhou, Masayoshi Tomizuka, and Wei Zhan. Detmatch: Two teachers are bet- ter than one for joint 2d and 3d semi-supervised object de- tection. InEuropean Conference on Computer Vision, pages 370–389. Springer, 2022. 1, 3, 6, 7
work page 2022
-
[21]
Pointnet: Deep learning on point sets for 3d classification and segmentation
Charles R Qi, Hao Su, Kaichun Mo, and Leonidas J Guibas. Pointnet: Deep learning on point sets for 3d classification and segmentation. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 652–660,
-
[22]
Charles Ruizhongtai Qi, Li Yi, Hao Su, and Leonidas J Guibas. Pointnet++: Deep hierarchical feature learning on point sets in a metric space.Advances in neural information processing systems, 30, 2017. 2
work page 2017
-
[23]
Deep hough voting for 3d object detection in point clouds
Charles R Qi, Or Litany, Kaiming He, and Leonidas J Guibas. Deep hough voting for 3d object detection in point clouds. Inproceedings of the IEEE/CVF International Con- ference on Computer Vision, pages 9277–9286, 2019. 2
work page 2019
-
[24]
Temporal ensembling for semi- supervised learning
Laine Samuli and Aila Timo. Temporal ensembling for semi- supervised learning. InInternational Conference on Learn- ing Representations (ICLR), page 6, 2017. 3
work page 2017
-
[25]
Improving 3d object detection with channel-wise transformer
Hualian Sheng, Sijia Cai, Yuan Liu, Bing Deng, Jianqiang Huang, Xian-Sheng Hua, and Min-Jian Zhao. Improving 3d object detection with channel-wise transformer. InProceed- ings of the IEEE/CVF international conference on computer vision, pages 2743–2752, 2021. 3
work page 2021
-
[26]
Pointr- cnn: 3d object proposal generation and detection from point cloud
Shaoshuai Shi, Xiaogang Wang, and Hongsheng Li. Pointr- cnn: 3d object proposal generation and detection from point cloud. InProceedings of the IEEE/CVF conference on com- puter vision and pattern recognition, pages 770–779, 2019. 2
work page 2019
-
[27]
Pv-rcnn: Point- voxel feature set abstraction for 3d object detection
Shaoshuai Shi, Chaoxu Guo, Li Jiang, Zhe Wang, Jianping Shi, Xiaogang Wang, and Hongsheng Li. Pv-rcnn: Point- voxel feature set abstraction for 3d object detection. InPro- ceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10529–10538, 2020. 3, 6, 1
work page 2020
-
[28]
Shaoshuai Shi, Zhe Wang, Jianping Shi, Xiaogang Wang, and Hongsheng Li. From points to parts: 3d object detection from point cloud with part-aware and part-aggregation net- work.IEEE transactions on pattern analysis and machine intelligence, 43(8):2647–2664, 2020. 2
work page 2020
-
[29]
Shaoshuai Shi, Li Jiang, Jiajun Deng, Zhe Wang, Chaoxu Guo, Jianping Shi, Xiaogang Wang, and Hongsheng Li. Pv- rcnn++: Point-voxel feature set abstraction with local vector representation for 3d object detection.International Journal of Computer Vision, 131(2):531–551, 2023. 3
work page 2023
-
[30]
Point-gnn: Graph neural net- work for 3d object detection in a point cloud
Weijing Shi and Raj Rajkumar. Point-gnn: Graph neural net- work for 3d object detection in a point cloud. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1711–1719, 2020. 2
work page 2020
-
[31]
Distance- normalized unified representation for monocular 3d object detection
Xuepeng Shi, Zhixiang Chen, and Tae-Kyun Kim. Distance- normalized unified representation for monocular 3d object detection. InEuropean Conference on Computer Vision, pages 91–107. Springer, 2020. 2
work page 2020
-
[32]
Geometry-based dis- tance decomposition for monocular 3d object detection
Xuepeng Shi, Qi Ye, Xiaozhi Chen, Chuangrong Chen, Zhixiang Chen, and Tae-Kyun Kim. Geometry-based dis- tance decomposition for monocular 3d object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 15172–15181, 2021
work page 2021
-
[33]
Multivari- ate probabilistic monocular 3d object detection
Xuepeng Shi, Zhixiang Chen, and Tae-Kyun Kim. Multivari- ate probabilistic monocular 3d object detection. InProceed- ings of the IEEE/CVF winter conference on applications of computer vision, pages 4281–4290, 2023. 2
work page 2023
-
[34]
Fixmatch: Simplifying semi-supervised learning with consistency and confidence
Kihyuk Sohn, David Berthelot, Nicholas Carlini, Zizhao Zhang, Han Zhang, Colin A Raffel, Ekin Dogus Cubuk, Alexey Kurakin, and Chun-Liang Li. Fixmatch: Simplifying semi-supervised learning with consistency and confidence. Advances in neural information processing systems, 33:596– 608, 2020. 3
work page 2020
-
[35]
A Simple Semi-Supervised Learning Framework for Object Detection , publisher =
Kihyuk Sohn, Zizhao Zhang, Chun-Liang Li, Han Zhang, Chen-Yu Lee, and Tomas Pfister. A simple semi-supervised learning framework for object detection.arXiv preprint arXiv:2005.04757, 2020. 3
-
[36]
Scalability in perception for autonomous driving: Waymo open dataset
Pei Sun, Henrik Kretzschmar, Xerxes Dotiwalla, Aurelien Chouard, Vijaysai Patnaik, Paul Tsui, James Guo, Yin Zhou, Yuning Chai, Benjamin Caine, et al. Scalability in perception for autonomous driving: Waymo open dataset. InProceed- ings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2446–2454, 2020. 6
work page 2020
-
[37]
Matthew Tancik, Pratul Srinivasan, Ben Mildenhall, Sara Fridovich-Keil, Nithin Raghavan, Utkarsh Singhal, Ravi Ra- mamoorthi, Jonathan Barron, and Ren Ng. Fourier features let networks learn high frequency functions in low dimen- sional domains.Advances in neural information processing systems, 33:7537–7547, 2020. 6
work page 2020
-
[38]
Antti Tarvainen and Harri Valpola. Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results.Advances in neural information processing systems, 30, 2017. 3
work page 2017
-
[39]
Chuxin Wang, Wenfei Yang, and Tianzhu Zhang. Not ev- ery side is equal: Localization uncertainty estimation for semi-supervised 3d object detection. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 3814–3824, 2023. 3
work page 2023
-
[40]
3dioumatch: Leveraging iou prediction for semi- supervised 3d object detection
He Wang, Yezhen Cong, Or Litany, Yue Gao, and Leonidas J Guibas. 3dioumatch: Leveraging iou prediction for semi- supervised 3d object detection. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14615–14624, 2021. 1, 2, 3, 4, 7
work page 2021
-
[41]
A-teacher: Asymmetric network for 3d semi-supervised ob- ject detection
Hanshi Wang, Zhipeng Zhang, Jin Gao, and Weiming Hu. A-teacher: Asymmetric network for 3d semi-supervised ob- ject detection. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14978– 14987, 2024. 1, 3, 6, 7
work page 2024
-
[42]
Xinjiang Wang, Xingyi Yang, Shilong Zhang, Yijiang Li, Litong Feng, Shijie Fang, Chengqi Lyu, Kai Chen, and Wayne Zhang. Consistent-teacher: Towards reducing incon- sistent pseudo-targets in semi-supervised object detection. In Proceedings of the IEEE/CVF conference on computer vi- sion and pattern recognition, pages 3240–3249, 2023. 3
work page 2023
-
[43]
Pillar-based object detection for autonomous driving
Yue Wang, Alireza Fathi, Abhijit Kundu, David A Ross, Caroline Pantofaru, Tom Funkhouser, and Justin Solomon. Pillar-based object detection for autonomous driving. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXII 16, pages 18–34. Springer, 2020. 2, 3
work page 2020
-
[44]
Semi-supervised 3d object detection with patchteacher and pillarmix
Xiaopei Wu, Liang Peng, Liang Xie, Yuenan Hou, Bin- bin Lin, Xiaoshui Huang, Haifeng Liu, Deng Cai, and Wanli Ouyang. Semi-supervised 3d object detection with patchteacher and pillarmix. InProceedings of the AAAI Con- ference on Artificial Intelligence, pages 6153–6161, 2024. 1, 3, 6, 7
work page 2024
-
[45]
Qizhe Xie, Zihang Dai, Eduard Hovy, Thang Luong, and Quoc Le. Unsupervised data augmentation for consistency training.Advances in neural information processing systems, 33:6256–6268, 2020. 3
work page 2020
-
[46]
Self-training with noisy student improves imagenet classification
Qizhe Xie, Minh-Thang Luong, Eduard Hovy, and Quoc V Le. Self-training with noisy student improves imagenet classification. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 10687– 10698, 2020. 3
work page 2020
-
[47]
End-to- end semi-supervised object detection with soft teacher
Mengde Xu, Zheng Zhang, Han Hu, Jianfeng Wang, Lijuan Wang, Fangyun Wei, Xiang Bai, and Zicheng Liu. End-to- end semi-supervised object detection with soft teacher. In Proceedings of the IEEE/CVF international conference on computer vision, pages 3060–3069, 2021. 3
work page 2021
-
[48]
Monocd: Monocular 3d object detection with complementary depths
Longfei Yan, Pei Yan, Shengzhou Xiong, Xuanyu Xiang, and Yihua Tan. Monocd: Monocular 3d object detection with complementary depths. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10248–10257, 2024. 2
work page 2024
-
[49]
Second: Sparsely embed- ded convolutional detection.Sensors, 18(10):3337, 2018
Yan Yan, Yuxing Mao, and Bo Li. Second: Sparsely embed- ded convolutional detection.Sensors, 18(10):3337, 2018. 2, 3, 4, 5
work page 2018
-
[50]
Std: Sparse-to-dense 3d object detector for point cloud
Zetong Yang, Yanan Sun, Shu Liu, Xiaoyong Shen, and Jiaya Jia. Std: Sparse-to-dense 3d object detector for point cloud. InProceedings of the IEEE/CVF international conference on computer vision, pages 1951–1960, 2019. 2
work page 1951
-
[51]
3dssd: Point-based 3d single stage object detector
Zetong Yang, Yanan Sun, Shu Liu, and Jiaya Jia. 3dssd: Point-based 3d single stage object detector. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 11040–11048, 2020. 2
work page 2020
-
[52]
Semi- supervised 3d object detection with proficient teachers
Junbo Yin, Jin Fang, Dingfu Zhou, Liangjun Zhang, Cheng- Zhong Xu, Jianbing Shen, and Wenguan Wang. Semi- supervised 3d object detection with proficient teachers. In European Conference on Computer Vision, pages 727–743. Springer, 2022. 1, 3
work page 2022
-
[53]
Center- based 3d object detection and tracking
Tianwei Yin, Xingyi Zhou, and Philipp Krahenbuhl. Center- based 3d object detection and tracking. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 11784–11793, 2021. 2
work page 2021
-
[54]
Csot: Cross-scan object transfer for semi- supervised lidar object detection
Jinglin Zhan, Tiejun Liu, Rengang Li, Zhaoxiang Zhang, and Yuntao Chen. Csot: Cross-scan object transfer for semi- supervised lidar object detection. InEuropean Conference on Computer Vision, 2024. 3
work page 2024
-
[55]
Bowen Zhang, Yidong Wang, Wenxin Hou, Hao Wu, Jin- dong Wang, Manabu Okumura, and Takahiro Shinozaki. Flexmatch: Boosting semi-supervised learning with curricu- lum pseudo labeling.Advances in Neural Information Pro- cessing Systems, 34:18408–18419, 2021. 3
work page 2021
-
[56]
Zehan Zhang, Yang Ji, Wei Cui, Yulong Wang, Hao Li, Xian Zhao, Duo Li, Sanli Tang, Ming Yang, Wenming Tan, et al. Atf-3d: Semi-supervised 3d object detection with adaptive thresholds filtering based on confidence and distance.IEEE Robotics and Automation Letters, 7(4):10573–10580, 2022. 1, 2, 3, 8
work page 2022
-
[57]
Sess: Self- ensembling semi-supervised 3d object detection
Na Zhao, Tat-Seng Chua, and Gim Hee Lee. Sess: Self- ensembling semi-supervised 3d object detection. InProceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11079–11087, 2020. 3
work page 2020
-
[58]
Instant-teaching: An end-to-end semi-supervised object detection framework
Qiang Zhou, Chaohui Yu, Zhibin Wang, Qi Qian, and Hao Li. Instant-teaching: An end-to-end semi-supervised object detection framework. InProceedings of the IEEE/CVF con- ference on computer vision and pattern recognition, pages 4081–4090, 2021. 3
work page 2021
-
[59]
V oxelnet: End-to-end learning for point cloud based 3d object detection
Yin Zhou and Oncel Tuzel. V oxelnet: End-to-end learning for point cloud based 3d object detection. InProceedings of the IEEE conference on computer vision and pattern recog- nition, pages 4490–4499, 2018. 2 Learning Adaptive Pseudo-Label Selection for Semi-Supervised 3D Object Detection Supplementary Material PedestrianCyclistCar CTE thresholdPQE scoreCT...
work page 2018
-
[60]
and V oxel-RCNN [2] also incorporate a GT-IoU esti- mation module, similar to PQE. The key difference of PQE lies in that the pseudo-label quality is predicted more reli- ably by aggregating diverse information through a score fu- sion manner, including semantic scores and geometric con- sistency between original and augmented scenes. Fig. 4 in the main p...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.