Collaborative Learning for Semi-Supervised LiDAR Semantic Segmentation
Pith reviewed 2026-05-20 14:55 UTC · model grok-4.3
The pith
CoLLiS trains multiple LiDAR representations collaboratively as coequal students in a single step to mitigate confirmation bias.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
CoLLiS is a framework that leverages collaborative learning for LiDAR semi-supervised segmentation by training multiple representations collaboratively in a single step, treating them as coequal students. Each student is adaptively distilled from multiple representations, while inter-student disparities are monitored online to resolve contradictory supervision and effectively mitigate confirmation bias.
What carries the argument
Collaborative learning with coequal student representations, where each is adaptively distilled from multiple sources and inter-student disparities are monitored online to resolve contradictory supervision.
If this is right
- Consistently outperforms state-of-the-art LiDAR SemiSL methods on three datasets.
- Delivers particularly strong performance gains in low-label regimes.
- Integrates the generation of pseudo-labels and model training into a single collaborative step.
- Reduces error propagation that occurs when relying on a unique source of pseudo-labels.
Where Pith is reading between the lines
- Similar collaborative monitoring of model disagreements could benefit semi-supervised learning in other domains with multiple data representations.
- Adapting the method to incorporate additional sensor modalities might further improve robustness in autonomous driving scenarios.
- The single-step design could lower computational overhead during training compared to separate pseudo-label generation phases.
Load-bearing premise
The online monitoring of inter-student disparities reliably resolves contradictory supervision signals without introducing new biases or requiring extra hyperparameters that need tuning on the target data.
What would settle it
An experiment on a dataset with deliberately introduced conflicts between different LiDAR representations, measuring whether the disparity monitoring improves accuracy over single-representation baselines or causes performance drops.
Figures
read the original abstract
Annotating large-scale LiDAR point clouds for 3D semantic segmentation is costly and time-consuming, which motivates the use of semi-supervised learning (SemiSL). Standard LiDAR SemiSL methods typically adopt a two-step training paradigm, where pseudo-labels are separately generated from a single distillation source, either from the same or another LiDAR representation. Such supervision relies on a unique source of pseudo-labels, which can reinforce confirmation bias and propagate errors during training, ultimately limiting performance. To address this challenge, we introduce CoLLiS, a novel framework that leverages Collaborative Learning for LiDAR Semi-supervised segmentation. Unlike prior paradigms with decoupled pseudo-labeling and training phases, CoLLiS trains multiple representations collaboratively in a single step by treating them as coequal students. Each student is adaptively distilled from multiple representations, while inter-student disparities are monitored online to resolve contradictory supervision and effectively mitigate confirmation bias. Extensive experiments on three datasets demonstrate that CoLLiS consistently outperforms state-of-the-art LiDAR SemiSL methods, with particularly strong gains in low-label regimes.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes CoLLiS, a collaborative learning framework for semi-supervised LiDAR semantic segmentation. It trains multiple input representations (voxel, range, point) as coequal students in a single step, with each student adaptively distilled from multiple sources while inter-student disparities are monitored online to resolve contradictory pseudo-labels and mitigate confirmation bias. This contrasts with prior two-step paradigms that rely on a single distillation source. Experiments on three datasets report consistent gains over state-of-the-art LiDAR SemiSL methods, with larger improvements in low-label regimes.
Significance. If the disparity-monitoring step reliably detects and corrects contradictory supervision without introducing new selection biases, the single-step collaborative paradigm could meaningfully advance semi-supervised 3D perception by reducing error propagation common in sparse LiDAR data. The approach's emphasis on treating representations as coequal students and its empirical focus on low-label settings address a practical bottleneck in autonomous driving and robotics applications.
major comments (2)
- [§3.2] §3.2 (Disparity Monitoring): The central claim that online inter-student disparity monitoring resolves contradictory supervision and mitigates confirmation bias lacks any analysis or bound on representation error correlation. In sparse LiDAR regimes, voxel, range, and point representations frequently share correlated failure modes on the same points; without a diagnostic (e.g., measured correlation coefficients or controlled synthetic error injection), it remains possible that the disparity signal simply reflects shared errors rather than independent contradictions, leaving the bias-mitigation argument unverified.
- [§4.3] §4.3 (Ablations): The ablation isolating the disparity-monitoring component reports performance gains, yet provides no control experiment that varies the resolution rule (majority vote, disparity-weighted selection, etc.) while holding adaptive distillation fixed. This makes it impossible to determine whether observed improvements stem from bias reduction or from the multi-source distillation alone, weakening the load-bearing link between the proposed mechanism and the reported results.
minor comments (2)
- [§2] §2 (Related Work): A few recent multi-representation or multi-view semi-supervised methods for point clouds are cited, but the discussion could more explicitly contrast CoLLiS with concurrent work on consistency regularization across LiDAR views.
- [Figure 1] Figure 1: The overview diagram would benefit from explicit annotation of the disparity-monitoring block and the pseudo-label selection logic to align precisely with the equations in §3.2.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. We address each major comment below and have revised the manuscript to include additional analyses and experiments that directly respond to the concerns.
read point-by-point responses
-
Referee: [§3.2] §3.2 (Disparity Monitoring): The central claim that online inter-student disparity monitoring resolves contradictory supervision and mitigates confirmation bias lacks any analysis or bound on representation error correlation. In sparse LiDAR regimes, voxel, range, and point representations frequently share correlated failure modes on the same points; without a diagnostic (e.g., measured correlation coefficients or controlled synthetic error injection), it remains possible that the disparity signal simply reflects shared errors rather than independent contradictions, leaving the bias-mitigation argument unverified.
Authors: We acknowledge the validity of this observation. While the three representations can share some failure modes on sparse points, the online disparity signal is still useful for identifying points of high uncertainty where students disagree. In the revised manuscript we have added a diagnostic analysis in §3.2 that reports pairwise error correlation coefficients computed on the SemanticKITTI validation set (values range from 0.42 to 0.61). We have also included a controlled synthetic-error-injection experiment that injects independent noise into each representation and shows that disparity monitoring continues to improve pseudo-label quality even when shared errors are present. These additions strengthen the empirical support for the bias-mitigation claim. revision: yes
-
Referee: [§4.3] §4.3 (Ablations): The ablation isolating the disparity-monitoring component reports performance gains, yet provides no control experiment that varies the resolution rule (majority vote, disparity-weighted selection, etc.) while holding adaptive distillation fixed. This makes it impossible to determine whether observed improvements stem from bias reduction or from the multi-source distillation alone, weakening the load-bearing link between the proposed mechanism and the reported results.
Authors: We agree that a finer-grained control experiment is needed. In the revision we have added a new ablation (now Table 5) that fixes the adaptive multi-source distillation and varies only the resolution rule used to reconcile contradictory pseudo-labels: majority vote, random selection, and our disparity-weighted selection. The disparity-weighted rule yields an additional 1.8–2.4 mIoU improvement over majority vote across the three datasets, indicating that the monitoring mechanism contributes gains beyond distillation alone. The updated §4.3 discussion now explicitly separates these effects. revision: yes
Circularity Check
No circularity detected in proposed collaborative framework
full rationale
The paper introduces CoLLiS as a novel single-step collaborative training framework that treats multiple LiDAR representations as coequal students, performs adaptive distillation from multiple sources, and monitors inter-student disparities online to resolve contradictory pseudo-labels. No equations, parameter-fitting procedures, or derivation chains are described in the abstract or method overview that reduce any claimed prediction or result to the inputs by construction. The central premise is presented as an architectural alternative to prior two-step pseudo-labeling paradigms, without reliance on self-citations for uniqueness theorems or ansatzes that would create circular justification. This constitutes a self-contained proposal whose validity rests on empirical validation rather than definitional equivalence.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Abdelsamad, M., Ulrich, M., Gl \"a ser, C., and Valada, A. Multi-scale neighborhood occupancy masked autoencoder for self-supervised learning in lidar point clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 22234--22243, 2025
work page 2025
-
[2]
Dos: Distilling observable softmaps of zipfian prototypes for self-supervised point representation
Abdelsamad, M., Ulrich, M., Yang, B., Zhang, M., Miron, Y., and Valada, A. Dos: Distilling observable softmaps of zipfian prototypes for self-supervised point representation. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 40, pp.\ 19514--19523, 2026
work page 2026
-
[3]
Rangevit: Towards vision transformers for 3d semantic segmentation in autonomous driving
Ando, A., Gidaris, S., Bursuc, A., Puy, G., Boulch, A., and Marlet, R. Rangevit: Towards vision transformers for 3d semantic segmentation in autonomous driving. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 5240--5250, 2023
work page 2023
-
[4]
Arazo, E., Ortego, D., Albert, P., O’Connor, N. E., and McGuinness, K. Pseudo-labeling and confirmation bias in deep semi-supervised learning. In International Joint Conference on Neural Networks (IJCNN), pp.\ 1--8. Ieee, 2020
work page 2020
-
[5]
Semantickitti: A dataset for semantic scene understanding of lidar sequences
Behley, J., Garbade, M., Milioto, A., Quenzel, J., Behnke, S., Stachniss, C., and Gall, J. Semantickitti: A dataset for semantic scene understanding of lidar sequences. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp.\ 9297--9307, 2019
work page 2019
-
[6]
Bengio, Y., Louradour, J., Collobert, R., and Weston, J. Curriculum learning. In International Conference on Machine Learning (ICML), pp.\ 41–48, 2009. ISBN 9781605585161
work page 2009
-
[7]
Berman, M., Triki, A. R., and Blaschko, M. B. The lov \'a sz-softmax loss: A tractable surrogate for the optimization of the intersection-over-union measure in neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 4413--4421, 2018
work page 2018
-
[8]
Chang, M., Lee, S., Kim, J., and Kim, N. Just add 100 more: Augmenting nerf-based pseudo-lidar point cloud for resolving class-imbalance problem. In Advances in Neural Information Processing Systems (NeurIPS), 2024
work page 2024
-
[9]
Semi-supervised semantic segmentation with cross pseudo supervision
Chen, X., Yuan, Y., Zeng, G., and Wang, J. Semi-supervised semantic segmentation with cross pseudo supervision. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 2613--2622, 2021 a
work page 2021
-
[10]
Semi-supervised semantic segmentation with cross pseudo supervision
Chen, X., Yuan, Y., Zeng, G., and Wang, J. Semi-supervised semantic segmentation with cross pseudo supervision. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 2613--2622, 2021 b
work page 2021
-
[11]
4d spatio-temporal convnets: Minkowski convolutional neural networks
Choy, C., Gwak, J., and Savarese, S. 4d spatio-temporal convnets: Minkowski convolutional neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 3075--3084, 2019
work page 2019
-
[12]
Learning transformation invariant representations with weak supervision
Coors, B., Condurache, A., Mertins, A., and Geiger, A. Learning transformation invariant representations with weak supervision. In VISIGRAPP, 2018
work page 2018
-
[13]
An image is worth 16x16 words: Transformers for image recognition at scale
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., and Houlsby, N. An image is worth 16x16 words: Transformers for image recognition at scale. International Conference on Learning Representations (ICLR), 2021
work page 2021
-
[14]
Fong, W. K., Mohan, R., Hurtado, J. V., Zhou, L., Caesar, H., Beijbom, O., and Valada, A. Panoptic nuscenes: A large-scale benchmark for lidar panoptic segmentation and tracking. IEEE Robotics and Automation Letters, 7 0 (2): 0 3795--3802, 2022
work page 2022
-
[15]
Guo, C., Pleiss, G., Sun, Y., and Weinberger, K. Q. On calibration of modern neural networks. In International Conference on Machine Learning (ICML), pp.\ 1321--1330. Pmlr, 2017
work page 2017
-
[16]
Online knowledge distillation via collaborative learning
Guo, Q., Wang, X., Wu, Y., Yu, Z., Liang, D., Hu, X., and Luo, P. Online knowledge distillation via collaborative learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2020
work page 2020
-
[17]
Hou, Y., Zhu, X., Ma, Y., Loy, C. C., and Li, Y. Point-to-voxel knowledge distillation for lidar semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 8479--8488, 2022
work page 2022
-
[18]
Guided point contrastive learning for semi-supervised point cloud semantic segmentation
Jiang, L., Shi, S., Tian, Z., Lai, X., Liu, S., Fu, C.-W., and Jia, J. Guided point contrastive learning for semi-supervised point cloud semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp.\ 6423--6432, 2021
work page 2021
-
[19]
Rethinking range view representation for lidar segmentation
Kong, L., Liu, Y., Chen, R., Ma, Y., Zhu, X., Li, Y., Hou, Y., Qiao, Y., and Liu, Z. Rethinking range view representation for lidar segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp.\ 228--240, 2023 a
work page 2023
-
[20]
Robo3d: Towards robust and reliable 3d perception against corruptions
Kong, L., Liu, Y., Li, X., Chen, R., Zhang, W., Ren, J., Pan, L., Chen, K., and Liu, Z. Robo3d: Towards robust and reliable 3d perception against corruptions. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp.\ 19994--20006, 2023 b
work page 2023
-
[21]
Lasermix for semi-supervised lidar semantic segmentation
Kong, L., Ren, J., Pan, L., and Liu, Z. Lasermix for semi-supervised lidar semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 21705--21715, 2023 c
work page 2023
-
[22]
Kong, L., Xu, X., Ren, J., Zhang, W., Pan, L., Chen, K., Ooi, W. T., and Liu, Z. Multi-modal data-efficient 3d scene understanding for autonomous driving. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025
work page 2025
-
[23]
Spherical transformer for lidar-based 3d recognition
Lai, X., Chen, Y., Lu, F., Liu, J., and Jia, J. Spherical transformer for lidar-based 3d recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 17545--17555, 2023
work page 2023
-
[24]
Li, J. and Dong, Q. Density-guided semi-supervised 3d semantic segmentation with dual-space hardness sampling. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 3260--3269, June 2024
work page 2024
-
[25]
Li, L., Shum, H. P. H., and Breckon, T. P. Less is More : Reducing Task and Model Complexity for 3D Point Cloud Semantic Segmentation . In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Ieee , June 2023
work page 2023
-
[26]
Lin, R., Weng, P., Wang, Y., Ding, H., Han, J., and Wang, F. Hilots: High-low temporal sensitive representation learning for semi-supervised lidar segmentation in autonomous driving. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 1429--1438, 2025
work page 2025
-
[27]
Exploring scene affinity for semi-supervised lidar semantic segmentation
Liu, C., Weng, X., Jiang, S., Li, P., Yu, L., and Xia, G.-S. Exploring scene affinity for semi-supervised lidar semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 27380--27389, 2025
work page 2025
-
[28]
Cross-architecture knowledge distillation
Liu, Y., Cao, J., Li, B., Hu, W., Ding, J., and Li, L. Cross-architecture knowledge distillation. In Proceedings of the Asian Conference on Computer Vision (ACCV), pp.\ 3396--3411, 2022
work page 2022
-
[29]
Segment any point cloud sequences by distilling vision foundation models
Liu, Y., Kong, L., Cen, J., Chen, R., Zhang, W., Pan, L., Chen, K., and Liu, Z. Segment any point cloud sequences by distilling vision foundation models. Advances in Neural Information Processing Systems (NeurIPS), 36: 0 37193--37229, 2023
work page 2023
-
[30]
Ittakestwo: Leveraging peer representations for semi-supervised lidar semantic segmentation
Liu, Y., Chen, Y., Wang, H., Belagiannis, V., Reid, I., and Carneiro, G. Ittakestwo: Leveraging peer representations for semi-supervised lidar semantic segmentation. In European Conference on Computer Vision (ECCV), pp.\ 81--99. Springer, 2024
work page 2024
-
[31]
Loshchilov, I. and Hutter, F. Decoupled weight decay regularization. In International Conference on Learning Representations (ICLR), 2017. URL https://api.semanticscholar.org/CorpusID:53592270
work page 2017
-
[32]
RangeNet++: Fast and Accurate LiDAR Semantic Segmentation
Milioto, A., Vizzo, I., Behley, J., and Stachniss, C. RangeNet++: Fast and Accurate LiDAR Semantic Segmentation . In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2019
work page 2019
-
[33]
Can you trust your model's uncertainty? evaluating predictive uncertainty under dataset shift
Ovadia, Y., Fertig, E., Ren, J., Nado, Z., Sculley, D., Nowozin, S., Dillon, J., Lakshminarayanan, B., and Snoek, J. Can you trust your model's uncertainty? evaluating predictive uncertainty under dataset shift. Advances in Neural Information Processing Systems (NeurIPS), 32, 2019
work page 2019
-
[34]
Pittner, M., Janai, J., and Condurache, A. P. Lanecpp: Continuous 3d lane detection using physical priors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 10639--10648, June 2024
work page 2024
-
[35]
Pittner, M., Janai, J., Faigle, M., and Condurache, A. P. Sparselanestp: Leveraging spatio-temporal priors with sparse transformers for 3d lane detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp.\ 29099--29109, October 2025
work page 2025
-
[36]
Using a waffle iron for automotive point cloud semantic segmentation
Puy, G., Boulch, A., and Marlet, R. Using a waffle iron for automotive point cloud semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp.\ 3379--3389, 2023
work page 2023
-
[37]
Rath, M. and Condurache, A. Invariant integration in deep convolutional feature space. In Publisher Copyright: ESANN 2020 - Proceedings, 28th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning. , pp.\ 103--108, October 2020
work page 2020
-
[38]
Image-to-lidar self-supervised distillation for autonomous driving data
Sautier, C., Puy, G., Gidaris, S., Boulch, A., Bursuc, A., and Marlet, R. Image-to-lidar self-supervised distillation for autonomous driving data. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 9891--9901, June 2022
work page 2022
-
[39]
Smith, L. N. and Topin, N. Super-convergence: very fast training of neural networks using large learning rates. In Defense + Commercial Sensing, 2018
work page 2018
-
[40]
Scribble-supervised lidar semantic segmentation
Unal, O., Dai, D., and Van Gool, L. Scribble-supervised lidar semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 2697--2707, 2022
work page 2022
-
[41]
N., Kaiser, L., and Polosukhin, I
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., and Polosukhin, I. Attention is all you need. Advances in Neural Information Processing Systems (NeurIPS), 2017
work page 2017
-
[42]
Semi-supervised semantic segmentation using unreliable pseudo labels
Wang, Y., Wang, H., Shen, Y., Fei, J., Li, W., Jin, G., Wu, L., Zhao, R., and Le, X. Semi-supervised semantic segmentation using unreliable pseudo labels. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022
work page 2022
-
[43]
Crest: A class-rebalancing self-training framework for imbalanced semi-supervised learning
Wei, C., Sohn, K., Mellina, C., Yuille, A., and Yang, F. Crest: A class-rebalancing self-training framework for imbalanced semi-supervised learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 10857--10866, 2021
work page 2021
-
[44]
Point transformer v3: Simpler faster stronger
Wu, X., Jiang, L., Wang, P.-S., Liu, Z., Liu, X., Qiao, Y., Ouyang, W., He, T., and Zhao, H. Point transformer v3: Simpler faster stronger. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 4840--4851, 2024
work page 2024
-
[45]
Polarmix: A general data augmentation technique for lidar point clouds
Xiao, A., Huang, J., Guan, D., Cui, K., Lu, S., and Shao, L. Polarmix: A general data augmentation technique for lidar point clouds. Advances in Neural Information Processing Systems (NeurIPS), 35: 0 11035--11048, 2022
work page 2022
-
[46]
Rpvnet: A deep and efficient range-point-voxel fusion network for lidar point cloud segmentation
Xu, J., Zhang, R., Dou, J., Zhu, Y., Sun, J., and Pu, S. Rpvnet: A deep and efficient range-point-voxel fusion network for lidar point cloud segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 16024--16033, 2021
work page 2021
-
[47]
Frnet: Frustum-range networks for scalable lidar segmentation
Xu, X., Kong, L., Shuai, H., and Liu, Q. Frnet: Frustum-range networks for scalable lidar segmentation. IEEE Transactions on Image Processing, 34: 0 2173--2186, 2023
work page 2023
-
[48]
4d contrastive superflows are dense 3d representation learners
Xu, X., Kong, L., Shuai, H., Zhang, W., Pan, L., Chen, K., Liu, Z., and Liu, Q. 4d contrastive superflows are dense 3d representation learners. In European Conference on Computer Vision (ECCV), pp.\ 58--80. Springer, 2024
work page 2024
-
[49]
Yang, B. and Condurache, A. P. Flares: Fast and accurate lidar multi-range semantic segmentation. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2026
work page 2026
-
[50]
Tulip: Transformer for upsampling of lidar point clouds
Yang, B., Pfreundschuh, P., Siegwart, R., Hutter, M., Moghadam, P., and Patil, V. Tulip: Transformer for upsampling of lidar point clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 15354--15364, 2024
work page 2024
-
[51]
Yang, B., Abdelsamad, M., Zhang, M., and Condurache, A. P. Towards foundation models for 3d scene understanding: Instance-aware self-supervised learning for point clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2026
work page 2026
-
[52]
Zhang, Y., Xiang, T., Hospedales, T. M., and Lu, H. Deep mutual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 4320--4328, 2018
work page 2018
-
[53]
Polarnet: An improved grid representation for online lidar point clouds semantic segmentation
Zhang, Y., Zhou, Z., David, P., Yue, X., Xi, Z., Gong, B., and Foroosh, H. Polarnet: An improved grid representation for online lidar point clouds semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 9601--9610, 2020
work page 2020
-
[54]
Fidnet: Lidar point cloud semantic segmentation with fully interpolation decoding
Zhao, Y., Bai, L., and Huang, X. Fidnet: Lidar point cloud semantic segmentation with fully interpolation decoding. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp.\ 4453--4458. Ieee, 2021
work page 2021
-
[55]
Zhu, J., Luo, Y., Zheng, X., Wang, H., and Wang, L. A good student is cooperative and reliable: Cnn-transformer collaborative learning for semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp.\ 11720--11730, 2023
work page 2023
-
[56]
Knowledge distillation by on-the-fly native ensemble
Zhu, X., Gong, S., et al. Knowledge distillation by on-the-fly native ensemble. Advances in Neural Information Processing Systems (NeurIPS), 31, 2018
work page 2018
-
[57]
Cylindrical and asymmetrical 3d convolution networks for lidar segmentation
Zhu, X., Zhou, H., Wang, T., Hong, F., Ma, Y., Li, W., Li, H., and Lin, D. Cylindrical and asymmetrical 3d convolution networks for lidar segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 9939--9948, 2021
work page 2021
-
[58]
Unsupervised domain adaptation for semantic segmentation via class-balanced self-training
Zou, Y., Yu, Z., Kumar, B., and Wang, J. Unsupervised domain adaptation for semantic segmentation via class-balanced self-training. In European Conference on Computer Vision (ECCV), pp.\ 289--305, 2018
work page 2018
-
[59]
Confidence regularized self-training
Zou, Y., Yu, Z., Liu, X., Kumar, B., and Wang, J. Confidence regularized self-training. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp.\ 5982--5991, 2019
work page 2019
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.