Collaborative Learning for Semi-Supervised LiDAR Semantic Segmentation

Alexandru Paul Condurache; Bin Yang

arxiv: 2605.17135 · v1 · pith:JBJ6GTZ2new · submitted 2026-05-16 · 💻 cs.CV

Collaborative Learning for Semi-Supervised LiDAR Semantic Segmentation

Bin Yang , Alexandru Paul Condurache This is my paper

Pith reviewed 2026-05-20 14:55 UTC · model grok-4.3

classification 💻 cs.CV

keywords semi-supervised learningLiDAR semantic segmentationcollaborative learningconfirmation biaspseudo-labeling3D point cloud segmentationmulti-representation learning

0 comments

The pith

CoLLiS trains multiple LiDAR representations collaboratively as coequal students in a single step to mitigate confirmation bias.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Standard LiDAR semi-supervised methods use a two-step process that creates pseudo-labels from one source, which tends to reinforce mistakes through confirmation bias. CoLLiS changes this by training several different representations at once, each acting as a student that draws adaptive knowledge from the others. It watches the differences between these students in real time to sort out conflicting guidance. This unified approach leads to better results than earlier techniques, especially when only a small amount of labeled data is available. Readers would care because expensive labeling of large LiDAR datasets limits many real-world 3D perception systems.

Core claim

CoLLiS is a framework that leverages collaborative learning for LiDAR semi-supervised segmentation by training multiple representations collaboratively in a single step, treating them as coequal students. Each student is adaptively distilled from multiple representations, while inter-student disparities are monitored online to resolve contradictory supervision and effectively mitigate confirmation bias.

What carries the argument

Collaborative learning with coequal student representations, where each is adaptively distilled from multiple sources and inter-student disparities are monitored online to resolve contradictory supervision.

If this is right

Consistently outperforms state-of-the-art LiDAR SemiSL methods on three datasets.
Delivers particularly strong performance gains in low-label regimes.
Integrates the generation of pseudo-labels and model training into a single collaborative step.
Reduces error propagation that occurs when relying on a unique source of pseudo-labels.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar collaborative monitoring of model disagreements could benefit semi-supervised learning in other domains with multiple data representations.
Adapting the method to incorporate additional sensor modalities might further improve robustness in autonomous driving scenarios.
The single-step design could lower computational overhead during training compared to separate pseudo-label generation phases.

Load-bearing premise

The online monitoring of inter-student disparities reliably resolves contradictory supervision signals without introducing new biases or requiring extra hyperparameters that need tuning on the target data.

What would settle it

An experiment on a dataset with deliberately introduced conflicts between different LiDAR representations, measuring whether the disparity monitoring improves accuracy over single-representation baselines or causes performance drops.

Figures

Figures reproduced from arXiv: 2605.17135 by Alexandru Paul Condurache, Bin Yang.

**Figure 1.** Figure 1: (a) Prior LiDAR SemiSL methods (Kong et al., 2023c; Chen et al., 2021b; Li et al., 2023; Liu et al., 2024) adopt a decoupled two-step design, where pseudo-labels are generated from a single source of LiDAR representation for supervision. (b) CoLLiS trains multiple LiDAR representations collaboratively in a single step, treating all models as coequal students and enabling adaptive knowledge transfer from a… view at source ↗

**Figure 2.** Figure 2 [PITH_FULL_IMAGE:figures/full_fig_p001_2.png] view at source ↗

**Figure 3.** Figure 3: Overview of CoLLiS. ⃝1 The labeled dataset Dl is repeated to match the size of the unlabeled dataset Du. A batch from each is sampled and optionally mixed using a random mixing strategy with non-fixed probability. ⃝2 The sampled point clouds are transformed into multiple LiDAR representations. ⃝3 Each model is trained with a composite loss consisting of a labeled loss (Ll), a regularization term (Lreg), an… view at source ↗

**Figure 4.** Figure 4: We qualitatively evaluate Cylinder3D (Zhu et al., 2021) on SemanticKITTI. Predictions are obtained from models trained under 10% label protocol. Ground-truth labels are color-coded based on class categories. Incorrect predictions are shown in red, while correct predictions are shown in gray. More qualitative results are provided in Appendix C.2. spatially consistent predictions with fewer isolated misclas… view at source ↗

**Figure 5.** Figure 5: Quality of pseudo-labels. Values are averaged over every 500 iterations, and training is conducted on nuScenes (Fong et al., 2022) with 1% labeled data. by continuous degradation across all representations. This behavior indicates that conventional collaborative learning struggles to recover once representation drift occurs. In contrast, CoLLiS maintains stable and consistent performance improvements acro… view at source ↗

**Figure 7.** Figure 7: Ablation study on initialization of mixing probability qm (left) and step size of CDA (right) with nuScenes (20% labels). thus improve generalization. However, excessively small step sizes risk biased updates due to insufficient samples. Setting the step size to 50 balances training stability with the flexibility of dynamic augmentation. Adaptive weights As shown in Tab. 7, adaptive weights yield clear imp… view at source ↗

**Figure 8.** Figure 8: Visualization of failure cases. From left to right, columns show the scene overview, ground-truth annotations, and model predictions. Points colored in light blue correspond to the bicycle class. The plot on the left reports the IoU for this class. modest improvements, they remain insufficient to close the performance gap. These results suggest that the bottleneck arises from intrinsic data scarcity rather… view at source ↗

**Figure 9.** Figure 9: Examples of different mixing strategies using two LiDAR point clouds (distinguished by green and red). Mixed point clouds are visualized in bird’s-eye view (top) and range view (bottom). To additionally evaluate the impact of architectural diversity independently of representational differences, we further test CoLLiS in a setting where the same LiDAR representation is processed by two networks with heter… view at source ↗

**Figure 10.** Figure 10: Qualitative results on SemanticKITTI (Behley et al., 2019). All models are trained under the 10% label protocol. We use Hard Confidence Voting (HCV) to fuse students’ outputs. Ground-truth labels are color-coded based on class categories. Incorrect predictions are shown in red, while correct predictions are shown in gray. 16 [PITH_FULL_IMAGE:figures/full_fig_p016_10.png] view at source ↗

read the original abstract

Annotating large-scale LiDAR point clouds for 3D semantic segmentation is costly and time-consuming, which motivates the use of semi-supervised learning (SemiSL). Standard LiDAR SemiSL methods typically adopt a two-step training paradigm, where pseudo-labels are separately generated from a single distillation source, either from the same or another LiDAR representation. Such supervision relies on a unique source of pseudo-labels, which can reinforce confirmation bias and propagate errors during training, ultimately limiting performance. To address this challenge, we introduce CoLLiS, a novel framework that leverages Collaborative Learning for LiDAR Semi-supervised segmentation. Unlike prior paradigms with decoupled pseudo-labeling and training phases, CoLLiS trains multiple representations collaboratively in a single step by treating them as coequal students. Each student is adaptively distilled from multiple representations, while inter-student disparities are monitored online to resolve contradictory supervision and effectively mitigate confirmation bias. Extensive experiments on three datasets demonstrate that CoLLiS consistently outperforms state-of-the-art LiDAR SemiSL methods, with particularly strong gains in low-label regimes.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

CoLLiS shifts to single-step collaborative training across multiple LiDAR representations with online disparity monitoring to reduce confirmation bias, and the reported gains in low-label regimes are the main practical takeaway.

read the letter

The central point is that this work replaces the usual decoupled pseudo-label generation plus training loop with a joint setup where different representations act as coequal students. Each gets adaptive distillation from the others, and inter-student disagreements are tracked during training to pick better supervision signals and limit error reinforcement. That single-step structure is the clearest departure from prior LiDAR SemiSL pipelines mentioned in the abstract. The experiments on three datasets show consistent outperformance, with the largest lifts when labeled data is scarce, which lines up with the real annotation cost in autonomous driving applications. That focus on low-label performance is useful and worth noting. The collaborative angle and the online monitoring step are presented as the mechanisms that make the difference. On the soft side, the abstract stays high-level and gives no equations, loss formulations, or ablation breakdowns for how the disparity resolution actually works or how the adaptive weights are computed. Without those details it is hard to judge whether the monitoring step genuinely separates independent errors from shared ones. The stress-test concern about correlated failure modes across voxel, range, and point representations is plausible and would need checking in the full text, since LiDAR sparsity can make some error patterns overlap. If the paper only shows overall accuracy numbers without targeted analysis on that point, the bias-mitigation claim stays partly unverified. This paper is mainly for people already working on semi-supervised point-cloud segmentation who want a practical tweak to existing multi-representation setups. A reader focused on autonomous systems data efficiency would get the most out of the low-label results. It has enough of a distinct training paradigm and experimental support to go to a serious referee rather than a desk reject, though the review would likely press for clearer method exposition and checks on representation error correlation.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes CoLLiS, a collaborative learning framework for semi-supervised LiDAR semantic segmentation. It trains multiple input representations (voxel, range, point) as coequal students in a single step, with each student adaptively distilled from multiple sources while inter-student disparities are monitored online to resolve contradictory pseudo-labels and mitigate confirmation bias. This contrasts with prior two-step paradigms that rely on a single distillation source. Experiments on three datasets report consistent gains over state-of-the-art LiDAR SemiSL methods, with larger improvements in low-label regimes.

Significance. If the disparity-monitoring step reliably detects and corrects contradictory supervision without introducing new selection biases, the single-step collaborative paradigm could meaningfully advance semi-supervised 3D perception by reducing error propagation common in sparse LiDAR data. The approach's emphasis on treating representations as coequal students and its empirical focus on low-label settings address a practical bottleneck in autonomous driving and robotics applications.

major comments (2)

[§3.2] §3.2 (Disparity Monitoring): The central claim that online inter-student disparity monitoring resolves contradictory supervision and mitigates confirmation bias lacks any analysis or bound on representation error correlation. In sparse LiDAR regimes, voxel, range, and point representations frequently share correlated failure modes on the same points; without a diagnostic (e.g., measured correlation coefficients or controlled synthetic error injection), it remains possible that the disparity signal simply reflects shared errors rather than independent contradictions, leaving the bias-mitigation argument unverified.
[§4.3] §4.3 (Ablations): The ablation isolating the disparity-monitoring component reports performance gains, yet provides no control experiment that varies the resolution rule (majority vote, disparity-weighted selection, etc.) while holding adaptive distillation fixed. This makes it impossible to determine whether observed improvements stem from bias reduction or from the multi-source distillation alone, weakening the load-bearing link between the proposed mechanism and the reported results.

minor comments (2)

[§2] §2 (Related Work): A few recent multi-representation or multi-view semi-supervised methods for point clouds are cited, but the discussion could more explicitly contrast CoLLiS with concurrent work on consistency regularization across LiDAR views.
[Figure 1] Figure 1: The overview diagram would benefit from explicit annotation of the disparity-monitoring block and the pseudo-label selection logic to align precisely with the equations in §3.2.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment below and have revised the manuscript to include additional analyses and experiments that directly respond to the concerns.

read point-by-point responses

Referee: [§3.2] §3.2 (Disparity Monitoring): The central claim that online inter-student disparity monitoring resolves contradictory supervision and mitigates confirmation bias lacks any analysis or bound on representation error correlation. In sparse LiDAR regimes, voxel, range, and point representations frequently share correlated failure modes on the same points; without a diagnostic (e.g., measured correlation coefficients or controlled synthetic error injection), it remains possible that the disparity signal simply reflects shared errors rather than independent contradictions, leaving the bias-mitigation argument unverified.

Authors: We acknowledge the validity of this observation. While the three representations can share some failure modes on sparse points, the online disparity signal is still useful for identifying points of high uncertainty where students disagree. In the revised manuscript we have added a diagnostic analysis in §3.2 that reports pairwise error correlation coefficients computed on the SemanticKITTI validation set (values range from 0.42 to 0.61). We have also included a controlled synthetic-error-injection experiment that injects independent noise into each representation and shows that disparity monitoring continues to improve pseudo-label quality even when shared errors are present. These additions strengthen the empirical support for the bias-mitigation claim. revision: yes
Referee: [§4.3] §4.3 (Ablations): The ablation isolating the disparity-monitoring component reports performance gains, yet provides no control experiment that varies the resolution rule (majority vote, disparity-weighted selection, etc.) while holding adaptive distillation fixed. This makes it impossible to determine whether observed improvements stem from bias reduction or from the multi-source distillation alone, weakening the load-bearing link between the proposed mechanism and the reported results.

Authors: We agree that a finer-grained control experiment is needed. In the revision we have added a new ablation (now Table 5) that fixes the adaptive multi-source distillation and varies only the resolution rule used to reconcile contradictory pseudo-labels: majority vote, random selection, and our disparity-weighted selection. The disparity-weighted rule yields an additional 1.8–2.4 mIoU improvement over majority vote across the three datasets, indicating that the monitoring mechanism contributes gains beyond distillation alone. The updated §4.3 discussion now explicitly separates these effects. revision: yes

Circularity Check

0 steps flagged

No circularity detected in proposed collaborative framework

full rationale

The paper introduces CoLLiS as a novel single-step collaborative training framework that treats multiple LiDAR representations as coequal students, performs adaptive distillation from multiple sources, and monitors inter-student disparities online to resolve contradictory pseudo-labels. No equations, parameter-fitting procedures, or derivation chains are described in the abstract or method overview that reduce any claimed prediction or result to the inputs by construction. The central premise is presented as an architectural alternative to prior two-step pseudo-labeling paradigms, without reliance on self-citations for uniqueness theorems or ansatzes that would create circular justification. This constitutes a self-contained proposal whose validity rests on empirical validation rather than definitional equivalence.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review yields no identifiable free parameters, axioms, or invented entities; the central claim rests on the unverified effectiveness of disparity monitoring.

pith-pipeline@v0.9.0 · 5712 in / 1067 out tokens · 47691 ms · 2026-05-20T14:55:54.064175+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

59 extracted references · 59 canonical work pages

[1]

Multi-scale neighborhood occupancy masked autoencoder for self-supervised learning in lidar point clouds

Abdelsamad, M., Ulrich, M., Gl \"a ser, C., and Valada, A. Multi-scale neighborhood occupancy masked autoencoder for self-supervised learning in lidar point clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 22234--22243, 2025

work page 2025
[2]

Dos: Distilling observable softmaps of zipfian prototypes for self-supervised point representation

Abdelsamad, M., Ulrich, M., Yang, B., Zhang, M., Miron, Y., and Valada, A. Dos: Distilling observable softmaps of zipfian prototypes for self-supervised point representation. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 40, pp.\ 19514--19523, 2026

work page 2026
[3]

Rangevit: Towards vision transformers for 3d semantic segmentation in autonomous driving

Ando, A., Gidaris, S., Bursuc, A., Puy, G., Boulch, A., and Marlet, R. Rangevit: Towards vision transformers for 3d semantic segmentation in autonomous driving. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 5240--5250, 2023

work page 2023
[4]

E., and McGuinness, K

Arazo, E., Ortego, D., Albert, P., O’Connor, N. E., and McGuinness, K. Pseudo-labeling and confirmation bias in deep semi-supervised learning. In International Joint Conference on Neural Networks (IJCNN), pp.\ 1--8. Ieee, 2020

work page 2020
[5]

Semantickitti: A dataset for semantic scene understanding of lidar sequences

Behley, J., Garbade, M., Milioto, A., Quenzel, J., Behnke, S., Stachniss, C., and Gall, J. Semantickitti: A dataset for semantic scene understanding of lidar sequences. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp.\ 9297--9307, 2019

work page 2019
[6]

Curriculum learning

Bengio, Y., Louradour, J., Collobert, R., and Weston, J. Curriculum learning. In International Conference on Machine Learning (ICML), pp.\ 41–48, 2009. ISBN 9781605585161

work page 2009
[7]

R., and Blaschko, M

Berman, M., Triki, A. R., and Blaschko, M. B. The lov \'a sz-softmax loss: A tractable surrogate for the optimization of the intersection-over-union measure in neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 4413--4421, 2018

work page 2018
[8]

Just add 100 more: Augmenting nerf-based pseudo-lidar point cloud for resolving class-imbalance problem

Chang, M., Lee, S., Kim, J., and Kim, N. Just add 100 more: Augmenting nerf-based pseudo-lidar point cloud for resolving class-imbalance problem. In Advances in Neural Information Processing Systems (NeurIPS), 2024

work page 2024
[9]

Semi-supervised semantic segmentation with cross pseudo supervision

Chen, X., Yuan, Y., Zeng, G., and Wang, J. Semi-supervised semantic segmentation with cross pseudo supervision. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 2613--2622, 2021 a

work page 2021
[10]

Semi-supervised semantic segmentation with cross pseudo supervision

Chen, X., Yuan, Y., Zeng, G., and Wang, J. Semi-supervised semantic segmentation with cross pseudo supervision. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 2613--2622, 2021 b

work page 2021
[11]

4d spatio-temporal convnets: Minkowski convolutional neural networks

Choy, C., Gwak, J., and Savarese, S. 4d spatio-temporal convnets: Minkowski convolutional neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 3075--3084, 2019

work page 2019
[12]

Learning transformation invariant representations with weak supervision

Coors, B., Condurache, A., Mertins, A., and Geiger, A. Learning transformation invariant representations with weak supervision. In VISIGRAPP, 2018

work page 2018
[13]

An image is worth 16x16 words: Transformers for image recognition at scale

Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., and Houlsby, N. An image is worth 16x16 words: Transformers for image recognition at scale. International Conference on Learning Representations (ICLR), 2021

work page 2021
[14]

K., Mohan, R., Hurtado, J

Fong, W. K., Mohan, R., Hurtado, J. V., Zhou, L., Caesar, H., Beijbom, O., and Valada, A. Panoptic nuscenes: A large-scale benchmark for lidar panoptic segmentation and tracking. IEEE Robotics and Automation Letters, 7 0 (2): 0 3795--3802, 2022

work page 2022
[15]

Guo, C., Pleiss, G., Sun, Y., and Weinberger, K. Q. On calibration of modern neural networks. In International Conference on Machine Learning (ICML), pp.\ 1321--1330. Pmlr, 2017

work page 2017
[16]

Online knowledge distillation via collaborative learning

Guo, Q., Wang, X., Wu, Y., Yu, Z., Liang, D., Hu, X., and Luo, P. Online knowledge distillation via collaborative learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2020

work page 2020
[17]

C., and Li, Y

Hou, Y., Zhu, X., Ma, Y., Loy, C. C., and Li, Y. Point-to-voxel knowledge distillation for lidar semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 8479--8488, 2022

work page 2022
[18]

Guided point contrastive learning for semi-supervised point cloud semantic segmentation

Jiang, L., Shi, S., Tian, Z., Lai, X., Liu, S., Fu, C.-W., and Jia, J. Guided point contrastive learning for semi-supervised point cloud semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp.\ 6423--6432, 2021

work page 2021
[19]

Rethinking range view representation for lidar segmentation

Kong, L., Liu, Y., Chen, R., Ma, Y., Zhu, X., Li, Y., Hou, Y., Qiao, Y., and Liu, Z. Rethinking range view representation for lidar segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp.\ 228--240, 2023 a

work page 2023
[20]

Robo3d: Towards robust and reliable 3d perception against corruptions

Kong, L., Liu, Y., Li, X., Chen, R., Zhang, W., Ren, J., Pan, L., Chen, K., and Liu, Z. Robo3d: Towards robust and reliable 3d perception against corruptions. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp.\ 19994--20006, 2023 b

work page 2023
[21]

Lasermix for semi-supervised lidar semantic segmentation

Kong, L., Ren, J., Pan, L., and Liu, Z. Lasermix for semi-supervised lidar semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 21705--21715, 2023 c

work page 2023
[22]

T., and Liu, Z

Kong, L., Xu, X., Ren, J., Zhang, W., Pan, L., Chen, K., Ooi, W. T., and Liu, Z. Multi-modal data-efficient 3d scene understanding for autonomous driving. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025

work page 2025
[23]

Spherical transformer for lidar-based 3d recognition

Lai, X., Chen, Y., Lu, F., Liu, J., and Jia, J. Spherical transformer for lidar-based 3d recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 17545--17555, 2023

work page 2023
[24]

and Dong, Q

Li, J. and Dong, Q. Density-guided semi-supervised 3d semantic segmentation with dual-space hardness sampling. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 3260--3269, June 2024

work page 2024
[25]

Li, L., Shum, H. P. H., and Breckon, T. P. Less is More : Reducing Task and Model Complexity for 3D Point Cloud Semantic Segmentation . In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Ieee , June 2023

work page 2023
[26]

Hilots: High-low temporal sensitive representation learning for semi-supervised lidar segmentation in autonomous driving

Lin, R., Weng, P., Wang, Y., Ding, H., Han, J., and Wang, F. Hilots: High-low temporal sensitive representation learning for semi-supervised lidar segmentation in autonomous driving. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 1429--1438, 2025

work page 2025
[27]

Exploring scene affinity for semi-supervised lidar semantic segmentation

Liu, C., Weng, X., Jiang, S., Li, P., Yu, L., and Xia, G.-S. Exploring scene affinity for semi-supervised lidar semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 27380--27389, 2025

work page 2025
[28]

Cross-architecture knowledge distillation

Liu, Y., Cao, J., Li, B., Hu, W., Ding, J., and Li, L. Cross-architecture knowledge distillation. In Proceedings of the Asian Conference on Computer Vision (ACCV), pp.\ 3396--3411, 2022

work page 2022
[29]

Segment any point cloud sequences by distilling vision foundation models

Liu, Y., Kong, L., Cen, J., Chen, R., Zhang, W., Pan, L., Chen, K., and Liu, Z. Segment any point cloud sequences by distilling vision foundation models. Advances in Neural Information Processing Systems (NeurIPS), 36: 0 37193--37229, 2023

work page 2023
[30]

Ittakestwo: Leveraging peer representations for semi-supervised lidar semantic segmentation

Liu, Y., Chen, Y., Wang, H., Belagiannis, V., Reid, I., and Carneiro, G. Ittakestwo: Leveraging peer representations for semi-supervised lidar semantic segmentation. In European Conference on Computer Vision (ECCV), pp.\ 81--99. Springer, 2024

work page 2024
[31]

and Hutter, F

Loshchilov, I. and Hutter, F. Decoupled weight decay regularization. In International Conference on Learning Representations (ICLR), 2017. URL https://api.semanticscholar.org/CorpusID:53592270

work page 2017
[32]

RangeNet++: Fast and Accurate LiDAR Semantic Segmentation

Milioto, A., Vizzo, I., Behley, J., and Stachniss, C. RangeNet++: Fast and Accurate LiDAR Semantic Segmentation . In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2019

work page 2019
[33]

Can you trust your model's uncertainty? evaluating predictive uncertainty under dataset shift

Ovadia, Y., Fertig, E., Ren, J., Nado, Z., Sculley, D., Nowozin, S., Dillon, J., Lakshminarayanan, B., and Snoek, J. Can you trust your model's uncertainty? evaluating predictive uncertainty under dataset shift. Advances in Neural Information Processing Systems (NeurIPS), 32, 2019

work page 2019
[34]

Pittner, M., Janai, J., and Condurache, A. P. Lanecpp: Continuous 3d lane detection using physical priors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 10639--10648, June 2024

work page 2024
[35]

Pittner, M., Janai, J., Faigle, M., and Condurache, A. P. Sparselanestp: Leveraging spatio-temporal priors with sparse transformers for 3d lane detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp.\ 29099--29109, October 2025

work page 2025
[36]

Using a waffle iron for automotive point cloud semantic segmentation

Puy, G., Boulch, A., and Marlet, R. Using a waffle iron for automotive point cloud semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp.\ 3379--3389, 2023

work page 2023
[37]

and Condurache, A

Rath, M. and Condurache, A. Invariant integration in deep convolutional feature space. In Publisher Copyright: ESANN 2020 - Proceedings, 28th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning. , pp.\ 103--108, October 2020

work page 2020
[38]

Image-to-lidar self-supervised distillation for autonomous driving data

Sautier, C., Puy, G., Gidaris, S., Boulch, A., Bursuc, A., and Marlet, R. Image-to-lidar self-supervised distillation for autonomous driving data. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 9891--9901, June 2022

work page 2022
[39]

Smith, L. N. and Topin, N. Super-convergence: very fast training of neural networks using large learning rates. In Defense + Commercial Sensing, 2018

work page 2018
[40]

Scribble-supervised lidar semantic segmentation

Unal, O., Dai, D., and Van Gool, L. Scribble-supervised lidar semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 2697--2707, 2022

work page 2022
[41]

N., Kaiser, L., and Polosukhin, I

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., and Polosukhin, I. Attention is all you need. Advances in Neural Information Processing Systems (NeurIPS), 2017

work page 2017
[42]

Semi-supervised semantic segmentation using unreliable pseudo labels

Wang, Y., Wang, H., Shen, Y., Fei, J., Li, W., Jin, G., Wu, L., Zhao, R., and Le, X. Semi-supervised semantic segmentation using unreliable pseudo labels. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022

work page 2022
[43]

Crest: A class-rebalancing self-training framework for imbalanced semi-supervised learning

Wei, C., Sohn, K., Mellina, C., Yuille, A., and Yang, F. Crest: A class-rebalancing self-training framework for imbalanced semi-supervised learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 10857--10866, 2021

work page 2021
[44]

Point transformer v3: Simpler faster stronger

Wu, X., Jiang, L., Wang, P.-S., Liu, Z., Liu, X., Qiao, Y., Ouyang, W., He, T., and Zhao, H. Point transformer v3: Simpler faster stronger. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 4840--4851, 2024

work page 2024
[45]

Polarmix: A general data augmentation technique for lidar point clouds

Xiao, A., Huang, J., Guan, D., Cui, K., Lu, S., and Shao, L. Polarmix: A general data augmentation technique for lidar point clouds. Advances in Neural Information Processing Systems (NeurIPS), 35: 0 11035--11048, 2022

work page 2022
[46]

Rpvnet: A deep and efficient range-point-voxel fusion network for lidar point cloud segmentation

Xu, J., Zhang, R., Dou, J., Zhu, Y., Sun, J., and Pu, S. Rpvnet: A deep and efficient range-point-voxel fusion network for lidar point cloud segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 16024--16033, 2021

work page 2021
[47]

Frnet: Frustum-range networks for scalable lidar segmentation

Xu, X., Kong, L., Shuai, H., and Liu, Q. Frnet: Frustum-range networks for scalable lidar segmentation. IEEE Transactions on Image Processing, 34: 0 2173--2186, 2023

work page 2023
[48]

4d contrastive superflows are dense 3d representation learners

Xu, X., Kong, L., Shuai, H., Zhang, W., Pan, L., Chen, K., Liu, Z., and Liu, Q. 4d contrastive superflows are dense 3d representation learners. In European Conference on Computer Vision (ECCV), pp.\ 58--80. Springer, 2024

work page 2024
[49]

and Condurache, A

Yang, B. and Condurache, A. P. Flares: Fast and accurate lidar multi-range semantic segmentation. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2026

work page 2026
[50]

Tulip: Transformer for upsampling of lidar point clouds

Yang, B., Pfreundschuh, P., Siegwart, R., Hutter, M., Moghadam, P., and Patil, V. Tulip: Transformer for upsampling of lidar point clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 15354--15364, 2024

work page 2024
[51]

Yang, B., Abdelsamad, M., Zhang, M., and Condurache, A. P. Towards foundation models for 3d scene understanding: Instance-aware self-supervised learning for point clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2026

work page 2026
[52]

M., and Lu, H

Zhang, Y., Xiang, T., Hospedales, T. M., and Lu, H. Deep mutual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 4320--4328, 2018

work page 2018
[53]

Polarnet: An improved grid representation for online lidar point clouds semantic segmentation

Zhang, Y., Zhou, Z., David, P., Yue, X., Xi, Z., Gong, B., and Foroosh, H. Polarnet: An improved grid representation for online lidar point clouds semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 9601--9610, 2020

work page 2020
[54]

Fidnet: Lidar point cloud semantic segmentation with fully interpolation decoding

Zhao, Y., Bai, L., and Huang, X. Fidnet: Lidar point cloud semantic segmentation with fully interpolation decoding. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp.\ 4453--4458. Ieee, 2021

work page 2021
[55]

A good student is cooperative and reliable: Cnn-transformer collaborative learning for semantic segmentation

Zhu, J., Luo, Y., Zheng, X., Wang, H., and Wang, L. A good student is cooperative and reliable: Cnn-transformer collaborative learning for semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp.\ 11720--11730, 2023

work page 2023
[56]

Knowledge distillation by on-the-fly native ensemble

Zhu, X., Gong, S., et al. Knowledge distillation by on-the-fly native ensemble. Advances in Neural Information Processing Systems (NeurIPS), 31, 2018

work page 2018
[57]

Cylindrical and asymmetrical 3d convolution networks for lidar segmentation

Zhu, X., Zhou, H., Wang, T., Hong, F., Ma, Y., Li, W., Li, H., and Lin, D. Cylindrical and asymmetrical 3d convolution networks for lidar segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 9939--9948, 2021

work page 2021
[58]

Unsupervised domain adaptation for semantic segmentation via class-balanced self-training

Zou, Y., Yu, Z., Kumar, B., and Wang, J. Unsupervised domain adaptation for semantic segmentation via class-balanced self-training. In European Conference on Computer Vision (ECCV), pp.\ 289--305, 2018

work page 2018
[59]

Confidence regularized self-training

Zou, Y., Yu, Z., Liu, X., Kumar, B., and Wang, J. Confidence regularized self-training. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp.\ 5982--5991, 2019

work page 2019

[1] [1]

Multi-scale neighborhood occupancy masked autoencoder for self-supervised learning in lidar point clouds

Abdelsamad, M., Ulrich, M., Gl \"a ser, C., and Valada, A. Multi-scale neighborhood occupancy masked autoencoder for self-supervised learning in lidar point clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 22234--22243, 2025

work page 2025

[2] [2]

Dos: Distilling observable softmaps of zipfian prototypes for self-supervised point representation

Abdelsamad, M., Ulrich, M., Yang, B., Zhang, M., Miron, Y., and Valada, A. Dos: Distilling observable softmaps of zipfian prototypes for self-supervised point representation. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 40, pp.\ 19514--19523, 2026

work page 2026

[3] [3]

Rangevit: Towards vision transformers for 3d semantic segmentation in autonomous driving

Ando, A., Gidaris, S., Bursuc, A., Puy, G., Boulch, A., and Marlet, R. Rangevit: Towards vision transformers for 3d semantic segmentation in autonomous driving. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 5240--5250, 2023

work page 2023

[4] [4]

E., and McGuinness, K

Arazo, E., Ortego, D., Albert, P., O’Connor, N. E., and McGuinness, K. Pseudo-labeling and confirmation bias in deep semi-supervised learning. In International Joint Conference on Neural Networks (IJCNN), pp.\ 1--8. Ieee, 2020

work page 2020

[5] [5]

Semantickitti: A dataset for semantic scene understanding of lidar sequences

Behley, J., Garbade, M., Milioto, A., Quenzel, J., Behnke, S., Stachniss, C., and Gall, J. Semantickitti: A dataset for semantic scene understanding of lidar sequences. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp.\ 9297--9307, 2019

work page 2019

[6] [6]

Curriculum learning

Bengio, Y., Louradour, J., Collobert, R., and Weston, J. Curriculum learning. In International Conference on Machine Learning (ICML), pp.\ 41–48, 2009. ISBN 9781605585161

work page 2009

[7] [7]

R., and Blaschko, M

Berman, M., Triki, A. R., and Blaschko, M. B. The lov \'a sz-softmax loss: A tractable surrogate for the optimization of the intersection-over-union measure in neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 4413--4421, 2018

work page 2018

[8] [8]

Just add 100 more: Augmenting nerf-based pseudo-lidar point cloud for resolving class-imbalance problem

Chang, M., Lee, S., Kim, J., and Kim, N. Just add 100 more: Augmenting nerf-based pseudo-lidar point cloud for resolving class-imbalance problem. In Advances in Neural Information Processing Systems (NeurIPS), 2024

work page 2024

[9] [9]

Semi-supervised semantic segmentation with cross pseudo supervision

Chen, X., Yuan, Y., Zeng, G., and Wang, J. Semi-supervised semantic segmentation with cross pseudo supervision. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 2613--2622, 2021 a

work page 2021

[10] [10]

Semi-supervised semantic segmentation with cross pseudo supervision

Chen, X., Yuan, Y., Zeng, G., and Wang, J. Semi-supervised semantic segmentation with cross pseudo supervision. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 2613--2622, 2021 b

work page 2021

[11] [11]

4d spatio-temporal convnets: Minkowski convolutional neural networks

Choy, C., Gwak, J., and Savarese, S. 4d spatio-temporal convnets: Minkowski convolutional neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 3075--3084, 2019

work page 2019

[12] [12]

Learning transformation invariant representations with weak supervision

Coors, B., Condurache, A., Mertins, A., and Geiger, A. Learning transformation invariant representations with weak supervision. In VISIGRAPP, 2018

work page 2018

[13] [13]

An image is worth 16x16 words: Transformers for image recognition at scale

Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., and Houlsby, N. An image is worth 16x16 words: Transformers for image recognition at scale. International Conference on Learning Representations (ICLR), 2021

work page 2021

[14] [14]

K., Mohan, R., Hurtado, J

Fong, W. K., Mohan, R., Hurtado, J. V., Zhou, L., Caesar, H., Beijbom, O., and Valada, A. Panoptic nuscenes: A large-scale benchmark for lidar panoptic segmentation and tracking. IEEE Robotics and Automation Letters, 7 0 (2): 0 3795--3802, 2022

work page 2022

[15] [15]

Guo, C., Pleiss, G., Sun, Y., and Weinberger, K. Q. On calibration of modern neural networks. In International Conference on Machine Learning (ICML), pp.\ 1321--1330. Pmlr, 2017

work page 2017

[16] [16]

Online knowledge distillation via collaborative learning

Guo, Q., Wang, X., Wu, Y., Yu, Z., Liang, D., Hu, X., and Luo, P. Online knowledge distillation via collaborative learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2020

work page 2020

[17] [17]

C., and Li, Y

Hou, Y., Zhu, X., Ma, Y., Loy, C. C., and Li, Y. Point-to-voxel knowledge distillation for lidar semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 8479--8488, 2022

work page 2022

[18] [18]

Guided point contrastive learning for semi-supervised point cloud semantic segmentation

Jiang, L., Shi, S., Tian, Z., Lai, X., Liu, S., Fu, C.-W., and Jia, J. Guided point contrastive learning for semi-supervised point cloud semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp.\ 6423--6432, 2021

work page 2021

[19] [19]

Rethinking range view representation for lidar segmentation

Kong, L., Liu, Y., Chen, R., Ma, Y., Zhu, X., Li, Y., Hou, Y., Qiao, Y., and Liu, Z. Rethinking range view representation for lidar segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp.\ 228--240, 2023 a

work page 2023

[20] [20]

Robo3d: Towards robust and reliable 3d perception against corruptions

Kong, L., Liu, Y., Li, X., Chen, R., Zhang, W., Ren, J., Pan, L., Chen, K., and Liu, Z. Robo3d: Towards robust and reliable 3d perception against corruptions. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp.\ 19994--20006, 2023 b

work page 2023

[21] [21]

Lasermix for semi-supervised lidar semantic segmentation

Kong, L., Ren, J., Pan, L., and Liu, Z. Lasermix for semi-supervised lidar semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 21705--21715, 2023 c

work page 2023

[22] [22]

T., and Liu, Z

Kong, L., Xu, X., Ren, J., Zhang, W., Pan, L., Chen, K., Ooi, W. T., and Liu, Z. Multi-modal data-efficient 3d scene understanding for autonomous driving. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025

work page 2025

[23] [23]

Spherical transformer for lidar-based 3d recognition

Lai, X., Chen, Y., Lu, F., Liu, J., and Jia, J. Spherical transformer for lidar-based 3d recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 17545--17555, 2023

work page 2023

[24] [24]

and Dong, Q

Li, J. and Dong, Q. Density-guided semi-supervised 3d semantic segmentation with dual-space hardness sampling. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 3260--3269, June 2024

work page 2024

[25] [25]

Li, L., Shum, H. P. H., and Breckon, T. P. Less is More : Reducing Task and Model Complexity for 3D Point Cloud Semantic Segmentation . In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Ieee , June 2023

work page 2023

[26] [26]

Hilots: High-low temporal sensitive representation learning for semi-supervised lidar segmentation in autonomous driving

Lin, R., Weng, P., Wang, Y., Ding, H., Han, J., and Wang, F. Hilots: High-low temporal sensitive representation learning for semi-supervised lidar segmentation in autonomous driving. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 1429--1438, 2025

work page 2025

[27] [27]

Exploring scene affinity for semi-supervised lidar semantic segmentation

Liu, C., Weng, X., Jiang, S., Li, P., Yu, L., and Xia, G.-S. Exploring scene affinity for semi-supervised lidar semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 27380--27389, 2025

work page 2025

[28] [28]

Cross-architecture knowledge distillation

Liu, Y., Cao, J., Li, B., Hu, W., Ding, J., and Li, L. Cross-architecture knowledge distillation. In Proceedings of the Asian Conference on Computer Vision (ACCV), pp.\ 3396--3411, 2022

work page 2022

[29] [29]

Segment any point cloud sequences by distilling vision foundation models

Liu, Y., Kong, L., Cen, J., Chen, R., Zhang, W., Pan, L., Chen, K., and Liu, Z. Segment any point cloud sequences by distilling vision foundation models. Advances in Neural Information Processing Systems (NeurIPS), 36: 0 37193--37229, 2023

work page 2023

[30] [30]

Ittakestwo: Leveraging peer representations for semi-supervised lidar semantic segmentation

Liu, Y., Chen, Y., Wang, H., Belagiannis, V., Reid, I., and Carneiro, G. Ittakestwo: Leveraging peer representations for semi-supervised lidar semantic segmentation. In European Conference on Computer Vision (ECCV), pp.\ 81--99. Springer, 2024

work page 2024

[31] [31]

and Hutter, F

Loshchilov, I. and Hutter, F. Decoupled weight decay regularization. In International Conference on Learning Representations (ICLR), 2017. URL https://api.semanticscholar.org/CorpusID:53592270

work page 2017

[32] [32]

RangeNet++: Fast and Accurate LiDAR Semantic Segmentation

Milioto, A., Vizzo, I., Behley, J., and Stachniss, C. RangeNet++: Fast and Accurate LiDAR Semantic Segmentation . In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2019

work page 2019

[33] [33]

Can you trust your model's uncertainty? evaluating predictive uncertainty under dataset shift

Ovadia, Y., Fertig, E., Ren, J., Nado, Z., Sculley, D., Nowozin, S., Dillon, J., Lakshminarayanan, B., and Snoek, J. Can you trust your model's uncertainty? evaluating predictive uncertainty under dataset shift. Advances in Neural Information Processing Systems (NeurIPS), 32, 2019

work page 2019

[34] [34]

Pittner, M., Janai, J., and Condurache, A. P. Lanecpp: Continuous 3d lane detection using physical priors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 10639--10648, June 2024

work page 2024

[35] [35]

Pittner, M., Janai, J., Faigle, M., and Condurache, A. P. Sparselanestp: Leveraging spatio-temporal priors with sparse transformers for 3d lane detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp.\ 29099--29109, October 2025

work page 2025

[36] [36]

Using a waffle iron for automotive point cloud semantic segmentation

Puy, G., Boulch, A., and Marlet, R. Using a waffle iron for automotive point cloud semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp.\ 3379--3389, 2023

work page 2023

[37] [37]

and Condurache, A

Rath, M. and Condurache, A. Invariant integration in deep convolutional feature space. In Publisher Copyright: ESANN 2020 - Proceedings, 28th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning. , pp.\ 103--108, October 2020

work page 2020

[38] [38]

Image-to-lidar self-supervised distillation for autonomous driving data

Sautier, C., Puy, G., Gidaris, S., Boulch, A., Bursuc, A., and Marlet, R. Image-to-lidar self-supervised distillation for autonomous driving data. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 9891--9901, June 2022

work page 2022

[39] [39]

Smith, L. N. and Topin, N. Super-convergence: very fast training of neural networks using large learning rates. In Defense + Commercial Sensing, 2018

work page 2018

[40] [40]

Scribble-supervised lidar semantic segmentation

Unal, O., Dai, D., and Van Gool, L. Scribble-supervised lidar semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 2697--2707, 2022

work page 2022

[41] [41]

N., Kaiser, L., and Polosukhin, I

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., and Polosukhin, I. Attention is all you need. Advances in Neural Information Processing Systems (NeurIPS), 2017

work page 2017

[42] [42]

Semi-supervised semantic segmentation using unreliable pseudo labels

Wang, Y., Wang, H., Shen, Y., Fei, J., Li, W., Jin, G., Wu, L., Zhao, R., and Le, X. Semi-supervised semantic segmentation using unreliable pseudo labels. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022

work page 2022

[43] [43]

Crest: A class-rebalancing self-training framework for imbalanced semi-supervised learning

Wei, C., Sohn, K., Mellina, C., Yuille, A., and Yang, F. Crest: A class-rebalancing self-training framework for imbalanced semi-supervised learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 10857--10866, 2021

work page 2021

[44] [44]

Point transformer v3: Simpler faster stronger

Wu, X., Jiang, L., Wang, P.-S., Liu, Z., Liu, X., Qiao, Y., Ouyang, W., He, T., and Zhao, H. Point transformer v3: Simpler faster stronger. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 4840--4851, 2024

work page 2024

[45] [45]

Polarmix: A general data augmentation technique for lidar point clouds

Xiao, A., Huang, J., Guan, D., Cui, K., Lu, S., and Shao, L. Polarmix: A general data augmentation technique for lidar point clouds. Advances in Neural Information Processing Systems (NeurIPS), 35: 0 11035--11048, 2022

work page 2022

[46] [46]

Rpvnet: A deep and efficient range-point-voxel fusion network for lidar point cloud segmentation

Xu, J., Zhang, R., Dou, J., Zhu, Y., Sun, J., and Pu, S. Rpvnet: A deep and efficient range-point-voxel fusion network for lidar point cloud segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 16024--16033, 2021

work page 2021

[47] [47]

Frnet: Frustum-range networks for scalable lidar segmentation

Xu, X., Kong, L., Shuai, H., and Liu, Q. Frnet: Frustum-range networks for scalable lidar segmentation. IEEE Transactions on Image Processing, 34: 0 2173--2186, 2023

work page 2023

[48] [48]

4d contrastive superflows are dense 3d representation learners

Xu, X., Kong, L., Shuai, H., Zhang, W., Pan, L., Chen, K., Liu, Z., and Liu, Q. 4d contrastive superflows are dense 3d representation learners. In European Conference on Computer Vision (ECCV), pp.\ 58--80. Springer, 2024

work page 2024

[49] [49]

and Condurache, A

Yang, B. and Condurache, A. P. Flares: Fast and accurate lidar multi-range semantic segmentation. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2026

work page 2026

[50] [50]

Tulip: Transformer for upsampling of lidar point clouds

Yang, B., Pfreundschuh, P., Siegwart, R., Hutter, M., Moghadam, P., and Patil, V. Tulip: Transformer for upsampling of lidar point clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 15354--15364, 2024

work page 2024

[51] [51]

Yang, B., Abdelsamad, M., Zhang, M., and Condurache, A. P. Towards foundation models for 3d scene understanding: Instance-aware self-supervised learning for point clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2026

work page 2026

[52] [52]

M., and Lu, H

Zhang, Y., Xiang, T., Hospedales, T. M., and Lu, H. Deep mutual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 4320--4328, 2018

work page 2018

[53] [53]

Polarnet: An improved grid representation for online lidar point clouds semantic segmentation

Zhang, Y., Zhou, Z., David, P., Yue, X., Xi, Z., Gong, B., and Foroosh, H. Polarnet: An improved grid representation for online lidar point clouds semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 9601--9610, 2020

work page 2020

[54] [54]

Fidnet: Lidar point cloud semantic segmentation with fully interpolation decoding

Zhao, Y., Bai, L., and Huang, X. Fidnet: Lidar point cloud semantic segmentation with fully interpolation decoding. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp.\ 4453--4458. Ieee, 2021

work page 2021

[55] [55]

A good student is cooperative and reliable: Cnn-transformer collaborative learning for semantic segmentation

Zhu, J., Luo, Y., Zheng, X., Wang, H., and Wang, L. A good student is cooperative and reliable: Cnn-transformer collaborative learning for semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp.\ 11720--11730, 2023

work page 2023

[56] [56]

Knowledge distillation by on-the-fly native ensemble

Zhu, X., Gong, S., et al. Knowledge distillation by on-the-fly native ensemble. Advances in Neural Information Processing Systems (NeurIPS), 31, 2018

work page 2018

[57] [57]

Cylindrical and asymmetrical 3d convolution networks for lidar segmentation

Zhu, X., Zhou, H., Wang, T., Hong, F., Ma, Y., Li, W., Li, H., and Lin, D. Cylindrical and asymmetrical 3d convolution networks for lidar segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.\ 9939--9948, 2021

work page 2021

[58] [58]

Unsupervised domain adaptation for semantic segmentation via class-balanced self-training

Zou, Y., Yu, Z., Kumar, B., and Wang, J. Unsupervised domain adaptation for semantic segmentation via class-balanced self-training. In European Conference on Computer Vision (ECCV), pp.\ 289--305, 2018

work page 2018

[59] [59]

Confidence regularized self-training

Zou, Y., Yu, Z., Liu, X., Kumar, B., and Wang, J. Confidence regularized self-training. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp.\ 5982--5991, 2019

work page 2019