Toward Robust Semi-supervised Regression via Dual-stream Knowledge Distillation

Hezhe Qiao; Lin Chen; Wei Huang; Ye Su

read the original abstract

Semi-supervised regression (SSR), which aims to predict continuous scores for samples while reducing the reliance on large-scale labeled data, has recently attracted considerable attention across various applications, including computer vision, natural language processing, audio analysis, and medical analysis. Existing SSR methods typically train models with scarce labeled data by introducing constraint-based regularization or ordinal ranking to mitigate overfitting. However, these approaches often fail to fully exploit the abundance of unlabeled samples. Although consistency-driven pseudo-labeling methods attempt to incorporate unlabeled data, their performance is highly sensitive to pseudo-label quality and noisy predictions. To address these challenges, we propose a Dual-stream Knowledge Distillation framework (DKD), which is specifically designed for SSR to distill both continuous-valued knowledge and distributional information. This design better preserves regression magnitude information and improves sample efficiency. Specifically, in DKD, the teacher is optimized solely with ground-truth labels for label distribution estimation, while the student learns from a mixture of real labels and teacher-generated pseudo targets on unlabeled data. The distillation process enables effective supervision transfer, allowing the student to leverage pseudo labels more robustly. Furthermore, we introduce a Decoupled Distribution Alignment (DDA) module, which separately aligns the target and non-target distributions between the teacher and student. To improve the reliability of non-target knowledge transfer, DDA incorporates a variance-guided non-target distribution alignment strategy that adaptively downweights uncertain teacher predictions, thereby enhancing the student's ability to mitigate noise in pseudo-label supervision and learn a better-calibrated regression predictor.

Toward Robust Semi-supervised Regression via Dual-stream Knowledge Distillation

discussion (0)