Dual-head knowledge distillation partitions the linear classifier into separate heads for logit and probability losses to exploit logits without causing classification head collapse.
In Proceedings of the Conference on Computer Vision and Pattern Recognition
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2024 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Dual-Head Knowledge Distillation: Enhancing Logits Utilization with an Auxiliary Head
Dual-head knowledge distillation partitions the linear classifier into separate heads for logit and probability losses to exploit logits without causing classification head collapse.