Presents a layer- and point-wise projection mapping for manifold-based logit distillation combined with LoRA to enable low-parameter student training with reported WER gains.
Teacher-student compression with generative adversar- ial networks.arXiv preprint arXiv:1812.02271, 2018a
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Logit Distillation on Manifolds: Mapping by Learning
Presents a layer- and point-wise projection mapping for manifold-based logit distillation combined with LoRA to enable low-parameter student training with reported WER gains.