Abkd: Pursuing a proper allocation of the probability mass in knowledge distillation via α-β-divergence

Guanghui Wang, Zhiyong Yang, Zitai Wang, Shi Wang, Qianqian Xu, Qingming Huang · 2025

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

cs.CL · 2026-02-03 · unverdicted · novelty 7.0

A learned transformation matrix minimizes CMI in teacher logits to degrade distillation performance while preserving task accuracy.

Showing 1 of 1 citing paper.

Towards Distillation-Resistant Large Language Models: An Information-Theoretic Perspective cs.CL · 2026-02-03 · unverdicted · none · ref 16
A learned transformation matrix minimizes CMI in teacher logits to degrade distillation performance while preserving task accuracy.