← back to paper
arxiv: 2605.13143 · 2 revisions
On the Generalization of Knowledge Distillation: An Information-Theoretic View