Knowledge distillation generalization bounds are derived via a new distillation divergence measuring teacher-student kernel difference, with tighter bounds from teacher loss flatness.
In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
An autoencoder with minimal latent entropy loss enables fully unsupervised video anomaly detection by concentrating normal latent embeddings and producing poor reconstructions for anomalies.
citing papers explorer
-
On the Generalization of Knowledge Distillation: An Information-Theoretic View
Knowledge distillation generalization bounds are derived via a new distillation divergence measuring teacher-student kernel difference, with tighter bounds from teacher loss flatness.
-
MLE-UVAD: Minimal Latent Entropy Autoencoder for Fully Unsupervised Video Anomaly Detection
An autoencoder with minimal latent entropy loss enables fully unsupervised video anomaly detection by concentrating normal latent embeddings and producing poor reconstructions for anomalies.