Dual-forward path teacher knowledge distillation: Bridging the capacity gap between teacher and student

Li, T · 2025 · arXiv 2506.18244

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

LIFT and PLACE: A Simple, Stable, and Effective Knowledge Distillation Framework for Lightweight Diffusion Models

cs.CV · 2026-05-19 · unverdicted · novelty 6.0 · 3 refs

LIFT decomposes distillation into coarse linear alignment then fine refinement while PLACE adds error-based local adaptation, allowing stable training of 1.3M-parameter students (1.6% teacher size) to FID 15.73 across diffusion and flow models.

Beyond Dark Knowledge: Mixup-Based Distillation for Reliable Predictions

cs.CV · 2026-06-10 · unverdicted · novelty 5.0

Mixup applied only to the student during KD induces independent linearity acquisition that reduces overconfidence by an order of magnitude while improving accuracy, with calibration transferring separately from accuracy.

citing papers explorer

Showing 0 of 0 citing papers after filters.

No citing papers match the current filters.

Dual-forward path teacher knowledge distillation: Bridging the capacity gap between teacher and student

fields

years

verdicts

representative citing papers

citing papers explorer