LIFT decomposes distillation into coarse linear alignment then fine refinement while PLACE adds error-based local adaptation, allowing stable training of 1.3M-parameter students (1.6% teacher size) to FID 15.73 across diffusion and flow models.
Dual-forward path teacher knowledge distillation: Bridging the capacity gap between teacher and student
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CV 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
Mixup applied only to the student during KD induces independent linearity acquisition that reduces overconfidence by an order of magnitude while improving accuracy, with calibration transferring separately from accuracy.
citing papers explorer
No citing papers match the current filters.