GDMD replaces raw-sample rewards with distillation-gradient rewards in RL-guided diffusion distillation, yielding 4-step models that surpass their multi-step teachers on GenEval and human preference metrics.
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition
4 Pith papers cite this work. Polarity classification is still indexing.
years
2026 4verdicts
UNVERDICTED 4representative citing papers
D-OPSD formulates supervised fine-tuning of step-distilled diffusion models as on-policy self-distillation by having the model act as both teacher (with multimodal context) and student (with text-only context) on its own roll-outs.
LucidNFT combines a new LR-referenced consistency reward, decoupled normalization, and a real-degradation dataset to improve perceptual quality in flow-matching super-resolution while preserving input fidelity.
OmniJigsaw is a self-supervised proxy task that reconstructs shuffled audio-visual clips via joint integration, sample-level selection, and clip-level masking strategies, yielding gains on 15 video, audio, and reasoning benchmarks.
citing papers explorer
No citing papers match the current filters.