MSDDA derives a closed-form optimal reverse denoising distribution for multi-objective diffusion alignment that is exactly equivalent to step-level RL fine-tuning with no approximation error.
Variational diffusion models.Advances in neural information processing systems, 34:21696–21707, 2021
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3representative citing papers
DAR replaces residual addition in DiTs with learnable timestep-adaptive non-incremental aggregation of sublayer outputs, improving FID by 2.11 on ImageNet 256x256 and accelerating convergence by 8.75x.
Embedding Loss aligns feature distributions via MMD in random network embeddings to boost one-step diffusion distillation, reaching SOTA FID of 1.475 on CIFAR-10 unconditional generation.
citing papers explorer
-
Step-level Denoising-time Diffusion Alignment with Multiple Objectives
MSDDA derives a closed-form optimal reverse denoising distribution for multi-objective diffusion alignment that is exactly equivalent to step-level RL fine-tuning with no approximation error.
-
Rethinking Cross-Layer Information Routing in Diffusion Transformers
DAR replaces residual addition in DiTs with learnable timestep-adaptive non-incremental aggregation of sublayer outputs, improving FID by 2.11 on ImageNet 256x256 and accelerating convergence by 8.75x.
-
Efficient Diffusion Distillation via Embedding Loss
Embedding Loss aligns feature distributions via MMD in random network embeddings to boost one-step diffusion distillation, reaching SOTA FID of 1.475 on CIFAR-10 unconditional generation.