JFDL allows pre-trained Consistency Models to perform guided image generation post-hoc by aligning flow distributions, reducing FID scores on CIFAR-10 and ImageNet without needing a teacher model.
Prodiff: Progressive fast diffusion model for high-quality text-to-speech
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
Continuous diffusion spoken language models follow scaling laws for loss and phoneme divergence and generate emotive multi-speaker speech at 16B scale, though long-form coherence stays difficult.
citing papers explorer
-
Post-Hoc Guidance for Consistency Models by Joint Flow Distribution Learning
JFDL allows pre-trained Consistency Models to perform guided image generation post-hoc by aligning flow distributions, reducing FID scores on CIFAR-10 and ImageNet without needing a teacher model.
-
Scaling Properties of Continuous Diffusion Spoken Language Models
Continuous diffusion spoken language models follow scaling laws for loss and phoneme divergence and generate emotive multi-speaker speech at 16B scale, though long-form coherence stays difficult.