Diffa: Large language diffusion models can listen and understand

Jiaming Zhou, Hongjie Chen, Shiwan Zhao, Jian Kang, Jie Li, Enzhi Wang, Yujie Guo, Haoqin Sun, Hui Wang, Aobo Kong, et al · 2025 · arXiv 2507.18452

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

read on arXiv browse 2 citing papers

representative citing papers

LaDA-Band: Language Diffusion Models for Vocal-to-Accompaniment Generation

cs.SD · 2026-04-13 · unverdicted · novelty 6.0

LaDA-Band applies discrete masked diffusion with dual-track conditioning and progressive training to generate vocal-to-accompaniment tracks that improve acoustic authenticity, global coherence, and dynamic orchestration over prior baselines.

Scaling Properties of Continuous Diffusion Spoken Language Models

cs.CL · 2026-04-27 · unverdicted · novelty 5.0

Continuous diffusion spoken language models follow scaling laws for loss and phoneme divergence and generate emotive multi-speaker speech at 16B scale, though long-form coherence stays difficult.

citing papers explorer

Showing 2 of 2 citing papers.

LaDA-Band: Language Diffusion Models for Vocal-to-Accompaniment Generation cs.SD · 2026-04-13 · unverdicted · none · ref 60
LaDA-Band applies discrete masked diffusion with dual-track conditioning and progressive training to generate vocal-to-accompaniment tracks that improve acoustic authenticity, global coherence, and dynamic orchestration over prior baselines.
Scaling Properties of Continuous Diffusion Spoken Language Models cs.CL · 2026-04-27 · unverdicted · none · ref 44
Continuous diffusion spoken language models follow scaling laws for loss and phoneme divergence and generate emotive multi-speaker speech at 16B scale, though long-form coherence stays difficult.

Diffa: Large language diffusion models can listen and understand

fields

years

verdicts

representative citing papers

citing papers explorer