Audioldm 2: Learning holistic audio generation with self-supervised pretraining

Haohe Liu, Yi Yuan, Xubo Liu, Xinhao Mei, Qiuqiang Kong, Qiao Tian, Yuping Wang, Wenwu Wang, Yuxuan Wang, Mark D Plumbley · arXiv 2305.12708

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

read on arXiv browse 1 citing papers

representative citing papers

AudioCALM: Continuous Autoregressive Language Modeling for Universal Audio Generation

eess.AS · 2026-06-22 · unverdicted · novelty 7.0

AudioCALM presents a continuous autoregressive framework with flow-matching prediction and A-MoME architecture that unifies speech, sound, and music generation while matching modality-specific state-of-the-art performance.

citing papers explorer

Showing 1 of 1 citing paper after filters.

AudioCALM: Continuous Autoregressive Language Modeling for Universal Audio Generation eess.AS · 2026-06-22 · unverdicted · none · ref 20
AudioCALM presents a continuous autoregressive framework with flow-matching prediction and A-MoME architecture that unifies speech, sound, and music generation while matching modality-specific state-of-the-art performance.

Audioldm 2: Learning holistic audio generation with self-supervised pretraining

fields

years

verdicts

representative citing papers

citing papers explorer