Uniflow-audio: Unified flow matching for audio generation from omni-modalities.ArXiv, abs/2509.24391, 2025

Xuenan Xu, Jiahao Mei, Zihao Zheng, Ye Tao, Zeyu Xie, et al · 2025 · arXiv 2509.24391

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

VidAudio-Bench: Benchmarking V2A and VT2A Generation across Four Audio Categories

cs.SD · 2026-04-12 · unverdicted · novelty 7.0

VidAudio-Bench benchmarks V2A and VT2A models across four audio categories, revealing poor speech/singing performance and a tension between visual alignment and text instruction following.

Omni2Sound: Towards Unified Video-Text-to-Audio Generation

cs.SD · 2026-01-06 · unverdicted · novelty 7.0

A single DiT-based diffusion model unifies video-to-audio, text-to-audio, and joint video-text-to-audio generation, supported by a new 470k-pair dataset and three-stage progressive training that resolves task competition.

citing papers explorer

Showing 2 of 2 citing papers.

VidAudio-Bench: Benchmarking V2A and VT2A Generation across Four Audio Categories cs.SD · 2026-04-12 · unverdicted · none · ref 64
VidAudio-Bench benchmarks V2A and VT2A models across four audio categories, revealing poor speech/singing performance and a tension between visual alignment and text instruction following.
Omni2Sound: Towards Unified Video-Text-to-Audio Generation cs.SD · 2026-01-06 · unverdicted · none · ref 29
A single DiT-based diffusion model unifies video-to-audio, text-to-audio, and joint video-text-to-audio generation, supported by a new 470k-pair dataset and three-stage progressive training that resolves task competition.

Uniflow-audio: Unified flow matching for audio generation from omni-modalities.ArXiv, abs/2509.24391, 2025

fields

years

verdicts

representative citing papers

citing papers explorer