← back to paper
arxiv: 2606.20101 · 2 revisions
Hybrid Diffusion Transformer for Instruction-Guided Audio Editing via Rectified Flow