MMAE is a new multitask audio editing benchmark showing that leading models achieve under 5% exact match rate, with 0% on complex mixed-modality tasks.
MMEDIT: A Unified Framework for Multi-T ype Audio Editing via Audio Language Model
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2representative citing papers
UNISON introduces a unified latent diffusion framework with layer-wise LLM fusion and channel-mask task encoding for multiple speech and sound generation and editing tasks.
citing papers explorer
-
UNISON: A Unified Sound Generation and Editing Framework via Deep LLM Fusion
UNISON introduces a unified latent diffusion framework with layer-wise LLM fusion and channel-mask task encoding for multiple speech and sound generation and editing tasks.