CodecSep performs prompt-driven universal sound separation directly in neural audio codec latents by combining a frozen DAC backbone with a lightweight FiLM-conditioned Transformer masker driven by CLAP embeddings, yielding efficiency gains over AudioSep.
Sdr--half-baked or well done? In ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 626--630
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.SD 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
CodecSep: Prompt-Driven Universal Sound Separation on Neural Audio Codec Latents
CodecSep performs prompt-driven universal sound separation directly in neural audio codec latents by combining a frozen DAC backbone with a lightweight FiLM-conditioned Transformer masker driven by CLAP embeddings, yielding efficiency gains over AudioSep.