ArtifactNet extracts codec residuals from spectrograms with a 4M-parameter network to detect AI music at F1=0.9829 and 1.49% FPR on unseen tracks from 22 generators, outperforming larger baselines.
Music source separation in the waveform domain
7 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
The Spheres dataset provides multitrack orchestral recordings with isolated instrument stems and acoustic characterizations to support supervised machine learning for music source separation in the classical domain.
EnCodec is an end-to-end trained streaming neural audio codec that uses a single multiscale spectrogram discriminator and a gradient-normalizing loss balancer to achieve higher fidelity than prior methods at the same bitrates for 24 kHz mono and 48 kHz stereo audio.
A Conformer-conditioned decoder-only language model generates discrete tokens via a neural audio codec to separate four music stems, reaching near state-of-the-art perceptual quality and top NISQA on vocals in MUSDB18-HQ tests.
Diffusion-based refinement followed by consistency distillation improves music source separation quality and inference speed across U-Net and BS-RoFormer backbones on Slakh2100 and MUSDB18.
citing papers explorer
-
The Spheres Dataset: Multitrack Orchestral Recordings for Music Source Separation and Information Retrieval
The Spheres dataset provides multitrack orchestral recordings with isolated instrument stems and acoustic characterizations to support supervised machine learning for music source separation in the classical domain.