Stylus achieves training-free music style transfer on Mel-spectrograms by repurposing image diffusion models via style-key injection in self-attention plus phase-preserving reconstruction, outperforming baselines by 34.1% in content preservation and 25.7% in perceptual quality per 2,925 human raters
For MusicTI [6], we trained the style encoder following the au- thor’s protocol; for MusicGen [9], content audio served as the melody guide with text-based style descriptions
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.SD 1years
2024 1verdicts
CONDITIONAL 1representative citing papers
citing papers explorer
-
Repurposing Image Diffusion Models for Training-Free Music Style Transfer on Mel-spectrograms
Stylus achieves training-free music style transfer on Mel-spectrograms by repurposing image diffusion models via style-key injection in self-attention plus phase-preserving reconstruction, outperforming baselines by 34.1% in content preservation and 25.7% in perceptual quality per 2,925 human raters