MixtureTT performs direct per-stem timbre transfer on polyphonic mixtures via a shared diffusion transformer, outperforming single-stem baselines on SATB choral data while eliminating cascaded separation errors.
Multi-source diffusion models for simultaneous mu- sic generation and separation
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
method 1
citation-polarity summary
fields
cs.SD 2years
2026 2verdicts
UNVERDICTED 2roles
method 1polarities
use method 1representative citing papers
MAGE unifies text, visual, and audio-conditioned music generation and editing in one flow-based latent model with dynamic modality masking and cross-gated control.
citing papers explorer
-
Remix the Timbre: Diffusion-Based Style Transfer Across Polyphonic Stems
MixtureTT performs direct per-stem timbre transfer on polyphonic mixtures via a shared diffusion transformer, outperforming single-stem baselines on SATB choral data while eliminating cascaded separation errors.
-
MAGE: Modality-Agnostic Music Generation and Editing
MAGE unifies text, visual, and audio-conditioned music generation and editing in one flow-based latent model with dynamic modality masking and cross-gated control.