Auxiliary lyric and timbre branches improve instrumental text-to-music generation quality in a controlled DiT setting even with degenerate inputs, outperforming parameter-reallocated depth variants and external baselines in objective and MOS evaluations.
The mtg-jamendo dataset for automatic music tagging,
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.SD 1years
2026 1verdicts
CONDITIONAL 1representative citing papers
citing papers explorer
-
Instrumental Text-to-Music Generation with Auxiliary Conditioning Branches
Auxiliary lyric and timbre branches improve instrumental text-to-music generation quality in a controlled DiT setting even with degenerate inputs, outperforming parameter-reallocated depth variants and external baselines in objective and MOS evaluations.