Polyphonia improves zero-shot stem-specific timbre transfer in polyphonic music by 15.5% target alignment via acoustic-informed attention calibration that uses probabilistic priors to set coarse boundaries.
SongEval: A benchmark dataset for song aesthetics evaluation
6 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 6roles
baseline 1polarities
baseline 1representative citing papers
MIDI-SAG generates consistent long-form singing accompaniments by feeding symbolic MIDI timing, chords, and structure labels into a compositional pipeline built from pre-trained modules.
APEX jointly predicts engagement-based popularity and five aesthetic quality dimensions for AI-generated music, improving human preference prediction on out-of-distribution generative systems.
SongBench is a new fine-grained benchmark for song quality assessment with seven dimensions and an expert-annotated dataset of 11,717 samples showing high correlation with professional ratings.
LaDA-Band applies discrete masked diffusion with dual-track conditioning and progressive training to generate vocal-to-accompaniment tracks that improve acoustic authenticity, global coherence, and dynamic orchestration over prior baselines.
A zero-training VLM framework generates music from images via ABC notation, multi-modal RAG, and self-refinement while providing text and visual explanations for the outputs.
citing papers explorer
-
Polyphonia: Zero-Shot Timbre Transfer in Polyphonic Music with Acoustic-Informed Attention Calibration
Polyphonia improves zero-shot stem-specific timbre transfer in polyphonic music by 15.5% target alignment via acoustic-informed attention calibration that uses probabilistic priors to set coarse boundaries.
-
MIDI-Informed Singing Accompaniment Generation in a Compositional Song Pipeline
MIDI-SAG generates consistent long-form singing accompaniments by feeding symbolic MIDI timing, chords, and structure labels into a compositional pipeline built from pre-trained modules.
-
APEX: Large-scale Multi-task Aesthetic-Informed Popularity Prediction for AI-Generated Music
APEX jointly predicts engagement-based popularity and five aesthetic quality dimensions for AI-generated music, improving human preference prediction on out-of-distribution generative systems.
-
SongBench: A Fine-Grained Multi-Aspect Benchmark for Song Quality Assessment
SongBench is a new fine-grained benchmark for song quality assessment with seven dimensions and an expert-annotated dataset of 11,717 samples showing high correlation with professional ratings.
-
LaDA-Band: Language Diffusion Models for Vocal-to-Accompaniment Generation
LaDA-Band applies discrete masked diffusion with dual-track conditioning and progressive training to generate vocal-to-accompaniment tracks that improve acoustic authenticity, global coherence, and dynamic orchestration over prior baselines.
-
Zero-Effort Image-to-Music Generation: An Interpretable RAG-based VLM Approach
A zero-training VLM framework generates music from images via ABC notation, multi-modal RAG, and self-refinement while providing text and visual explanations for the outputs.