CoMelSinger introduces a discrete token-based zero-shot SVS framework on MaskGCT with coarse-to-fine contrastive learning and an SVT module to improve melody control and reduce prosody leakage.
Singmos: An extensive open- source singing voice dataset for mos prediction
4 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 4representative citing papers
MOS-Bench benchmark shows that existing SSQA models struggle with out-of-domain generalization and that training on multiple diverse datasets improves robustness.
MusicJudge is a modality-guided framework that performs block-aligned multimodal analysis for singing quality assessment by coupling lyrics with pitch-rhythm fidelity via multi-signal matching and Modality-Guided LoRA fine-tuning.
MOS models match humans on acoustic degradation but are insensitive to prosodic errors and show a double dissociation on speaker characteristics like mean F0 bias and insensitivity to rate and F0 variability.
citing papers explorer
-
Listening Like a Judge: A Music-Aware Framework for Automatic Singing Performance Evaluation
MusicJudge is a modality-guided framework that performs block-aligned multimodal analysis for singing quality assessment by coupling lyrics with pitch-rhythm fidelity via multi-signal matching and Modality-Guided LoRA fine-tuning.