RoleJudge is a multidimensional evaluation framework for speech-character alignment in audio LLMs, backed by the RoleChat dataset and multi-stage RL training with standard alignment to reduce reward issues.
Audio large language models can be descriptive speech quality evaluators
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 2
citation-polarity summary
years
2026 2verdicts
UNVERDICTED 2roles
background 2polarities
background 2representative citing papers
JASTIN is an instruction-driven audio evaluation system that achieves state-of-the-art correlation with human ratings on speech, sound, music, and out-of-domain tasks without task-specific retraining.
citing papers explorer
-
Character Beyond Speech: Leveraging Role-Playing Evaluation in Audio Large Language Models via Reinforcement Learning
RoleJudge is a multidimensional evaluation framework for speech-character alignment in audio LLMs, backed by the RoleChat dataset and multi-stage RL training with standard alignment to reduce reward issues.
-
JASTIN: Aligning LLMs for Zero-Shot Audio and Speech Evaluation via Natural Language Instructions
JASTIN is an instruction-driven audio evaluation system that achieves state-of-the-art correlation with human ratings on speech, sound, music, and out-of-domain tasks without task-specific retraining.