SQ-LLM, trained on the new SpeechEval dataset of 32k multilingual clips with 128k annotations, enables LLMs to perform interpretable multi-task speech quality evaluation including assessment, comparison, improvement suggestions, and deepfake detection.
On a technical level, clarity is excellent, with no audible distortion, and the pacing and flow are well-managed, though slight volume adjustments could enhance consistency
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.SD 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
SpeechLLM-as-Judges: Towards General and Interpretable Speech Quality Evaluation
SQ-LLM, trained on the new SpeechEval dataset of 32k multilingual clips with 128k annotations, enables LLMs to perform interpretable multi-task speech quality evaluation including assessment, comparison, improvement suggestions, and deepfake detection.