EVA-Bench introduces a simulation-plus-scoring framework for voice agents that reveals no tested system exceeds 0.5 on both accuracy and experience metrics at pass@1.
Judgments concerning the valence of inter-turn silence across speakers of American English, Italian, and Japanese.Discourse Processes, 48(5):331–354, 2011
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.SD 1years
2026 1verdicts
ACCEPT 1representative citing papers
citing papers explorer
-
EVA-Bench: A New End-to-end Framework for Evaluating Voice Agents
EVA-Bench introduces a simulation-plus-scoring framework for voice agents that reveals no tested system exceeds 0.5 on both accuracy and experience metrics at pass@1.