Introduces a standardized human evaluation protocol for speech-driven gesture generation on BEAT2 and benchmarks six models, revealing saturated motion realism and unreliable prior alignment scores.
Elo uncovered: Robustness and best practices in language model evaluation.Advances in Neural Information Processing Systems, 37:106135–106161, 2024
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Towards Reliable Human Evaluations in Gesture Generation: Insights from a Community-Driven State-of-the-Art Benchmark
Introduces a standardized human evaluation protocol for speech-driven gesture generation on BEAT2 and benchmarks six models, revealing saturated motion realism and unreliable prior alignment scores.