Audio2Tool is a new benchmark dataset that shows speech models perform well on simple commands but degrade sharply on compositional tasks and realistic acoustic noise.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
Raon-OpenTTS provides an open 510K-hour curated speech dataset and DiT-based TTS models up to 1B parameters that achieve competitive WER and speaker similarity on benchmarks versus closed models trained on millions of hours.
citing papers explorer
-
Raon-OpenTTS: Open Models and Data for Robust Text-to-Speech
Raon-OpenTTS provides an open 510K-hour curated speech dataset and DiT-based TTS models up to 1B parameters that achieve competitive WER and speaker similarity on benchmarks versus closed models trained on millions of hours.