τ-voice: Benchmarking full-duplex voice agents on real-world domains

Soham Ray, Keshav Dhandhania, Victor Barres, Karthik Narasimhan · 2026 · arXiv 2603.13686

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

read on arXiv browse 3 citing papers

citation-role summary

background 1 dataset 1

citation-polarity summary

background 1 use dataset 1

representative citing papers

EVA-Bench: A New End-to-end Framework for Evaluating Voice Agents

cs.SD · 2026-05-13 · unverdicted · novelty 7.0 · 2 refs

EVA-Bench supplies a simulation engine for bot-to-bot voice dialogues plus two composite metrics (EVA-A for accuracy, EVA-X for experience) evaluated on 213 enterprise scenarios, showing no tested system exceeds 0.5 on both pass@1 scores.

TOBench: A Task-Oriented Omni-Modal Benchmark for Real-World Tool-Using Agents

cs.AI · 2026-05-16 · unverdicted · novelty 5.0

MM-ToolBench introduces 100 closed-loop multimodal tasks across two domains with 27 MCP servers and 324 tools, where agents must execute, inspect artifacts, and revise before final output.

A Survey of Audio Reasoning in Multimodal Foundation Models

eess.AS · 2026-05-20 · unverdicted · novelty 2.0

A survey that provides a unified formulation of audio reasoning and reviews advances across Audio-to-Text, Audio-to-Speech, Audio-Visual, and Agentic paradigms while discussing challenges and future directions.

citing papers explorer

Showing 0 of 0 citing papers after filters.

No citing papers match the current filters.

τ-voice: Benchmarking full-duplex voice agents on real-world domains

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer