SEATauBench is the first agent benchmark for SEA languages, finding that performance holds for language-only changes but degrades sharply with full domain localization.
M ulti WOZ 2.2 : A Dialogue Dataset with Additional Annotation Corrections and State Tracking Baselines
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
GBC treats multi-agent LLM workflows as differentiable graphs to enable token-level attribution and targeted optimization, with reported gains on MultiWOZ and τ-bench.
citing papers explorer
-
SEATauBench: Adapting Tool-Agent-User Evaluation Into Low-Resource Southeast Asian Languages
SEATauBench is the first agent benchmark for SEA languages, finding that performance holds for language-only changes but degrades sharply with full domain localization.