TMAS scales test-time compute in LLMs via multi-agent collaboration with experience banks, guideline banks, and hybrid reward training to achieve stronger iterative scaling on reasoning benchmarks than prior methods.
Ignore the student’s reasoning steps unless the result is embedded within them
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AI 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
TMAS: Scaling Test-Time Compute via Multi-Agent Synergy
TMAS scales test-time compute in LLMs via multi-agent collaboration with experience banks, guideline banks, and hybrid reward training to achieve stronger iterative scaling on reasoning benchmarks than prior methods.