Title resolution pending

Maintain the core intent while enhancing professionalism

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

representative citing papers

GTA-2: Benchmarking General Tool Agents from Atomic Tool-Use to Open-Ended Workflows

cs.CL · 2026-04-17 · conditional · novelty 7.0

GTA-2 benchmark shows frontier models achieve below 50% on atomic tool tasks and only 14.39% success on realistic long-horizon workflows, with execution harnesses like Manus providing substantial gains.

citing papers explorer

Showing 1 of 1 citing paper.

GTA-2: Benchmarking General Tool Agents from Atomic Tool-Use to Open-Ended Workflows cs.CL · 2026-04-17 · conditional · none · ref 78
GTA-2 benchmark shows frontier models achieve below 50% on atomic tool tasks and only 14.39% success on realistic long-horizon workflows, with execution harnesses like Manus providing substantial gains.

Title resolution pending

fields

years

verdicts

representative citing papers

citing papers explorer