Thirty-seventh Conference on Neural Information Processing Systems , year=

Toolformer: Language Models Can Teach Themselves to Use Tools , author=

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

browse 4 citing papers

representative citing papers

cs.AI · 2024-08-15 · conditional · novelty 7.0

Meta Agent Search uses a meta-agent to iteratively program novel agentic systems in code, producing agents that outperform state-of-the-art hand-designed ones across coding, science, and math while transferring across domains and models.

Design and Report Benchmarks for Knowledge Work

cs.AI · 2026-05-22 · unverdicted · novelty 6.0

Proposes a three-step benchmark design method (define work activity, specify tested setting, score work product) derived from work studies and O*NET, demonstrated via three case analyses.

JTPRO: A Joint Tool-Prompt Reflective Optimization Framework for Language Agents

cs.AI · 2026-04-20 · unverdicted · novelty 5.0

JTPRO co-optimizes prompts and tool descriptions via reflection to raise overall success rate by 5-20% over baselines on multi-tool benchmarks.

Towards Self-Improving Error Diagnosis in Multi-Agent Systems

cs.MA · 2026-04-19 · unverdicted · novelty 5.0

ErrorProbe introduces a self-improving pipeline for attributing semantic failures in LLM multi-agent systems to specific agents and steps via anomaly detection, backward tracing, and tool-grounded validation with verified episodic memory.

citing papers explorer

Showing 4 of 4 citing papers.

Automated Design of Agentic Systems cs.AI · 2024-08-15 · conditional · none · ref 71
Meta Agent Search uses a meta-agent to iteratively program novel agentic systems in code, producing agents that outperform state-of-the-art hand-designed ones across coding, science, and math while transferring across domains and models.
Design and Report Benchmarks for Knowledge Work cs.AI · 2026-05-22 · unverdicted · none · ref 110
Proposes a three-step benchmark design method (define work activity, specify tested setting, score work product) derived from work studies and O*NET, demonstrated via three case analyses.
JTPRO: A Joint Tool-Prompt Reflective Optimization Framework for Language Agents cs.AI · 2026-04-20 · unverdicted · none · ref 6
JTPRO co-optimizes prompts and tool descriptions via reflection to raise overall success rate by 5-20% over baselines on multi-tool benchmarks.
Towards Self-Improving Error Diagnosis in Multi-Agent Systems cs.MA · 2026-04-19 · unverdicted · none · ref 5
ErrorProbe introduces a self-improving pipeline for attributing semantic failures in LLM multi-agent systems to specific agents and steps via anomaly detection, backward tracing, and tool-grounded validation with verified episodic memory.

Thirty-seventh Conference on Neural Information Processing Systems , year=

fields

years

verdicts

representative citing papers

citing papers explorer