The Agent-First Tool API paradigm raises AI agent task success from 64% to 88% and cuts human interventions by 72.7% through semantic phases, structured contracts, and risk governance in a production enterprise system.
Canonical reference
TPTU: Task planning and tool usage of large language model-based AI agents
Canonical reference. 100% of citing Pith papers cite this work as background.
citation-role summary
citation-polarity summary
polarities
background 5representative citing papers
Talk-to-Your-Slides uses language-driven structured data manipulation with a hierarchical architecture to edit slides, reporting 34% faster processing, 34% better instruction fidelity, and 87% lower cost than GUI-based baselines on text-centric tasks while releasing the TSBench benchmark.
A survey of LLM-based autonomous agents that proposes a unified framework for their construction and reviews applications in social science, natural science, and engineering along with evaluation methods and future directions.
NSHA improves LLM handling of hierarchical instruction conflicts by combining solver-guided constraint satisfaction at inference with distillation of those decisions into model parameters at training.
The paper surveys human memory categories, maps them to LLM memory, and proposes a new three-dimension (object, form, time) categorization into eight quadrants to organize existing work and highlight open problems.
The LMMP framework improves tool-calling accuracy and task success rates for Earth observation agents by grounding plans in multimodal features and remote sensing expert knowledge via a two-stage training process.
A systematic review of memory designs, evaluation methods, applications, limitations, and future directions for LLM-based agents.
citing papers explorer
-
Agent-First Tool API: A Semantic Interface Paradigm for Enterprise AI Agent Systems
The Agent-First Tool API paradigm raises AI agent task success from 64% to 88% and cuts human interventions by 72.7% through semantic phases, structured contracts, and risk governance in a production enterprise system.
-
Talk to Your Slides: High-Efficiency Slide Editing via Language-Driven Structured Data Manipulation
Talk-to-Your-Slides uses language-driven structured data manipulation with a hierarchical architecture to edit slides, reporting 34% faster processing, 34% better instruction fidelity, and 87% lower cost than GUI-based baselines on text-centric tasks while releasing the TSBench benchmark.
-
A Survey on Large Language Model based Autonomous Agents
A survey of LLM-based autonomous agents that proposes a unified framework for their construction and reviews applications in social science, natural science, and engineering along with evaluation methods and future directions.
-
Hierarchical Alignment: Enforcing Hierarchical Instruction-Following in LLMs through Logical Consistency
NSHA improves LLM handling of hierarchical instruction conflicts by combining solver-guided constraint satisfaction at inference with distillation of those decisions into model parameters at training.
-
From Human Memory to AI Memory: A Survey on Memory Mechanisms in the Era of LLMs
The paper surveys human memory categories, maps them to LLM memory, and proposes a new three-dimension (object, form, time) categorization into eight quadrants to organize existing work and highlight open problems.
-
Bridging Perception and Action: A Lightweight Multimodal Meta-Planner Framework for Robust Earth Observation Agents
The LMMP framework improves tool-calling accuracy and task success rates for Earth observation agents by grounding plans in multimodal features and remote sensing expert knowledge via a two-stage training process.
-
A Survey on the Memory Mechanism of Large Language Model based Agents
A systematic review of memory designs, evaluation methods, applications, limitations, and future directions for LLM-based agents.
- A Comprehensive Survey on Agent Skills: Taxonomy, Techniques, and Applications