DTDR dynamically retrieves relevant tools by modeling dependencies from demonstrations and conditioning on the evolving agent plan, improving function calling success rates by 23-104% over static retrievers across benchmarks.
Hammer: Robust function-calling for on-device language models via function masking
6 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
ToolPRM provides fine-grained intra-call process supervision via a new dataset and reward model, outperforming outcome and coarse-grained alternatives on function-calling benchmarks.
GDPO decouples per-reward normalization in multi-reward RL to avoid advantage collapse and improve convergence over GRPO on tool-calling, math, and coding tasks.
A principled reward design for tool selection and application in RL-trained LLMs delivers 17% gains over base models and 15% over SFT across benchmarks.
LOM-action uses business events to drive ontology-governed graph simulations that generate auditable decisions, reporting 93.82% accuracy and 98.74% tool-chain F1 versus 24-36% F1 for frontier LLMs.
The survey organizes security threats and defenses in autonomous LLM agents into four layers and identifies that risks can propagate across layers from inputs to ecosystem impacts.
citing papers explorer
-
Dynamic Tool Dependency Retrieval for Lightweight Function Calling
DTDR dynamically retrieves relevant tools by modeling dependencies from demonstrations and conditioning on the evolving agent plan, improving function calling success rates by 23-104% over static retrievers across benchmarks.
-
ToolPRM: Fine-Grained Inference Scaling of Structured Outputs for Function Calling
ToolPRM provides fine-grained intra-call process supervision via a new dataset and reward model, outperforming outcome and coarse-grained alternatives on function-calling benchmarks.
-
GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization
GDPO decouples per-reward normalization in multi-reward RL to avoid advantage collapse and improve convergence over GRPO on tool-calling, math, and coding tasks.
-
ToolRL: Reward is All Tool Learning Needs
A principled reward design for tool selection and application in RL-trained LLMs delivers 17% gains over base models and 15% over SFT across benchmarks.
-
From Business Events to Auditable Decisions: Ontology-Governed Graph Simulation for Enterprise AI
LOM-action uses business events to drive ontology-governed graph simulations that generate auditable decisions, reporting 93.82% accuracy and 98.74% tool-chain F1 versus 24-36% F1 for frontier LLMs.
-
Security Attack and Defense Strategies for Autonomous Agent Frameworks: A Layered Review with OpenClaw as a Case Study
The survey organizes security threats and defenses in autonomous LLM agents into four layers and identifies that risks can propagate across layers from inputs to ecosystem impacts.