hub Canonical reference

Llm with tools: A survey

· 2024 · arXiv 2409.18807

Canonical reference. 100% of citing Pith papers cite this work as background.

12 Pith papers citing it

Background 100% of classified citations

read on arXiv browse 12 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 7

citation-polarity summary

background 7

representative citing papers

Tools as Continuous Flow for Evolving Agentic Reasoning

cs.AI · 2026-05-08 · unverdicted · novelty 7.0

FlowAgent models tool chaining as continuous latent trajectory generation with conditional flow matching to deliver global planning, formal utility bounds, and better robustness on long-horizon tasks, plus a new plan-level benchmark.

Revisiting the Travel Planning Capabilities of Large Language Models

cs.AI · 2026-05-05 · unverdicted · novelty 7.0

LLMs extract explicit constraints effectively but struggle with implicit open-world requirements, structural biases in plans, and ineffective self-correction during travel planning.

From Standalone LLMs to Integrated Intelligence: A Survey of Compound Al Systems

cs.MA · 2025-06-05 · accept · novelty 7.0

A survey that defines Compound AI Systems, proposes a multi-dimensional taxonomy based on component roles and orchestration strategies, reviews four foundational paradigms, and identifies key challenges for future research.

GRAFT: Graph-Tokenized LLMs for Tool Planning

cs.LG · 2026-05-12 · unverdicted · novelty 6.0

GRAFT internalizes tool dependency graphs via dedicated special tokens in LLMs and applies on-policy context distillation to achieve higher exact sequence matching and dependency legality than prior external-graph methods.

Automating Structural Analysis Across Multiple Software Platforms Using Large Language Models

cs.SE · 2026-04-10 · unverdicted · novelty 6.0

A two-stage multi-agent LLM converts structural inputs to JSON then platform-specific scripts for ETABS, SAP2000, and OpenSees, achieving over 90% accuracy on 20 frame problems across ten trials.

Dynamic Skill Lifecycle Management for Agentic Reinforcement Learning

cs.LG · 2026-05-11 · unverdicted · novelty 5.0 · 2 refs

SLIM dynamically optimizes the active external skill set in agentic RL via leave-one-skill-out marginal contribution estimates and lifecycle operations, delivering a 7.1% average gain over baselines on ALFWorld and SearchQA while showing some skills remain externally useful.

Evaluating Generative Models as Interactive Emergent Representations of Human-Like Collaborative Behavior

cs.RO · 2026-05-05 · unverdicted · novelty 5.0 · 2 refs

Embodied LLM agents exhibit emergent collaborative behaviors indicating mental models of partners in a color-matching game, detected via LLM judges and supported by positive user feedback.

Red Skills or Blue Skills? A Dive Into Skills Published on ClawHub

cs.CL · 2026-03-19 · unverdicted · novelty 4.0

Analysis of ClawHub shows language-based functional divides in agent skills, with over 30% flagged suspicious and submission-time documentation enabling 73% accurate risk prediction.

Exploiting Web Search Tools of AI Agents for Data Exfiltration

cs.CR · 2025-10-10 · unverdicted · novelty 4.0

Indirect prompt injection attacks remain effective on LLMs using web search tools, allowing data exfiltration and exposing ongoing weaknesses in current model defenses.

Large Language Model-Brained GUI Agents: A Survey

cs.AI · 2024-11-27 · unverdicted · novelty 4.0

A survey consolidating frameworks, data practices, large action models, benchmarks, applications, and research gaps in LLM-brained GUI agents.

Large Language Model Agent: A Survey on Methodology, Applications and Challenges

cs.CL · 2025-03-27 · accept · novelty 3.0

A survey that deconstructs LLM agent systems via a methodology-centered taxonomy linking design principles to emergent behaviors, applications, and challenges.

LLM-Powered AI Agent Systems and Their Applications in Industry

cs.AI · 2025-05-22 · unverdicted · novelty 2.0

A survey categorizing LLM-powered agent systems into software-based, physical, and hybrid types, covering industrial applications and challenges such as latency and security.

citing papers explorer

Showing 12 of 12 citing papers.

Tools as Continuous Flow for Evolving Agentic Reasoning cs.AI · 2026-05-08 · unverdicted · none · ref 4
FlowAgent models tool chaining as continuous latent trajectory generation with conditional flow matching to deliver global planning, formal utility bounds, and better robustness on long-horizon tasks, plus a new plan-level benchmark.
Revisiting the Travel Planning Capabilities of Large Language Models cs.AI · 2026-05-05 · unverdicted · none · ref 14
LLMs extract explicit constraints effectively but struggle with implicit open-world requirements, structural biases in plans, and ineffective self-correction during travel planning.
From Standalone LLMs to Integrated Intelligence: A Survey of Compound Al Systems cs.MA · 2025-06-05 · accept · none · ref 156
A survey that defines Compound AI Systems, proposes a multi-dimensional taxonomy based on component roles and orchestration strategies, reviews four foundational paradigms, and identifies key challenges for future research.
GRAFT: Graph-Tokenized LLMs for Tool Planning cs.LG · 2026-05-12 · unverdicted · none · ref 1
GRAFT internalizes tool dependency graphs via dedicated special tokens in LLMs and applies on-policy context distillation to achieve higher exact sequence matching and dependency legality than prior external-graph methods.
Automating Structural Analysis Across Multiple Software Platforms Using Large Language Models cs.SE · 2026-04-10 · unverdicted · none · ref 15
A two-stage multi-agent LLM converts structural inputs to JSON then platform-specific scripts for ETABS, SAP2000, and OpenSees, achieving over 90% accuracy on 20 frame problems across ten trials.
Dynamic Skill Lifecycle Management for Agentic Reinforcement Learning cs.LG · 2026-05-11 · unverdicted · none · ref 48 · 2 links
SLIM dynamically optimizes the active external skill set in agentic RL via leave-one-skill-out marginal contribution estimates and lifecycle operations, delivering a 7.1% average gain over baselines on ALFWorld and SearchQA while showing some skills remain externally useful.
Evaluating Generative Models as Interactive Emergent Representations of Human-Like Collaborative Behavior cs.RO · 2026-05-05 · unverdicted · none · ref 29 · 2 links
Embodied LLM agents exhibit emergent collaborative behaviors indicating mental models of partners in a color-matching game, detected via LLM judges and supported by positive user feedback.
Red Skills or Blue Skills? A Dive Into Skills Published on ClawHub cs.CL · 2026-03-19 · unverdicted · none · ref 17
Analysis of ClawHub shows language-based functional divides in agent skills, with over 30% flagged suspicious and submission-time documentation enabling 73% accurate risk prediction.
Exploiting Web Search Tools of AI Agents for Data Exfiltration cs.CR · 2025-10-10 · unverdicted · none · ref 3
Indirect prompt injection attacks remain effective on LLMs using web search tools, allowing data exfiltration and exposing ongoing weaknesses in current model defenses.
Large Language Model-Brained GUI Agents: A Survey cs.AI · 2024-11-27 · unverdicted · none · ref 13
A survey consolidating frameworks, data practices, large action models, benchmarks, applications, and research gaps in LLM-brained GUI agents.
Large Language Model Agent: A Survey on Methodology, Applications and Challenges cs.CL · 2025-03-27 · accept · none · ref 113
A survey that deconstructs LLM agent systems via a methodology-centered taxonomy linking design principles to emergent behaviors, applications, and challenges.
LLM-Powered AI Agent Systems and Their Applications in Industry cs.AI · 2025-05-22 · unverdicted · none · ref 104
A survey categorizing LLM-powered agent systems into software-based, physical, and hybrid types, covering industrial applications and challenges such as latency and security.

Llm with tools: A survey

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer