The tool calls should be useful for the purpose and correctly align with the specified task, while unnecessary, irrelevant, or incorrect ones should not be executed

Effective Tool Use Requirement: The tools should be utilized strategically to collect useful information, take effective actions for answering the question or accomplishing the

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

Identifying the Risks of LM Agents with an LM-Emulated Sandbox

cs.AI · 2023-09-25 · unverdicted · novelty 7.0

ToolEmu uses LM-based tool emulation to test LM agents on 36 high-stakes tools and 144 cases, revealing that even the safest agent fails 23.9% of the time.

InjecAgent: Benchmarking Indirect Prompt Injections in Tool-Integrated Large Language Model Agents

cs.CL · 2024-03-05 · conditional · novelty 6.0

InjecAgent benchmark demonstrates that tool-integrated LLM agents are vulnerable to indirect prompt injection attacks, with ReAct-prompted GPT-4 succeeding on 24% of attacks and nearly twice that rate when attacker instructions are reinforced.

citing papers explorer

Showing 2 of 2 citing papers.

Identifying the Risks of LM Agents with an LM-Emulated Sandbox cs.AI · 2023-09-25 · unverdicted · none · ref 12
ToolEmu uses LM-based tool emulation to test LM agents on 36 high-stakes tools and 144 cases, revealing that even the safest agent fails 23.9% of the time.
InjecAgent: Benchmarking Indirect Prompt Injections in Tool-Integrated Large Language Model Agents cs.CL · 2024-03-05 · conditional · none · ref 10
InjecAgent benchmark demonstrates that tool-integrated LLM agents are vulnerable to indirect prompt injection attacks, with ReAct-prompted GPT-4 succeeding on 24% of attacks and nearly twice that rate when attacker instructions are reinforced.

The tool calls should be useful for the purpose and correctly align with the specified task, while unnecessary, irrelevant, or incorrect ones should not be executed

fields

years

verdicts

representative citing papers

citing papers explorer