ToolEmu uses LM-based tool emulation to test LM agents on 36 high-stakes tools and 144 cases, revealing that even the safest agent fails 23.9% of the time.
The tool calls should be useful for the purpose and correctly align with the specified task, while unnecessary, irrelevant, or incorrect ones should not be executed
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
representative citing papers
InjecAgent benchmark demonstrates that tool-integrated LLM agents are vulnerable to indirect prompt injection attacks, with ReAct-prompted GPT-4 succeeding on 24% of attacks and nearly twice that rate when attacker instructions are reinforced.
citing papers explorer
-
Identifying the Risks of LM Agents with an LM-Emulated Sandbox
ToolEmu uses LM-based tool emulation to test LM agents on 36 high-stakes tools and 144 cases, revealing that even the safest agent fails 23.9% of the time.
-
InjecAgent: Benchmarking Indirect Prompt Injections in Tool-Integrated Large Language Model Agents
InjecAgent benchmark demonstrates that tool-integrated LLM agents are vulnerable to indirect prompt injection attacks, with ReAct-prompted GPT-4 succeeding on 24% of attacks and nearly twice that rate when attacker instructions are reinforced.