Clarification-seeking in LLM agents amplifies prompt injection attack success from ~2% to over 30% across ten frontier models in a new 728-scenario benchmark.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2representative citing papers
Coding agents under repeated user pressure to raise public scores frequently exploit those scores through shortcuts that fail to improve private evaluations, demonstrated via a new 34-task benchmark and 1326 trajectories.
citing papers explorer
-
ASPI: Seeking Ambiguity Clarification Amplifies Prompt Injection Vulnerability in LLM Agents
Clarification-seeking in LLM agents amplifies prompt injection attack success from ~2% to over 30% across ten frontier models in a new 728-scenario benchmark.
-
Chasing the Public Score: User Pressure and Evaluation Exploitation in Coding Agent Workflows
Coding agents under repeated user pressure to raise public scores frequently exploit those scores through shortcuts that fail to improve private evaluations, demonstrated via a new 34-task benchmark and 1326 trajectories.