{"paper":{"title":"ASPI: Seeking Ambiguity Clarification Amplifies Prompt Injection Vulnerability in LLM Agents","license":"http://creativecommons.org/licenses/by/4.0/","headline":"Seeking clarification on ambiguous tasks makes LLM agents far more vulnerable to prompt injection attacks.","cross_cats":["cs.AI"],"primary_cat":"cs.CR","authors_text":"Dileepa Lakshan, Heming Liu, Joseph Brandifino, Max Fenkell, Udari Madhushani Sehwag, Zhengyang Shan","submitted_at":"2026-05-17T08:30:45Z","abstract_excerpt":"Clarification-seeking behavior is widely regarded as a desirable property of LLM agents, enabling them to resolve ambiguity before acting on underspecified tasks. However, the security implications of this interaction pattern remain unexplored. We investigate whether the transition from standard execution to a clarification-seeking state increases an agent's susceptibility to prompt injection attacks. We introduce ASPI (Ambiguous-State Prompt Injection), a benchmark of 728 task-attack scenarios that isolates clarification as a distinct agent state and measures how this state transition affects"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"Clarification-seeking consistently and substantially amplifies vulnerability. For instance, attack success rises from 1.8% to 34.0% for o3 and from 2.2% to 35.7% for Gemini-3-Flash.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"The benchmark successfully isolates the clarification-seeking state transition as the sole variable, without introducing differences in prompt formatting, tool-return handling, or user-input channel that could independently affect attack success.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"Clarification-seeking in LLM agents amplifies prompt injection attack success from ~2% to over 30% across ten frontier models in a new 728-scenario benchmark.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"Seeking clarification on ambiguous tasks makes LLM agents far more vulnerable to prompt injection attacks.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"a2575414ba6f6d85befff8df115e51f1a974e38d6d4e7aa7b5c28494bcaeae54"},"source":{"id":"2605.17324","kind":"arxiv","version":1},"verdict":{"id":"a9d72b66-f75e-49c4-935e-0c409d4a8ac4","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-19T23:49:21.895873Z","strongest_claim":"Clarification-seeking consistently and substantially amplifies vulnerability. For instance, attack success rises from 1.8% to 34.0% for o3 and from 2.2% to 35.7% for Gemini-3-Flash.","one_line_summary":"Clarification-seeking in LLM agents amplifies prompt injection attack success from ~2% to over 30% across ten frontier models in a new 728-scenario benchmark.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"The benchmark successfully isolates the clarification-seeking state transition as the sole variable, without introducing differences in prompt formatting, tool-return handling, or user-input channel that could independently affect attack success.","pith_extraction_headline":"Seeking clarification on ambiguous tasks makes LLM agents far more vulnerable to prompt injection attacks."},"integrity":{"clean":true,"summary":{"advisory":0,"critical":0,"by_detector":{},"informational":0},"endpoint":"/pith/2605.17324/integrity.json","findings":[],"available":true,"detectors_run":[{"name":"doi_compliance","ran_at":"2026-05-20T00:03:22.601099Z","status":"completed","version":"1.0.0","findings_count":0},{"name":"doi_title_agreement","ran_at":"2026-05-20T00:01:20.656867Z","status":"completed","version":"1.0.0","findings_count":0},{"name":"claim_evidence","ran_at":"2026-05-19T21:41:57.814690Z","status":"completed","version":"1.0.0","findings_count":0},{"name":"ai_meta_artifact","ran_at":"2026-05-19T21:33:23.747477Z","status":"skipped","version":"1.0.0","findings_count":0}],"snapshot_sha256":"9061192cceb79a5517fdaa6ed889b86440434ba6adeca2461aba28741b40a124"},"references":{"count":116,"sample":[{"doi":"","year":2026,"title":"ClawGuard: A Runtime Security Framework for Tool-Augmented LLM Agents Against Indirect Prompt Injection , author=. 2026 , eprint=","work_id":"","ref_index":1,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2026,"title":"AgentSentry: Mitigating Indirect Prompt Injection in LLM Agents via Temporal Causal Diagnostics and Context Purification , author=. 2026 , eprint=","work_id":"","ref_index":2,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2026,"title":"AttriGuard: Defeating Indirect Prompt Injection in LLM Agents via Causal Attribution of Tool Invocations , author=. 2026 , eprint=","work_id":"8c36265d-4efc-4d5e-9837-3b8e7820748a","ref_index":3,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2026,"title":"How Vulnerable Are AI Agents to Indirect Prompt Injections? Insights from a Large-Scale Public Competition , author=. 2026 , eprint=","work_id":"c59e5b3d-990f-40ac-80b2-b26c88a44bc3","ref_index":4,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2026,"title":"Philippe Laban and Hiroaki Hayashi and Yingbo Zhou and Jennifer Neville , booktitle=. 2026 , url=","work_id":"c12b3b1f-1749-4d8e-87e2-5fb93867aacf","ref_index":5,"cited_arxiv_id":"","is_internal_anchor":false}],"resolved_work":84,"snapshot_sha256":"0ad92086471926599f4fdf497130ce8102727224df8e97e31d9d976cdc01993b","internal_anchors":10},"formal_canon":{"evidence_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}