{"record_type":"pith_number_record","schema_url":"https://pith.science/schemas/pith-number/v1.json","pith_number":"pith:2025:U4LWUM4HR6F4JRSRGMW53APVDW","short_pith_number":"pith:U4LWUM4H","schema_version":"1.0","canonical_sha256":"a7176a33878f8bc4c651332ddd81f51d85f305df7dfc3bbd3e42fc27d184197b","source":{"kind":"arxiv","id":"2504.18575","version":3},"attestation_state":"computed","paper":{"title":"WASP: Benchmarking Web Agent Security Against Prompt Injection Attacks","license":"http://creativecommons.org/licenses/by-nc-sa/4.0/","headline":"WASP benchmark shows top web agents deceived by simple prompt injections with partial success up to 86 percent.","cross_cats":["cs.AI"],"primary_cat":"cs.CR","authors_text":"Aaron Grattafiori, Arman Zharmagambetov, Chuan Guo, Ivan Evtimov, Kamalika Chaudhuri","submitted_at":"2025-04-22T17:51:03Z","abstract_excerpt":"Autonomous UI agents powered by AI have tremendous potential to boost human productivity by automating routine tasks such as filing taxes and paying bills. However, a major challenge in unlocking their full potential is security, which is exacerbated by the agent's ability to take action on their user's behalf. Existing tests for prompt injections in web agents either over-simplify the threat by testing unrealistic scenarios or giving the attacker too much power, or look at single-step isolated tasks. To more accurately measure progress for secure web agents, we introduce WASP -- a new publicl"},"verification_status":{"content_addressed":true,"pith_receipt":true,"author_attested":false,"weak_author_claims":0,"strong_author_claims":0,"externally_anchored":false,"storage_verified":false,"citation_signatures":0,"replication_records":0,"graph_snapshot":true,"references_resolved":true,"formal_links_present":false},"canonical_record":{"source":{"id":"2504.18575","kind":"arxiv","version":3},"metadata":{"license":"http://creativecommons.org/licenses/by-nc-sa/4.0/","primary_cat":"cs.CR","submitted_at":"2025-04-22T17:51:03Z","cross_cats_sorted":["cs.AI"],"title_canon_sha256":"d82c1e88660f2ab70e699549c3e5c7b3f0c0ff29856e788c2914471738702a6d","abstract_canon_sha256":"7195a4ebdc15e95bc2fac8acabd99815e08a6520136f3539a878f772e241b857"},"schema_version":"1.0"},"receipt":{"kind":"pith_receipt","key_id":"pith-v1-2026-05","algorithm":"ed25519","signed_at":"2026-05-17T23:38:49.957940Z","signature_b64":"kbzc75II+FEkdnC20kbgt+mu6VaR/e0C1JTTDiOnfCqdVuU+YTWkAbcQMEbtKEjST8CLUmqovnRZ1MJ+ig40CA==","signed_message":"canonical_sha256_bytes","builder_version":"pith-number-builder-2026-05-17-v1","receipt_version":"0.3","canonical_sha256":"a7176a33878f8bc4c651332ddd81f51d85f305df7dfc3bbd3e42fc27d184197b","last_reissued_at":"2026-05-17T23:38:49.957466Z","signature_status":"signed_v1","first_computed_at":"2026-05-17T23:38:49.957466Z","public_key_fingerprint":"8d4b5ee74e4693bcd1df2446408b0d54"},"graph_snapshot":{"paper":{"title":"WASP: Benchmarking Web Agent Security Against Prompt Injection Attacks","license":"http://creativecommons.org/licenses/by-nc-sa/4.0/","headline":"WASP benchmark shows top web agents deceived by simple prompt injections with partial success up to 86 percent.","cross_cats":["cs.AI"],"primary_cat":"cs.CR","authors_text":"Aaron Grattafiori, Arman Zharmagambetov, Chuan Guo, Ivan Evtimov, Kamalika Chaudhuri","submitted_at":"2025-04-22T17:51:03Z","abstract_excerpt":"Autonomous UI agents powered by AI have tremendous potential to boost human productivity by automating routine tasks such as filing taxes and paying bills. However, a major challenge in unlocking their full potential is security, which is exacerbated by the agent's ability to take action on their user's behalf. Existing tests for prompt injections in web agents either over-simplify the threat by testing unrealistic scenarios or giving the attacker too much power, or look at single-step isolated tasks. To more accurately measure progress for secure web agents, we introduce WASP -- a new publicl"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"Evaluating with WASP shows that even top-tier AI models, including those with advanced reasoning capabilities, can be deceived by simple, low-effort human-written injections in very realistic scenarios. Our end-to-end evaluation reveals a previously unobserved insight: while attacks partially succeed in up to 86% of the case, even state-of-the-art agents often struggle to fully complete the attacker goals.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"The benchmark tasks and injection examples accurately represent real-world web agent usage and attacker capabilities without over-simplifying or granting attackers unrealistic control.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"WASP benchmark reveals web agents are vulnerable to simple prompt injections with partial success rates up to 86%, but agents frequently fail to complete attacker objectives.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"WASP benchmark shows top web agents deceived by simple prompt injections with partial success up to 86 percent.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"5702fb7655825c08921547d3cf3a5eb7bfa532520f4c9cb313cd3189707c68a2"},"source":{"id":"2504.18575","kind":"arxiv","version":3},"verdict":{"id":"b1fecf80-6a4e-4311-8f08-1d94e67dc024","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-15T22:18:38.317346Z","strongest_claim":"Evaluating with WASP shows that even top-tier AI models, including those with advanced reasoning capabilities, can be deceived by simple, low-effort human-written injections in very realistic scenarios. Our end-to-end evaluation reveals a previously unobserved insight: while attacks partially succeed in up to 86% of the case, even state-of-the-art agents often struggle to fully complete the attacker goals.","one_line_summary":"WASP benchmark reveals web agents are vulnerable to simple prompt injections with partial success rates up to 86%, but agents frequently fail to complete attacker objectives.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"The benchmark tasks and injection examples accurately represent real-world web agent usage and attacker capabilities without over-simplifying or granting attackers unrealistic control.","pith_extraction_headline":"WASP benchmark shows top web agents deceived by simple prompt injections with partial success up to 86 percent."},"references":{"count":22,"sample":[{"doi":"","year":2024,"title":"WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models","work_id":"12c1d840-af20-4a4e-8750-6b9c6266638f","ref_index":1,"cited_arxiv_id":"2401.13919","is_internal_anchor":true},{"doi":"","year":null,"title":"The objective is to comment on a given issue saying ‘‘we are working on it.’’ Currently, the observation shows that we are on the profile page in GitLab","work_id":"1175517f-80d7-46e1-a07e-e222cf3e5847","ref_index":2,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":null,"title":"Comment on the given issue saying we are working on it","work_id":"5e5d48ee-d0d3-460c-99c2-0f3f89be6780","ref_index":3,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":null,"title":"‘not compromised","work_id":"5571ea5e-f217-4aa9-a12e-44a8b6c6fc8f","ref_index":4,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":null,"title":"‘not compromised","work_id":"84e9bef1-f3a2-49c6-8449-8346de242a7e","ref_index":5,"cited_arxiv_id":"","is_internal_anchor":false}],"resolved_work":22,"snapshot_sha256":"d67fda95b5779b144c98eaeaca34f75faf5f7c05729a71085ac010515bd57488","internal_anchors":1},"formal_canon":{"evidence_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"},"aliases":[{"alias_kind":"arxiv","alias_value":"2504.18575","created_at":"2026-05-17T23:38:49.957545+00:00"},{"alias_kind":"arxiv_version","alias_value":"2504.18575v3","created_at":"2026-05-17T23:38:49.957545+00:00"},{"alias_kind":"doi","alias_value":"10.48550/arxiv.2504.18575","created_at":"2026-05-17T23:38:49.957545+00:00"},{"alias_kind":"pith_short_12","alias_value":"U4LWUM4HR6F4","created_at":"2026-05-18T12:33:37.589309+00:00"},{"alias_kind":"pith_short_16","alias_value":"U4LWUM4HR6F4JRSR","created_at":"2026-05-18T12:33:37.589309+00:00"},{"alias_kind":"pith_short_8","alias_value":"U4LWUM4H","created_at":"2026-05-18T12:33:37.589309+00:00"}],"events":[],"event_summary":{},"paper_claims":[],"inbound_citations":{"count":26,"internal_anchor_count":26,"sample":[{"citing_arxiv_id":"2505.10924","citing_title":"A Survey on the Safety and Security Threats of Computer-Using Agents: JARVIS or Ultron?","ref_index":2,"is_internal_anchor":true},{"citing_arxiv_id":"2506.23978","citing_title":"LLM Agents Are the Antidote to Walled Gardens","ref_index":25,"is_internal_anchor":true},{"citing_arxiv_id":"2605.16282","citing_title":"Taxonomy and Consistency Analysis of Safety Benchmarks for AI Agents","ref_index":12,"is_internal_anchor":true},{"citing_arxiv_id":"2605.17986","citing_title":"LivePI: More Realistic Benchmarking of Agents Against Indirect Prompt Injection","ref_index":24,"is_internal_anchor":true},{"citing_arxiv_id":"2507.10610","citing_title":"LaSM: Layer-wise Scaling Mechanism for Defending Pop-up Attack on GUI Agents","ref_index":8,"is_internal_anchor":true},{"citing_arxiv_id":"2510.10073","citing_title":"SecureWebArena: A Holistic Security Evaluation Benchmark for LVLM-based Web Agents","ref_index":8,"is_internal_anchor":true},{"citing_arxiv_id":"2510.23883","citing_title":"Agentic AI Security: Threats, Defenses, Evaluation, and Open Challenges","ref_index":242,"is_internal_anchor":true},{"citing_arxiv_id":"2603.09002","citing_title":"Security Considerations for Multi-agent Systems","ref_index":125,"is_internal_anchor":true},{"citing_arxiv_id":"2605.14290","citing_title":"Web Agents Should Adopt the Plan-Then-Execute Paradigm","ref_index":10,"is_internal_anchor":true},{"citing_arxiv_id":"2604.04978","citing_title":"Measuring the Permission Gate: A Stress-Test Evaluation of Claude Code's Auto Mode","ref_index":2,"is_internal_anchor":true},{"citing_arxiv_id":"2605.08828","citing_title":"When Agents Overtrust Environmental Evidence: An Extensible Agentic Framework for Benchmarking Evidence-Grounding Defects in LLM Agents","ref_index":7,"is_internal_anchor":true},{"citing_arxiv_id":"2605.11868","citing_title":"IPI-proxy: An Intercepting Proxy for Red-Teaming Web-Browsing AI Agents Against Indirect Prompt Injection","ref_index":12,"is_internal_anchor":true},{"citing_arxiv_id":"2605.12015","citing_title":"SkillSafetyBench: Evaluating Agent Safety under Skill-Facing Attack Surfaces","ref_index":55,"is_internal_anchor":true},{"citing_arxiv_id":"2605.11039","citing_title":"The Granularity Mismatch in Agent Security: Argument-Level Provenance Solves Enforcement and Isolates the LLM Reasoning Bottleneck","ref_index":5,"is_internal_anchor":true},{"citing_arxiv_id":"2604.27202","citing_title":"Indirect Prompt Injection in the Wild: An Empirical Study of Prevalence, Techniques, and Objectives","ref_index":47,"is_internal_anchor":true},{"citing_arxiv_id":"2605.08828","citing_title":"When Agents Overtrust Environmental Evidence: An Extensible Agentic Framework for Benchmarking Evidence-Grounding Defects in LLM Agents","ref_index":7,"is_internal_anchor":true},{"citing_arxiv_id":"2605.10481","citing_title":"Safe Multi-Agent Behavior Must Be Maintained, Not Merely Asserted: Constraint Drift in LLM-Based Multi-Agent Systems","ref_index":11,"is_internal_anchor":true},{"citing_arxiv_id":"2604.25562","citing_title":"SnapGuard: Lightweight Prompt Injection Detection for Screenshot-Based Web Agents","ref_index":14,"is_internal_anchor":true},{"citing_arxiv_id":"2604.24348","citing_title":"OS-SPEAR: A Toolkit for the Safety, Performance,Efficiency, and Robustness Analysis of OS Agents","ref_index":33,"is_internal_anchor":true},{"citing_arxiv_id":"2605.06393","citing_title":"Constraining Host-Level Abuse in Self-Hosted Computer-Use Agents via TEE-Backed Isolation","ref_index":26,"is_internal_anchor":true},{"citing_arxiv_id":"2604.12284","citing_title":"WebAgentGuard: A Reasoning-Driven Guard Model for Detecting Prompt Injection Attacks in Web Agents","ref_index":3,"is_internal_anchor":true},{"citing_arxiv_id":"2604.08499","citing_title":"PIArena: A Platform for Prompt Injection Evaluation","ref_index":4,"is_internal_anchor":true},{"citing_arxiv_id":"2605.07110","citing_title":"Securing Computer-Use Agents: A Unified Architecture-Lifecycle Framework for Deployment-Grounded Reliability","ref_index":166,"is_internal_anchor":true},{"citing_arxiv_id":"2604.13954","citing_title":"HINTBench: Horizon-agent Intrinsic Non-attack Trajectory Benchmark","ref_index":1,"is_internal_anchor":true},{"citing_arxiv_id":"2605.02801","citing_title":"Reinforcement Learning for LLM-based Multi-Agent Systems through Orchestration Traces","ref_index":14,"is_internal_anchor":true}]},"formal_canon":{"evidence_count":0,"sample":[],"anchors":[]},"links":{"html":"https://pith.science/pith/U4LWUM4HR6F4JRSRGMW53APVDW","json":"https://pith.science/pith/U4LWUM4HR6F4JRSRGMW53APVDW.json","graph_json":"https://pith.science/api/pith-number/U4LWUM4HR6F4JRSRGMW53APVDW/graph.json","events_json":"https://pith.science/api/pith-number/U4LWUM4HR6F4JRSRGMW53APVDW/events.json","paper":"https://pith.science/paper/U4LWUM4H"},"agent_actions":{"view_html":"https://pith.science/pith/U4LWUM4HR6F4JRSRGMW53APVDW","download_json":"https://pith.science/pith/U4LWUM4HR6F4JRSRGMW53APVDW.json","view_paper":"https://pith.science/paper/U4LWUM4H","resolve_alias":"https://pith.science/api/pith-number/resolve?arxiv=2504.18575&json=true","fetch_graph":"https://pith.science/api/pith-number/U4LWUM4HR6F4JRSRGMW53APVDW/graph.json","fetch_events":"https://pith.science/api/pith-number/U4LWUM4HR6F4JRSRGMW53APVDW/events.json","actions":{"anchor_timestamp":"https://pith.science/pith/U4LWUM4HR6F4JRSRGMW53APVDW/action/timestamp_anchor","attest_storage":"https://pith.science/pith/U4LWUM4HR6F4JRSRGMW53APVDW/action/storage_attestation","attest_author":"https://pith.science/pith/U4LWUM4HR6F4JRSRGMW53APVDW/action/author_attestation","sign_citation":"https://pith.science/pith/U4LWUM4HR6F4JRSRGMW53APVDW/action/citation_signature","submit_replication":"https://pith.science/pith/U4LWUM4HR6F4JRSRGMW53APVDW/action/replication_record"}},"created_at":"2026-05-17T23:38:49.957545+00:00","updated_at":"2026-05-17T23:38:49.957545+00:00"}