When benign inputs lead to severe harms: Eliciting unsafe unintended behaviors of computer-use agents

Jaylen Jones, Zhehao Zhang, Yuting Ning, Eric Fosler-Lussier, Pierre-Luc St-Charles, Yoshua Bengio, Dawn Song, Yu Su, Huan Sun · 2026 · arXiv 2602.08235

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

read on arXiv browse 4 citing papers

citation-role summary

background 3 method 1

citation-polarity summary

support 2 background 1 use method 1

representative citing papers

The Authorization-Execution Gap Is a Major Safety and Security Problem in Open-World Agents

cs.CR · 2026-05-10 · conditional · novelty 6.0

Open-world agents suffer from an Authorization-Execution Gap arising from delegation incompleteness, channel corruption, and composition fragmentation, requiring dynamic runtime integrity checks instead of only upfront filters or post-hoc audits.

When Routine Chats Turn Toxic: Unintended Long-Term State Poisoning in Personalized Agents

cs.CR · 2026-05-07 · unverdicted · novelty 6.0

Routine user chats can unintentionally poison the long-term state of personalized LLM agents, causing authorization drift, tool escalation, and unchecked autonomy, as measured by a new benchmark and reduced by the StateGuard defense.

PageGuide: Browser extension to assist users in navigating a webpage and locating information

cs.HC · 2026-04-26 · accept · novelty 6.0

PageGuide grounds LLM answers in webpage DOM elements using visual overlays for find, guide, and hide modes, yielding measurable gains in a 94-user study.

Securing Computer-Use Agents: A Unified Architecture-Lifecycle Framework for Deployment-Grounded Reliability

cs.CL · 2026-05-08 · unverdicted · novelty 4.0

The paper develops a unified framework that organizes computer-use agent reliability around perception-decision-execution layers and creation-deployment-operation-maintenance stages to map security and alignment interventions.

citing papers explorer

Showing 4 of 4 citing papers.

The Authorization-Execution Gap Is a Major Safety and Security Problem in Open-World Agents cs.CR · 2026-05-10 · conditional · none · ref 16
Open-world agents suffer from an Authorization-Execution Gap arising from delegation incompleteness, channel corruption, and composition fragmentation, requiring dynamic runtime integrity checks instead of only upfront filters or post-hoc audits.
When Routine Chats Turn Toxic: Unintended Long-Term State Poisoning in Personalized Agents cs.CR · 2026-05-07 · unverdicted · none · ref 13
Routine user chats can unintentionally poison the long-term state of personalized LLM agents, causing authorization drift, tool escalation, and unchecked autonomy, as measured by a new benchmark and reduced by the StateGuard defense.
PageGuide: Browser extension to assist users in navigating a webpage and locating information cs.HC · 2026-04-26 · accept · none · ref 20
PageGuide grounds LLM answers in webpage DOM elements using visual overlays for find, guide, and hide modes, yielding measurable gains in a 94-user study.
Securing Computer-Use Agents: A Unified Architecture-Lifecycle Framework for Deployment-Grounded Reliability cs.CL · 2026-05-08 · unverdicted · none · ref 176
The paper develops a unified framework that organizes computer-use agent reliability around perception-decision-execution layers and creation-deployment-operation-maintenance stages to map security and alignment interventions.

When benign inputs lead to severe harms: Eliciting unsafe unintended behaviors of computer-use agents

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer