Mirrorguard: Toward secure computer-use agents via simulation-to-real reasoning correction

Wenqi Zhang, Yulin Shen, Changyue Jiang, Jiarun Dai, Geng Hong, Xudong Pan · 2026 · arXiv 2601.12822

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

read on arXiv browse 2 citing papers

citation-role summary

background 1

citation-polarity summary

support 1

representative citing papers

WARD: Adversarially Robust Defense of Web Agents Against Prompt Injections

cs.CR · 2026-05-14 · unverdicted · novelty 5.0

WARD is a guard model trained on 177K web samples and adversarially hardened via attacker-guard co-evolution to achieve high recall on prompt injections with low false positives and no added latency.

Securing Computer-Use Agents: A Unified Architecture-Lifecycle Framework for Deployment-Grounded Reliability

cs.CL · 2026-05-08 · unverdicted · novelty 4.0

The paper develops a unified framework that organizes computer-use agent reliability around perception-decision-execution layers and creation-deployment-operation-maintenance stages to map security and alignment interventions.

citing papers explorer

Showing 2 of 2 citing papers.

WARD: Adversarially Robust Defense of Web Agents Against Prompt Injections cs.CR · 2026-05-14 · unverdicted · none · ref 79
WARD is a guard model trained on 177K web samples and adversarially hardened via attacker-guard co-evolution to achieve high recall on prompt injections with low false positives and no added latency.
Securing Computer-Use Agents: A Unified Architecture-Lifecycle Framework for Deployment-Grounded Reliability cs.CL · 2026-05-08 · unverdicted · none · ref 188
The paper develops a unified framework that organizes computer-use agent reliability around perception-decision-execution layers and creation-deployment-operation-maintenance stages to map security and alignment interventions.

Mirrorguard: Toward secure computer-use agents via simulation-to-real reasoning correction

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer