Caution for the environment: Multimodal agents are susceptible to environmental distractions

Xinbei Ma, Yiting Wang, Yao Yao, Tongxin Yuan, Aston Zhang, Zhuosheng Zhang, Hai Zhao · 2024 · arXiv 2408.02544

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

read on arXiv browse 5 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

GUIGuard-Bench: Toward a General Evaluation for Privacy-Preserving GUI Agents

cs.CR · 2026-01-26 · unverdicted · novelty 8.0

GUIGuard-Bench is a new benchmark with annotated GUI screenshots that measures privacy recognition, planning fidelity under protection, and utility impact for trajectory-based GUI agents.

HLL: Can Agents Cross Humanity's Last Line of Verification?

cs.AI · 2026-06-01 · unverdicted · novelty 7.0

HLL is a new benchmark that evaluates eight frontier multimodal agents on closed-loop interactive CAPTCHA solving, showing sharp performance drops under realism stressors and trace validation.

Turing Test on Screen: A Benchmark for Mobile GUI Agent Humanization

cs.AI · 2026-02-24 · unverdicted · novelty 7.0

The work creates a new benchmark for humanizing GUI agent touch dynamics via a MinMax detector-agent model, a mobile touch dataset, and methods showing agents can match human behavior without losing task performance.

Mobile GUI Agents under Real-world Threats: Are We There Yet?

cs.CR · 2025-07-06 · conditional · novelty 6.0

Introduces an app-content instrumentation framework and benchmark showing that examined GUI agents suffer 42.0% and 36.1% average misleading rates from third-party content in dynamic and static tests respectively.

LaSM: Layer-wise Scaling Mechanism for Defending Pop-up Attack on GUI Agents

cs.CR · 2025-07-13 · conditional · novelty 5.0

LaSM is a layer-wise scaling mechanism that amplifies attention and MLP modules in critical layers to defend GUI agents against pop-up attacks by correcting attention misalignment.

citing papers explorer

Showing 1 of 1 citing paper after filters.

GUIGuard-Bench: Toward a General Evaluation for Privacy-Preserving GUI Agents cs.CR · 2026-01-26 · unverdicted · none · ref 32
GUIGuard-Bench is a new benchmark with annotated GUI screenshots that measures privacy recognition, planning fidelity under protection, and utility impact for trajectory-based GUI agents.

Caution for the environment: Multimodal agents are susceptible to environmental distractions

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer