Seeing is deceiving: Exploitation of visual pathways in multi-modal language models.arXiv preprint arXiv:2411.05056, 2024

Pete Janowczyk, Linda Laurier, Ave Giulietta, Arlo Octavia, Meade Cleti · 2024 · arXiv 2411.05056

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

read on arXiv browse 2 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

GUIGuard-Bench: Toward a General Evaluation for Privacy-Preserving GUI Agents

cs.CR · 2026-01-26 · unverdicted · novelty 8.0

GUIGuard-Bench is a new benchmark with annotated GUI screenshots that measures privacy recognition, planning fidelity under protection, and utility impact for trajectory-based GUI agents.

ORCA: An Agentic Reasoning Framework for Hallucination and Adversarial Robustness in Vision-Language Models

cs.CV · 2025-09-18 · unverdicted · novelty 6.0 · 2 refs

ORCA is an agentic reasoning framework that enhances factual accuracy and adversarial robustness of pretrained LVLMs via an Observe-Reason-Critique-Act loop with small vision models, reporting accuracy gains of up to 40% on hallucination benchmarks and 20% under adversarial perturbations.

citing papers explorer

Showing 2 of 2 citing papers.

GUIGuard-Bench: Toward a General Evaluation for Privacy-Preserving GUI Agents cs.CR · 2026-01-26 · unverdicted · none · ref 19
GUIGuard-Bench is a new benchmark with annotated GUI screenshots that measures privacy recognition, planning fidelity under protection, and utility impact for trajectory-based GUI agents.
ORCA: An Agentic Reasoning Framework for Hallucination and Adversarial Robustness in Vision-Language Models cs.CV · 2025-09-18 · unverdicted · none · ref 14 · 2 links
ORCA is an agentic reasoning framework that enhances factual accuracy and adversarial robustness of pretrained LVLMs via an Observe-Reason-Critique-Act loop with small vision models, reporting accuracy gains of up to 40% on hallucination benchmarks and 20% under adversarial perturbations.

Seeing is deceiving: Exploitation of visual pathways in multi-modal language models.arXiv preprint arXiv:2411.05056, 2024

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer