Reframing LLM Agent Security as an Agent-Human Interaction Problem

· 2026 · cs.CR · arXiv 2605.24309

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

open full Pith review browse 3 citing papers arXiv PDF

abstract

We argue that LLM agent security is fundamentally an agent-human interaction (AHI) problem, not a purely algorithmic one. To substantiate this position, we conduct a systematic analysis of 59 academic papers, 21 production agent systems, and 26 security plugins as of April 2026. Our analysis reveals a striking pattern: the three widely deployed human-centric security mechanisms (policy specification, runtime approval, and scope configuration) dominate industry practice, each adopted by at least 14 of 21 systems (14, 15, and 16, respectively), while the categories most heavily studied in academia (intent anchoring and trust labeling) see zero production deployment. Yet current human participation mechanisms are far from satisfactory: they suffer from a fundamental trade-off between cognitive burden and security guarantees, leaving users caught between approval fatigue and uncontrolled agent autonomy. We make three contributions. First, through a systematic comparison of LLM-based and human-based intent alignment, we argue that human participation in agent security decisions is indispensable given current capabilities. Second, we quantify a pronounced industry-academia mismatch: the security mechanisms that practitioners actually deploy receive scant research attention, while the approaches that researchers favor remain undeployed. Third, we propose a three-direction research agenda and call for AHI security to be recognized as a first-class research citizen, one that demands its own design principles, evaluation methods, and theoretical foundations.

representative citing papers

One Goal, Many Commands: Characterizing Denylist Fragility in AI Agents

cs.CR · 2026-06-14 · unverdicted · novelty 7.0

ShellSieve, an LLM-driven pipeline, detects command denylist fragility in terminal AI agents and finds 69.0-98.6% of 1,709 GitHub-collected denylists to be bypassable.

Janus: a Playground for User-Involved Agentic Permission Management

cs.AI · 2026-07-01 · unverdicted · novelty 6.0

Janus is a publicly available playground system and evaluation harness for testing user-involved permission management designs in AI agents, demonstrating benefits of user input and the need for context-sensitive approaches.

Oversight Has a Capacity: Calibrating Agent Guards to a Subjective, Fatiguing Human

cs.AI · 2026-06-08 · unverdicted · novelty 3.0

Human oversight for LLM agent actions is capacity-limited by subjective disagreement (kappa 0.52) and fatigue, producing an inverted-U safety curve and vulnerability to flooding attacks in a modeling study.

citing papers explorer

Showing 3 of 3 citing papers.

One Goal, Many Commands: Characterizing Denylist Fragility in AI Agents cs.CR · 2026-06-14 · unverdicted · none · ref 48 · internal anchor
ShellSieve, an LLM-driven pipeline, detects command denylist fragility in terminal AI agents and finds 69.0-98.6% of 1,709 GitHub-collected denylists to be bypassable.
Janus: a Playground for User-Involved Agentic Permission Management cs.AI · 2026-07-01 · unverdicted · none · ref 12 · internal anchor
Janus is a publicly available playground system and evaluation harness for testing user-involved permission management designs in AI agents, demonstrating benefits of user input and the need for context-sensitive approaches.
Oversight Has a Capacity: Calibrating Agent Guards to a Subjective, Fatiguing Human cs.AI · 2026-06-08 · unverdicted · none · ref 17 · internal anchor
Human oversight for LLM agent actions is capacity-limited by subjective disagreement (kappa 0.52) and fatigue, producing an inverted-U safety curve and vulnerability to flooding attacks in a modeling study.

Reframing LLM Agent Security as an Agent-Human Interaction Problem

fields

years

verdicts

representative citing papers

citing papers explorer