Agentic LLMs are vulnerable to indirect prompt injections that bypass standard defenses, but representation engineering on internal hidden states can detect attacks with high accuracy before actions are taken.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Your Agent is More Brittle Than You Think: Uncovering Indirect Injection Vulnerabilities in Agentic LLMs
Agentic LLMs are vulnerable to indirect prompt injections that bypass standard defenses, but representation engineering on internal hidden states can detect attacks with high accuracy before actions are taken.