Error messages in the Model Context Protocol can be systematically mutated across seven dimensions to triple indirect prompt injection success rates, reaching up to 100% compliance on four frontier models.
The illusion of role separation: Hidden shortcuts in llm role learning (and how to fix them).arXiv preprint arXiv:2505.00626,
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AI 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
VATS: Exploiting Implicit Authority in Error-Path Injection via Systematic Mutation
Error messages in the Model Context Protocol can be systematically mutated across seven dimensions to triple indirect prompt injection success rates, reaching up to 100% compliance on four frontier models.