Malice in Agentland: Down the Rabbit Hole of Backdoors in the AI Supply Chain

· 2025 · cs.CR · arXiv 2510.05159

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

While finetuning AI agents on interaction data -- such as web browsing or tool use -- improves their capabilities, it also introduces critical security vulnerabilities within the agentic AI supply chain. We show that adversaries can effectively poison the data collection pipeline at multiple stages to embed hard-to-detect backdoors that, when triggered, cause unsafe or malicious behavior. We formalize three realistic threat models across distinct layers of the supply chain: direct poisoning of finetuning data, pre-backdoored base models, and environment poisoning, a novel attack vector that exploits vulnerabilities specific to agentic training pipelines. Evaluated on two widely adopted agentic benchmarks, all three threat models prove effective: poisoning only a small number of demonstrations is sufficient to embed a backdoor that causes an agent to leak confidential user information with over 80\% success.

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Conjunctive Prompt Attacks in Multi-Agent LLM Systems

cs.MA · 2026-04-17 · unverdicted · novelty 7.0

Conjunctive prompt attacks split adversarial elements across agents and routing paths in multi-agent LLM systems, evading isolated defenses and succeeding through topology-aware optimization.

citing papers explorer

Showing 1 of 1 citing paper.

Conjunctive Prompt Attacks in Multi-Agent LLM Systems cs.MA · 2026-04-17 · unverdicted · none · ref 5 · internal anchor
Conjunctive prompt attacks split adversarial elements across agents and routing paths in multi-agent LLM systems, evading isolated defenses and succeeding through topology-aware optimization.

Malice in Agentland: Down the Rabbit Hole of Backdoors in the AI Supply Chain

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer