Commercial llm agents are already vulnerable to simple yet dangerous attacks

Li A, Zhou Y, Raghuram VC, Goldstein T, Goldblum M · 2025 · arXiv 2502.08586

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

read on arXiv browse 7 citing papers

citation-role summary

background 3

citation-polarity summary

background 3

representative citing papers

When the Manual Lies: A Realistic Benchmark to Evaluate MCP Poisoning Attacks for LLM Agents

cs.CR · 2026-05-22 · unverdicted · novelty 7.0

Introduces MCP-TDP benchmark showing near-100% attack success on models like GPT-4o for tool description poisoning and proposes reactive self-correction defense.

OS-SPEAR: A Toolkit for the Safety, Performance,Efficiency, and Robustness Analysis of OS Agents

cs.CL · 2026-04-27 · unverdicted · novelty 7.0

OS-SPEAR is a new evaluation toolkit that tests 22 OS agents and identifies trade-offs between efficiency and safety or robustness.

LLM-AutoDP: Automatic Data Processing via LLM Agents for Model Fine-tuning

cs.LG · 2026-01-28 · unverdicted · novelty 6.0

LLM agents iteratively generate and optimize data processing strategies for fine-tuning, delivering over 80% win rates versus unprocessed data and 65% versus LLM-based AutoML baselines while cutting search time by up to 10x.

Why Trust Your Agent? Empirical Security Gains from TRiSM-Guided Agentic Workflows in Healthcare

cs.CR · 2026-06-27 · unverdicted · novelty 4.0

TRiSM-guided agentic workflows reduced RAG poisoning attack success from 31% to 10%, data-field injection from 42% to 25%, eliminated network injection, and raised report accuracy from 72.5% to 86.5% across five LLMs and 800 generations.

Agentic AI Security: Threats, Defenses, Evaluation, and Open Challenges

cs.AI · 2025-10-27 · unverdicted · novelty 4.0

A survey that taxonomizes threats to agentic AI, reviews benchmarks and evaluation methods, discusses technical and governance defenses, and identifies open challenges.

Large Language Model Agent: A Survey on Methodology, Applications and Challenges

cs.CL · 2025-03-27 · accept · novelty 3.0

A survey that deconstructs LLM agent systems via a methodology-centered taxonomy linking design principles to emergent behaviors, applications, and challenges.

Towards trustworthy agentic AI: a comprehensive survey of safety, robustness, privacy, and system security

cs.AI · 2026-05-17 · unverdicted · novelty 2.0

A survey that maps risks along the agent workflow and consolidates metrics and benchmarks for safety, robustness, privacy, and security in agentic AI.

citing papers explorer

Showing 7 of 7 citing papers.

When the Manual Lies: A Realistic Benchmark to Evaluate MCP Poisoning Attacks for LLM Agents cs.CR · 2026-05-22 · unverdicted · none · ref 9
Introduces MCP-TDP benchmark showing near-100% attack success on models like GPT-4o for tool description poisoning and proposes reactive self-correction defense.
OS-SPEAR: A Toolkit for the Safety, Performance,Efficiency, and Robustness Analysis of OS Agents cs.CL · 2026-04-27 · unverdicted · none · ref 34
OS-SPEAR is a new evaluation toolkit that tests 22 OS agents and identifies trade-offs between efficiency and safety or robustness.
LLM-AutoDP: Automatic Data Processing via LLM Agents for Model Fine-tuning cs.LG · 2026-01-28 · unverdicted · none · ref 23
LLM agents iteratively generate and optimize data processing strategies for fine-tuning, delivering over 80% win rates versus unprocessed data and 65% versus LLM-based AutoML baselines while cutting search time by up to 10x.
Why Trust Your Agent? Empirical Security Gains from TRiSM-Guided Agentic Workflows in Healthcare cs.CR · 2026-06-27 · unverdicted · none · ref 21
TRiSM-guided agentic workflows reduced RAG poisoning attack success from 31% to 10%, data-field injection from 42% to 25%, eliminated network injection, and raised report accuracy from 72.5% to 86.5% across five LLMs and 800 generations.
Agentic AI Security: Threats, Defenses, Evaluation, and Open Challenges cs.AI · 2025-10-27 · unverdicted · none · ref 44
A survey that taxonomizes threats to agentic AI, reviews benchmarks and evaluation methods, discusses technical and governance defenses, and identifies open challenges.
Large Language Model Agent: A Survey on Methodology, Applications and Challenges cs.CL · 2025-03-27 · accept · none · ref 175
A survey that deconstructs LLM agent systems via a methodology-centered taxonomy linking design principles to emergent behaviors, applications, and challenges.
Towards trustworthy agentic AI: a comprehensive survey of safety, robustness, privacy, and system security cs.AI · 2026-05-17 · unverdicted · none · ref 99
A survey that maps risks along the agent workflow and consolidates metrics and benchmarks for safety, robustness, privacy, and security in agentic AI.

Commercial llm agents are already vulnerable to simple yet dangerous attacks

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer