DRIP-R is a new benchmark showing that frontier LLMs systematically disagree on how to resolve identical ambiguous retail policy scenarios, highlighting ambiguity as a core challenge for agent decision-making.
Doomarena: A framework for testing ai agents against evolving security threats
6 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
representative citing papers
Agentic browsers are vulnerable to 20 web and LLM attacks with 18 implemented, exposing five failure modes across four major LLM models that require redesign before safe deployment.
Signal-Driven Observation decouples observation from action frequency in long-horizon web agents by invoking selective task-relevant DOM reads only on signals such as URL changes or action failures.
The paper defines and evaluates Trojan Hippo attacks on LLM agent memory, showing 85-100% success in data exfiltration across backends and reduced rates with defenses at varying utility costs.
Adversaries can poison finetuning data, base models, or environments to backdoor AI agents, achieving over 80% success in leaking confidential information on two agentic benchmarks.
A synthesis of 247 papers on LLM agent security identifies prompt injection and tool hijacking as dominant threats, notes weakly compositional defenses, and argues for trust boundaries and realistic evaluations.
citing papers explorer
-
Malice in Agentland: Down the Rabbit Hole of Backdoors in the AI Supply Chain
Adversaries can poison finetuning data, base models, or environments to backdoor AI agents, achieving over 80% success in leaking confidential information on two agentic benchmarks.