Benchmarking the Robustness of Agentic Systems to Adversarially-Induced Harms

· 2025 · arXiv 2508.16481

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

read on arXiv browse 3 citing papers

citation-role summary

dataset 1

citation-polarity summary

background 1

representative citing papers

Taxonomy and Consistency Analysis of Safety Benchmarks for AI Agents

cs.CY · 2026-04-11 · accept · novelty 8.0

This paper delivers the first systematic taxonomy and cross-benchmark consistency analysis of 40 agent safety benchmarks, finding broad but shallow risk coverage, no ranking concordance across evaluations, and that benchmark choice systematically alters reported safety.

A Systematic Survey of Security Threats and Defenses in LLM-Based AI Agents: A Layered Attack Surface Framework

cs.CR · 2026-04-25 · unverdicted · novelty 7.0

A new 7x4 taxonomy organizes agentic AI security threats by architectural layer and persistence timescale, revealing under-explored upper layers and missing defenses after surveying 116 papers.

Position: A Three-Layer Probabilistic Assume-Guarantee Architecture Is Structurally Required for Safe LLM Agent Deployment

cs.AI · 2026-05-18 · unverdicted · novelty 5.0

A three-layer probabilistic assume-guarantee architecture is structurally required for safe LLM agent deployment.

citing papers explorer

Showing 3 of 3 citing papers.

Taxonomy and Consistency Analysis of Safety Benchmarks for AI Agents cs.CY · 2026-04-11 · accept · none · ref 41
This paper delivers the first systematic taxonomy and cross-benchmark consistency analysis of 40 agent safety benchmarks, finding broad but shallow risk coverage, no ranking concordance across evaluations, and that benchmark choice systematically alters reported safety.
A Systematic Survey of Security Threats and Defenses in LLM-Based AI Agents: A Layered Attack Surface Framework cs.CR · 2026-04-25 · unverdicted · none · ref 143
A new 7x4 taxonomy organizes agentic AI security threats by architectural layer and persistence timescale, revealing under-explored upper layers and missing defenses after surveying 116 papers.
Position: A Three-Layer Probabilistic Assume-Guarantee Architecture Is Structurally Required for Safe LLM Agent Deployment cs.AI · 2026-05-18 · unverdicted · none · ref 43
A three-layer probabilistic assume-guarantee architecture is structurally required for safe LLM agent deployment.

Benchmarking the Robustness of Agentic Systems to Adversarially-Induced Harms

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer