Liu, et al., TraceAegis: Provenance-based anomaly detection for AI agent execution traces (2025)

Traceaegis: Securing llm-based agents via hierarchical, behavioral anomaly detection · 2024 · arXiv 2510.11203

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

representative citing papers

Inverting the Shield: Systematically Generating Safety Tests from Policy Specifications

cs.AI · 2026-05-24 · unverdicted · novelty 7.0

POLARIS formalizes policies in FOL, constructs a Semantic Policy Graph to discover compositional violations, and generates natural-language tests, reporting higher coverage and attack success than baselines.

Runtime Skill Audit: Targeted Runtime Probing for Agent Skill Security

cs.CR · 2026-06-10 · unverdicted · novelty 6.0

Runtime Skill Audit introduces targeted runtime probing to detect malicious LLM agent skills, reporting 90% accuracy and resilience to self-evolving attacks on 100 skills versus static baselines.

AgentShield: Deception-based Compromise Detection for Tool-using LLM Agents

cs.CR · 2026-05-10 · unverdicted · novelty 6.0

AgentShield uses layered deception traps in LLM agent tool interfaces to detect indirect prompt injection compromises with 90.7-100% success on commercial models, zero false positives, and cross-lingual transfer without retraining.

Content-Aware Attack Detection in LLM Agent Tool-Call Traffic: An Empirical Study of Features, Architectures, and Evaluation Protocols

cs.CR · 2026-05-11 · unverdicted · novelty 5.0 · 3 refs

Content embeddings from SBERT enable AUROC above 0.89 for attack detection in MCP tool-call sessions, with tree ensembles on pooled embeddings reaching 0.975 and outperforming GNNs when using task-stratified splits instead of random ones.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Inverting the Shield: Systematically Generating Safety Tests from Policy Specifications cs.AI · 2026-05-24 · unverdicted · none · ref 2
POLARIS formalizes policies in FOL, constructs a Semantic Policy Graph to discover compositional violations, and generates natural-language tests, reporting higher coverage and attack success than baselines.

Liu, et al., TraceAegis: Provenance-based anomaly detection for AI agent execution traces (2025)

fields

years

verdicts

representative citing papers

citing papers explorer