arxiv: 2605.06812 · v1 · submitted 2026-05-07 · 💻 cs.AI

Recognition: 1 theorem link

· Lean Theorem

Towards Security-Auditable LLM Agents: A Unified Graph Representation

Chaofan Li , Lyuye Zhang , Jintao Zhai , Siyue Feng , Xichun Yang , Huahao Wang , Shihan Dou , Yu Ji

show 4 more authors

Yutao Hu Yueming Wu Yang Liu Deqing Zou

Authors on Pith no claims yet

Pith reviewed 2026-05-11 01:20 UTC · model grok-4.3

classification 💻 cs.AI

keywords LLM agentssecurity auditinggraph representationAgent-BOMmulti-agent systemsrisk assessmentOWASP Agentic Top 10execution traces

0 comments

The pith

Agent-BOM models LLM agentic systems as hierarchical attributed directed graphs to enable security auditing and root-cause analysis.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes Agent-BOM as a unified representation to close the semantic gap between low-level events and high-level intent in LLM-based agentic systems. It represents these systems as hierarchical attributed directed graphs that separate static elements such as models, tools, and long-term memory from dynamic elements such as goals, reasoning trajectories, and actions. Semantic edges and security attributes link the layers, turning fragmented traces into queryable audit paths. This structure supports graph-query risk assessment aligned with the OWASP Agentic Top 10 and reconstructs attack chains including memory poisoning, tool misuse, and privilege abuse in evaluated scenarios. If the model holds, it supplies a foundation for systematic security adjudication across multi-agent ecosystems where prior logs and SBOMs fall short.

Core claim

Agent-BOM models an agentic system as a hierarchical attributed directed graph that separates static capability bases, such as models, tools, and long-term memory, from dynamic runtime semantic states, such as goals, reasoning trajectories, and actions. These layers connect through semantic edges and security attributes, transforming fragmented execution traces into queryable audit paths for path-level risk assessment.

What carries the argument

Agent-BOM: a hierarchical attributed directed graph with static capability layers and dynamic semantic-state layers joined by semantic edges and security attributes to produce auditable paths.

If this is right

Graph queries can assess risks against the OWASP Agentic Top 10 by traversing semantic paths.
Reconstruction of stealthy chains becomes possible, including cross-session memory poisoning and capability supply-chain hijacking.
Fragmented traces convert into structured audit paths usable for root-cause analysis.
The representation supplies a unified foundation for security adjudication in multi-agent ecosystems.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Auditing plugins could be extended to emit Agent-BOM graphs in real time for continuous monitoring rather than post-hoc review.
The same separation of static and dynamic layers might apply to non-LLM autonomous systems that share similar state-evolution problems.
Quantitative risk scores could be attached to the security attributes to enable automated prioritization of audit paths.

Load-bearing premise

A hierarchical attributed directed graph can faithfully capture cognitive-state evolution, capability bindings, memory contamination, and cascading risks without significant information loss from live executions.

What would settle it

A concrete counter-example would be a live multi-agent execution whose reconstructed Agent-BOM graph fails to link a memory-contamination event to a subsequent unauthorized action, leaving the attack chain incomplete.

Figures

Figures reproduced from arXiv: 2605.06812 by Chaofan Li, Deqing Zou, Huahao Wang, Jintao Zhai, Lyuye Zhang, Shihan Dou, Siyue Feng, Xichun Yang, Yang Liu, Yueming Wu, Yu Ji, Yutao Hu.

**Figure 2.** Figure 2: Auditing Subgraph for Cross-session Memory Poisoning and Tool Misuse. [PITH_FULL_IMAGE:figures/full_fig_p011_2.png] view at source ↗

**Figure 3.** Figure 3: Auditing Subgraph for Capability Supply-chain Hijacking and Unexpected Code Execution. [PITH_FULL_IMAGE:figures/full_fig_p012_3.png] view at source ↗

**Figure 4.** Figure 4: Auditing Subgraph for Multi-agent Ecosystem Hijacking. [PITH_FULL_IMAGE:figures/full_fig_p014_4.png] view at source ↗

**Figure 5.** Figure 5: Auditing Subgraph for Human-agent Trust Exploitation and Privilege Abuse. [PITH_FULL_IMAGE:figures/full_fig_p014_5.png] view at source ↗

read the original abstract

LLM-based agentic systems are rapidly evolving to perform complex autonomous tasks through dynamic tool invocation, stateful memory management, and multi-agent collaboration. However, this semantics-driven execution paradigm creates a severe semantic gap between low-level physical events and high-level execution intent, making post-hoc security auditing fundamentally difficult. Existing representation mechanisms, including static SBOMs and runtime logs, provide only fragmented evidence and fail to capture cognitive-state evolution, capability bindings, persistent memory contamination, and cascading risk propagation across interacting agents. To bridge this gap, we propose Agent-BOM, a unified structural representation for agent security auditing. Agent-BOM models an agentic system as a hierarchical attributed directed graph that separates static capability bases, such as models, tools, and long-term memory, from dynamic runtime semantic states, such as goals, reasoning trajectories, and actions. These layers are connected through semantic edges and security attributes, transforming fragmented execution traces into queryable audit paths. Building on Agent-BOM, we develop a graph-query-based paradigm for path-level risk assessment and instantiate it with the OWASP Agentic Top 10. We further implement an auditing plugin in the OpenClaw environment to construct Agent-BOM from live executions. Evaluation on representative real-world agentic attack scenarios shows that Agent-BOM can reconstruct stealthy attack chains, including cross-session memory poisoning and tool misuse, capability supply-chain hijacking and unexpected code execution, multi-agent ecosystem hijacking, and privilege and trust abuse. These results demonstrate that Agent-BOM provides a unified and auditable foundation for root-cause analysis and security adjudication in complex agentic ecosystems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Agent-BOM gives a concrete graph model for auditing LLM agents but rests on qualitative reconstructions without fidelity metrics.

read the letter

The main thing here is that the paper defines Agent-BOM as a hierarchical attributed directed graph separating static parts like models and tools from dynamic states like goals and reasoning, then uses it to trace security risks in agent runs. They tie the model to OWASP Agentic Top 10 queries and show an OpenClaw plugin that builds the graph from live executions, with examples reconstructing attacks such as cross-session memory poisoning, tool misuse, capability hijacking, and multi-agent ecosystem issues. This is a step past scattered logs and SBOMs by creating queryable paths that include semantic edges and security attributes. The separation of layers and the focus on risk propagation across agents is a reasonable way to address the intent-to-event gap in autonomous systems. The evaluation demonstrates that the graph can surface these chains in the chosen cases, which is useful as a proof of concept. The soft spot is the lack of any quantitative check on construction quality. The reconstructions are shown for four hand-selected scenarios, but there are no metrics for recall of reasoning steps, coverage of memory states, or bounds on information loss when mapping runtime events to nodes and edges. Without formal rules for the mapping or error analysis, it is hard to know how faithfully the graph captures cognitive evolution and cascading risks. This work is aimed at researchers in AI security who deal with agentic systems and need better auditing structures. A reader building tools or studying attack surfaces would pick up practical modeling ideas, even if the current evidence stays at the demonstration level. It deserves peer review because the problem is timely and the proposal is structured enough to benefit from external feedback on the evaluation. I would send it out, with the main request being added quantitative validation of the graph-building process.

Referee Report

2 major / 2 minor

Summary. The paper proposes Agent-BOM, a hierarchical attributed directed graph representation for LLM-based agentic systems that separates static capability bases (models, tools, long-term memory) from dynamic runtime semantic states (goals, reasoning trajectories, actions). These are linked via semantic edges and security attributes to enable queryable audit paths. The work develops a graph-query paradigm for path-level risk assessment aligned with the OWASP Agentic Top 10, implements an OpenClaw plugin to construct Agent-BOM from live executions, and evaluates it on four representative attack scenarios (cross-session memory poisoning, tool misuse, capability supply-chain hijacking, multi-agent hijacking, and privilege abuse), claiming successful reconstruction of stealthy attack chains.

Significance. If the graph construction can be shown to preserve dynamic states with bounded loss, Agent-BOM could offer a practical unified structure for root-cause analysis and security adjudication in agentic ecosystems, addressing the semantic gap between low-level events and high-level intent that current SBOMs and logs leave unaddressed. The proposal is timely given the rapid adoption of autonomous LLM agents.

major comments (2)

[Evaluation] Evaluation section: The central claim that Agent-BOM 'can reconstruct stealthy attack chains' rests on qualitative descriptions of four hand-selected scenarios; no quantitative fidelity metrics (e.g., recall of reasoning steps, coverage of memory-state transitions, or edge-attribute completeness) are reported to bound information loss when mapping live execution traces to the attributed graph. This is load-bearing for the assertion of faithful capture of cognitive-state evolution and cascading risks.
[Agent-BOM Construction] Agent-BOM definition and OpenClaw plugin: No formal construction rules, mapping algorithm, or attribute schema are provided for transforming runtime events into hierarchical nodes and semantic edges. Without these, it is impossible to assess whether the separation of static and dynamic layers systematically omits critical elements such as transient capability bindings or multi-agent trust propagation.

minor comments (2)

[Abstract] The abstract and introduction refer to 'representative real-world agentic attack scenarios' without clarifying selection criteria or coverage relative to the full OWASP Agentic Top 10.
[Agent-BOM] Notation for graph attributes (e.g., security attributes on edges) is introduced but not summarized in a single table or definition list, making it harder to follow how they support the query paradigm.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed comments. We address each major comment point by point below, acknowledging where the manuscript can be strengthened through revisions.

read point-by-point responses

Referee: [Evaluation] Evaluation section: The central claim that Agent-BOM 'can reconstruct stealthy attack chains' rests on qualitative descriptions of four hand-selected scenarios; no quantitative fidelity metrics (e.g., recall of reasoning steps, coverage of memory-state transitions, or edge-attribute completeness) are reported to bound information loss when mapping live execution traces to the attributed graph. This is load-bearing for the assertion of faithful capture of cognitive-state evolution and cascading risks.

Authors: We agree that the current evaluation relies primarily on qualitative case studies of four scenarios to demonstrate reconstruction of attack chains. While such detailed qualitative analysis is standard in security research for illustrating complex, stealthy paths where ground truth is inherently interpretive, we recognize that quantitative fidelity metrics would better bound potential information loss. In the revised manuscript, we will add quantitative evaluations, including precision/recall for reasoning step and memory transition reconstruction, as well as attribute completeness scores derived from annotated execution traces. revision: yes
Referee: [Agent-BOM Construction] Agent-BOM definition and OpenClaw plugin: No formal construction rules, mapping algorithm, or attribute schema are provided for transforming runtime events into hierarchical nodes and semantic edges. Without these, it is impossible to assess whether the separation of static and dynamic layers systematically omits critical elements such as transient capability bindings or multi-agent trust propagation.

Authors: Sections 3 and 4 describe the Agent-BOM structure and the OpenClaw plugin's mapping from runtime events, with illustrative examples. We acknowledge that a more formal specification would improve reproducibility and allow explicit evaluation of completeness for elements like transient bindings. In the revision, we will add a formal definition of the construction algorithm (including pseudocode), a complete attribute schema, and explicit discussion of how transient and multi-agent elements are handled. revision: yes

Circularity Check

0 steps flagged

No significant circularity in the Agent-BOM modeling proposal

full rationale

The paper introduces Agent-BOM as a conceptual modeling framework that represents agentic systems via a hierarchical attributed directed graph separating static and dynamic elements. No equations, derivations, fitted parameters, predictions, or self-referential reductions appear in the provided text or abstract. The central claim rests on qualitative reconstruction of attack scenarios rather than any mathematical chain that collapses to its inputs by construction. No self-citations are invoked as load-bearing uniqueness theorems or ansatzes, and the proposal does not rename known results or smuggle assumptions through prior work. The framework is therefore self-contained as a descriptive representation without circular reduction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on domain assumptions about graph modeling of agent systems rather than fitted parameters or new physical entities.

axioms (1)

domain assumption LLM agentic systems can be modeled as hierarchical attributed directed graphs that separate static capability bases from dynamic runtime semantic states
Core modeling premise stated in the abstract for transforming execution traces into audit paths.

invented entities (1)

Agent-BOM no independent evidence
purpose: Unified structural representation for agent security auditing
New construct introduced to bridge the semantic gap between events and intent.

pith-pipeline@v0.9.0 · 5629 in / 1196 out tokens · 31439 ms · 2026-05-11T01:20:07.194346+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean; IndisputableMonolith/Cost/FunctionalEquation.lean; IndisputableMonolith/Foundation/AlexanderDuality.lean reality_from_one_distinction; washburn_uniqueness_aczel; alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Agent-BOM models an agentic system as a hierarchical attributed directed graph that separates static capability bases... from dynamic runtime semantic states... connected through semantic edges and security attributes

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

65 extracted references · 65 canonical work pages · 3 internal anchors

[1]

Generative agents: Interactive simulacra of human behavior

Joon Sung Park, Joseph O’Brien, Carrie Jun Cai, Meredith Ringel Morris, Percy Liang, and Michael S Bernstein. Generative agents: Interactive simulacra of human behavior. InProceedings of the 36th annual acm symposium on user interface software and technology, pages 1–22, 2023

work page 2023
[2]

arXiv preprint arXiv:2404.11584 , year=

Tula Masterman, Sandi Besen, Mason Sawtell, and Alex Chao. The landscape of emerging ai agent architectures for reasoning, planning, and tool calling: A survey.arXiv preprint arXiv:2404.11584, 2024

work page arXiv 2024
[3]

Control- flow integrity principles, implementations, and applications.ACM Transactions on Information and System Security (TISSEC), 13(1):1– 40, 2009

Mart ´ın Abadi, Mihai Budiu, Ulfar Erlingsson, and Jay Ligatti. Control- flow integrity principles, implementations, and applications.ACM Transactions on Information and System Security (TISSEC), 13(1):1– 40, 2009

work page 2009
[4]

Watson: Abstracting behaviors from audit logs via semantic context

Wajih Ul Hassan, Mohammad A Noureddine, Pubali Datta, and Adam Bates. Watson: Abstracting behaviors from audit logs via semantic context. In28th Annual Network and Distributed System Security Symposium (NDSS), 2021

work page 2021
[5]

Agentpoison: Red-teaming llm agents via poisoning memory or knowl- edge bases.Advances in Neural Information Processing Systems, 37:130185–130213, 2024

Zhaorun Chen, Zhen Xiang, Chaowei Xiao, Dawn Song, and Bo Li. Agentpoison: Red-teaming llm agents via poisoning memory or knowl- edge bases.Advances in Neural Information Processing Systems, 37:130185–130213, 2024

work page 2024
[6]

The emerged security and privacy of llm agent: A survey with case studies.ACM Computing Surveys, 58(6):1–36, 2025

Feng He, Tianqing Zhu, Dayong Ye, Bo Liu, Wanlei Zhou, and Philip S Yu. The emerged security and privacy of llm agent: A survey with case studies.ACM Computing Surveys, 58(6):1–36, 2025

work page 2025
[7]

Impacts of software bill of materials (sbom) generation on vulnerability detection

Eric O’Donoghue, Brittany Boles, Clemente Izurieta, and Ann Marie Reinhold. Impacts of software bill of materials (sbom) generation on vulnerability detection. InProceedings of the 2024 Workshop on Software Supply Chain Offensive Research and Ecosystem Defenses, pages 67–76, 2023

work page 2024
[8]

What we know about aiboms: Results from a multivocal literature review on artificial intelligence bill of ma- terials.ACM Transactions on Software Engineering and Methodology, 2025

Sabato Nocera, Massimiliano Di Penta, Fatima Ahmed, Simone Ro- mano, and Giuseppe Scanniello. What we know about aiboms: Results from a multivocal literature review on artificial intelligence bill of ma- terials.ACM Transactions on Software Engineering and Methodology, 2025

work page 2025
[9]

Guide to computer security log management.NIST special publication, 92:1–72, 2006

Karen Kent and Murugiah Souppaya. Guide to computer security log management.NIST special publication, 92:1–72, 2006

work page 2006
[10]

Agenttrace: A structured logging framework for agent workflows

Abdullah AlSayyad et al. Agenttrace: A structured logging framework for agent workflows. InOpenReview / preprint version available, 2026. 15

work page 2026
[11]

Agentops: Enabling observability of llm agents

Liming Dong, Qinghua Lu, and Liming Zhu. Agentops: Enabling observability of llm agents.arXiv preprint arXiv:2411.05285, 2024

work page arXiv 2024
[12]

Prov-agent: Unified provenance for tracking ai agent interactions in agentic workflows

Renan Souza, Amal Gueroudji, Stephen DeWitt, Daniel Rosendo, Tirthankar Ghosal, Robert Ross, Prasanna Balaprakash, and Rafael Fer- reira Da Silva. Prov-agent: Unified provenance for tracking ai agent interactions in agentic workflows. In2025 IEEE International Confer- ence on eScience (eScience), pages 467–473. IEEE, 2025

work page 2025
[13]

Owasp top 10 for agentic applications for 2026

OW ASP GenAI Security Project. Owasp top 10 for agentic applications for 2026. https://genai.owasp.org/resource/ owasp-top-10-for-agentic-applications-for-2026/, 2026. Accessed: 2026-05-07

work page 2026
[14]

Openclaw 2026.2.6 release notes

OpenClaw Team. Openclaw 2026.2.6 release notes. https://github.com/ openclaw/openclaw/releases/tag/v2026.2.6, 2026

work page 2026
[15]

Backtracking intrusions

Samuel T King and Peter M Chen. Backtracking intrusions. In Proceedings of the nineteenth ACM symposium on Operating systems principles (SOSP), pages 223–236, 2003

work page 2003
[16]

Milajerdi, Junao Wang, Birhanu Eshete, Rigel Gjomemo, R

Md Nahid Hossain, Sadegh M. Milajerdi, Junao Wang, Birhanu Eshete, Rigel Gjomemo, R. Sekar, Scott D. Stoller, and V . N. Venkatakrishnan. SLEUTH: Real-time attack scenario reconstruction from COTS audit data. In26th USENIX Security Symposium, pages 487–504, 2017

work page 2017
[17]

The w3c prov family of specifications for modelling provenance metadata

Paolo Missier, Khalid Belhajjame, and James Cheney. The w3c prov family of specifications for modelling provenance metadata. InPro- ceedings of the 16th International Conference on Extending Database Technology, pages 773–776, 2013

work page 2013
[18]

Holmes: real-time apt detection through correlation of suspicious information flows

Sadegh M Milajerdi, Rigel Gjomemo, Venkat Venkatakrishnan Viswanathan, and VN Sekar. Holmes: real-time apt detection through correlation of suspicious information flows. In2019 IEEE Symposium on Security and Privacy (SP), pages 113–130. IEEE, 2019

work page 2019
[19]

Provenance-aware storage systems

Kiran-Kumar Muniswamy-Reddy, David A Holland, Uri Braun, and Margo Seltzer. Provenance-aware storage systems. InProceedings of the annual conference on USENIX ’06 Annual Technical Conference, pages 43–56, 2006

work page 2006
[20]

The minimum elements for a software bill of materials (sbom)

National Telecommunications and Information Administration (NTIA). The minimum elements for a software bill of materials (sbom). Technical report, US Department of Commerce, 2021

work page 2021
[21]

React: Synergizing reasoning and acting in language models

Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, and Yuan Cao. React: Synergizing reasoning and acting in language models. InThe Eleventh International Conference on Learning Representations (ICLR), 2023

work page 2023
[22]

Autogen: Enabling next-gen llm applications via multi-agent conversation

Qingyun Wu, Gagan Bansal, Jieyu Zhang, Yiran Wu, Beibin Li, Erkang Zhu, Li Jiang, Xiaoyun Zhang, Shaokun Zhang, Jiale Liu, et al. Autogen: Enabling next-gen llm applications via multi-agent conversation. InThe Twelfth International Conference on Learning Representations (ICLR), 2024

work page 2024
[23]

Memgpt: Towards llms as operating systems

Charles Packer, Vivian Fang, Shishir G Patil, Kevin Lin, Sarah Wooders, and Joseph E Gonzalez. Memgpt: Towards llms as operating systems. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2024

work page 2024
[24]

Agent security bench (asb): Formalizing and benchmarking attacks and defenses in llm- based agents.International Conference on Learning Representations, 2024

Hanrong Zhang, Jingyuan Huang, Kai Mei, Yifei Yao, Zhenting Wang, Chenlu Zhan, Hongwei Wang, and Yongfeng Zhang. Agent security bench (asb): Formalizing and benchmarking attacks and defenses in llm- based agents.International Conference on Learning Representations, 2024

work page 2024
[25]

Multi-agent security tax: Trading off security and collaboration capa- bilities in multi-agent systems

Pierre Peign ´e, Mikolaj Kniejski, Filip Sondej, Matthieu David, Jason Hoelscher-Obermaier, Christian Schroeder de Witt, and Esben Kran. Multi-agent security tax: Trading off security and collaboration capa- bilities in multi-agent systems. InProceedings of the AAAI Conference on Artificial Intelligence, volume 39, pages 27573–27581, 2025

work page 2025
[26]

Awakening the web’s sleeper agents: Misusing service workers for privacy leakage

Soroush Karami, Panagiotis Ilia, and Jason Polakis. Awakening the web’s sleeper agents: Misusing service workers for privacy leakage. NDSS, 2021

work page 2021
[27]

Cyclonedx v1.5: Machine learning bill of mate- rials (ml-bom), 2023

OW ASP CycloneDX. Cyclonedx v1.5: Machine learning bill of mate- rials (ml-bom), 2023

work page 2023
[28]

Elt: Efficient log-based troubleshooting system for cloud computing infrastructures

Kamal Kc and Xiaohui Gu. Elt: Efficient log-based troubleshooting system for cloud computing infrastructures. In2011 IEEE 30th Interna- tional Symposium on Reliable Distributed Systems, pages 11–20. IEEE, 2011

work page 2011
[29]

Dapper, a large-scale distributed systems tracing infrastructure.Google Technical Report, 2010

Benjamin H Sigelman, Luiz Andr ´e Barroso, Mike Burrows, Pat Stephen- son, Manoj Plakal, Donald Beaver, Saul Jaspan, and Chandan Shanbhag. Dapper, a large-scale distributed systems tracing infrastructure.Google Technical Report, 2010

work page 2010
[30]

Opentelemetry: High- quality, ubiquitous, and portable telemetry to enable effective observ- ability, 2024

Cloud Native Computing Foundation (CNCF). Opentelemetry: High- quality, ubiquitous, and portable telemetry to enable effective observ- ability, 2024

work page 2024
[31]

The prov data model and abstract syntax notation

Luc Moreau, Paolo Missier, Khalid Belhajjame, Reza B’Far, James Cheney, Sam Coppens, Stephen Cresswell, Yolanda Gil, Paul Groth, Graham Klyne, et al. The prov data model and abstract syntax notation. Technical report, World Wide Web Consortium (W3C), 2013

work page 2013
[32]

Langsmith: Unified platform for debugging, testing, evalu- ating, and monitoring llm applications, 2023

LangChain. Langsmith: Unified platform for debugging, testing, evalu- ating, and monitoring llm applications, 2023

work page 2023
[33]

Agent-Sentry: Bounding LLM Agents via Execution Provenance

Rohan Sequeira, Stavros Damianakis, Umar Iqbal, and Konstantinos Psounis. Agent-sentry: Bounding llm agents via execution provenance. arXiv preprint arXiv:2603.22868, 2026

work page internal anchor Pith review arXiv 2026
[34]

Langchain: Build context-aware, reasoning applications, 2022

LangChain AI. Langchain: Build context-aware, reasoning applications, 2022

work page 2022
[35]

A survey on large language model based autonomous agents.Frontiers of Computer Science, 18(6):186345, 2024

Lei Wang, Chen Ma, Xueyang Feng, Zeyu Zhang, Hao Yang, Jingsen Zhang, Zhiyuan Chen, Jiakai Tang, Xu Chen, Yankai Lin, et al. A survey on large language model based autonomous agents.Frontiers of Computer Science, 18(6):186345, 2024

work page 2024
[36]

Language models are few-shot learners

Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. Language models are few-shot learners. In Advances in Neural Information Processing Systems (NeurIPS), 2020

work page 2020
[37]

Plan-and-solve prompting: Improving zero- shot chain-of-thought reasoning by large language models

Lei Wang, Wanyu Xu, Yihuai Lan, Zhiqiang Hu, Yunshi Lan, Roy Ka- Wei Lee, and Ee-Peng Lim. Plan-and-solve prompting: Improving zero- shot chain-of-thought reasoning by large language models. InProceed- ings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL), 2023

work page 2023
[38]

Reflexion: Language agents with verbal reinforcement learning

Noah Shinn, Federico Cassano, Ashwin Gopinath, Karthik Narasimhan, and Shunyu Yao. Reflexion: Language agents with verbal reinforcement learning. InThirty-seventh Conference on Neural Information Process- ing Systems (NeurIPS), 2023

work page 2023
[39]

Toolformer: Language models can teach themselves to use tools

Timo Schick, Jane Dwivedi-Yu, Roberto Dess `ı, Roberta Raileanu, Maria Lomeli, Luke Zettlemoyer, Nicola Cancedda, and Thomas Scialom. Toolformer: Language models can teach themselves to use tools. In Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS), 2023

work page 2023
[40]

Your Agent, Their Asset: A Real-World Safety Analysis of OpenClaw

Zijun Wang, Haoqin Tu, Letian Zhang, Hardy Chen, Juncheng Wu, Xiangyan Liu, Zhenlong Yuan, Tianyu Pang, Michael Qizhe Shieh, Fengze Liu, et al. Your agent, their asset: A real-world safety analysis of openclaw.arXiv preprint arXiv:2604.04759, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026
[41]

Trinityguard: A unified framework for safeguarding multi-agent systems

Kai Wang, Biaojie Zeng, Zeming Wei, Chang Jin, Hefeng Zhou, Xiangtian Li, Chao Yang, Jingjing Qu, Xingcheng Xu, and Xia Hu. Trinityguard: A unified framework for safeguarding multi-agent systems. arXiv preprint arXiv:2603.15408, 2026

work page arXiv 2026
[42]

Agents of Chaos

Natalie Shapira, Chris Wendler, Avery Yen, Gabriele Sarti, Koyena Pal, Olivia Floody, Adam Belfki, Alex Loftus, Aditya Ratan Jannali, Nikhil Prakash, et al. Agents of chaos.arXiv preprint arXiv:2602.20021, 2026

work page internal anchor Pith review arXiv 2026
[43]

Model cards for model reporting

Margaret Mitchell, Simone Wu, Andrew Zaldivar, Parker Barnes, Lucy Vasserman, Ben Hutchinson, Elena Spitzer, Inioluwa Deborah Raji, and Timnit Gebru. Model cards for model reporting. InProceedings of the conference on fairness, accountability, and transparency, pages 220– 229, 2019

work page 2019
[44]

Aibomgen: Generating an ai bill of materials for secure, transparent, and compliant model training.arXiv preprint arXiv:2601.05703, 2026

Wiebe Vandendriessche, Jordi Thijsman, Laurens D’hooge, Bruno V ol- ckaert, and Merlijn Sebrechts. Aibomgen: Generating an ai bill of materials for secure, transparent, and compliant model training.arXiv preprint arXiv:2601.05703, 2026

work page arXiv 2026
[45]

An empirical study on vulnerability disclosure management of open source software systems.ACM Transactions on Software Engineering and Methodology, 34(7):1–31, 2025

Shuhan Liu, Jiayuan Zhou, Xing Hu, Filipe Roseiro Cogo, Xin Xia, and Xiaohu Yang. An empirical study on vulnerability disclosure management of open source software systems.ACM Transactions on Software Engineering and Methodology, 34(7):1–31, 2025

work page 2025
[46]

Identifying affected libraries and their ecosystems for open source software vulnerabilities

Susheng Wu, Wenyan Song, Kaifeng Huang, Bihuan Chen, and Xin Peng. Identifying affected libraries and their ecosystems for open source software vulnerabilities. InProceedings of the IEEE/ACM 46th International Conference on Software Engineering, pages 1–12, 2024

work page 2024
[47]

Systematic literature review on software security vulnerability information extraction.ACM Trans- actions on Software Engineering and Methodology, 35(4):1–52, 2026

Sofonias Yitagesu, Zhenchang Xing, Xiaowang Zhang, Zhiyong Feng, Tingting Bi, Linyi Han, and Xiaohong Li. Systematic literature review on software security vulnerability information extraction.ACM Trans- actions on Software Engineering and Methodology, 35(4):1–52, 2026

work page 2026
[48]

Uncovering cwe-cve-cpe relations with threat knowledge graphs

Zhenpeng Shi, Nikolay Matyunin, Kalman Graffi, and David Starobin- ski. Uncovering cwe-cve-cpe relations with threat knowledge graphs. ACM Transactions on Privacy and Security, 27(1):1–26, 2024

work page 2024
[49]

Chain-of-thought prompting elicits reasoning in large language models.Advances in neural infor- mation processing systems, 35:24824–24837, 2022

Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Fei Xia, Ed Chi, Quoc V Le, Denny Zhou, et al. Chain-of-thought prompting elicits reasoning in large language models.Advances in neural infor- mation processing systems, 35:24824–24837, 2022

work page 2022
[50]

Tree of thoughts: Deliberate problem 16 solving with large language models.Advances in neural information processing systems, 36:11809–11822, 2023

Shunyu Yao, Dian Yu, Jeffrey Zhao, Izhak Shafran, Tom Griffiths, Yuan Cao, and Karthik Narasimhan. Tree of thoughts: Deliberate problem 16 solving with large language models.Advances in neural information processing systems, 36:11809–11822, 2023

work page 2023
[51]

Watson: A cognitive observability framework for the reasoning of llm-powered agents

Benjamin Rombaut, Sogol Masoumzadeh, Kirill Vasilevski, Dayi Lin, and Ahmed E Hassan. Watson: A cognitive observability framework for the reasoning of llm-powered agents. In2025 40th IEEE/ACM International Conference on Automated Software Engineering (ASE), pages 739–751. IEEE, 2025

work page 2025
[52]

Agentsight: System-level observability for ai agents using ebpf

Yusheng Zheng, Yanpeng Hu, Tong Yu, and Andi Quinn. Agentsight: System-level observability for ai agents using ebpf. InProceedings of the 4th Workshop on Practical Adoption Challenges of ML for Systems, pages 110–115, 2025

work page 2025
[53]

” do anything now”: Characterizing and evaluating in-the-wild jailbreak prompts on large language models

Xinyue Shen, Zeyuan Chen, Michael Backes, Yun Shen, and Yang Zhang. ” do anything now”: Characterizing and evaluating in-the-wild jailbreak prompts on large language models. InProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security, pages 1671–1685, 2024

work page 2024
[54]

Pleak: Prompt leaking attacks against large language model applications

Bo Hui, Haolin Yuan, Neil Gong, Philippe Burlina, and Yinzhi Cao. Pleak: Prompt leaking attacks against large language model applications. InProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security, pages 3600–3614, 2024

work page 2024
[55]

11 Florian Tramèr, Nicholas Carlini, Wieland Brendel, Aleksander Madry, Alexey Kurakin, and Nico- las Papernot

Xue Tan, Hao Luan, Mingyu Luo, Xiaoyan Sun, Ping Chen, and Jun Dai. Revprag: Revealing poisoning attacks in retrieval-augmented generation through llm activation analysis.arXiv preprint arXiv:2411.18948, 2024

work page arXiv 2024
[56]

Don’t listen to me: Understanding and exploring jailbreak prompts of large language models

Zhiyuan Yu, Xiaogeng Liu, Shunning Liang, Zach Cameron, Chaowei Xiao, and Ning Zhang. Don’t listen to me: Understanding and exploring jailbreak prompts of large language models. In33rd USENIX Security Symposium (USENIX Security 24), pages 4675–4692, 2024

work page 2024
[57]

Making them ask and answer: Jailbreaking large language models in few queries via disguise and reconstruction

Tong Liu, Yingjie Zhang, Zhe Zhao, Yinpeng Dong, Guozhu Meng, and Kai Chen. Making them ask and answer: Jailbreaking large language models in few queries via disguise and reconstruction. In33rd USENIX Security Symposium (USENIX Security 24), pages 4711–4728, 2024

work page 2024
[58]

Prompt perturbation in retrieval-augmented generation based large language models

Zhibo Hu, Chen Wang, Yanfeng Shu, Hye-Young Paik, and Liming Zhu. Prompt perturbation in retrieval-augmented generation based large language models. InProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 1119–1130, 2024

work page 2024
[59]

Kairos: Practical intrusion detec- tion and investigation using whole-system provenance

Zijun Cheng, Qiujian Lv, Jinyuan Liang, Yan Wang, Degang Sun, Thomas Pasquier, and Xueyuan Han. Kairos: Practical intrusion detec- tion and investigation using whole-system provenance. In2024 IEEE Symposium on Security and Privacy (SP), pages 3533–3551. IEEE, 2024

work page 2024
[60]

Flash: A comprehensive approach to intrusion detection via provenance graph representation learning

Mati Ur Rehman, Hadi Ahmadi, and Wajih Ul Hassan. Flash: A comprehensive approach to intrusion detection via provenance graph representation learning. In2024 IEEE Symposium on Security and Privacy (SP), pages 3552–3570. IEEE, 2024

work page 2024
[61]

Provenance tracking in large-scale machine learning systems

Gabriele Padovani, Valentine Anantharaj, and Sandro Fiore. Provenance tracking in large-scale machine learning systems. InWorkshop Proceed- ings of the 54th International Conference on Parallel Processing, pages 167–174, 2025

work page 2025
[62]

Prompt provenance: Toward traceable llm interactions.Available at SSRN 5682942, 2025

Tyler Procko, Lynn V onder Haar, Timothy Elvira, and Omar Ochoa. Prompt provenance: Toward traceable llm interactions.Available at SSRN 5682942, 2025

work page 2025
[63]

Using llms to infer provenance information

Abdullah Hamed Almuntashiri, Luis-Daniel Ib ´a˜nez, and Adriane Chap- man. Using llms to infer provenance information. InProceedings of the ProvenanceWeek 2025, pages 1–10. 2025

work page 2025
[64]

Dills: Interactive diagnosis of llm-based multi- agent systems via layered summary of agent behaviors

Rui Sheng, Yukun Yang, Chuhan Shi, Yanna Lin, Zixin Chen, Huamin Qu, and Furui Cheng. Dills: Interactive diagnosis of llm-based multi- agent systems via layered summary of agent behaviors. InProceedings of the 2026 CHI Conference on Human Factors in Computing Systems, pages 1–17, 2026

work page 2026
[65]

Trace: Trajectory-aware comprehensive evaluation for deep research agents

Yanyu Chen, Jiyue Jiang, Jiahong Liu, Yifei Zhang, Xiao Guo, and Irwin King. Trace: Trajectory-aware comprehensive evaluation for deep research agents. InProceedings of the ACM Web Conference 2026, pages 2524–2534, 2026. 17

work page 2026