VIGIL: Runtime Enforcement of Behavioral Specifications in AI Agent Skills

Bosi Zhang; Hanzhi Liu; Hongbo Wen; Peiran Wang; Yanju Chen; Ying Li; Yuan Tian; Yu Feng

arxiv: 2606.26524 · v1 · pith:XHND3YFZnew · submitted 2026-06-25 · 💻 cs.CR

VIGIL: Runtime Enforcement of Behavioral Specifications in AI Agent Skills

Ying Li , Yanju Chen , Hongbo Wen , Bosi Zhang , Hanzhi Liu , Peiran Wang , Yu Feng , Yuan Tian This is my paper

Pith reviewed 2026-06-26 04:40 UTC · model grok-4.3

classification 💻 cs.CR

keywords runtime enforcementAI agentsbehavioral policiesSMT constraintspolicy languagetrace monitoringLLM agentssecurity

0 comments

The pith

VIGIL enforces AI agent behavioral policies by translating natural-language specifications into SMT constraints over finite execution traces.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

VIGIL is a runtime enforcement framework that monitors LLM agent executions against policies derived from skill specifications and operator constraints. It addresses the challenge of contextual granularity by deciding which events to observe and how to reason over multi-action traces. The system uses a policy language to capture temporal dependencies, argument constraints, and value flows, which are then translated into SMT constraints. Evaluation on real tasks in office, operational, and engineering domains shows detection of policy violations with over 95% recall and less than 10% false positives. This matters because agentic systems can affect files and devices, and current specs lack executable enforcement.

Core claim

VIGIL checks an agent's actual execution trace against behavioral policies from skill specifications, operator-defined constraints, and global rules spanning multiple skills. To make policies executable, it introduces a policy language that captures context-specific enforcement requirements over agent-tool events, including temporal dependencies, argument constraints, and value-flow conditions. The language is paired with symbolic evaluation rules that translate policies into SMT constraints over finite traces, allowing detection of violations that depend on event order, argument relationships, or cross-call value flow.

What carries the argument

Policy language paired with symbolic evaluation rules that translate policies into SMT constraints over finite traces.

Load-bearing premise

Natural-language skill specifications and operator constraints can be faithfully captured in the proposed policy language and translated to SMT constraints over finite traces without significant loss of intended meaning or introduction of spurious violations.

What would settle it

Finding a realistic multi-action agent trace where a clear policy violation is missed by the SMT encoding or where a compliant trace triggers a false positive violation alert.

Figures

Figures reproduced from arXiv: 2606.26524 by Bosi Zhang, Hanzhi Liu, Hongbo Wen, Peiran Wang, Yanju Chen, Ying Li, Yuan Tian, Yu Feng.

**Figure 1.** Figure 1: Autonomous-driving perception example: (a) the requirements and their formalization, (b) skill interaction, and [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗

**Figure 2.** Figure 2: VIGIL’s end-to-end lifecycle: each pending invocation is checked against the trace prefix; an unsafe verdict blocks it and returns a localized witness. the execution history: a critical validation event that simply never happened. The challenges. This temporal stealth exposes why traditional, single-call defenses fail: the violation exists in the semantic gap between an abstract human rule and the messy … view at source ↗

**Figure 3.** Figure 3: The VIGIL framework: an execution L and specification D become a verdict, safe (τ |= P) or unsafe with the offending statements and events. A policy P = {φ1, . . . , φm} is a finite set of statements, and the trace must obey every one. We write τ |= φ when the trace satisfies a statement, and τ |= P when it satisfies them all. The checking problem is to decide, given a trace τ and a policy P, whether τ |=… view at source ↗

**Figure 4.** Figure 4: The VIGIL policy language. Its symbols are drawn from the run’s signature Σ = (A, K, V ): the actions, argument names, and values the run exhibits. • ψ ∧ ψ ′ combines conditions. A pattern’s symbols come from the run’s signature Σ = (A, K, V ): α ⊆ A is a set of the run’s actions, S ⊆ V a set of its values, and k ∈ K one of its argument names. A pattern can therefore name only what the run exhibits; a ter… view at source ↗

**Figure 5.** Figure 5: Symbolic compilation of the policy language, top down: a policy to the conjunction of its statements (P [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗

**Figure 6.** Figure 6: Cross-benchmark balanced enforcement error, [PITH_FULL_IMAGE:figures/full_fig_p010_6.png] view at source ↗

**Figure 7.** Figure 7: VIGIL’s two error boundaries. (a) A false negative at the boundary of trace observability: the policy-relevant steps run inside the script body and produce no trace facts. (b) A false positive from compilation fidelity: the generated statement drops the qualifier bulk and fires on a benign single-event edit [PITH_FULL_IMAGE:figures/full_fig_p011_7.png] view at source ↗

**Figure 8.** Figure 8: The vocabulary exposed to COMPILE saturates, so VIGIL’s overhead stays stable as traces grow. it is highly optimizable. Deployments can cache compiled policies per skill, utilize faster model serving, or deploy small local distilled models to reduce end-to-end latency, while the determinism of the sub-second VERIFY path remains uncompromised. Takeaway III. The online check decides in 0.27 s; 92.5% of the … view at source ↗

**Figure 9.** Figure 9: Abridged prompt template used for specification [PITH_FULL_IMAGE:figures/full_fig_p016_9.png] view at source ↗

read the original abstract

Agentic systems increasingly act through third-party skills, allowing model-generated decisions to affect files, communication channels, and cyber-physical devices. These skills often include natural-language specifications that define access permissions, disclosure limits, execution privileges, and required preconditions. Although such specifications describe the intended boundaries of skill behavior, they do not by themselves provide executable runtime enforcement. Enforcing them raises a contextual granularity challenge: even when a policy is written for a particular task context, a monitor must still decide which events to observe, what state to retain, how far across the execution to reason, and where to intervene. Choosing the wrong granularity can either block benign executions or miss violations that emerge only across multiple actions. Most existing enforcement mechanisms, however, assume a fixed event model or enforcement point. In this work, we present VIGIL, an end-to-end runtime enforcement framework for agentic systems. VIGIL checks an agent's actual execution trace against behavioral policies from skill specifications, operator-defined constraints, and global rules spanning multiple skills. To make such policies executable, VIGIL introduces a policy language that captures context-specific enforcement requirements over agent-tool events, including temporal dependencies, argument constraints, and value-flow conditions. The language is paired with symbolic evaluation rules that translate policies into SMT constraints over finite traces, allowing VIGIL to detect violations that depend on event order, argument relationships, or cross-call value flow rather than relying on fixed single-call filters. On real LLM-agent runs spanning office-document, operational, and engineering tasks, VIGIL detects policy violations with over 95% recall and a false-positive rate below 10%.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

VIGIL gives a usable policy language plus SMT checker for multi-event agent behaviors and reports solid detection numbers, but the abstract leaves evaluation methodology and translation fidelity almost completely opaque.

read the letter

The main thing to know is that VIGIL defines a policy language for agent-tool events that includes temporal order, argument constraints, and value flow across calls, then translates those policies into SMT constraints over finite traces for runtime checking. On traces from real LLM agents doing office, operational, and engineering tasks it claims over 95% recall and under 10% false positives.

The language and the SMT encoding are the concrete new pieces. Most prior runtime monitors use fixed single-call rules; this one tries to handle dependencies that only appear across several actions. That matches the practical problem the abstract describes, where skills come with natural-language specs that need enforcement at the right granularity.

The evaluation numbers are the part that needs more scrutiny. The abstract gives no information on how violations were labeled, what the collection process for traces looked like, or what baselines were compared against. Without those details it is hard to know whether the reported recall holds when policies are written at different levels of detail or when the agent behavior changes.

The stress-test concern about semantic loss during translation from natural-language specs also lands. The metrics are measured against the formal policies after translation; if that step drops or adds conditions, the numbers reflect enforcement of the artifact rather than the original intent. The abstract does not show an independent check that the SMT encoding preserves meaning on the reported tasks.

This is for people working on securing tool-using agents. A reader who needs a starting point for context-sensitive monitoring will find the language design and the SMT approach useful even if the current evidence is thin. The work is coherent enough on its own terms to go to serious referees, though any review will have to press hard on the evaluation section and the translation step.

Referee Report

3 major / 2 minor

Summary. VIGIL is an end-to-end runtime enforcement framework for AI agent skills that introduces a policy language capturing context-specific requirements (temporal dependencies, argument constraints, value-flow conditions) from natural-language skill specs, operator constraints, and global rules. Policies are translated via symbolic evaluation rules into SMT constraints over finite traces, enabling detection of violations that span multiple actions rather than relying on fixed single-call filters. The central empirical claim is that, on real LLM-agent execution traces from office-document, operational, and engineering tasks, VIGIL achieves >95% recall with <10% false-positive rate.

Significance. If the translation from natural-language specifications to the policy language and SMT encoding is faithful and the evaluation methodology is sound, the work would offer a concrete mechanism for contextual runtime enforcement in agentic systems that interact with third-party skills, addressing a gap left by fixed event models. The symbolic approach to handling cross-call value flow and order-dependent constraints is a technical contribution worth noting if the reported metrics are reproducible and not artifacts of the formalization step itself.

major comments (3)

[Abstract / Evaluation] Abstract and evaluation description: the headline result (>95% recall, <10% FPR) is measured on policies that have already been translated from natural-language specifications into the proposed policy language and SMT constraints. No independent validation (e.g., manual audit of a sample of translated policies against original intent, or comparison against unenforced baseline traces) is described to confirm that the encoding preserves preconditions, value-flow conditions, or temporal dependencies without introducing spurious violations or dropping intended constraints. This is load-bearing for the central claim because any systematic mismatch directly affects the reported recall/FPR figures.
[Evaluation] Evaluation setup: the abstract reports strong empirical numbers but supplies no information on how ground-truth violations were labeled, what trace collection protocol was used, what baselines (if any) were compared against, or how policy granularity was varied. Without these details it is impossible to assess whether the 95% recall holds under different task distributions or policy formulations.
[Symbolic evaluation rules] § on symbolic translation rules: the claim that the SMT encoding handles 'cross-call value flow' and 'context-specific enforcement requirements' without loss of meaning is central, yet the manuscript provides no formal argument or empirical check that the finite-trace encoding is semantics-preserving for the multi-action scenarios in the office-document and engineering tasks.

minor comments (2)

[Policy language definition] Notation for the policy language elements (e.g., how temporal operators and value-flow predicates are written) should be introduced with a small concrete example early in the paper to improve readability.
[Abstract] The abstract states 'over 95% recall' and 'below 10%' without confidence intervals or per-task breakdowns; adding these would strengthen the presentation even if the underlying data are sound.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. The comments correctly identify gaps in validation and methodological transparency that affect the strength of the central empirical claims. We address each point below and will revise the manuscript accordingly.

read point-by-point responses

Referee: [Abstract / Evaluation] Abstract and evaluation description: the headline result (>95% recall, <10% FPR) is measured on policies that have already been translated from natural-language specifications into the proposed policy language and SMT constraints. No independent validation (e.g., manual audit of a sample of translated policies against original intent, or comparison against unenforced baseline traces) is described to confirm that the encoding preserves preconditions, value-flow conditions, or temporal dependencies without introducing spurious violations or dropping intended constraints. This is load-bearing for the central claim because any systematic mismatch directly affects the reported recall/FPR figures.

Authors: We acknowledge that the manuscript does not describe an independent validation of the translation step. The reported metrics reflect end-to-end enforcement on translated policies. In revision we will add a dedicated subsection reporting a manual audit of 50 sampled policies (two reviewers, inter-rater agreement measured) that compares translated SMT constraints against original natural-language intent, plus a baseline comparison of unenforced traces to quantify the effect of enforcement. revision: yes
Referee: [Evaluation] Evaluation setup: the abstract reports strong empirical numbers but supplies no information on how ground-truth violations were labeled, what trace collection protocol was used, what baselines (if any) were compared against, or how policy granularity was varied. Without these details it is impossible to assess whether the 95% recall holds under different task distributions or policy formulations.

Authors: The Evaluation section of the full manuscript describes trace collection from LLM-agent executions on the three task domains and ground-truth labeling by expert review against the source specifications. We agree the abstract and high-level summary omit these details. The revision will expand the abstract with a brief protocol summary and insert a new 'Evaluation Methodology' subsection covering trace collection, labeling procedure, inter-rater process, baselines (naive per-action filters and rule-based monitors), and experiments that vary policy granularity. revision: yes
Referee: [Symbolic evaluation rules] § on symbolic translation rules: the claim that the SMT encoding handles 'cross-call value flow' and 'context-specific enforcement requirements' without loss of meaning is central, yet the manuscript provides no formal argument or empirical check that the finite-trace encoding is semantics-preserving for the multi-action scenarios in the office-document and engineering tasks.

Authors: The manuscript presents the symbolic rules and applies them to multi-action traces but supplies neither a standalone formal semantics argument nor an isolated empirical check of preservation. We will add an appendix containing a formal argument that the finite-trace SMT encoding is semantics-preserving for the supported policy constructs (temporal, argument, and value-flow) and an empirical validation on a curated set of synthetic multi-action traces that isolate cross-call value flow and ordering constraints. revision: yes

Circularity Check

0 steps flagged

No significant circularity; evaluation on external real runs is independent of internal definitions.

full rationale

The paper's central empirical claims (>95% recall, <10% FPR) are presented as measured on real LLM-agent executions across tasks. No self-definitional reductions, fitted inputs renamed as predictions, or load-bearing self-citation chains are identifiable from the provided text. The policy language and SMT translation are introduced as a new mechanism, but performance is not shown to be forced by construction from those mechanisms alone. The derivation chain remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides no explicit free parameters, axioms, or invented entities; the framework relies on standard SMT solving and trace monitoring assumptions not detailed here.

pith-pipeline@v0.9.1-grok · 5844 in / 995 out tokens · 27788 ms · 2026-06-26T04:40:00.989103+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

69 extracted references · 18 linked inside Pith

[1]

Equipping agents for the real world with Agent Skills,

Anthropic, “Equipping agents for the real world with Agent Skills,” https://www.anthropic.com/engineering/equipping-agents-for-the-rea l-world-with-agent-skills, 2025

2025
[2]

GitHub Copilot now supports Agent Skills,

GitHub, “GitHub Copilot now supports Agent Skills,” https://github .blog/changelog/2025-12-18-github-copilot-now-supports-agent-ski lls/, 2025

2025
[3]

Level up your agents: Announcing Google’s official skills repository,

Google Cloud, “Level up your agents: Announcing Google’s official skills repository,” https://cloud.google.com/blog/topics/developers-p ractitioners/level-up-your-agents-announcing-googles-official-skill s-repository, 2026

2026
[4]

AWS Agent Toolkit: Skills,

Amazon Web Services, “AWS Agent Toolkit: Skills,” https://docs.a ws.amazon.com/agent-toolkit/latest/userguide/skills.html, 2026

2026
[5]

NVIDIA verified agent skills provide capability gover- nance for AI agents,

NVIDIA, “NVIDIA verified agent skills provide capability gover- nance for AI agents,” https://developer.nvidia.com/blog/nvidia-verif ied-agent-skills-provide-capability-governance-for-ai-agents/, 2026

2026
[6]

Echoleak: The first real-world zero-click prompt injection exploit in a production llm system,

P. Reddy and A. S. Gujral, “Echoleak: The first real-world zero-click prompt injection exploit in a production llm system,” inProceedings of the AAAI Symposium Series, vol. 7, no. 1, 2025, pp. 303–311

2025
[7]

ForcedLeak: AI agent risks exposed in Salesforce Agentforce,

Noma Security, “ForcedLeak: AI agent risks exposed in Salesforce Agentforce,” https://noma.security/blog/forcedleak-agent-risks-expos ed-in-salesforce-agentforce/, 2025

2025
[8]

A Meta AI security researcher said an OpenClaw agent ran amok on her inbox,

J. Bort, “A Meta AI security researcher said an OpenClaw agent ran amok on her inbox,” https://techcrunch.com/2026/02/23/a-meta-ai-s ecurity-researcher-said-an-openclaw-agent-ran-amok-on-her-inbox/, 2026

2026
[9]

No attack required: Semantic fuzzing for specification violations in agent skills,

Y . Li, H. Wen, Y . Chen, H. Liu, Y . Tian, and Y . Feng, “No attack required: Semantic fuzzing for specification violations in agent skills,” arXiv preprint arXiv:2605.13044, 2026

Pith/arXiv arXiv 2026
[10]

A survey of autonomous driving: Common practices and emerging technologies,

E. Yurtsever, J. Lambert, A. Carballo, and K. Takeda, “A survey of autonomous driving: Common practices and emerging technologies,” IEEE access, vol. 8, pp. 58 443–58 469, 2020

2020
[11]

Skillsbench: Benchmarking how well agent skills work across diverse tasks,

X. Li, W. Chen, Y . Liu, S. Zheng, X. Chen, Y . He, Y . Li, B. You, H. Shen, J. Sunet al., “Skillsbench: Benchmarking how well agent skills work across diverse tasks,”arXiv preprint arXiv:2602.12670, 2026

Pith/arXiv arXiv 2026
[12]

Build agents you can trust across any framework with open evals and a control standard,

Microsoft, “Build agents you can trust across any framework with open evals and a control standard,” https://devblogs.microsoft.com/f oundry/build-2026-open-trust-stack-ai-agents/, 2026

2026
[13]

Overview of NVIDIA OpenShell,

NVIDIA, “Overview of NVIDIA OpenShell,” https://docs.nvidia.co m/openshell/about/overview, 2026

2026
[14]

Agentspec: Customizable runtime enforcement for safe and reliable llm agents,

H. Wang, C. M. Poskitt, and J. Sun, “Agentspec: Customizable runtime enforcement for safe and reliable llm agents,”arXiv preprint arXiv:2503.18666, 2025

Pith/arXiv arXiv 2025
[15]

Pro- gent: Programmable privilege control for llm agents,

T. Shi, J. He, Z. Wang, H. Li, L. Wu, W. Guo, and D. Song, “Pro- gent: Programmable privilege control for llm agents,”arXiv preprint arXiv:2504.11703, 2025

Pith/arXiv arXiv 2025
[16]

Contextual agent security: A policy for every purpose,

L. Tsai and E. Bagdasarian, “Contextual agent security: A policy for every purpose,” inProceedings of the 2025 Workshop on Hot Topics in Operating Systems, 2025, pp. 8–17

2025
[17]

Policy compiler for secure agentic systems,

N. Palumbo, S. Choudhary, J. Choi, P. Chalasani, and S. Jha, “Policy compiler for secure agentic systems,”arXiv e-prints, pp. arXiv–2602, 2026

2026
[18]

Defeating prompt injec- tions by design,

E. Debenedetti, I. Shumailov, T. Fan, J. Hayes, N. Carlini, D. Fabian, C. Kern, C. Shi, A. Terzis, and F. Tramèr, “Defeating prompt injec- tions by design,”arXiv preprint arXiv:2503.18813, 2025

Pith/arXiv arXiv 2025
[19]

Securing ai agents with information-flow control,

M. Costa, B. Köpf, A. Kolluri, A. Paverd, M. Russinovich, A. Salem, S. Tople, L. Wutschitz, and S. Zanella-Béguelin, “Securing ai agents with information-flow control,”arXiv preprint arXiv:2505.23643, 2025

Pith/arXiv arXiv 2025
[20]

Identifying the risks of lm agents with an lm-emulated sandbox,

Y . Ruan, H. Dong, A. Wang, S. Pitis, Y . Zhou, J. Ba, Y . Dubois, C. Maddison, and T. Hashimoto, “Identifying the risks of lm agents with an lm-emulated sandbox,” inInternational Conference on Learn- ing Representations, vol. 2024, 2024, pp. 27 031–27 098

2024
[21]

Policy-invisible violations in llm-based agents,

J. Wu and M. Gong, “Policy-invisible violations in llm-based agents,” arXiv preprint arXiv:2604.12177, 2026

Pith/arXiv arXiv 2026
[22]

Satisfiability modulo theories,

C. Barrett and C. Tinelli, “Satisfiability modulo theories,” inHand- book of model checking. Springer, 2018, pp. 305–343

2018
[23]

Skill-inject: Measuring agent vulnerability to skill file attacks,

D. Schmotz, L. Beurer-Kellner, S. Abdelnabi, and M. An- driushchenko, “Skill-inject: Measuring agent vulnerability to skill file attacks,”arXiv preprint arXiv:2602.20156, 2026

Pith/arXiv arXiv 2026
[24]

Agentdojo: A dynamic environment to evaluate prompt injection attacks and defenses for llm agents,

E. Debenedetti, J. Zhang, M. Balunovic, L. Beurer-Kellner, M. Fis- cher, and F. Tramèr, “Agentdojo: A dynamic environment to evaluate prompt injection attacks and defenses for llm agents,”Advances in Neural Information Processing Systems, vol. 37, pp. 82 895–82 920, 2024

2024
[25]

Safeagentbench: A benchmark for safe task planning of embodied llm agents,

S. Yin, X. Pang, Y . Ding, M. Chen, Y . Bi, Y . Xiong, W. Huang, Z. Xiang, J. Shao, and S. Chen, “Safeagentbench: A benchmark for safe task planning of embodied llm agents,”arXiv preprint arXiv:2412.13178, 2024

arXiv 2024
[26]

Databricks Agent Skills,

Databricks, “Databricks Agent Skills,” https://github.com/databricks/ databricks-agent-skills, 2025, accessed: 2026-06-11

2025
[27]

Testing handbook skills for Claude Code,

Trail of Bits, “Testing handbook skills for Claude Code,” https://gith ub.com/trailofbits/skills/tree/main/plugins/testing-handbook-skills, 2025, accessed: 2026-06-10

2025
[28]

Anthropic skills,

Anthropic, “Anthropic skills,” https://github.com/anthropics/skills/tre e/main/skills, 2026

2026
[29]

Model context protocol specification,

Model Context Protocol, “Model context protocol specification,” http s://modelcontextprotocol.io, 2024, accessed: June 26, 2026

2024
[30]

Enforceable security policies,

F. B. Schneider, “Enforceable security policies,”ACM Transactions on Information and System Security (TISSEC), vol. 3, no. 1, pp. 30– 50, 2000

2000
[31]

Edit automata: Enforcement mechanisms for run-time security policies,

J. Ligatti, L. Bauer, and D. Walker, “Edit automata: Enforcement mechanisms for run-time security policies,”International Journal of Information Security, vol. 4, no. 1, pp. 2–16, 2005

2005
[32]

Patterns in property specifications for finite-state verification,

M. B. Dwyer, G. S. Avrunin, and J. C. Corbett, “Patterns in property specifications for finite-state verification,” inProceedings of the 21st international conference on Software engineering, 1999, pp. 411–420

1999
[33]

Linear temporal logic and linear dynamic logic on finite traces,

G. De Giacomo and M. Y . Vardi, “Linear temporal logic and linear dynamic logic on finite traces,” inProceedings of the Twenty-Third In- ternational Joint Conference on Artificial Intelligence (IJCAI). AAAI Press, 2013, pp. 854–860

2013
[34]

Z3: An efficient smt solver,

L. De Moura and N. Bjørner, “Z3: An efficient smt solver,” inIn- ternational conference on Tools and Algorithms for the Construction and Analysis of Systems. Springer, 2008, pp. 337–340

2008
[35]

Foray: Towards effective attack synthesis against deep logical vulnerabilities in DeFi protocols,

H. Wen, H. Liu, J. Song, Y . Chen, W. Guo, and Y . Feng, “Foray: Towards effective attack synthesis against deep logical vulnerabilities in DeFi protocols,” inProceedings of the 2024 ACM SIGSAC Con- ference on Computer and Communications Security (CCS), 2024

2024
[36]

Nemo-evaluator-launcher skills,

NVIDIA, “Nemo-evaluator-launcher skills,” https://github.com/NVI DIA/skills/tree/8c40eff71464e661df027f547c4a7d0f69fe3693/skills/ NeMo-Evaluator-Launcher, 2026

2026
[37]

Cloudflare skills,

Cloudflare, “Cloudflare skills,” https://github.com/cloudflare/skills, 2026

2026
[38]

Microsoft skill bundle,

Microsoft, “Microsoft skill bundle,” https://github.com/microsoft/ski lls/, 2026

2026
[39]

Agentc- group: Understanding and controlling OS resources of AI agents,

Y . Zheng, J. Fan, Q. Fu, Y . Yang, W. Zhang, and A. Quinn, “Agentc- group: Understanding and controlling OS resources of AI agents,” arXiv preprint arXiv:2602.09345, 2026

arXiv 2026
[40]

Clawhub,

ClawHub, “Clawhub,” https://clawhub.ai/, 2026, accessed: 2026-06- 12

2026
[41]

Benchflow,

BenchFlow, “Benchflow,” https://github.com/benchflow-ai/benchflo w, 2026, accessed: 2026-06-11

2026
[42]

{AddressSanitizer}: A fast address sanity checker,

K. Serebryany, D. Bruening, A. Potapenko, and D. Vyukov, “{AddressSanitizer}: A fast address sanity checker,” in2012 USENIX annual technical conference (USENIX ATC 12), 2012, pp. 309–318

2012
[43]

AddressSanitizer — Clang documentation,

LLVM Project, “AddressSanitizer — Clang documentation,” https: //clang.llvm.org/docs/AddressSanitizer.html, accessed: 2026-06-11

2026
[44]

The attack and defense landscape of agentic ai: A comprehensive survey,

J. Kim, X. Liu, Z. Wang, S. Qiu, B. Li, W. Guo, and D. Song, “The attack and defense landscape of agentic ai: A comprehensive survey,” arXiv preprint arXiv:2603.11088, 2026

arXiv 2026
[45]

Sok: Attack and defense landscape of agentic ai systems,

J. Kim, W. Guo, and D. Song, “Sok: Attack and defense landscape of agentic ai systems,” in35th USENIX Security Symposium (USENIX Security 26), 2026

2026
[46]

Ghost in the agent: Re- defining information flow tracking for llm agents,

Y . Cai, W. Tang, C. Wen, and S. Qin, “Ghost in the agent: Re- defining information flow tracking for llm agents,”arXiv preprint arXiv:2604.23374, 2026

Pith/arXiv arXiv 2026
[47]

Agentarmor: Enforcing program analysis on agent runtime trace to defend against prompt injection,

P. Wang, Y . Liu, Y . Lu, Y . Cai, H. Chen, Q. Yang, J. Zhang, J. Hong, and Y . Wu, “Agentarmor: Enforcing program analysis on agent runtime trace to defend against prompt injection,”arXiv preprint arXiv:2508.01249, 2025

arXiv 2025
[48]

Agentsentry: Mitigating indirect prompt injection in llm agents via temporal causal diagnostics and context purification,

T. Zhang, Y . Xu, J. Wang, K. Guo, X. Xu, B. Xiao, Q. Guan, J. Fan, J. Liu, Z. Liuet al., “Agentsentry: Mitigating indirect prompt injection in llm agents via temporal causal diagnostics and context purification,”arXiv preprint arXiv:2602.22724, 2026

arXiv 2026
[49]

Agent-diff: Bench- marking llm agents on enterprise api tasks via code execution with state-diff-based evaluation,

H. M. Pysklo, A. Zhuravel, and P. D. Watson, “Agent-diff: Bench- marking llm agents on enterprise api tasks via code execution with state-diff-based evaluation,”arXiv preprint arXiv:2602.11224, 2026

Pith/arXiv arXiv 2026
[50]

The keynote trust-management system version 2,

M. Blaze, J. Feigenbaum, J. Ioannidis, and A. Keromytis, “The keynote trust-management system version 2,” RFC Editor, Tech. Rep. RFC 2704, September 1999

1999
[51]

The ponder policy specification language,

N. Damianou, N. Dulay, E. Lupu, and M. Sloman, “The ponder policy specification language,” inInternational Workshop on Policies for Distributed Systems and Networks. Springer, 2001, pp. 18–38

2001
[52]

Cedar: A new language for expressive, fast, safe, and analyzable authorization,

J. W. Cutler, C. Disselkoen, A. Eline, S. He, K. Headley, M. Hicks, K. Hietala, E. Ioannidis, J. Kastner, A. Mamatet al., “Cedar: A new language for expressive, fast, safe, and analyzable authorization,” Proceedings of the ACM on Programming Languages, vol. 8, no. OOPSLA1, pp. 670–697, 2024

2024
[53]

Policies and permissions in aws identity and access management,

Amazon Web Services, “Policies and permissions in aws identity and access management,” https://docs.aws.amazon.com/IAM/latest/Use rGuide/access_policies.html, 2026, accessed: 2026-06-06

2026
[54]

What is azure role-based access control (Azure RBAC)?

Microsoft Azure, “What is azure role-based access control (Azure RBAC)?” https://learn.microsoft.com/en-us/azure/role-based-acces s-control/overview, 2026, accessed: 2026-06-06

2026
[55]

Iam overview,

Google Cloud, “Iam overview,” https://cloud.google.com/iam/docs/ overview, 2026, accessed: 2026-06-06

2026
[56]

Semantic-based automated reasoning for aws access policies using smt,

J. Backes, P. Bolignano, B. Cook, C. Dodge, A. Gacek, K. Luckow, N. Rungta, O. Tkachuk, and C. Varming, “Semantic-based automated reasoning for aws access policies using smt,” in2018 Formal Methods in Computer Aided Design (FMCAD). IEEE, 2018, pp. 1–9

2018
[57]

Agent skills for large language models: Archi- tecture, acquisition, security, and the path forward,

R. Xu and Y . Yan, “Agent skills for large language models: Archi- tecture, acquisition, security, and the path forward,”arXiv preprint arXiv:2602.12430, 2026

Pith/arXiv arXiv 2026
[58]

Sok: Agentic skills–beyond tool use in llm agents,

Y . Jiang, D. Li, H. Deng, B. Ma, X. Wang, Q. Wang, and G. Yu, “Sok: Agentic skills–beyond tool use in llm agents,”arXiv preprint arXiv:2602.20867, 2026

Pith/arXiv arXiv 2026
[59]

Towards secure agent skills: Architecture, threat taxonomy, and security analysis,

Z. Li, J. Wu, X. Ling, X. Cui, and T. Luo, “Towards secure agent skills: Architecture, threat taxonomy, and security analysis,”arXiv preprint arXiv:2604.02837, 2026

Pith/arXiv arXiv 2026
[60]

Agent skills in the wild: An empirical study of security vulnerabilities at scale,

Y . Liu, W. Wang, R. Feng, Y . Zhang, G. Xu, G. Deng, Y . Li, and L. Zhang, “Agent skills in the wild: An empirical study of security vulnerabilities at scale,”arXiv preprint arXiv:2601.10338, 2026

Pith/arXiv arXiv 2026
[61]

Malicious agent skills in the wild: A large-scale security empirical study,

Y . Liu, Z. Chen, Y . Zhang, G. Deng, Y . Li, J. Ning, Y . Zhang, and L. Y . Zhang, “Malicious agent skills in the wild: A large-scale security empirical study,”arXiv preprint arXiv:2602.06547, 2026

Pith/arXiv arXiv 2026
[62]

Harmful- skillbench: How do harmful skills weaponize your agents?

Y . Jiang, Y . Zhang, M. Backes, X. Shen, and Y . Zhang, “Harmful- skillbench: How do harmful skills weaponize your agents?”arXiv preprint arXiv:2604.15415, 2026

Pith/arXiv arXiv 2026
[63]

Agenttrap: Measuring runtime trust failures in third-party agent skills,

H. Zhuang, H. Xing, Y . Zhou, Y . Ma, Y . Huang, Y . Shen, Y . Han, and X. Zhang, “Agenttrap: Measuring runtime trust failures in third-party agent skills,”arXiv preprint arXiv:2605.13940, 2026

Pith/arXiv arXiv 2026
[64]

Skillprobe: Security auditing for emerging agent skill marketplaces via multi- agent collaboration,

Z. Guo, Z. Chen, X. Nie, J. Lin, Y . Zhou, and W. Zhang, “Skillprobe: Security auditing for emerging agent skill marketplaces via multi- agent collaboration,”arXiv preprint arXiv:2603.21019, 2026

arXiv 2026
[65]

Semia: Auditing agent skills via constraint-guided representation synthesis,

H. Wen, Y . Li, H. Liu, C. Shou, Y . Chen, Y . Tian, and Y . Feng, “Semia: Auditing agent skills via constraint-guided representation synthesis,”arXiv preprint arXiv:2605.00314, 2026. Table 5. Representative policy categories and formalizations in SB+SI. Category Example policyFormalization Required event property Period search must include flux uncertain...

Pith/arXiv arXiv 2026
[66]

SECURITY_PROTOCOL: the skill’s natural−language policy text
[67]

observed_vocabulary: actions, argument roles, command options, predicates, scripts, and text samples recovered from the trace
[68]

signature: the allowed condition operators and temporal templates
[69]

Grounding contract: − Draft policy statements only; do not classify the trace

policy_language: the target schema and declared role set. Grounding contract: − Draft policy statements only; do not classify the trace. − Use only literals and symbols present in the inputs. − Do not emit event ids, witness ids, SMT variables, expected verdicts, violation labels, or other trace−checking artifacts. − Emit Unsupported when a rule cannot be...

[1] [1]

Equipping agents for the real world with Agent Skills,

Anthropic, “Equipping agents for the real world with Agent Skills,” https://www.anthropic.com/engineering/equipping-agents-for-the-rea l-world-with-agent-skills, 2025

2025

[2] [2]

GitHub Copilot now supports Agent Skills,

GitHub, “GitHub Copilot now supports Agent Skills,” https://github .blog/changelog/2025-12-18-github-copilot-now-supports-agent-ski lls/, 2025

2025

[3] [3]

Level up your agents: Announcing Google’s official skills repository,

Google Cloud, “Level up your agents: Announcing Google’s official skills repository,” https://cloud.google.com/blog/topics/developers-p ractitioners/level-up-your-agents-announcing-googles-official-skill s-repository, 2026

2026

[4] [4]

AWS Agent Toolkit: Skills,

Amazon Web Services, “AWS Agent Toolkit: Skills,” https://docs.a ws.amazon.com/agent-toolkit/latest/userguide/skills.html, 2026

2026

[5] [5]

NVIDIA verified agent skills provide capability gover- nance for AI agents,

NVIDIA, “NVIDIA verified agent skills provide capability gover- nance for AI agents,” https://developer.nvidia.com/blog/nvidia-verif ied-agent-skills-provide-capability-governance-for-ai-agents/, 2026

2026

[6] [6]

Echoleak: The first real-world zero-click prompt injection exploit in a production llm system,

P. Reddy and A. S. Gujral, “Echoleak: The first real-world zero-click prompt injection exploit in a production llm system,” inProceedings of the AAAI Symposium Series, vol. 7, no. 1, 2025, pp. 303–311

2025

[7] [7]

ForcedLeak: AI agent risks exposed in Salesforce Agentforce,

Noma Security, “ForcedLeak: AI agent risks exposed in Salesforce Agentforce,” https://noma.security/blog/forcedleak-agent-risks-expos ed-in-salesforce-agentforce/, 2025

2025

[8] [8]

A Meta AI security researcher said an OpenClaw agent ran amok on her inbox,

J. Bort, “A Meta AI security researcher said an OpenClaw agent ran amok on her inbox,” https://techcrunch.com/2026/02/23/a-meta-ai-s ecurity-researcher-said-an-openclaw-agent-ran-amok-on-her-inbox/, 2026

2026

[9] [9]

No attack required: Semantic fuzzing for specification violations in agent skills,

Y . Li, H. Wen, Y . Chen, H. Liu, Y . Tian, and Y . Feng, “No attack required: Semantic fuzzing for specification violations in agent skills,” arXiv preprint arXiv:2605.13044, 2026

Pith/arXiv arXiv 2026

[10] [10]

A survey of autonomous driving: Common practices and emerging technologies,

E. Yurtsever, J. Lambert, A. Carballo, and K. Takeda, “A survey of autonomous driving: Common practices and emerging technologies,” IEEE access, vol. 8, pp. 58 443–58 469, 2020

2020

[11] [11]

Skillsbench: Benchmarking how well agent skills work across diverse tasks,

X. Li, W. Chen, Y . Liu, S. Zheng, X. Chen, Y . He, Y . Li, B. You, H. Shen, J. Sunet al., “Skillsbench: Benchmarking how well agent skills work across diverse tasks,”arXiv preprint arXiv:2602.12670, 2026

Pith/arXiv arXiv 2026

[12] [12]

Build agents you can trust across any framework with open evals and a control standard,

Microsoft, “Build agents you can trust across any framework with open evals and a control standard,” https://devblogs.microsoft.com/f oundry/build-2026-open-trust-stack-ai-agents/, 2026

2026

[13] [13]

Overview of NVIDIA OpenShell,

NVIDIA, “Overview of NVIDIA OpenShell,” https://docs.nvidia.co m/openshell/about/overview, 2026

2026

[14] [14]

Agentspec: Customizable runtime enforcement for safe and reliable llm agents,

H. Wang, C. M. Poskitt, and J. Sun, “Agentspec: Customizable runtime enforcement for safe and reliable llm agents,”arXiv preprint arXiv:2503.18666, 2025

Pith/arXiv arXiv 2025

[15] [15]

Pro- gent: Programmable privilege control for llm agents,

T. Shi, J. He, Z. Wang, H. Li, L. Wu, W. Guo, and D. Song, “Pro- gent: Programmable privilege control for llm agents,”arXiv preprint arXiv:2504.11703, 2025

Pith/arXiv arXiv 2025

[16] [16]

Contextual agent security: A policy for every purpose,

L. Tsai and E. Bagdasarian, “Contextual agent security: A policy for every purpose,” inProceedings of the 2025 Workshop on Hot Topics in Operating Systems, 2025, pp. 8–17

2025

[17] [17]

Policy compiler for secure agentic systems,

N. Palumbo, S. Choudhary, J. Choi, P. Chalasani, and S. Jha, “Policy compiler for secure agentic systems,”arXiv e-prints, pp. arXiv–2602, 2026

2026

[18] [18]

Defeating prompt injec- tions by design,

E. Debenedetti, I. Shumailov, T. Fan, J. Hayes, N. Carlini, D. Fabian, C. Kern, C. Shi, A. Terzis, and F. Tramèr, “Defeating prompt injec- tions by design,”arXiv preprint arXiv:2503.18813, 2025

Pith/arXiv arXiv 2025

[19] [19]

Securing ai agents with information-flow control,

M. Costa, B. Köpf, A. Kolluri, A. Paverd, M. Russinovich, A. Salem, S. Tople, L. Wutschitz, and S. Zanella-Béguelin, “Securing ai agents with information-flow control,”arXiv preprint arXiv:2505.23643, 2025

Pith/arXiv arXiv 2025

[20] [20]

Identifying the risks of lm agents with an lm-emulated sandbox,

Y . Ruan, H. Dong, A. Wang, S. Pitis, Y . Zhou, J. Ba, Y . Dubois, C. Maddison, and T. Hashimoto, “Identifying the risks of lm agents with an lm-emulated sandbox,” inInternational Conference on Learn- ing Representations, vol. 2024, 2024, pp. 27 031–27 098

2024

[21] [21]

Policy-invisible violations in llm-based agents,

J. Wu and M. Gong, “Policy-invisible violations in llm-based agents,” arXiv preprint arXiv:2604.12177, 2026

Pith/arXiv arXiv 2026

[22] [22]

Satisfiability modulo theories,

C. Barrett and C. Tinelli, “Satisfiability modulo theories,” inHand- book of model checking. Springer, 2018, pp. 305–343

2018

[23] [23]

Skill-inject: Measuring agent vulnerability to skill file attacks,

D. Schmotz, L. Beurer-Kellner, S. Abdelnabi, and M. An- driushchenko, “Skill-inject: Measuring agent vulnerability to skill file attacks,”arXiv preprint arXiv:2602.20156, 2026

Pith/arXiv arXiv 2026

[24] [24]

Agentdojo: A dynamic environment to evaluate prompt injection attacks and defenses for llm agents,

E. Debenedetti, J. Zhang, M. Balunovic, L. Beurer-Kellner, M. Fis- cher, and F. Tramèr, “Agentdojo: A dynamic environment to evaluate prompt injection attacks and defenses for llm agents,”Advances in Neural Information Processing Systems, vol. 37, pp. 82 895–82 920, 2024

2024

[25] [25]

Safeagentbench: A benchmark for safe task planning of embodied llm agents,

S. Yin, X. Pang, Y . Ding, M. Chen, Y . Bi, Y . Xiong, W. Huang, Z. Xiang, J. Shao, and S. Chen, “Safeagentbench: A benchmark for safe task planning of embodied llm agents,”arXiv preprint arXiv:2412.13178, 2024

arXiv 2024

[26] [26]

Databricks Agent Skills,

Databricks, “Databricks Agent Skills,” https://github.com/databricks/ databricks-agent-skills, 2025, accessed: 2026-06-11

2025

[27] [27]

Testing handbook skills for Claude Code,

Trail of Bits, “Testing handbook skills for Claude Code,” https://gith ub.com/trailofbits/skills/tree/main/plugins/testing-handbook-skills, 2025, accessed: 2026-06-10

2025

[28] [28]

Anthropic skills,

Anthropic, “Anthropic skills,” https://github.com/anthropics/skills/tre e/main/skills, 2026

2026

[29] [29]

Model context protocol specification,

Model Context Protocol, “Model context protocol specification,” http s://modelcontextprotocol.io, 2024, accessed: June 26, 2026

2024

[30] [30]

Enforceable security policies,

F. B. Schneider, “Enforceable security policies,”ACM Transactions on Information and System Security (TISSEC), vol. 3, no. 1, pp. 30– 50, 2000

2000

[31] [31]

Edit automata: Enforcement mechanisms for run-time security policies,

J. Ligatti, L. Bauer, and D. Walker, “Edit automata: Enforcement mechanisms for run-time security policies,”International Journal of Information Security, vol. 4, no. 1, pp. 2–16, 2005

2005

[32] [32]

Patterns in property specifications for finite-state verification,

M. B. Dwyer, G. S. Avrunin, and J. C. Corbett, “Patterns in property specifications for finite-state verification,” inProceedings of the 21st international conference on Software engineering, 1999, pp. 411–420

1999

[33] [33]

Linear temporal logic and linear dynamic logic on finite traces,

G. De Giacomo and M. Y . Vardi, “Linear temporal logic and linear dynamic logic on finite traces,” inProceedings of the Twenty-Third In- ternational Joint Conference on Artificial Intelligence (IJCAI). AAAI Press, 2013, pp. 854–860

2013

[34] [34]

Z3: An efficient smt solver,

L. De Moura and N. Bjørner, “Z3: An efficient smt solver,” inIn- ternational conference on Tools and Algorithms for the Construction and Analysis of Systems. Springer, 2008, pp. 337–340

2008

[35] [35]

Foray: Towards effective attack synthesis against deep logical vulnerabilities in DeFi protocols,

H. Wen, H. Liu, J. Song, Y . Chen, W. Guo, and Y . Feng, “Foray: Towards effective attack synthesis against deep logical vulnerabilities in DeFi protocols,” inProceedings of the 2024 ACM SIGSAC Con- ference on Computer and Communications Security (CCS), 2024

2024

[36] [36]

Nemo-evaluator-launcher skills,

NVIDIA, “Nemo-evaluator-launcher skills,” https://github.com/NVI DIA/skills/tree/8c40eff71464e661df027f547c4a7d0f69fe3693/skills/ NeMo-Evaluator-Launcher, 2026

2026

[37] [37]

Cloudflare skills,

Cloudflare, “Cloudflare skills,” https://github.com/cloudflare/skills, 2026

2026

[38] [38]

Microsoft skill bundle,

Microsoft, “Microsoft skill bundle,” https://github.com/microsoft/ski lls/, 2026

2026

[39] [39]

Agentc- group: Understanding and controlling OS resources of AI agents,

Y . Zheng, J. Fan, Q. Fu, Y . Yang, W. Zhang, and A. Quinn, “Agentc- group: Understanding and controlling OS resources of AI agents,” arXiv preprint arXiv:2602.09345, 2026

arXiv 2026

[40] [40]

Clawhub,

ClawHub, “Clawhub,” https://clawhub.ai/, 2026, accessed: 2026-06- 12

2026

[41] [41]

Benchflow,

BenchFlow, “Benchflow,” https://github.com/benchflow-ai/benchflo w, 2026, accessed: 2026-06-11

2026

[42] [42]

{AddressSanitizer}: A fast address sanity checker,

K. Serebryany, D. Bruening, A. Potapenko, and D. Vyukov, “{AddressSanitizer}: A fast address sanity checker,” in2012 USENIX annual technical conference (USENIX ATC 12), 2012, pp. 309–318

2012

[43] [43]

AddressSanitizer — Clang documentation,

LLVM Project, “AddressSanitizer — Clang documentation,” https: //clang.llvm.org/docs/AddressSanitizer.html, accessed: 2026-06-11

2026

[44] [44]

The attack and defense landscape of agentic ai: A comprehensive survey,

J. Kim, X. Liu, Z. Wang, S. Qiu, B. Li, W. Guo, and D. Song, “The attack and defense landscape of agentic ai: A comprehensive survey,” arXiv preprint arXiv:2603.11088, 2026

arXiv 2026

[45] [45]

Sok: Attack and defense landscape of agentic ai systems,

J. Kim, W. Guo, and D. Song, “Sok: Attack and defense landscape of agentic ai systems,” in35th USENIX Security Symposium (USENIX Security 26), 2026

2026

[46] [46]

Ghost in the agent: Re- defining information flow tracking for llm agents,

Y . Cai, W. Tang, C. Wen, and S. Qin, “Ghost in the agent: Re- defining information flow tracking for llm agents,”arXiv preprint arXiv:2604.23374, 2026

Pith/arXiv arXiv 2026

[47] [47]

Agentarmor: Enforcing program analysis on agent runtime trace to defend against prompt injection,

P. Wang, Y . Liu, Y . Lu, Y . Cai, H. Chen, Q. Yang, J. Zhang, J. Hong, and Y . Wu, “Agentarmor: Enforcing program analysis on agent runtime trace to defend against prompt injection,”arXiv preprint arXiv:2508.01249, 2025

arXiv 2025

[48] [48]

Agentsentry: Mitigating indirect prompt injection in llm agents via temporal causal diagnostics and context purification,

T. Zhang, Y . Xu, J. Wang, K. Guo, X. Xu, B. Xiao, Q. Guan, J. Fan, J. Liu, Z. Liuet al., “Agentsentry: Mitigating indirect prompt injection in llm agents via temporal causal diagnostics and context purification,”arXiv preprint arXiv:2602.22724, 2026

arXiv 2026

[49] [49]

Agent-diff: Bench- marking llm agents on enterprise api tasks via code execution with state-diff-based evaluation,

H. M. Pysklo, A. Zhuravel, and P. D. Watson, “Agent-diff: Bench- marking llm agents on enterprise api tasks via code execution with state-diff-based evaluation,”arXiv preprint arXiv:2602.11224, 2026

Pith/arXiv arXiv 2026

[50] [50]

The keynote trust-management system version 2,

M. Blaze, J. Feigenbaum, J. Ioannidis, and A. Keromytis, “The keynote trust-management system version 2,” RFC Editor, Tech. Rep. RFC 2704, September 1999

1999

[51] [51]

The ponder policy specification language,

N. Damianou, N. Dulay, E. Lupu, and M. Sloman, “The ponder policy specification language,” inInternational Workshop on Policies for Distributed Systems and Networks. Springer, 2001, pp. 18–38

2001

[52] [52]

Cedar: A new language for expressive, fast, safe, and analyzable authorization,

J. W. Cutler, C. Disselkoen, A. Eline, S. He, K. Headley, M. Hicks, K. Hietala, E. Ioannidis, J. Kastner, A. Mamatet al., “Cedar: A new language for expressive, fast, safe, and analyzable authorization,” Proceedings of the ACM on Programming Languages, vol. 8, no. OOPSLA1, pp. 670–697, 2024

2024

[53] [53]

Policies and permissions in aws identity and access management,

Amazon Web Services, “Policies and permissions in aws identity and access management,” https://docs.aws.amazon.com/IAM/latest/Use rGuide/access_policies.html, 2026, accessed: 2026-06-06

2026

[54] [54]

What is azure role-based access control (Azure RBAC)?

Microsoft Azure, “What is azure role-based access control (Azure RBAC)?” https://learn.microsoft.com/en-us/azure/role-based-acces s-control/overview, 2026, accessed: 2026-06-06

2026

[55] [55]

Iam overview,

Google Cloud, “Iam overview,” https://cloud.google.com/iam/docs/ overview, 2026, accessed: 2026-06-06

2026

[56] [56]

Semantic-based automated reasoning for aws access policies using smt,

J. Backes, P. Bolignano, B. Cook, C. Dodge, A. Gacek, K. Luckow, N. Rungta, O. Tkachuk, and C. Varming, “Semantic-based automated reasoning for aws access policies using smt,” in2018 Formal Methods in Computer Aided Design (FMCAD). IEEE, 2018, pp. 1–9

2018

[57] [57]

Agent skills for large language models: Archi- tecture, acquisition, security, and the path forward,

R. Xu and Y . Yan, “Agent skills for large language models: Archi- tecture, acquisition, security, and the path forward,”arXiv preprint arXiv:2602.12430, 2026

Pith/arXiv arXiv 2026

[58] [58]

Sok: Agentic skills–beyond tool use in llm agents,

Y . Jiang, D. Li, H. Deng, B. Ma, X. Wang, Q. Wang, and G. Yu, “Sok: Agentic skills–beyond tool use in llm agents,”arXiv preprint arXiv:2602.20867, 2026

Pith/arXiv arXiv 2026

[59] [59]

Towards secure agent skills: Architecture, threat taxonomy, and security analysis,

Z. Li, J. Wu, X. Ling, X. Cui, and T. Luo, “Towards secure agent skills: Architecture, threat taxonomy, and security analysis,”arXiv preprint arXiv:2604.02837, 2026

Pith/arXiv arXiv 2026

[60] [60]

Agent skills in the wild: An empirical study of security vulnerabilities at scale,

Y . Liu, W. Wang, R. Feng, Y . Zhang, G. Xu, G. Deng, Y . Li, and L. Zhang, “Agent skills in the wild: An empirical study of security vulnerabilities at scale,”arXiv preprint arXiv:2601.10338, 2026

Pith/arXiv arXiv 2026

[61] [61]

Malicious agent skills in the wild: A large-scale security empirical study,

Y . Liu, Z. Chen, Y . Zhang, G. Deng, Y . Li, J. Ning, Y . Zhang, and L. Y . Zhang, “Malicious agent skills in the wild: A large-scale security empirical study,”arXiv preprint arXiv:2602.06547, 2026

Pith/arXiv arXiv 2026

[62] [62]

Harmful- skillbench: How do harmful skills weaponize your agents?

Y . Jiang, Y . Zhang, M. Backes, X. Shen, and Y . Zhang, “Harmful- skillbench: How do harmful skills weaponize your agents?”arXiv preprint arXiv:2604.15415, 2026

Pith/arXiv arXiv 2026

[63] [63]

Agenttrap: Measuring runtime trust failures in third-party agent skills,

H. Zhuang, H. Xing, Y . Zhou, Y . Ma, Y . Huang, Y . Shen, Y . Han, and X. Zhang, “Agenttrap: Measuring runtime trust failures in third-party agent skills,”arXiv preprint arXiv:2605.13940, 2026

Pith/arXiv arXiv 2026

[64] [64]

Skillprobe: Security auditing for emerging agent skill marketplaces via multi- agent collaboration,

Z. Guo, Z. Chen, X. Nie, J. Lin, Y . Zhou, and W. Zhang, “Skillprobe: Security auditing for emerging agent skill marketplaces via multi- agent collaboration,”arXiv preprint arXiv:2603.21019, 2026

arXiv 2026

[65] [65]

Semia: Auditing agent skills via constraint-guided representation synthesis,

H. Wen, Y . Li, H. Liu, C. Shou, Y . Chen, Y . Tian, and Y . Feng, “Semia: Auditing agent skills via constraint-guided representation synthesis,”arXiv preprint arXiv:2605.00314, 2026. Table 5. Representative policy categories and formalizations in SB+SI. Category Example policyFormalization Required event property Period search must include flux uncertain...

Pith/arXiv arXiv 2026

[66] [66]

SECURITY_PROTOCOL: the skill’s natural−language policy text

[67] [67]

observed_vocabulary: actions, argument roles, command options, predicates, scripts, and text samples recovered from the trace

[68] [68]

signature: the allowed condition operators and temporal templates

[69] [69]

Grounding contract: − Draft policy statements only; do not classify the trace

policy_language: the target schema and declared role set. Grounding contract: − Draft policy statements only; do not classify the trace. − Use only literals and symbols present in the inputs. − Do not emit event ids, witness ids, SMT variables, expected verdicts, violation labels, or other trace−checking artifacts. − Emit Unsupported when a rule cannot be...