SkillGuard: A Permission-Centric Framework for Agent Skill Security

Dianshu Liao; Kaiwen Yang; Shidong Pan; Tianyi Zhang; Xiaoyu Sun; Zhenchang Xing

REVIEW 3 major objections 6 minor 1 cited by

Skills should be treated as permission-bearing programs whose context influence and runtime actions are jointly gated by manifests and deny-by-default mediation.

Reviewed by Pith at T0; open to challenge. T0 means a machine referee read the full paper against a public rubric. the ladder, T0–T4 →

Challenge this review Re-run · record.json Download PDF Read on arXiv ↗

T0 review · grok-4.5

2026-07-14 18:30 UTC pith:VWVRACYD

load-bearing objection Solid systems paper that makes skills a real permission principal; security gains are real but modest, and the dual-plane story is stronger on actions than on context. the 3 major comments →

arxiv 2606.03024 v2 pith:VWVRACYD submitted 2026-06-02 cs.CR cs.SE

SkillGuard: A Permission-Centric Framework for Agent Skill Security

Shidong Pan , Xiaoyu Sun , Tianyi Zhang , Dianshu Liao , Kaiwen Yang , Zhenchang Xing This is my paper

classification cs.CR cs.SE

keywords LLM agentsagent skillspermission frameworkruntime enforcementaccess controlskill securityAI agent securitydeny-by-default

verification ladder T0 review T1 audit T2 compute T3 formal T4 reserved

The pith

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Agent skills are not just prompt snippets or tool wrappers: they reshape what an agent knows and what it does. SkillGuard argues that this dual role makes skills a first-class security principal, and that current defenses fail because they either scan skill files statically or police individual tool calls without linking declared skill intent to live behavior. The framework requires each skill to declare capabilities in a manifest, then mediates every sensitive action at runtime under a deny-by-default policy composed from workspace defaults, the manifest, and user grants, with extra inference for shell commands. On a real-world skill corpus the permission taxonomy covers nearly all observed protected objects, and on SkillInject the system lowers attack success while leaving benign task completion almost unchanged. The practical claim is that skill-centric permissions can give skill marketplaces a workable least-privilege boundary without rewriting the agent runtime.

Core claim

Treating skills as permission-bearing executable artifacts and jointly governing context influence and action side effects through manifests, runtime access control, user-mediated grants, deny-by-default enforcement, and capability inference reduces SkillInject attack success from 32.37% to 23.02% under contextual injections and from 25.56% to 16.67% under obvious injections, while benign task success falls only from 86.96% to 85.51%.

What carries the argument

Dual-plane skill permission model: a JSON skill manifest declares capabilities (context loading, file/network/secrets, execution, delegation, policy changes), and a PreToolUse mediation pipeline maps host tool calls to those capabilities, composes workspace/manifest/session grants, and blocks undeclared or constraint-mismatched actions, with a mini-agent refining shell commands into lower-level capabilities.

Load-bearing premise

Blocking actions that fall outside a skill’s declared capability set is enough to stop most skill attacks, even when the attack reuses permissions the skill already needs for legitimate work.

What would settle it

On a held-out set of skill injections whose malicious side effects stay inside the legitimate capability surface already declared for the skill (for example mass-forwarding mail with declared send APIs), measure whether attack success stays high; if SkillGuard cannot reduce those successes without also collapsing benign task completion, the central security claim fails.

Watch this falsifier — get emailed when new claim-graph text bears on it.

If this is right

Skill marketplaces can require machine-checkable manifests the way app stores require permission declarations, so users and hosts see least-privilege scope before install.
Agent scaffolds that expose tool-call lifecycle hooks can host the same enforcement without modifying model weights or core agent code.
Auto-generated manifests at ~91% capability F1 make authoring burden low enough that governance can scale with marketplace growth.
Residual attacks that stay inside declared permissions imply a need for task-alignment checks beyond pure capability allowlists.
Audit logs of every mediated call (capability, policy source, decision) become a portable observability layer for compliance and debugging.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Permission manifests will likely become a de facto interchange format across agent platforms once a few large skill catalogs adopt them, similar to mobile app manifests.
The residual ASR when attacks reuse legitimate APIs suggests the next research bottleneck is intent/task alignment, not finer capability taxonomies alone.
Unattended auto-approve of confirm decisions understates the human-in-the-loop defense; interactive deployments may see stronger protection and higher friction.
Shell mini-agent analysis is a soft spot: if command-to-capability inference is incomplete, both false allows and false blocks will concentrate on script-heavy skills.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit.

Desk Editor's Note

Solid systems paper that makes skills a real permission principal; security gains are real but modest, and the dual-plane story is stronger on actions than on context.

read the letter

The useful takeaway is simple: skills are a new security principal, and SkillGuard is a concrete Android-style permission stack for them—manifest DSL, deny-by-default PreToolUse mediation, session grants, and a mini-agent that expands shell commands into lower-level capabilities. That package is the real contribution, not a huge ASR collapse.

What they do well is systems craft. The dual-plane framing (context influence vs action side effects) is the right problem statement, and they actually build the missing middle between Progent-style tool policies and AgentBound-style MCP manifests. Taxonomy work on 315 SkillsMP skills is careful (dual annotation, κ=0.656, ~99.8% object coverage). Manifest generation hits 91% F1 with high recall, which is the safer failure mode. ASR drops ~9 points on SkillInject with almost no utility loss (86.96% → 85.51% TSR). They also ship artifacts and acknowledge residual attacks that reuse already-declared capabilities (email mass-forward is the honest example).

Soft spots, in proportion: the dual-plane claim is only partly stress-tested. LOAD_CONTEXT and instruction admission are in the taxonomy, but the reported wins look like action denial of undeclared capabilities, not a separate context-admission ablation. Unattended runs auto-approve every confirm, so user consent and session-grant discipline are designed but not measured. Single model/scaffold, modest effect sizes, and no task-alignment check once a capability is declared. Those are evaluation limits, not a broken design.

Math/data/citations look fine for an empirical systems paper—no circular scoring, related work is placed fairly, no load-bearing contradiction. This is for people building agent runtimes, skill marketplaces, or MCP-style sandboxes. Worth a serious referee; I’d engage, cite the dual-plane + manifest idea, and push them on context-plane measurement and multi-model generality rather than desk-reject.

Referee Report

3 major / 6 minor

Summary. SkillGuard proposes a skill-centric permission framework for LLM agent ecosystems, treating skills as permission-bearing executable artifacts that must be governed on both a context plane (instructions and loaded content that reshape reasoning) and an action plane (tool calls and side effects). The system comprises a JSON skill-manifest DSL, a multi-group capability taxonomy, PreToolUse-style runtime mediation, user confirm grants, deny-by-default policy composition, a shell permission-generation mini-agent, and audit logging. Empirically, dual-annotator analysis of 315 SkillsMP skills reports 99.76% predefined protected-object coverage; automated manifest generation on 23 clean SkillInject skills reaches 91.0% capability F1; and on SkillInject, SkillGuard reduces ASR from 32.37% to 23.02% (contextual) and 25.56% to 16.67% (obvious) while TSR falls only from 86.96% to 85.51% on eligible benign tasks.

Significance. If the dual-plane framing and runtime design hold, this is a timely systems contribution: skills are becoming a first-class distribution unit for agent behavior, and prior work largely secures either tool calls (Progent, AgentBound) or context integrity (A2AS) without a skill-level principal that links declared intent, context admission, and side effects. Strengths include a concrete, inspectable DSL; a taxonomy validated by dual human annotation with reported agreement; honest residual-attack case analysis; paired ASR/TSR measurement against a no-guard baseline; and promised artifacts. The absolute ASR reductions are modest, but the paper correctly positions SkillGuard as a practical foundation rather than a complete defense. Credit is due for measuring utility preservation and for documenting over-declaration and capability-reuse bypasses rather than only reporting headline ASR drops.

major comments (3)

The central dual-plane claim (§1–§2.1, Abstract) is only partially supported by the evaluation. Table 1 includes LOAD_CONTEXT and related Agent Ecosystem permissions, but RQ3 (§8, Table 4) measures success via whether adversarial behavior occurs under PreToolUse capability checks; residual successes (e.g., INST-26_email-api_task1 mass-forward using already-declared web.post / external_api.call) are action-plane capability reuse, not evidence that context admission was jointly enforced. Please either (i) add an ablation that isolates context-plane mediation (instruction/context admission vs. action denial), or (ii) narrow claims so that dual-plane governance is presented as an architectural goal with action-plane results as the primary empirical support.
§5.1 states that every confirm decision is auto-approved as a one-time allow in unattended runs, so only deny paths are stress-tested. User Interaction (§4.3)—Allow once / Allow this session / Deny, and non-widening session grants—is a core module of the claimed framework, yet RQ3–RQ4 never measure consent burden, grant-scope correctness, or attacks that rely on over-broad user approvals. This is load-bearing for the complete-mediation and least-privilege story. At minimum, report how many mediated calls would have required confirm under the generated manifests, and discuss how auto-approve biases ASR/TSR; ideally add a small interactive or simulated-consent study.
§8 acknowledges that SkillGuard cannot separate legitimate from adversarial use of declared capabilities, yet the paper’s strongest claim still frames skill-level capability declarations plus tool-call mediation as a sufficient security principal for skill attacks. Given residual ASR of 16.67–23.02% and the email-api / similar bypass class, the Discussion should more sharply bound what permission manifests can and cannot stop (e.g., task-alignment or data-flow checks for ambiguous declared permissions), and avoid implying that dual-plane skill permissions close the skill-injection problem rather than reduce a subset of undeclared-capability attacks.

minor comments (6)

§6 reports Cohen’s κ=0.656 as “strong agreement”; under Landis & Koch this is conventionally “substantial.” Please use standard terminology.
RQ2 (§7, Table 3): 56.5% of manifests are over-declared. Briefly discuss operational impact (prompt fatigue, user habituation) beyond noting that over-declaration is “safer.”
Threats to validity (§11.2) correctly note single model (MiMo-V2.5-Pro) and single scaffold (Claude Code). A short qualitative argument why tool-to-capability mapping should transfer would strengthen external validity claims in §11.1.
Ensure abstract and body statistics match exactly (skill count, coverage %, ASR/TSR). Any stale abstract numbers should be reconciled before camera-ready.
Figures 4–6 use log-scale token/time axes; add median/mean annotations in the figure or caption so readers need not rely only on prose.
Minor polish: “PerT ool U se” / spacing artifacts in Figure 2 labels; “decently maintaining” in the short abstract is informal for a journal tone.

Circularity Check

0 steps flagged

Empirical systems paper: ASR/TSR and taxonomy coverage are measured against external benchmarks and human annotations, not forced by construction from the framework's own definitions.

full rationale

SkillGuard's load-bearing claims are empirical comparisons, not first-principles derivations. Taxonomy coverage (RQ1) is measured by first extracting protected objects from skill text without consulting the taxonomy, then mapping them—so coverage is not tautological. Manifest quality (RQ2) is scored against independently authored human reference manifests. Defense and utility (RQ3–RQ4) compare a SkillGuard-enabled agent to an unguarded baseline on the external SkillInject benchmark under fixed auto-approve-confirm settings. Design principles cite Saltzer/Schroeder and Android as external inspiration; no uniqueness theorem, fitted constant, or self-citation chain forces the reported ASR/TSR reductions. Residual attack successes when adversaries reuse already-declared capabilities are limitations of the security principal, not circularity. No step reduces a claimed prediction to its own inputs by construction.

Axiom & Free-Parameter Ledger

3 free parameters · 5 axioms · 4 invented entities

Load-bearing content is mostly design axioms and invented policy machinery rather than fitted physical constants. Security claims rest on classical access-control principles applied to skills, on the adequacy of a fixed capability taxonomy, and on treating the tool-call hook as the complete mediation point. Free parameters are discrete design knobs (effects, protection levels, auto-approve experimental policy), not continuous fits to ASR.

free parameters (3)

permission effect defaults (allow/confirm/deny) and protection levels (normal/dangerous/system/redact) = DSL defaults every permission to confirm in generation; levels assigned by taxonomy design
Hand-chosen policy vocabulary and severity labels that determine when user prompts fire and how risky a capability is treated; not derived from data.
experimental auto-approve of confirm decisions = all confirm → one-time allow
Unattended evaluation policy that converts confirm→allow-once; changes measured security relative to real user mediation.
SkillsMP sampling (5 skills × 63 categories = 315) = 315 skills
Corpus size and stratified sample used to claim taxonomy expressiveness; different sample could shift coverage.

axioms (5)

domain assumption Saltzer–Schroeder least privilege, complete mediation, and fail-safe defaults apply to heterogeneous natural-language+code skill packages the same way they apply to classical software principals.
Stated as design foundation in §3; not proved for LLM agents that can be steered by context alone.
domain assumption Agent skills are analogous to Android apps as third-party executable artifacts that must declare, be granted, and be mediated capabilities.
§3–4.1 analogy that justifies the manifest-centric design.
ad hoc to paper The host tool-call boundary (SessionStart / PreToolUse / SkillLoaded hooks) is a sufficient complete-mediation point for both action side effects and skill-driven context loading.
Enforcement architecture in §4.2–4.4; pure in-context influence that never becomes a mediated tool call is only partially covered.
ad hoc to paper A fixed multi-group capability taxonomy can name essentially all protected objects skills target in the wild.
RQ1 premise; empirically supported at 99.76% on the sampled corpus but still a closed vocabulary assumption.
domain assumption LLM-as-judge scoring of ASR and TSR is an adequate outcome measure for defense effectiveness and utility.
§5.1 protocol following prior SkillInject/DDIPE practice.

invented entities (4)

Skill Manifest + SkillGuard policy DSL (capability, effect, constraints, session state) no independent evidence
purpose: Declare and compose skill authority independently of host tool names.
Core artifact introduced in §4.1 and Figure 3; no independent standard yet outside this paper.
Dual-plane governance model (context plane + action plane) for skills no independent evidence
purpose: Frame why tool-only or context-only defenses are incomplete for skills.
Problem formalization in §1–2; analytical construct rather than measured natural kind.
SkillGuard permission taxonomy (8 groups / protected objects / specific permissions) no independent evidence
purpose: Canonical vocabulary for manifests and runtime mapping.
Table 1; coverage is evaluated but the taxonomy itself is author-defined.
Permission-generation mini-agent for shell-style execution no independent evidence
purpose: Infer lower-level capabilities of commands and referenced scripts before allow.
§4.4 enforcement refinement; effectiveness shown only inside this system’s pipeline.

pith-pipeline@v1.1.0-grok45 · 22492 in / 3770 out tokens · 44176 ms · 2026-07-14T18:30:52.415032+00:00 · methodology

0 comments

read the original abstract

Skills extend LLM agents with reusable instructions, scripts, data, and tool bindings. This shift makes skills a new security principal in agent systems: a skill can alter the agent's reasoning before any tool is called, and it can also steer the agent toward actions with concrete side effects. However, current skill ecosystems lack a permission model that captures this dual role. Existing defenses either inspect skill files before use or constrain individual tool calls during execution, leaving the connection between skill-level intent, contextual influence, and runtime behavior weakly governed. In this paper, we present SkillGuard, a skill-centric permission framework that treats skills as permission-bearing executable artifacts. SkillGuard introduces a dual-plane governance model that jointly regulates context influence and action side effects through skill manifests, runtime permission control, user interaction, and policy enforcement. We evaluate the permission taxonomy expressiveness on 1,260 real-world skills, and 99.93% of observed protected objects are covered. In adversarial evaluations on SkillInject dataset, SkillGuard reduces attack success rate from 35.3% to 20.7% for contextual injections and from 36.7% to 18.0% for obvious injections, while decently maintaining benign task completion. These results suggest that SkillGuard, as a skill-centric permission framework, can provide a practical foundation for improving the security of agent skill ecosystems.

Figures

Figures reproduced from arXiv: 2606.03024 by Dianshu Liao, Kaiwen Yang, Shidong Pan, Tianyi Zhang, Xiaoyu Sun, Zhenchang Xing.

**Figure 2.** Figure 2: The workflow of SkillGuard. requested permission is allowed under the session’s permission list, establishing a coarse-grained capability boundary. The accesscontrol decision follows a deterministic order. If the requested capability is not declared, the declarations exist but their constraints do not match the tool input, or it requests a dangerous permission, the user will be informed to make the decis… view at source ↗

**Figure 3.** Figure 3: Abstract syntax of the SkillGuard policy DSL. [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 6.** Figure 6: Per-run wall-clock duration distributions for the [PITH_FULL_IMAGE:figures/full_fig_p010_6.png] view at source ↗

**Figure 5.** Figure 5: Per-run token distributions for the SkillGuard [PITH_FULL_IMAGE:figures/full_fig_p010_5.png] view at source ↗

Review history (2 revisions) →

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Are You Still the Agent I Authorized? Earned Authority under a Fixed Ceiling for Evolving Agents
cs.AI 2026-07 conditional novelty 6.5

Evolving agents may change active authority only beneath an immutable user-issued effect ceiling, and a transition envelope decides whether the old grant survives mutation at all.

Reference graph

Works this paper leans on

36 extracted references · 13 linked inside Pith · cited by 1 Pith paper

[1]

Agent Skills Marketplace

2026. Agent Skills Marketplace. https://skillsmp.com/

2026
[2]

The Artifacts of SkillGuard

2026. The Artifacts of SkillGuard. https://github .com/Dianshu-Liao/SkilLGuard

2026
[3]

Amirhossein Abaskohi, Amrutha Varshini Ramesh, Shailesh Nanisetty, Chirag Goel, David Vazquez, Christopher Pal, Spandana Gella, Giuseppe Carenini, and Issam H Laradji. 2025. AgentAda: Skill-Adaptive Data Analytics for Tailored Insight Discovery.arXiv preprint arXiv:2504.07421(2025)

arXiv 2025
[4]

Anthropic. 2025. Claude Sonnet 4.6. https://www .anthropic.com/claude/sonnet

2025
[5]

Anthropic. 2025. What is the Model Context Protocol (MCP)? https:// modelcontextprotocol.io/docs/getting-started/intro

2025
[6]

Anthropic. 2026. Claude Code Docs. https://code .claude.com/docs/en/overview

2026
[7]

Bhavyansh. 2026. MCP vs Agent Skills: Which AI Architecture Pattern to Use . https://bhavyansh001 .medium.com/mcp-vs-agent-skills-which-ai- architecture-pattern-to-use-mcp-deepdive-03-6a42185d9e7b

2026
[8]

Christoph Bühler, Matteo Biagiola, Luca Di Grazia, and Guido Salvaneschi. 2026. AgentBound: Securing Execution Boundaries of AI Agents. InProceedings of the 34th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (FSE)(Montreal, Canada, 2026-07) (2026, Vol. 3). ACM, New York, NY, USA, Article FSE096...

doi:10.1145/3808103 2026
[9]

Edoardo Debenedetti, Jie Zhang, Mislav Balunovi’c, Luca Beurer-Kellner, Marc Fischer, and Florian Tramèr. 2024. AgentDojo: A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents.ArXivabs/2406.13352 (2024)

Pith/arXiv arXiv 2024
[10]

Kai Greshake, Sahar Abdelnabi, Shailesh Mishra, Christoph Endres, Thorsten Holz, and Mario Fritz. 2023. More than you’ve asked for: A Comprehensive Analysis of Novel Prompt Injection Threats to Application-Integrated Large Language Models.ArXivabs/2302.12173 (2023). https://api .semanticscholar.org/ CorpusID:257102404

Pith/arXiv arXiv 2023
[11]

Kai Greshake, Sahar Abdelnabi, Shailesh Mishra, Christoph Endres, Thorsten Holz, and Mario Fritz. 2023. Not What You’ve Signed Up For: Compromising Real- World LLM-Integrated Applications with Indirect Prompt Injection.Proceedings of the 16th ACM Workshop on Artificial Intelligence and Security(2023). https: //api.semanticscholar.org/CorpusID:258546941

2023
[12]

Tingxu Han, Yi Zhang, Wei Song, Chunrong Fang, Zhenyu Chen, Youcheng Sun, and Lijie Hu. 2026. SWE-Skills-Bench: Do Agent Skills Actually Help in Real-World Software Engineering?arXiv preprint arXiv:2603.15401(2026)

arXiv 2026
[13]

Yinghan Hou and Zongyou Yang. 2026. Skillsieve: A hierarchical triage framework for detecting malicious ai agent skills.arXiv preprint arXiv:2604.06550(2026)

Pith/arXiv arXiv 2026
[14]

Xiaojun Jia, Jie Liao, Simeng Qin, Jindong Gu, Wenqi Ren, Xiaochun Cao, Yang Liu, and Philip Torr. 2026. Skillject: Automating stealthy skill-based prompt injection for coding agents with trace-driven closed-loop refinement. InThe 6th Workshop of Adversarial Machine Learning on Computer Vision: Safety of Vision-Language Agents

2026
[15]

Yanna Jiang, Delong Li, Haiyu Deng, Baihe Ma, Xu Wang, Qin Wang, and Guang- sheng Yu. 2026. SoK: Agentic Skills–Beyond Tool Use in LLM Agents.arXiv preprint arXiv:2602.20867(2026)

Pith/arXiv arXiv 2026
[16]

J Richard Landis and Gary G. Koch. 1977. An application of hierarchical kappa- type statistics in the assessment of majority agreement among multiple observers. Biometrics33 2 (1977), 363–74

1977
[17]

Xiangyi Li, Wenbo Chen, Yimin Liu, Shenghan Zheng, Xiaokun Chen, Yifeng He, Yubo Li, Bingran You, Haotian Shen, Jiankai Sun, et al. 2026. SkillsBench: Benchmarking how well agent skills work across diverse tasks.arXiv preprint arXiv:2602.12670(2026)

Pith/arXiv arXiv 2026
[18]

George Ling, Shanshan Zhong, and Richard Huang. 2026. Agent skills: A data- driven analysis of claude skills for extending large language model functionality. arXiv preprint arXiv:2602.08004(2026)

arXiv 2026
[19]

Pei Liu, Li Li, Yanjie Zhao, Xiaoyu Sun, and John Grundy. 2020. Androzooopen: Collecting large-scale open source android apps for the research community. In Proceedings of the 17th International Conference on Mining Software Repositories. 548–552

2020
[20]

Yi Liu, Zhihao Chen, Yanjun Zhang, Gelei Deng, Yuekang Li, Jianting Ning, Ying Zhang, and Leo Yu Zhang. 2026. Malicious agent skills in the wild: A large-scale security empirical study.arXiv preprint arXiv:2602.06547(2026)

Pith/arXiv arXiv 2026
[21]

Eugene Neelou, Ivan Novikov, Max Moroz, Om Narayan, Tiffany Saade, Mika Ayenson, Ilya Kabanov, Jen Ozmen, Edward Lee, Vineeth Sai Narajala, et al
[22]

A2AS: agentic AI runtime security and Self-Defense.arXiv preprint arXiv:2510.13825(2025)

arXiv 2025
[23]

Linyue Pan, Lexiao Zou, Shuo Guo, Jingchen Ni, and Hai-Tao Zheng. 2026. Natural-language agent harnesses.arXiv preprint arXiv:2603.25723(2026)

Pith/arXiv arXiv 2026
[24]

Yubin Qu, Yi Liu, Tongcheng Geng, Gelei Deng, Yuekang Li, Leo Yu Zhang, Ying Zhang, and Lei Ma. 2026. Supply-Chain Poisoning Attacks Against LLM Coding Agent Skill Ecosystems

2026
[25]

Brandon Radosevich and John Halloran. 2025. Mcp safety audit: Llms with the model context protocol allow major security exploits.arXiv preprint arXiv:2504.03767(2025)

Pith/arXiv arXiv 2025
[26]

Jerome H Saltzer and Michael D Schroeder. 1975. The protection of information in computer systems.Proc. IEEE63, 9 (1975), 1278–1308

1975
[28]

David Schmotz, Luca Beurer-Kellner, Sahar Abdelnabi, and Maksym An- driushchenko. 2026. Skill-Inject: Measuring Agent Vulnerability to Skill File Attacks.ArXivabs/2602.20156 (2026)

Pith/arXiv arXiv 2026
[29]

Abdullah AL Mamun, Yu Fu, Pedram Zaree, Yue Dong, and Nael B

Erfan Shayegani, Md. Abdullah AL Mamun, Yu Fu, Pedram Zaree, Yue Dong, and Nael B. Abu-Ghazaleh. 2023. Survey of Vulnerabilities in Large Language 11 Models Revealed by Adversarial Attacks.ArXivabs/2310.10844 (2023). https: //api.semanticscholar.org/CorpusID:264172191

Pith/arXiv arXiv 2023
[30]

Tianneng Shi, Jingxuan He, Zhun Wang, Hongwei Li, Linyu Wu, Wenbo Guo, and Dawn Song. 2025. Progent: Programmable privilege control for llm agents. arXiv preprint arXiv:2504.11703(2025)

Pith/arXiv arXiv 2025
[31]

SkillsMP. 2026. SkillsMP: A Marketplace for Agent Skills. https://skillsmp .com/. Accessed: 2026-06-01

2026
[32]

Xiaoyu Sun, Xiao Chen, Li Li, Haipeng Cai, John Grundy, Jordan Samhi, Tegawendé Bissyandé, and Jacques Klein. 2023. Demystifying hidden sensi- tive operations in android apps.ACM Transactions on Software Engineering and Methodology32, 2 (2023), 1–30

2023
[33]

Guiyao Tie, Jiawen Shi, Pan Zhou, and Lichao Sun. 2026. Badskill: Backdoor at- tacks on agent skills via model-in-skill poisoning.arXiv preprint arXiv:2604.09378 (2026)

Pith/arXiv arXiv 2026
[34]

Peiran Wang, Xinfeng Li, Chong Xiang, Jinghuai Zhang, Ying Li, Lixia Zhang, Xiaofeng Wang, and Yuan Tian. 2026. The landscape of prompt injection threats in llm agents: From taxonomy to analysis.arXiv preprint arXiv:2602.10453(2026)

arXiv 2026
[35]

Qingtian Wang. 2025. The Comprehensive Review on Prompt Injection At- tacks and Defense Mechanisms in Large Language Models.Science and Tech- nology of Engineering, Chemistry and Environmental Protection(2025). https: //api.semanticscholar.org/CorpusID:279511010

2025
[36]

Xiaomi MiMo Team. 2025. MiMo-V2.5-Pro. https://mimo .xiaomi.com/mimo-v2- 5-pro

2025
[37]

Chenyu Zhou, Huacan Chai, Wenteng Chen, Zihan Guo, Rong Shan, Yuanyi Song, Tianyi Xu, Yingxuan Yang, Aofan Yu, Weiming Zhang, et al . 2026. Ex- ternalization in llm agents: A unified review of memory, skills, protocols and harness engineering.arXiv preprint arXiv:2604.08224(2026). 12

Pith/arXiv arXiv 2026

[1] [1]

Agent Skills Marketplace

2026. Agent Skills Marketplace. https://skillsmp.com/

2026

[2] [2]

The Artifacts of SkillGuard

2026. The Artifacts of SkillGuard. https://github .com/Dianshu-Liao/SkilLGuard

2026

[3] [3]

Amirhossein Abaskohi, Amrutha Varshini Ramesh, Shailesh Nanisetty, Chirag Goel, David Vazquez, Christopher Pal, Spandana Gella, Giuseppe Carenini, and Issam H Laradji. 2025. AgentAda: Skill-Adaptive Data Analytics for Tailored Insight Discovery.arXiv preprint arXiv:2504.07421(2025)

arXiv 2025

[4] [4]

Anthropic. 2025. Claude Sonnet 4.6. https://www .anthropic.com/claude/sonnet

2025

[5] [5]

Anthropic. 2025. What is the Model Context Protocol (MCP)? https:// modelcontextprotocol.io/docs/getting-started/intro

2025

[6] [6]

Anthropic. 2026. Claude Code Docs. https://code .claude.com/docs/en/overview

2026

[7] [7]

Bhavyansh. 2026. MCP vs Agent Skills: Which AI Architecture Pattern to Use . https://bhavyansh001 .medium.com/mcp-vs-agent-skills-which-ai- architecture-pattern-to-use-mcp-deepdive-03-6a42185d9e7b

2026

[8] [8]

Christoph Bühler, Matteo Biagiola, Luca Di Grazia, and Guido Salvaneschi. 2026. AgentBound: Securing Execution Boundaries of AI Agents. InProceedings of the 34th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (FSE)(Montreal, Canada, 2026-07) (2026, Vol. 3). ACM, New York, NY, USA, Article FSE096...

doi:10.1145/3808103 2026

[9] [9]

Edoardo Debenedetti, Jie Zhang, Mislav Balunovi’c, Luca Beurer-Kellner, Marc Fischer, and Florian Tramèr. 2024. AgentDojo: A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents.ArXivabs/2406.13352 (2024)

Pith/arXiv arXiv 2024

[10] [10]

Kai Greshake, Sahar Abdelnabi, Shailesh Mishra, Christoph Endres, Thorsten Holz, and Mario Fritz. 2023. More than you’ve asked for: A Comprehensive Analysis of Novel Prompt Injection Threats to Application-Integrated Large Language Models.ArXivabs/2302.12173 (2023). https://api .semanticscholar.org/ CorpusID:257102404

Pith/arXiv arXiv 2023

[11] [11]

Kai Greshake, Sahar Abdelnabi, Shailesh Mishra, Christoph Endres, Thorsten Holz, and Mario Fritz. 2023. Not What You’ve Signed Up For: Compromising Real- World LLM-Integrated Applications with Indirect Prompt Injection.Proceedings of the 16th ACM Workshop on Artificial Intelligence and Security(2023). https: //api.semanticscholar.org/CorpusID:258546941

2023

[12] [12]

Tingxu Han, Yi Zhang, Wei Song, Chunrong Fang, Zhenyu Chen, Youcheng Sun, and Lijie Hu. 2026. SWE-Skills-Bench: Do Agent Skills Actually Help in Real-World Software Engineering?arXiv preprint arXiv:2603.15401(2026)

arXiv 2026

[13] [13]

Yinghan Hou and Zongyou Yang. 2026. Skillsieve: A hierarchical triage framework for detecting malicious ai agent skills.arXiv preprint arXiv:2604.06550(2026)

Pith/arXiv arXiv 2026

[14] [14]

Xiaojun Jia, Jie Liao, Simeng Qin, Jindong Gu, Wenqi Ren, Xiaochun Cao, Yang Liu, and Philip Torr. 2026. Skillject: Automating stealthy skill-based prompt injection for coding agents with trace-driven closed-loop refinement. InThe 6th Workshop of Adversarial Machine Learning on Computer Vision: Safety of Vision-Language Agents

2026

[15] [15]

Yanna Jiang, Delong Li, Haiyu Deng, Baihe Ma, Xu Wang, Qin Wang, and Guang- sheng Yu. 2026. SoK: Agentic Skills–Beyond Tool Use in LLM Agents.arXiv preprint arXiv:2602.20867(2026)

Pith/arXiv arXiv 2026

[16] [16]

J Richard Landis and Gary G. Koch. 1977. An application of hierarchical kappa- type statistics in the assessment of majority agreement among multiple observers. Biometrics33 2 (1977), 363–74

1977

[17] [17]

Xiangyi Li, Wenbo Chen, Yimin Liu, Shenghan Zheng, Xiaokun Chen, Yifeng He, Yubo Li, Bingran You, Haotian Shen, Jiankai Sun, et al. 2026. SkillsBench: Benchmarking how well agent skills work across diverse tasks.arXiv preprint arXiv:2602.12670(2026)

Pith/arXiv arXiv 2026

[18] [18]

George Ling, Shanshan Zhong, and Richard Huang. 2026. Agent skills: A data- driven analysis of claude skills for extending large language model functionality. arXiv preprint arXiv:2602.08004(2026)

arXiv 2026

[19] [19]

Pei Liu, Li Li, Yanjie Zhao, Xiaoyu Sun, and John Grundy. 2020. Androzooopen: Collecting large-scale open source android apps for the research community. In Proceedings of the 17th International Conference on Mining Software Repositories. 548–552

2020

[20] [20]

Yi Liu, Zhihao Chen, Yanjun Zhang, Gelei Deng, Yuekang Li, Jianting Ning, Ying Zhang, and Leo Yu Zhang. 2026. Malicious agent skills in the wild: A large-scale security empirical study.arXiv preprint arXiv:2602.06547(2026)

Pith/arXiv arXiv 2026

[21] [21]

Eugene Neelou, Ivan Novikov, Max Moroz, Om Narayan, Tiffany Saade, Mika Ayenson, Ilya Kabanov, Jen Ozmen, Edward Lee, Vineeth Sai Narajala, et al

[22] [22]

A2AS: agentic AI runtime security and Self-Defense.arXiv preprint arXiv:2510.13825(2025)

arXiv 2025

[23] [23]

Linyue Pan, Lexiao Zou, Shuo Guo, Jingchen Ni, and Hai-Tao Zheng. 2026. Natural-language agent harnesses.arXiv preprint arXiv:2603.25723(2026)

Pith/arXiv arXiv 2026

[24] [24]

Yubin Qu, Yi Liu, Tongcheng Geng, Gelei Deng, Yuekang Li, Leo Yu Zhang, Ying Zhang, and Lei Ma. 2026. Supply-Chain Poisoning Attacks Against LLM Coding Agent Skill Ecosystems

2026

[25] [25]

Brandon Radosevich and John Halloran. 2025. Mcp safety audit: Llms with the model context protocol allow major security exploits.arXiv preprint arXiv:2504.03767(2025)

Pith/arXiv arXiv 2025

[26] [26]

Jerome H Saltzer and Michael D Schroeder. 1975. The protection of information in computer systems.Proc. IEEE63, 9 (1975), 1278–1308

1975

[27] [28]

David Schmotz, Luca Beurer-Kellner, Sahar Abdelnabi, and Maksym An- driushchenko. 2026. Skill-Inject: Measuring Agent Vulnerability to Skill File Attacks.ArXivabs/2602.20156 (2026)

Pith/arXiv arXiv 2026

[28] [29]

Abdullah AL Mamun, Yu Fu, Pedram Zaree, Yue Dong, and Nael B

Erfan Shayegani, Md. Abdullah AL Mamun, Yu Fu, Pedram Zaree, Yue Dong, and Nael B. Abu-Ghazaleh. 2023. Survey of Vulnerabilities in Large Language 11 Models Revealed by Adversarial Attacks.ArXivabs/2310.10844 (2023). https: //api.semanticscholar.org/CorpusID:264172191

Pith/arXiv arXiv 2023

[29] [30]

Tianneng Shi, Jingxuan He, Zhun Wang, Hongwei Li, Linyu Wu, Wenbo Guo, and Dawn Song. 2025. Progent: Programmable privilege control for llm agents. arXiv preprint arXiv:2504.11703(2025)

Pith/arXiv arXiv 2025

[30] [31]

SkillsMP. 2026. SkillsMP: A Marketplace for Agent Skills. https://skillsmp .com/. Accessed: 2026-06-01

2026

[31] [32]

Xiaoyu Sun, Xiao Chen, Li Li, Haipeng Cai, John Grundy, Jordan Samhi, Tegawendé Bissyandé, and Jacques Klein. 2023. Demystifying hidden sensi- tive operations in android apps.ACM Transactions on Software Engineering and Methodology32, 2 (2023), 1–30

2023

[32] [33]

Guiyao Tie, Jiawen Shi, Pan Zhou, and Lichao Sun. 2026. Badskill: Backdoor at- tacks on agent skills via model-in-skill poisoning.arXiv preprint arXiv:2604.09378 (2026)

Pith/arXiv arXiv 2026

[33] [34]

Peiran Wang, Xinfeng Li, Chong Xiang, Jinghuai Zhang, Ying Li, Lixia Zhang, Xiaofeng Wang, and Yuan Tian. 2026. The landscape of prompt injection threats in llm agents: From taxonomy to analysis.arXiv preprint arXiv:2602.10453(2026)

arXiv 2026

[34] [35]

Qingtian Wang. 2025. The Comprehensive Review on Prompt Injection At- tacks and Defense Mechanisms in Large Language Models.Science and Tech- nology of Engineering, Chemistry and Environmental Protection(2025). https: //api.semanticscholar.org/CorpusID:279511010

2025

[35] [36]

Xiaomi MiMo Team. 2025. MiMo-V2.5-Pro. https://mimo .xiaomi.com/mimo-v2- 5-pro

2025

[36] [37]

Chenyu Zhou, Huacan Chai, Wenteng Chen, Zihan Guo, Rong Shan, Yuanyi Song, Tianyi Xu, Yingxuan Yang, Aofan Yu, Weiming Zhang, et al . 2026. Ex- ternalization in llm agents: A unified review of memory, skills, protocols and harness engineering.arXiv preprint arXiv:2604.08224(2026). 12

Pith/arXiv arXiv 2026