pith. machine review for the scientific record. sign in

arxiv: 2605.05868 · v1 · submitted 2026-05-07 · 💻 cs.CR

Recognition: unknown

SkillScope: Toward Fine-Grained Least-Privilege Enforcement for Agent Skills

Authors on Pith no claims yet

Pith reviewed 2026-05-08 09:29 UTC · model grok-4.3

classification 💻 cs.CR
keywords Agent SkillsLeast-Privilege EnforcementOver-Privilege DetectionLLM AgentsGraph-Based AnalysisSecurityPrivilege Constraining
0
0 comments X

The pith

SkillScope models agent skills as graphs of instruction and code actions to detect and constrain those that exceed the needs of a specific user task.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces SkillScope as a way to enforce least-privilege on reusable LLM agent skills, which bundle instructions and code that may perform actions unnecessary for the current user request. Existing checks fail because over-privilege is conditional on the exact task, so the same skill action can be legitimate in one prompt and excessive in another. SkillScope builds fine-grained graphs linking natural-language steps to executable operations, identifies candidate violations, replays them against task-specific instantiations to confirm over-privilege, and then inserts control-flow restrictions to block only the excess actions. Evaluation shows it reaches 94.53 percent F1 on detection and cuts over-privileged action instances by 88.56 percent across thousands of real skills while still completing intended tasks.

Core claim

SkillScope achieves fine-grained least-privilege enforcement for Agent Skills by representing instruction-level procedures and code-level operations as fine-grained action nodes in a graph, extracting potential over-privilege candidates from this structure, validating candidates through replay-based analysis under graph-instantiated user tasks, and applying control-flow privilege constraining to validated violations.

What carries the argument

Graph-based modeling of instruction procedures and code operations as fine-grained action nodes, combined with replay validation on task-instantiated graphs and subsequent control-flow constraining.

If this is right

  • Least-privilege violations appear common, with over 7,000 real-world skills exhibiting over-privileged behaviors.
  • The reduction of 88.56 percent in triggered over-privileged actions occurs while still allowing normal task success rates.
  • The same graph and replay approach can be applied at scale to audit entire skill repositories before deployment.
  • Control-flow constraining offers a practical way to limit skill behavior without rewriting the underlying code.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Skill marketplaces could adopt similar checks as a required gate before listing new skills.
  • Integration into agent runtimes might allow dynamic privilege adjustment as user prompts evolve during a session.
  • The technique could generalize to other reusable capability bundles beyond LLM agents, such as browser extensions or API toolkits.

Load-bearing premise

The graph model plus replay validation correctly distinguishes actions that are required for the current task from those that are over-privileged, without wrongly blocking legitimate task completions.

What would settle it

A collection of agent skills containing known task-conditioned over-privileges where SkillScope either fails to flag genuine excesses or incorrectly constrains actions needed for legitimate task completion.

Figures

Figures reproduced from arXiv: 2605.05868 by Huaijin Wang, Jiangrong Wu, Shuai Wang, Yixi Lin, Yuhong Nan, Yuming Xiao, Zibin Zheng.

Figure 1
Figure 1. Figure 1: Progressive disclosure in Agent Skill execution. view at source ↗
Figure 2
Figure 2. Figure 2: A skill introduces instruction-level (left) and view at source ↗
Figure 3
Figure 3. Figure 3: Overview of SkillScope. 3 Overview of SkillScope Research Goal and Threat Model. In this paper, our goal is to build a fine-grained, task-conditioned least-privilege enforcement framework for Agent Skills. Concretely, given a skill 𝜎 and a user prompt 𝑝, we aim to determine whether 𝜎 performs actions that are unnecessary for fulfilling the intended task, and to localize the specific behaviors responsible f… view at source ↗
Figure 4
Figure 4. Figure 4: The process of unified execution graph construction. Unified Execution Graph Construction. Formally, given a skill bundle 𝜎, SkillScope constructs a directed graph 𝐺 = (𝑉 , 𝐸), where 𝑉 = 𝑉𝐼 ∪ 𝑉𝐶 ∪ 𝑉𝑃 denotes instruction-level action nodes, code￾level action nodes, and predicate nodes, respectively, and 𝐸 denotes execution dependency edges among these nodes. • Action and Predicate Nodes. An action node is t… view at source ↗
Figure 5
Figure 5. Figure 5: The process of user task (user prompt) instantiation. Graph-guided Task Instantiation. In this step, the full natural￾language task space of a skill is difficult to obtain exhaustively due to the open-ended nature of user intents, and the diversity of semantically equivalent prompt expressions. To overcome this chal￾lenge, SkillScope uses the unified execution graph as a structural approximation of the ski… view at source ↗
Figure 7
Figure 7. Figure 7: Task-conditioned over-privilege control. view at source ↗
read the original abstract

Agent Skills have become a practical way to extend LLM agents by packaging metadata, natural-language instructions, and executable resources into reusable capability bundles. However, this growing Skill ecosystem introduces a new compliance risk: a Skill may perform high-impact actions that exceed the minimum necessary scope of the user's current task, thereby violating least-privilege. Existing skill detection approaches are insufficient for this problem because it is inherently task-conditioned: the same action may be necessary under one user prompt but over-privileged under another. In this paper, we present SkillScope, a framework for fine-grained least-privilege enforcement in Agent Skills. SkillScope adopts a graph-based analysis approach that models instruction-level procedures and code-level operations as fine-grained action nodes. It extracts potential over-privilege candidates, validates them under graph-instantiated user tasks through replay-based analysis, and constrains validated over-privileged actions via control-flow privilege constraining. We evaluate SkillScope through effectiveness experiments and large-scale real-world measurement. SkillScope achieves 94.53% F1 for skill over-privilege detection. In the wild, SkillScope validates 7,039 Skills with over-privileged behaviors, showing that least-privilege violations are prevalent in current Skill ecosystems. In the privilege-constraining evaluation, SkillScope reduces triggered over-privileged action-in-task instances by 88.56% while preserving legitimate task completion.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper introduces SkillScope, a framework for fine-grained least-privilege enforcement in Agent Skills for LLM agents. It models instruction-level procedures and code-level operations as fine-grained action nodes in a graph, extracts potential over-privilege candidates, validates them via replay-based analysis under graph-instantiated user tasks, and constrains validated over-privileged actions through control-flow mechanisms. The evaluation reports 94.53% F1 for over-privilege detection, validates 7,039 skills with over-privileged behaviors in the wild, and achieves an 88.56% reduction in triggered over-privileged action-in-task instances while preserving legitimate task completion.

Significance. If the results hold, the work addresses an important emerging security and compliance issue in the expanding ecosystem of reusable Agent Skills by providing a task-conditioned enforcement approach. The graph-based modeling combined with replay validation offers a practical way to handle the context-dependent nature of over-privilege, and the large-scale wild measurement provides empirical evidence of the problem's prevalence. Strengths include the focus on preserving task completion alongside privilege reduction.

major comments (2)
  1. [Abstract and Evaluation] Abstract and Evaluation section: The reported 94.53% F1 score and 88.56% reduction lack any description of baselines, error bars, dataset selection criteria, or the exact protocol for establishing ground-truth labels on task-conditioned over-privilege (i.e., how necessity under one prompt vs. over-privilege under another is determined). This directly undermines verifiability of the central effectiveness claims.
  2. [Methodology] Methodology section: The graph construction for instruction-level procedures and code-level operations, along with the replay-based validation process, is described only at a high level. Without specifics on node granularity, task instantiation mechanics, or controls for false positives that could incorrectly constrain legitimate actions, the soundness of the over-privilege identification cannot be fully assessed.
minor comments (1)
  1. [Abstract] The abstract would benefit from including the total number of skills or tasks analyzed in the effectiveness experiments to provide context for the 7,039 validated skills figure.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive feedback. We address the major comments point-by-point below, agreeing that additional details are needed for verifiability and soundness assessment. We will incorporate the suggested clarifications in the revised manuscript.

read point-by-point responses
  1. Referee: [Abstract and Evaluation] Abstract and Evaluation section: The reported 94.53% F1 score and 88.56% reduction lack any description of baselines, error bars, dataset selection criteria, or the exact protocol for establishing ground-truth labels on task-conditioned over-privilege (i.e., how necessity under one prompt vs. over-privilege under another is determined). This directly undermines verifiability of the central effectiveness claims.

    Authors: We agree that the abstract is high-level and that the Evaluation section would benefit from explicit discussion of these elements to strengthen verifiability. The manuscript describes the dataset and ground-truth process at a summary level, but we will revise the Evaluation section to add: (1) explicit baselines (e.g., rule-based and LLM-only detectors), (2) error bars from repeated runs with statistical significance, (3) dataset selection criteria with inclusion/exclusion details, and (4) a dedicated subsection on the ground-truth protocol, including how task-conditioned necessity is distinguished from over-privilege via prompt variation and expert validation. These changes will directly address the concern without altering the reported results. revision: yes

  2. Referee: [Methodology] Methodology section: The graph construction for instruction-level procedures and code-level operations, along with the replay-based validation process, is described only at a high level. Without specifics on node granularity, task instantiation mechanics, or controls for false positives that could incorrectly constrain legitimate actions, the soundness of the over-privilege identification cannot be fully assessed.

    Authors: We acknowledge that the Methodology section presents the graph modeling and replay validation conceptually to focus on the overall framework. To enable full assessment of soundness, we will expand Section 3 with: (1) precise node granularity definitions (e.g., atomic instruction steps and code API calls with examples), (2) detailed task instantiation mechanics (how user prompts are mapped to graph paths), and (3) explicit false-positive controls, including validation thresholds, replay consistency checks, and manual audit procedures for constrained actions. These additions will clarify how legitimate actions are preserved while identifying over-privilege. revision: yes

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The provided abstract and description outline SkillScope as a graph-based framework that models procedures, extracts over-privilege candidates, validates via replay, and constrains actions, with results reported from separate effectiveness experiments (94.53% F1) and real-world measurements (7,039 skills, 88.56% reduction). No equations, parameter-fitting steps, self-citations as load-bearing premises, or derivations that reduce outputs to inputs by construction appear in the text. Claims rest on external experimental validation rather than self-referential definitions or renamed fits. The work is self-contained against the given description.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no explicit free parameters, axioms, or invented entities can be extracted. The approach implicitly relies on choices in graph construction and replay simulation that may function as unstated modeling assumptions.

pith-pipeline@v0.9.0 · 5561 in / 1154 out tokens · 51325 ms · 2026-05-08T09:29:34.954918+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

66 extracted references · 24 canonical work pages · 6 internal anchors

  1. [1]

    Alice-Dot-Io. 2026. Caterpillar: Caterpillar is a security scanning library for AI agent skill files (e.g., Claude Code skills) for dangerous or malicious behavior. https://github.com/alice-dot-io/caterpillar

  2. [2]

    Anthropic. 2026. Claude Code. https://www.anthropic.com/product/claude-code. Accessed: 2026-04-19

  3. [3]

    Anysphere. 2026. Cursor: The Best Way to Code with AI. https://cursor.com/. Accessed: 2026-04-19

  4. [4]

    Shih-Han Chan. 2025. Encrypted Prompt: Securing LLM Applications Against Unauthorized Actions. arXiv:2503.23250 [cs.CR] https://arxiv.org/abs/2503.23250

  5. [5]

    Sivajeet Chand, Melih Kilic, Roland Würsching, Sushant Kumar Pandey, and Alexander Pretschner. 2025. Automated Extract Method Refactoring with Open- Source LLMs: A Comparative Study. In2025 2nd IEEE/ACM International Confer- ence on AI-powered Software (AIware). IEEE, 113–122

  6. [6]

    Sizhe Chen, Julien Piet, Chawin Sitawarin, and David Wagner. 2025. {StruQ}: Defending against prompt injection with structured queries. In34th USENIX Security Symposium (USENIX Security 25). 2383–2400

  7. [7]

    Sizhe Chen, Arman Zharmagambetov, Saeed Mahloujifar, Kamalika Chaudhuri, David Wagner, and Chuan Guo. 2025. Secalign: Defending against prompt injection with preference optimization. InProceedings of the 2025 ACM SIGSAC Conference on Computer and Communications Security. 2833–2847

  8. [8]

    Cisco AI Defense. 2026. Skill Scanner. https://github.com/cisco-ai-defense/skill- scanner. Accessed: 2026-04-21

  9. [9]

    ClawHub. 2026. ClawHub. https://clawhub.ai/. Accessed: 2026-04-19

  10. [10]

    Edoardo Debenedetti, Jie Zhang, Mislav Balunović, Luca Beurer-Kellner, Marc Fischer, and Florian Tramèr. 2024. AgentDojo: A Dynamic Environment to Evaluate Prompt Injection Attacks and Defenses for LLM Agents.arXiv preprint arXiv:2406.13352(2024)

  11. [11]

    Gelei Deng, Yi Liu, Yuekang Li, Kailong Wang, Ying Zhang, Zefeng Li, Haoyu Wang, Tianwei Zhang, and Yang Liu. 2023. Masterkey: Automated jailbreak across multiple large language model chatbots.arXiv preprint arXiv:2307.08715 (2023)

  12. [12]

    Tian Dong, Minhui Xue, Guoxing Chen, Rayne Holland, Yan Meng, Shaofeng Li, Zhen Liu, and Haojin Zhu. 2023. The philosopher’s stone: Trojaning plugins of large language models.arXiv preprint arXiv:2312.00374(2023)

  13. [13]

    Zenghao Duan, Yuxin Tian, Zhiyi Yin, Liang Pang, Jingcheng Deng, Zihao Wei, Shicheng Xu, Yuyao Ge, and Xueqi Cheng. 2026. SkillAttack: Automated Red Teaming of Agent Skills through Attack Path Refinement.arXiv preprint arXiv:2604.04989(2026)

  14. [14]

    Manuel Egele, Christopher Kruegel, Engin Kirda, and Giovanni Vigna. 2011. PiOS: Detecting Privacy Leaks in iOS Applications. InProceedings of the Network and Distributed System Security Symposium

  15. [15]

    Cox, Jaeyeon Jung, Patrick McDaniel, and Anmol N

    William Enck, Peter Gilbert, Byung-Gon Chun, Landon P. Cox, Jaeyeon Jung, Patrick McDaniel, and Anmol N. Sheth. 2010. TaintDroid: An Information-Flow Tracking System for Realtime Privacy Monitoring on Smartphones. InProceedings of the 9th USENIX Symposium on Operating Systems Design and Implementation

  16. [16]

    European Data Protection Supervisor. 2026. Data Minimisation. https://www. edps.europa.eu/data-protection/data-protection/glossary/d_en. Accessed: 2026-04-21

  17. [17]

    European Union. 2016. General Data Protection Regulation (GDPR), Article 5: Principles relating to processing of personal data. https://gdpr-info.eu/art-5- gdpr/. Accessed: 2026-04-21

  18. [18]

    Adrienne Porter Felt, Erika Chin, Steve Hanna, Dawn Song, and David Wagner

  19. [19]

    InProceedings of the 18th ACM Conference on Computer and Communications Security (CCS)

    Android Permissions Demystified. InProceedings of the 18th ACM Conference on Computer and Communications Security (CCS). ACM, 627–638. https://doi.or g/10.1145/2046707.2046779

  20. [20]

    Zihan Guo, Zhiyu Chen, Xiaohang Nie, Jianghao Lin, Yuanjian Zhou, and Weinan Zhang. 2026. SkillProbe: Security Auditing for Emerging Agent Skill Marketplaces via Multi-Agent Collaboration.arXiv preprint arXiv:2603.21019(2026)

  21. [21]

    Shin, and Karl Aberer

    Hamza Harkous, Kassem Fawaz, Rémi Lebret, Florian Schaub, Kang G. Shin, and Karl Aberer. 2018. Polisis: Automated Analysis and Presentation of Pri- vacy Policies Using Deep Learning. InProceedings of the 27th USENIX Security Symposium

  22. [22]

    Information Commissioner’s Office. 2026. Principle (c): Data minimisation. https: //ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/data-protecti on-principles/a-guide-to-the-data-protection-principles/data-minimisation/. Accessed: 2026-04-21

  23. [23]

    Xiaojun Jia, Jie Liao, Simeng Qin, Jindong Gu, Wenqi Ren, Xiaochun Cao, Yang Liu, and Philip Torr. 2026. Skillject: Automating stealthy skill-based prompt injection for coding agents with trace-driven closed-loop refinement.arXiv preprint arXiv:2602.14211(2026)

  24. [24]

    Juhee Kim, Woohyuk Choi, and Byoungyoung Lee. 2025. Prompt flow integrity to prevent privilege escalation in llm agents.arXiv preprint arXiv:2503.15547 (2025)

  25. [25]

    Hao Li, Xiaogeng Liu, Hung-Chun Chiu, Dianqi Li, Ning Zhang, and Chaowei Xiao. 2025. Drift: Dynamic rule-based defense with injection isolation for securing llm agents.arXiv preprint arXiv:2506.12104(2025)

  26. [26]

    Yixi Lin, Jiangrong Wu, Yuhong Nan, Xueqiang Wang, Xinyuan Zhang, and Zibin Zheng. 2026. AgentRaft: Automated Detection of Data Over-Exposure in LLM Agents.arXiv preprint arXiv:2603.07557(2026)

  27. [27]

    Yen-Ting Lin and Yun-Nung Chen. 2023. LLM-Eval: Unified Multi-Dimensional Automatic Evaluation for Open-Domain Conversations with Large Language Models. InProceedings of the 5th Workshop on NLP for Conversational AI

  28. [28]

    Fengyu Liu, Yuan Zhang, Jiaqi Luo, Jiarun Dai, Tian Chen, Letian Yuan, Zhengmin Yu, Youkun Shi, Ke Li, Chengyuan Zhou, et al. 2025. Make agent defeat agent: Automatic detection of {Taint-Style} vulnerabilities in {LLM-based} agents. In 34th USENIX Security Symposium (USENIX Security 25). 3767–3786

  29. [29]

    Tong Liu, Zizhuang Deng, Guozhu Meng, Yuekang Li, and Kai Chen. 2024. De- mystifying rce vulnerabilities in llm-integrated apps. InProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security. 1716–1730

  30. [30]

    Yi Liu, Zhihao Chen, Yanjun Zhang, Gelei Deng, Yuekang Li, Jianting Ning, Ying Zhang, and Leo Yu Zhang. 2026. Malicious agent skills in the wild: A large-scale security empirical study.arXiv preprint arXiv:2602.06547(2026)

  31. [31]

    Yi Liu, Gelei Deng, Yuekang Li, Kailong Wang, Zihao Wang, Xiaofeng Wang, Tianwei Zhang, Yepang Liu, Haoyu Wang, Yan Zheng, et al. 2023. Prompt injection attack against llm-integrated applications.arXiv preprint arXiv:2306.05499(2023)

  32. [32]

    Yang Liu, Dan Iter, Yichong Xu, Shuohang Wang, Ruochen Xu, and Chenguang Zhu. 2023. G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment. InProceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

  33. [33]

    Yupei Liu, Yuqi Jia, Runpeng Geng, Jinyuan Jia, and Neil Zhenqiang Gong. 2024. Formalizing and benchmarking prompt injection attacks and defenses. In33rd USENIX Security Symposium (USENIX Security 24). 1831–1847

  34. [34]

    Yi Liu, Weizhe Wang, Ruitao Feng, Yao Zhang, Guangquan Xu, Gelei Deng, Yuekang Li, and Leo Zhang. 2026. Agent Skills in the Wild: An Empirical Study of Security Vulnerabilities at Scale.arXiv preprint arXiv:2601.10338(2026)

  35. [35]

    MockLoop. 2025. MockLoop MCP Documentation. https://docs.mockloop.com/. Accessed: 2026-04-29

  36. [36]

    Model Context Protocol. 2025. Filesystem MCP Server. https://github.com/mod elcontextprotocol/servers/blob/main/src/filesystem/README.md. Accessed: 2026-04-29

  37. [37]

    Model Context Protocol. 2025. mcp-server-git. https://pypi.org/project/mcp- server-git/. Accessed: 2026-04-29

  38. [38]

    2020.Security and Privacy Controls for Information Systems and Organizations

    National Institute of Standards and Technology. 2020.Security and Privacy Controls for Information Systems and Organizations. Special Publication 800-53 Rev. 5. National Institute of Standards and Technology. Control AC-6: Least Privilege

  39. [39]

    National Institute of Standards and Technology. 2025. Least Privilege. https: //csrc.nist.gov/glossary/term/least_privilege. Accessed: 2026-04-29

  40. [40]

    Nova-Hunting. 2026. Nova-Proximity: Nova-Proximity is a MCP and Agent Skills security scanner powered with NOVA. https://github.com/Nova-Hunting/nova- proximity

  41. [41]

    OpenAI. 2025. GPT-5.1. https://openai.com/index/gpt-5-1/. Accessed: 2026-04-19

  42. [42]

    OpenAI. 2026. Codex: AI Coding Partner from OpenAI. https://openai.com/cod ex/. Accessed: 2026-04-19

  43. [43]

    Rodrigo Pedro, Miguel E Coimbra, Daniel Castro, Paulo Carreira, and Nuno Santos. 2025. Prompt-to-SQL injections in LLM-integrated web applications: Risks and defenses. InProceedings of the IEEE/ACM 47th International Conference on Software Engineering. 1768–1780

  44. [44]

    Dorin Pomian, Abhiram Bellur, Malinda Dilhara, Zarina Kurbatova, Egor Bogo- molov, Andrey Sokolov, Timofey Bryksin, and Danny Dig. 2024. Em-assist: Safe automated extractmethod refactoring with llms. InCompanion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering. 582–586

  45. [45]

    Wang, and Crispin Cowan

    Franziska Roesner, Tadayoshi Kohno, Alexander Moshchuk, Bryan Parno, He- len J. Wang, and Crispin Cowan. 2012. User-Driven Access Control: Rethink- ing Permission Granting in Modern Operating Systems. InProceedings of the Jiangrong Wu, Yuhong Nan, Yixi Lin, Huaijin Wang, Yuming Xiao, Shuai Wang, and Zibin Zheng 2012 IEEE Symposium on Security and Privacy ...

  46. [46]

    The Protection of Information in Computer Systems

    Jerome H. Saltzer and Michael D. Schroeder. 1975. The Protection of Information in Computer Systems.Proc. IEEE63, 9 (1975), 1278–1308. https://doi.org/10.110 9/PROC.1975.9939

  47. [47]

    sam2332. 2025. MCP Server Placeholder Image Generator. https://github.com/s am2332/Mcp-Server-Placeholder-Image-Generator. Accessed: 2026-04-29

  48. [48]

    David Schmotz, Sahar Abdelnabi, and Maksym Andriushchenko. 2025. Agent Skills Enable a New Class of Realistic and Trivially Simple Prompt Injections. arXiv preprint arXiv:2510.26328(2025)

  49. [49]

    David Schmotz, Luca Beurer-Kellner, Sahar Abdelnabi, and Maksym An- driushchenko. 2026. Skill-inject: Measuring agent vulnerability to skill file attacks. arXiv preprint arXiv:2602.20156(2026)

  50. [50]

    Tianneng Shi, Jingxuan He, Zhun Wang, Hongwei Li, Linyu Wu, Wenbo Guo, and Dawn Song. 2025. Progent: Programmable privilege control for llm agents. arXiv preprint arXiv:2504.11703(2025)

  51. [51]

    SkillsMP. 2026. Agent Skills Marketplace. https://skillsmp.com/. Accessed: 2026-04-19

  52. [52]

    Xuchen Suo. 2024. Signed-prompt: A new approach to prevent prompt injec- tion attacks against llm-integrated applications. InAIP Conference Proceedings, Vol. 3194. AIP Publishing LLC, 040013

  53. [53]

    Liran Tal. 2026. Snyk Finds Prompt Injection in 36%, 1467 Malicious or Vulnerable AI Agent Skills in ClawHub. https://snyk.io/blog/toxicskills-malicious-ai-agent- skills-clawhub/. Accessed: 2026-04-21

  54. [54]

    TRAE. 2026. TRAE: Collaborate with Intelligence. https://www.trae.ai/. Accessed: 2026-04-19

  55. [55]

    vivekVells. 2025. mcp-pandoc: MCP Server for Document Format Conversion Using Pandoc. https://github.com/vivekVells/mcp-pandoc. Accessed: 2026-04-29

  56. [56]

    elementary, my dear watson

    Shenao Wang, Junjie He, Yanjie Zhao, Yayi Wang, Kan Yu, and Haoyu Wang. 2026. " Elementary, My Dear Watson. " Detecting Malicious Skills via Neuro-Symbolic Reasoning across Heterogeneous Artifacts.arXiv preprint arXiv:2603.27204(2026)

  57. [57]

    weidwonder. 2025. Terminal MCP Server. https://github.com/weidwonder/term inal-mcp-server. Accessed: 2026-04-29

  58. [58]

    WellAlly Technology. 2026. Skill-Security-Scanner. https://github.com/huifer/sk ill-security-scan. GitHub repository. Accessed: 2026-04-21

  59. [59]

    Jiangrong Wu, Zitong Yao, Yuhong Nan, and Zibin Zheng. 2026. ChainFuzzer: Greybox Fuzzing for Workflow-Level Multi-Tool Vulnerabilities in LLM Agents. arXiv preprint arXiv:2603.12614(2026)

  60. [60]

    Yisen Xu, Feng Lin, Jinqiu Yang, Nikolaos Tsantalis, et al. 2025. Mantra: Enhancing automated method-level refactoring with contextual RAG and multi-agent LLM collaboration.arXiv preprint arXiv:2503.14340(2025)

  61. [61]

    Hanrong Zhang, Jingyuan Huang, Kai Mei, Yifei Yao, Zhenting Wang, Chenlu Zhan, Hongwei Wang, and Yongfeng Zhang. 2024. Agent Security Bench (ASB): Formalizing and Benchmarking Attacks and Defenses in LLM-based Agents. arXiv preprint arXiv:2410.02644(2024)

  62. [62]

    Jinchuan Zhang, Yan Zhou, Binyuan Hui, Yaxin Liu, Ziming Li, and Songlin Hu

  63. [63]

    In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

    Trojansql: Sql injection against natural language interface to database. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. 4344–4359

  64. [64]

    Ruiyi Zhang, David Sullivan, Kyle Jackson, Pengtao Xie, and Mei Chen

  65. [65]

    arXiv:2504.07467 [cs.CL] https://arxiv.org/abs/2504.07467

    Defense against Prompt Injection Attacks via Mixture of Encodings. arXiv:2504.07467 [cs.CL] https://arxiv.org/abs/2504.07467

  66. [66]

    send”, “sync

    Lianmin Zheng, Wei-Lin Chiang, Ying Sheng, Siyuan Zhuang, Zhanghao Wu, Yonghao Zhuang, Zi Lin, Zhuohan Li, Dacheng Li, Eric P. Xing, Hao Zhang, Joseph E. Gonzalez, and Ion Stoica. 2023. Judging LLM-as-a-Judge with MT- Bench and Chatbot Arena. InAdvances in Neural Information Processing Systems. A Quality of Unified Execution Graph Construction Table 12: T...