{"total":20,"items":[{"citing_arxiv_id":"2606.11671","ref_index":1,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Runtime Skill Audit: Targeted Runtime Probing for Agent Skill Security","primary_cat":"cs.CR","submitted_at":"2026-06-10T05:29:34+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Runtime Skill Audit introduces targeted runtime probing to detect malicious LLM agent skills, reporting 90% accuracy and resilience to self-evolving attacks on 100 skills versus static baselines.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.00925","ref_index":32,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Benchmarking Security Risk Detection and Verification in Open Agentic Skill Ecosystems","primary_cat":"cs.CR","submitted_at":"2026-05-30T23:19:30+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"SkillVetBench is a two-stage benchmark combining natural-language semantic vetting and instrumented sandbox execution to detect and provide runtime evidence for malicious skills in open agent platforms, with experiments showing static methods miss up to 89% of threats.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.00448","ref_index":3,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"When Safe Skills Collide: Measuring Compositional Risk in Agent Skill Ecosystems","primary_cat":"cs.SE","submitted_at":"2026-05-30T00:38:07+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"About 18.2% of structurally flagged skill pairs represent genuine compositional safety risks in agent skill registries, with exploitation gated by host model behavior.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.20631","ref_index":18,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Harnessing Agent Skills: Architectural Patterns and a Reference Architecture for Skill-Mediated LLM Agents","primary_cat":"cs.AI","submitted_at":"2026-05-29T02:12:24+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Catalogs ten patterns and synthesizes a four-layer reference architecture for skill harnessing in LLM agents, evaluated via cross-instantiation on eight systems.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.14460","ref_index":18,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"Exploiting LLM Agent Supply Chains via Payload-less Skills","primary_cat":"cs.CR","submitted_at":"2026-05-14T06:55:47+00:00","verdict":"CONDITIONAL","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Semantic Compliance Hijacking lets attackers hijack LLM agents by disguising malicious instructions as compliance rules in skills, reaching up to 77.67% success on confidentiality breaches and 67.33% on RCE while evading all tested scanners.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.13044","ref_index":40,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"No Attack Required: Semantic Fuzzing for Specification Violations in Agent Skills","primary_cat":"cs.CR","submitted_at":"2026-05-13T05:57:06+00:00","verdict":"UNVERDICTED","verdict_confidence":"MODERATE","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Sefz discovers specification violations in 29.9% of 402 real-world agent skills by translating guardrails into reachability goals and guiding LLM mutations with a multi-armed bandit.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.12875","ref_index":15,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"Do Skill Descriptions Tell the Truth? Detecting Undisclosed Security Behaviors in Code-Backed LLM Skills","primary_cat":"cs.CR","submitted_at":"2026-05-13T01:44:10+00:00","verdict":"CONDITIONAL","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"SKILLSCOPE detects undisclosed security behaviors in LLM skill implementations via security property graphs and taxonomy-based consistency checking, identifying confirmed inconsistencies in 9.4% of 4,556 evaluated skills with 84.8% precision and 96.5% recall against human review.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.12015","ref_index":68,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"SkillSafetyBench: Evaluating Agent Safety under Skill-Facing Attack Surfaces","primary_cat":"cs.CR","submitted_at":"2026-05-12T12:03:54+00:00","verdict":null,"verdict_confidence":null,"novelty_score":null,"formal_verification":null,"one_line_summary":null,"context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.11891","ref_index":19,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"Proteus: A Self-Evolving Red Team for Agent Skill Ecosystems","primary_cat":"cs.CR","submitted_at":"2026-05-12T10:05:54+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Proteus demonstrates that adaptive red-teaming achieves 40-90% attack success after five rounds and bypasses even strong auditors at up to 41% joint success, revealing that static skill vetting underestimates residual risk.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"classifier over fixed or pre-collected inputs, rather than as a component exposed to an adaptive skill author who revises the same artefact across rounds after observing the auditor's findings. 2 Reported skill attacks and adjacent injection vectors.Documented skill-side attacks include operational-narrative injection [17], credential-theft and instruction-exploitation patterns at the∼100k- skill scale [19], indirect prompt injection and agent-injection benchmarks [5, 7, 35, 39], environmental injection in web agents [16], learnable execution triggers [25], and harm benchmarks [1, 22]. These works characterize the attack surface, but typically evaluate static, hand-crafted, or template-driven artefacts. They do not measure the residual risk left by an auditor once the same skill author can"},{"citing_arxiv_id":"2605.11770","ref_index":45,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"Behavioral Integrity Verification for AI Agent Skills","primary_cat":"cs.CR","submitted_at":"2026-05-12T08:41:09+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"BIV audits AI agent skills at scale, finding 80% deviate from declared behavior on 49,943 skills and achieving 0.946 F1 for malicious skill detection.","context_count":1,"top_context_role":"dataset","top_context_polarity":"use_dataset","context_text":"Φ(s)exposes the credential-exfiltration kill chain through structural taint analysis, and the LLM judge independently catches the instruction-override directives in the markdown. Benchmarks and baselines.The 906 skills mix three sources: MaliciousAgentSkillsBench [44] (44 real-world malware+410 benign); Skill-Inject [27] (160 attacks+42 clean controls); and SkillJect [45] (200 attacks+50 clean controls). Real-world samples carry ecological validity; the synthetic sources cover adversarial diversity at higher count. We compare BIV against two baselines representing rule-based and LLM-based state of the art. Therule-basedbaseline is the behavioral-analysis component of the Cisco AI Defense skill scanner [46]. TheLLM-onlybaseline"},{"citing_arxiv_id":"2605.11418","ref_index":6,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"Under the Hood of SKILL.md: Semantic Supply-chain Attacks on AI Agent Skill Registry","primary_cat":"cs.AI","submitted_at":"2026-05-12T02:11:54+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":8.0,"formal_verification":"none","one_line_summary":"Semantic manipulations of SKILL.md descriptions enable effective supply-chain attacks that bias AI agent skill registries toward adversarial skills in discovery, selection, and governance.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"Agent Skillsaddress this problem by packaging reusable domain knowledge, procedural instructions, scripts, references, and assets into lightweight filesystem modules that agents can load on demand [4, 5]. This design has quickly produced a large ecosystem of community skill registries, with recent work reporting more than98,000skills within the first three months [6]. This growth also introduces a new supply-chain risk. Like traditional packages, third-party skills may contain malicious code or installation steps; unlike traditional packages, they also contain natural-language instructions that agents may read, trust, and act upon. Real-world incidents such as ClawHavoc and malicious OpenClaw skills distributing Atomic macOS Stealer show that skill"},{"citing_arxiv_id":"2605.09594","ref_index":19,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"Trust Me, Import This: Dependency Steering Attacks via Malicious Agent Skills","primary_cat":"cs.CR","submitted_at":"2026-05-10T15:13:38+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Malicious Skills induce coding agents to hallucinate and import attacker-controlled packages at high rates while evading detection.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"shape the model's dependency selection behavior in advance. The attacker does not need to modify model weights, poison training data, compromise the package registry, or control the user's prompt. Instead, the attack operates through a Skill that the coding agent treats as trusted development guidance. Skills significantly expand the attack surface of agentic coding systems [16], [17], [18], [19], [20]. We use the termSkillbroadly to refer to persistent instruction artifacts, including Claude Skills [16], Cursor Rules [21], Windsurf Rules [22], AutoGen system prompts [23], LangChain instruc- tion templates [24], and project-specific markdown instruction files. These artifacts commonly encode coding conventions, preferred frameworks, architectural assumptions, workflow"},{"citing_arxiv_id":"2605.05868","ref_index":30,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"SkillScope: Toward Fine-Grained Least-Privilege Enforcement for Agent Skills","primary_cat":"cs.CR","submitted_at":"2026-05-07T08:34:14+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"SkillScope detects over-privileged LLM agent skills with 94.53% F1 score via graph analysis and replay validation, finding 7,039 problematic skills in the wild and reducing violations by 88.56% while preserving task completion.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"mystifying rce vulnerabilities in llm-integrated apps. InProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security. 1716-1730. [29] Yi Liu, Zhihao Chen, Yanjun Zhang, Gelei Deng, Yuekang Li, Jianting Ning, Ying Zhang, and Leo Yu Zhang. 2026. Malicious agent skills in the wild: A large-scale security empirical study.arXiv preprint arXiv:2602.06547(2026). [30] Yi Liu, Gelei Deng, Yuekang Li, Kailong Wang, Zihao Wang, Xiaofeng Wang, Tianwei Zhang, Yepang Liu, Haoyu Wang, Yan Zheng, et al. 2023. Prompt injection attack against llm-integrated applications.arXiv preprint arXiv:2306.05499(2023). [31] Yang Liu, Dan Iter, Yichong Xu, Shuohang Wang, Ruochen Xu, and Chenguang Zhu. 2023. G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment."},{"citing_arxiv_id":"2605.05274","ref_index":28,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"Sealing the Audit-Runtime Gap for LLM Skills","primary_cat":"cs.CR","submitted_at":"2026-05-06T14:23:22+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"SIGIL cryptographically seals the audit-runtime gap for LLM skills via an on-chain registry with four publication types, DAO vetting, and a runtime verification loader that enforces integrity and permissions.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.22888","ref_index":7,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"RouteGuard: Internal-Signal Detection of Skill Poisoning in LLM Agents","primary_cat":"cs.CR","submitted_at":"2026-04-24T09:07:05+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"RouteGuard uses response-conditioned attention and hidden-state alignment to detect skill poisoning in LLM agents, achieving 0.8834 F1 on Skill-Inject benchmarks and recovering 90.51% of attacks missed by lexical screening.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.15415","ref_index":40,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"HarmfulSkillBench: How Do Harmful Skills Weaponize Your Agents?","primary_cat":"cs.CR","submitted_at":"2026-04-16T17:31:52+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":8.0,"formal_verification":"none","one_line_summary":"Harmful skills in open agent ecosystems raise average harm scores from 0.27 to 0.76 across six LLMs by lowering refusal rates when tasks are presented via pre-installed skills.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"In this section, we introduce how we collect skills from on- line skill registries. Specifically, we focus on ClawHub and Skills.Rest, two of the largest public skill registries, which also differ substantially in platform structure and distribu- tion mechanisms, making them well-suited for providing a comprehensive perspective on the diverse agent skill ecosys- tems [40, 24]. ClawHub.ClawHub [13] is a public, versioned reg- istry for agent skills, where publishers can upload and manage multiple iterations of their skills. We first col- lect the complete corpus of skills from ClawHub's of- ficial GitHub repository, 3 where each skill is organized within a dedicated directory containing itsSKILL.md file and corresponding scripts."},{"citing_arxiv_id":"2604.06550","ref_index":7,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"SkillSieve: A Hierarchical Triage Framework for Detecting Malicious AI Agent Skills","primary_cat":"cs.CR","submitted_at":"2026-04-08T00:58:48+00:00","verdict":null,"verdict_confidence":null,"novelty_score":null,"formal_verification":null,"one_line_summary":null,"context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.03070","ref_index":31,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"How Your Credentials Are Leaked by LLM Agent Skills: An Empirical Study","primary_cat":"cs.CR","submitted_at":"2026-04-03T14:50:16+00:00","verdict":null,"verdict_confidence":null,"novelty_score":null,"formal_verification":null,"one_line_summary":null,"context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"3 Security Analysis of LLM Agent Ecosystems Prompt injection techniques can manipulate agent decision- making [13, 17, 49, 59], and unchecked autonomy enables un- intended system damage [29, 47]. At the ecosystem level, Deng et al. [14] proposed a five-layer lifecycle framework for agent threats, Shen et al. [48] cataloged 221 vulnerabilities across 50 agent applications, Maloyan et al. [31] systematized 42 attack techniques against coding assistants, and Jiang et al. [23] sur- veyed the full skill lifecycle. Defensive work includes privilege control frameworks [50, 54], sandbox isolation [32, 55], reasoning- based guardrails [20, 38], capability-based formal analysis [7], and MCP-specific exploit benchmarks [21, 60]. None of the above work examines how the NL+PL skill architec-"},{"citing_arxiv_id":"2604.02837","ref_index":10,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"Towards Secure Agent Skills: Architecture, Threat Taxonomy, and Security Analysis","primary_cat":"cs.CR","submitted_at":"2026-04-03T07:56:42+00:00","verdict":"ACCEPT","verdict_confidence":"MODERATE","novelty_score":8.0,"formal_verification":"none","one_line_summary":"Agent Skills has structural security weaknesses from missing data-instruction boundaries, single-approval persistent trust, and absent marketplace reviews that require fundamental redesign.","context_count":1,"top_context_role":"background","top_context_polarity":"support","context_text":"once a user approves a Skill, it silently inherits persistent permissions to read and write files, download code, and open network connections, all without further prompts [9]. In January 2026, a coordinated supply chain campaign systematically compromised over 1,184 Skills in a major community marketplace-approximately one in five available packages-delivering a credential- theft payload to unsuspecting users [10, 11]. A concurrent large-scale empirical study that scanned 42,447 Skills found that 26.1% contained at least one security vulnerability, spanning 14 distinct patterns across four categories: prompt injection, data exfiltration, privilege escalation, and supply chain risks [12]. Independent researchers further demonstrated that Skill-based prompt injection"},{"citing_arxiv_id":"2602.12430","ref_index":34,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"Agent Skills for Large Language Models: Architecture, Acquisition, Security, and the Path Forward","primary_cat":"cs.MA","submitted_at":"2026-02-12T21:33:25+00:00","verdict":null,"verdict_confidence":null,"novelty_score":null,"formal_verification":null,"one_line_summary":null,"context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"across four categories: prompt injection, data exfiltration (13.3%), privilege escalation (11.8%), and supply chain risks. Skills bundling executable scripts are 2.12×more likely to contain vulnerabilities than instruction-only skills (OR=2.12,p <0.001). 5.2% of skills exhibit high-severity patterns strongly suggesting malicious intent. 6.3 Confirmed Malicious Skills A subsequent study [34] constructed the first ground-truth dataset of confirmed malicious skills by behaviorally verifying 98,380 skills from two community registries. Among 157 confirmed malicious skills with 632 vulnerabilities, the authors identified two attack archetypes:Data Thievesthat exfiltrate credentials through supply chain techniques, andAgent Hijackers that subvert agent decision-making through instruction manipulation."}],"limit":50,"offset":0}