{"total":13,"items":[{"citing_arxiv_id":"2606.31272","ref_index":20,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"The Decomposition Is the Fingerprint: Per-Component Identity for Agent Skills","primary_cat":"cs.CR","submitted_at":"2026-06-30T07:45:33+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"A per-component SimHash fingerprint supplies structural identity for AI agent skills, recovering family membership under paraphrase and refactoring with AUC 0.974 while localizing changes.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.07131","ref_index":23,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"MalSkillBench: A Runtime-Verified Benchmark of Malicious Agent Skills","primary_cat":"cs.CR","submitted_at":"2026-06-05T10:43:19+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":8.0,"formal_verification":"none","one_line_summary":"MalSkillBench supplies the first sandbox-verified dataset of malicious agent skills and shows that existing detectors achieve high recall on code injection but collapse on prompt injection and agent-control attacks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.00925","ref_index":25,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"Benchmarking Security Risk Detection and Verification in Open Agentic Skill Ecosystems","primary_cat":"cs.CR","submitted_at":"2026-05-30T23:19:30+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"SkillVetBench is a two-stage benchmark combining natural-language semantic vetting and instrumented sandbox execution to detect and provide runtime evidence for malicious skills in open agent platforms, with experiments showing static methods miss up to 89% of threats.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.00448","ref_index":17,"ref_count":1,"confidence":0.98,"is_internal_anchor":true,"paper_title":"When Safe Skills Collide: Measuring Compositional Risk in Agent Skill Ecosystems","primary_cat":"cs.SE","submitted_at":"2026-05-30T00:38:07+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"About 18.2% of structurally flagged skill pairs represent genuine compositional safety risks in agent skill registries, with exploitation gated by host model behavior.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.14460","ref_index":14,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"Exploiting LLM Agent Supply Chains via Payload-less Skills","primary_cat":"cs.CR","submitted_at":"2026-05-14T06:55:47+00:00","verdict":"CONDITIONAL","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Semantic Compliance Hijacking lets attackers hijack LLM agents by disguising malicious instructions as compliance rules in skills, reaching up to 77.67% success on confidentiality breaches and 67.33% on RCE while evading all tested scanners.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.11770","ref_index":46,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"Behavioral Integrity Verification for AI Agent Skills","primary_cat":"cs.CR","submitted_at":"2026-05-12T08:41:09+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"BIV audits AI agent skills at scale, finding 80% deviate from declared behavior on 49,943 skills and achieving 0.946 F1 for malicious skill detection.","context_count":1,"top_context_role":"baseline","top_context_polarity":"baseline","context_text":"SkillJect [45] (200 attacks+50 clean controls). Real-world samples carry ecological validity; the synthetic sources cover adversarial diversity at higher count. We compare BIV against two baselines representing rule-based and LLM-based state of the art. Therule-basedbaseline is the behavioral-analysis component of the Cisco AI Defense skill scanner [46]. TheLLM-onlybaseline is the single-pass audit protocol from Liu et al. [44], which issues one security-audit call over the raw skill content; we instantiate it with Claude Sonnet 4.5 to match BIV's judge backbone, isolating the contribution of structural evidence. All 906 skills are held out from prompt engineering, threshold tuning, and judge-panel calibration; the relaxed-veto threshold is fixed by the structural-rule defini-"},{"citing_arxiv_id":"2605.11047","ref_index":8,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"Red-Teaming Agent Execution Contexts: Open-World Security Evaluation on OpenClaw","primary_cat":"cs.CR","submitted_at":"2026-05-11T13:20:02+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"DeepTrap automates discovery of contextual vulnerabilities in OpenClaw agents via trajectory optimization, showing that unsafe behavior can be induced while preserving task completion and that final-response checks are insufficient.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.08442","ref_index":24,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"Defense effectiveness across architectural layers: a mechanistic evaluation of persistent memory attacks on stateful LLM agents","primary_cat":"cs.CR","submitted_at":"2026-05-08T20:04:57+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Memory Sandbox at the memory layer reduces persistent memory attack success rate to 0% for eight of nine models with no utility cost, while input-level and retrieval-level defenses achieve near-baseline attack success rates of 88-89%.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.05868","ref_index":23,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"SkillScope: Toward Fine-Grained Least-Privilege Enforcement for Agent Skills","primary_cat":"cs.CR","submitted_at":"2026-05-07T08:34:14+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"SkillScope detects over-privileged LLM agent skills with 94.53% F1 score via graph analysis and replay validation, finding 7,039 problematic skills in the wild and reducing violations by 88.56% while preserving task completion.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"on-principles/a-guide-to-the-data-protection-principles/data-minimisation/. Accessed: 2026-04-21. [22] Xiaojun Jia, Jie Liao, Simeng Qin, Jindong Gu, Wenqi Ren, Xiaochun Cao, Yang Liu, and Philip Torr. 2026. Skillject: Automating stealthy skill-based prompt injection for coding agents with trace-driven closed-loop refinement.arXiv preprint arXiv:2602.14211(2026). [23] Juhee Kim, Woohyuk Choi, and Byoungyoung Lee. 2025. Prompt flow integrity to prevent privilege escalation in llm agents.arXiv preprint arXiv:2503.15547 (2025). [24] Hao Li, Xiaogeng Liu, Hung-Chun Chiu, Dianqi Li, Ning Zhang, and Chaowei Xiao. 2025. Drift: Dynamic rule-based defense with injection isolation for securing llm agents.arXiv preprint arXiv:2506."},{"citing_arxiv_id":"2604.25109","ref_index":8,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"Structured Security Auditing and Robustness Enhancement for Untrusted Agent Skills","primary_cat":"cs.CR","submitted_at":"2026-04-28T01:32:27+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"SkillGuard-Robust formulates pre-load auditing of untrusted Agent Skills as a three-way classification task and achieves 97.30% exact match and 98.33% malicious-risk recall on held-out benchmarks.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.06550","ref_index":24,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"SkillSieve: A Hierarchical Triage Framework for Detecting Malicious AI Agent Skills","primary_cat":"cs.CR","submitted_at":"2026-04-08T00:58:48+00:00","verdict":null,"verdict_confidence":null,"novelty_score":null,"formal_verification":null,"one_line_summary":null,"context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.04759","ref_index":7,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"Your Agent, Their Asset: A Real-World Safety Analysis of OpenClaw","primary_cat":"cs.CR","submitted_at":"2026-04-06T15:27:05+00:00","verdict":"CONDITIONAL","verdict_confidence":"MODERATE","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Poisoning any single CIK dimension of an AI agent raises average attack success rate from 24.6% to 64-74% across models, and tested defenses leave substantial residual risk.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.02900","ref_index":158,"ref_count":1,"confidence":0.9,"is_internal_anchor":true,"paper_title":"Safety in Embodied AI: A Survey of Risks, Attacks, and Defenses","primary_cat":"cs.CR","submitted_at":"2026-03-28T13:21:44+00:00","verdict":null,"verdict_confidence":null,"novelty_score":null,"formal_verification":null,"one_line_summary":null,"context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"Leakage (5):[378] [346] [445] [227] [471] Memory Defenses Defenses (2):[134] [14] Self-Evolving (§ 6.3) Emerging Risks Misalignment (1):[302] Capability Expansion (1):[16] Embodied Alignment Defenses (8):[135] [12] [169] [255] [434] [271] [391] [45] Cascading Risks (§ 6.4) Emerging Risks Cross-Layer (3):[354] [418] [456] Supply Chain (7):[40] [368] [290] [382] [162] [237] [158] Figure 4The roadmap of this survey. or agentic-level threats. [143] surveys security threats and defenses for LLM-controlled robotics, but scopes narrowly to LLM integration without addressing broader perception or interaction layers. [281] provides a policy-oriented risk taxonomy spanning physical, informational, and social dimensions, but does not analyze"}],"limit":50,"offset":0}