Formal analysis and supply chain security for agentic AI skills

Varun Pratap Bhardwaj · 2026 · arXiv 2603.00195

9 Pith papers cite this work. Polarity classification is still indexing.

9 Pith papers citing it

read on arXiv browse 9 citing papers

citation-role summary

baseline 1

citation-polarity summary

baseline 1

representative citing papers

MalSkillBench: A Runtime-Verified Benchmark of Malicious Agent Skills

cs.CR · 2026-06-05 · unverdicted · novelty 8.0

MalSkillBench supplies the first sandbox-verified dataset of malicious agent skills and shows that existing detectors achieve high recall on code injection but collapse on prompt injection and agent-control attacks.

Skills Are Not Islands: Measuring Dependency and Risk in Agent Skill Supply Chains

cs.SE · 2026-07-01 · unverdicted · novelty 7.0

The paper defines Agent Skill Supply Chains (ASSCs) and SkillDepAnalyzer to extract and analyze dependency graphs from over 1.43 million LLM agent skills, revealing structural patterns and security signals.

Sealing the Audit-Runtime Gap for LLM Skills

cs.CR · 2026-05-06 · unverdicted · novelty 7.0

SIGIL cryptographically seals the audit-runtime gap for LLM skills via an on-chain registry with four publication types, DAO vetting, and a runtime verification loader that enforces integrity and permissions.

Runtime Skill Audit: Targeted Runtime Probing for Agent Skill Security

cs.CR · 2026-06-10 · unverdicted · novelty 6.0

Runtime Skill Audit introduces targeted runtime probing to detect malicious LLM agent skills, reporting 90% accuracy and resilience to self-evolving attacks on 100 skills versus static baselines.

When Safe Skills Collide: Measuring Compositional Risk in Agent Skill Ecosystems

cs.SE · 2026-05-30 · unverdicted · novelty 6.0

About 18.2% of structurally flagged skill pairs represent genuine compositional safety risks in agent skill registries, with exploitation gated by host model behavior.

Behavioral Integrity Verification for AI Agent Skills

cs.CR · 2026-05-12 · unverdicted · novelty 6.0

BIV audits AI agent skills at scale, finding 80% deviate from declared behavior on 49,943 skills and achieving 0.946 F1 for malicious skill detection.

Benchmarking Security Risk Detection and Verification in Open Agentic Skill Ecosystems

cs.CR · 2026-05-30 · unverdicted · novelty 5.0

SkillVetBench is a two-stage benchmark combining natural-language semantic vetting and instrumented sandbox execution to detect and provide runtime evidence for malicious skills in open agent platforms, with experiments showing static methods miss up to 89% of threats.

Qualixar OS: A Universal Operating System for AI Agent Orchestration

cs.AI · 2026-04-07 · unverdicted · novelty 4.0

Qualixar OS provides a runtime for multi-agent AI systems with support for 12 topologies, LLM-driven team design, dynamic routing, consensus judging, content attribution, and protocol bridging, achieving 100% accuracy on a custom 20-task suite at $0.000039 mean cost per task.

SkillSieve: A Hierarchical Triage Framework for Detecting Malicious AI Agent Skills

cs.CR · 2026-04-08

citing papers explorer

Showing 1 of 1 citing paper after filters.

Behavioral Integrity Verification for AI Agent Skills cs.CR · 2026-05-12 · unverdicted · none · ref 22
BIV audits AI agent skills at scale, finding 80% deviate from declared behavior on 49,943 skills and achieving 0.946 F1 for malicious skill detection.

Formal analysis and supply chain security for agentic AI skills

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer