MalSkillBench supplies the first sandbox-verified dataset of malicious agent skills and shows that existing detectors achieve high recall on code injection but collapse on prompt injection and agent-control attacks.
In: Proceedings of the 17th International Conference on Detection of Intrusions and Malware, and Vulnera- bility Assessment (DIMVA)
9 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 9roles
background 3polarities
background 3representative citing papers
Large-scale scan found 2,289 malicious Go module versions and showed 99.4% remained retrievable via proxy after GitHub takedowns.
CodeQL detected 171 CVEs total, with 83 caught by a prior version before the fix; detections were often actionable within the vulnerable file but not stable across tool versions.
A taxonomy of GitHub abuse behaviors is proposed along with a detection framework achieving F1-scores exceeding 89% on a manually labeled dataset of 392 instances.
Large-scale analysis of 200K PyPI packages identifies 1,361 replicated popular packages, 256 replicated vulnerable packages, and 7 new replicated malicious packages, showing replication as a security threat vector.
Analysis of 67,453 OpenClaw skills shows three scanners overlap on at most 10.4% of combined positives, with 81.9% flagged by only one scanner and distinct profiles for malicious versus suspicious skills.
Proposes cryptographic registry identity, dual-signature model, and authoritative namespace binding to create three defense layers against dependency confusion.
Hidden dependencies and component variants in SBOMs cause inconsistent vulnerability reporting and VEX handling across scanners.
Systematic review of 97 studies on breaking changes in five software ecosystems, producing a four-dimensional taxonomy, reason/impact categories, 43 detection approaches, and 66 mitigation strategies.
citing papers explorer
-
Longitudinal Analyses of SAST Tools: A CodeQL Case Study
CodeQL detected 171 CVEs total, with 83 caught by a prior version before the fix; detections were often actionable within the vulnerable file but not stable across tool versions.