Multi-tier verification on VULBENCH-CPP shows AI-generated C++ code triggers confirmed runtime violations roughly twice as often as human code, while static analysis misleadingly indicates parity due to code length.
The formai dataset: Generative ai in software security through the lens of formal verification,
3 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.SE 3years
2026 3verdicts
UNVERDICTED 3representative citing papers
Large-scale analysis of 200K PyPI packages identifies 1,361 replicated popular packages, 256 replicated vulnerable packages, and 7 new replicated malicious packages, showing replication as a security threat vector.
Empirical evaluation shows popular SBOM tools miss CIMs across languages, so security-grade SBOMs are not achievable under current definitions.
citing papers explorer
-
The Illusion of Safety: Multi-Tier Verification of AI vs. Human C++ Code
Multi-tier verification on VULBENCH-CPP shows AI-generated C++ code triggers confirmed runtime violations roughly twice as often as human code, while static analysis misleadingly indicates parity due to code length.
-
Uncovering Similar but Different Packages in PyPI and Potential Security Threats
Large-scale analysis of 200K PyPI packages identifies 1,361 replicated popular packages, 256 replicated vulnerable packages, and 7 new replicated malicious packages, showing replication as a security threat vector.
-
Poking Around in the Dark: Why a Shared Understanding of Components Matters
Empirical evaluation shows popular SBOM tools miss CIMs across languages, so security-grade SBOMs are not achievable under current definitions.