A matched-pair protocol and Accurate Differentiation Rate metric reveal that conventional LLM accuracy on SAT problems is often inflated by over-predicting satisfiability, while cross-representation agreement exceeds 80 percent for most models.
A few billion lines of code later: Using static analysis to find bugs in the real world.Communications of the ACM, 53(2):66–75
8 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 8representative citing papers
A categorical framework characterizes robustness in program analysis as functors and gives recipes for lifting sound robust analyses from restricted models to general programs.
Steering LLM residual streams with random sparse vectors creates detectable self-recognition fingerprints that enable over 98% accurate attribution of generated text to specific models without degrading output quality.
SAGE uses sparse autoencoders to boost vulnerability signals in LLMs, raising internal SNR 12.7x and delivering up to 318% MCC gains on vulnerability detection benchmarks.
SAILOR combines static analysis and LLM-orchestrated synthesis to automatically generate symbolic execution harnesses, discovering 379 previously unknown memory-safety vulnerabilities across 10 large open-source C/C++ projects where the strongest baseline found only 12.
Analysis of 67,453 OpenClaw skills shows three scanners overlap on at most 10.4% of combined positives, with 81.9% flagged by only one scanner and distinct profiles for malicious versus suspicious skills.
STAF applies sentence embeddings from transformers to classify SCA findings, reaching 89% F1 and beating prior filters by 11% within projects and 6% across projects.
A benchmarking experiment finds low rediscovery rates for three models on six Mythos-linked bug tasks, with only six target matches across 54 attempts under controlled prompting.
citing papers explorer
-
Towards Better Static Code Analysis Reports: Sentence Transformer-based Filtering of Non-Actionable Alerts
STAF applies sentence embeddings from transformers to classify SCA findings, reaching 89% F1 and beating prior filters by 11% within projects and 6% across projects.