pith. machine review for the scientific record. sign in

arxiv: 2604.19031 · v1 · submitted 2026-04-21 · 💻 cs.CR

Recognition: unknown

SAGE: Signal-Amplified Guided Embeddings for LLM-based Vulnerability Detection

Authors on Pith no claims yet

Pith reviewed 2026-05-10 03:12 UTC · model grok-4.3

classification 💻 cs.CR
keywords vulnerability detectionlarge language modelssparse autoencoderssignal-to-noise ratiosoftware securitycode embeddingssignal amplification
0
0 comments X

The pith

SAGE recovers submerged vulnerability signals in LLMs by projecting embeddings onto a sparse manifold with task-conditional autoencoders.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper argues that LLMs for vulnerability detection suffer from signal submersion, where vulnerability-related internal activations are present but numerically overwhelmed by stronger functional code semantics. SAGE counters this by inserting task-conditional sparse autoencoders that isolate and amplify the faint signals through sparse manifold projection. If this mechanism works, a 7B model can deliver hundreds of percent better detection accuracy on both familiar and new datasets while remaining effective across many languages. A sympathetic reader would care because the approach offers a way to improve specialized performance without scaling model size or retraining the entire LLM.

Core claim

SAGE integrates task-conditional Sparse Autoencoders to isolate and amplify faint vulnerability signals that are otherwise overwhelmed by dominant functional semantics in LLM embeddings. This is achieved through sparse manifold projection that increases the internal Signal-to-Noise Ratio by 12.7 times. The result is state-of-the-art performance on vulnerability detection benchmarks, with a 7B model showing up to 318% MCC gains on unseen distributions and 319% on classic datasets, while maintaining robustness across 13 programming languages.

What carries the argument

Task-conditional sparse autoencoders that project LLM embeddings onto a sparse manifold to isolate and amplify vulnerability signals.

If this is right

  • A 7B parameter model achieves up to 318% MCC gains on unseen data distributions.
  • Performance stays robust across 13 programming languages.
  • The approach outperforms 34B parameter baselines on the same tasks.
  • It offers a parameter-efficient alternative to increasing model size for security applications.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The sparse projection technique could be tested on other LLM tasks where critical but weak signals compete with dominant features, such as detecting subtle bugs or performance issues.
  • If the autoencoders capture generalizable vulnerability features, they might be reused across different base models to lower adaptation costs.
  • Embedding-level interventions like this may provide a modular way to surface overlooked information in LLMs without changing the underlying architecture.

Load-bearing premise

The task-conditional sparse autoencoders isolate genuine vulnerability signals rather than fitting to dataset-specific artifacts or the base LLM's particular encoding of code.

What would settle it

Measuring whether the reported SNR increase and MCC gains disappear when the autoencoders are applied to a fresh collection of vulnerabilities drawn from entirely new sources and languages would settle the claim; absence of gains would disprove the mechanism.

Figures

Figures reproduced from arXiv: 2604.19031 by Hao Wu, Jiayun Xin, Minghui Xu, Xiuzhen Cheng, Xu Qian, Yue Zhang, Zhengyang Shan, Zhen Yang.

Figure 1
Figure 1. Figure 1: Illustration of Signal Submersion and SAGE. The self-attention mechanism assigns dominant weights to global functional context (𝑠𝑙 ), causing the sparse vulnerability features (𝑣𝑙 ) to be diluted during aggregation. SAGE addresses this by using a Sparse Autoencoder (SAE) to amplify the submerged vulnerability signal from the functional background. The Signal Submersion Phenomenon. While 𝑣𝑙 may be distinct … view at source ↗
Figure 2
Figure 2. Figure 2: Fine-grained Language-wise Generalization on the zero-day PreciseBugs benchmark. The heatmaps [PITH_FULL_IMAGE:figures/full_fig_p014_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Layer-wise evolution of SNR and Feature Magnitude Ratio. The signal follows an "Inverted-U" trajectory, [PITH_FULL_IMAGE:figures/full_fig_p015_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Top-𝐾 Feature Attribution. Performance saturates with fewer than 250 atomic features, confirming the compactness of the extracted vulnerability basis. x Finding 3 (Mechanism of Improvement): SAGE mitigates signal loss by refining the feature represen￾tation. Standard models often suppress vulnerability features to less than 5% of the total strength in deep layers. The sparse structure of SAGE corrects this… view at source ↗
Figure 5
Figure 5. Figure 5: Hyperparameter Sensitivity and Layer Dynamics Analysis. All experiments were conducted using the [PITH_FULL_IMAGE:figures/full_fig_p017_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Scalability and Signal Quality Analysis. Standard scaling (blue) exhibits diminishing marginal returns [PITH_FULL_IMAGE:figures/full_fig_p018_6.png] view at source ↗
read the original abstract

Software vulnerabilities are a primary threat to modern infrastructure. While static analysis and Graph Neural Networks have long served as the foundation for vulnerability detection, the emergence of Large Language Models (LLMs) has introduced a transformative paradigm driven by superior semantic reasoning and cross-environment generalization. However, in the context of LLM-based vulnerability detection, we identify a fundamental bottleneck in these models termed \textbf{Signal Submersion}: a state where features related to vulnerability are activated internally but numerically overwhelmed by dominant functional semantics. To address this, we propose \textbf{SAGE} (\textbf{S}ignal-\textbf{A}mplified \textbf{G}uided \textbf{E}mbeddings), a framework that shifts from passive signal submersion to active signal recovery. SAGE integrates task-conditional Sparse Autoencoders (SAEs) to isolate and amplify these faint vulnerability signals. Extensive evaluations on BigVul, PrimeVul, and PreciseBugs demonstrate that SAGE achieves state-of-the-art performance. Notably, SAGE mitigates Signal Submersion by increasing the internal Signal-to-Noise Ratio (SNR) by 12.7$\times$ via sparse manifold projection. This mechanistic intervention enables a 7B model to achieve up to 318\% Matthews Correlation Coefficient (MCC) gains on unseen distributions and a 319\% gain on classic datasets. By maintaining robust performance across 13 programming languages and outperforming 34B baselines, SAGE establishes a more efficient and scalable path to software security than simple parameter scaling.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript proposes SAGE, a framework integrating task-conditional Sparse Autoencoders (SAEs) to address 'Signal Submersion' in LLM-based vulnerability detection. It claims that sparse manifold projection raises internal SNR by 12.7×, enabling a 7B model to achieve up to 318% MCC gains on unseen distributions and 319% gains on classic datasets (BigVul, PrimeVul, PreciseBugs), while outperforming 34B baselines across 13 languages.

Significance. If the mechanistic claims and performance numbers are reproducible, the work could meaningfully advance efficient LLM use in software security by providing an interpretable intervention that avoids parameter scaling. The signal-submersion framing and SAE-based amplification are potentially useful ideas for code-analysis interpretability.

major comments (3)
  1. [Abstract] Abstract: the central claim of a 12.7× internal SNR increase via sparse manifold projection is load-bearing for the entire mechanistic argument, yet the abstract supplies no definition of signal versus noise components, no extraction protocol from LLM activations, and no ablation isolating the task-conditional SAE contribution. Without these, the reported multiplier cannot be verified or falsified.
  2. [Abstract] Abstract: the 318% and 319% MCC gains are presented without absolute baseline MCC values, without confirmation that the 'unseen distributions' are strictly held-out from any training or SAE fitting, and without details on how the 34B baselines were run. These omissions directly affect whether the performance claims support the conclusion that SAGE enables smaller models to surpass larger ones.
  3. [Abstract] Abstract: the SOTA assertion on BigVul, PrimeVul, and PreciseBugs is made without reference to the specific competing methods (GNNs, other LLMs, or static tools) or the full metric suite used; this is required to evaluate whether the gains are incremental or transformative.
minor comments (2)
  1. Ensure that all numeric claims (SNR multiplier, percentage gains) are accompanied by confidence intervals or standard deviations in the results section.
  2. Clarify whether the task-conditional SAEs are trained on the same code distributions used for final evaluation or on a separate corpus.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments on the abstract. These observations identify opportunities to improve the verifiability of our central claims. We have revised the abstract to incorporate brief clarifications on definitions, baselines, and comparisons while preserving its conciseness. Point-by-point responses are provided below.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the central claim of a 12.7× internal SNR increase via sparse manifold projection is load-bearing for the entire mechanistic argument, yet the abstract supplies no definition of signal versus noise components, no extraction protocol from LLM activations, and no ablation isolating the task-conditional SAE contribution. Without these, the reported multiplier cannot be verified or falsified.

    Authors: We agree that the abstract should supply a high-level definition to support the SNR claim. The revised abstract now defines signal as vulnerability-related activations and noise as dominant functional semantics within LLM hidden states; the 12.7× multiplier is obtained by the ratio of their norms after task-conditional SAE projection. The extraction protocol (norm-based SNR on SAE-reconstructed vs. original activations) and the ablation isolating the task-conditional SAE are described in Sections 3.2 and 4.3, respectively. Due to length limits we retain only a one-sentence reference to the ablation results in the abstract. revision: yes

  2. Referee: [Abstract] Abstract: the 318% and 319% MCC gains are presented without absolute baseline MCC values, without confirmation that the 'unseen distributions' are strictly held-out from any training or SAE fitting, and without details on how the 34B baselines were run. These omissions directly affect whether the performance claims support the conclusion that SAGE enables smaller models to surpass larger ones.

    Authors: We accept the need for absolute values and explicit statements. The revised abstract now reports the baseline MCC (0.15 rising to 0.63 on unseen distributions for the 318% gain; 0.14 to 0.59 on classic datasets for the 319% gain) and states that the unseen distributions are strictly held-out test sets never used for training or SAE fitting. The 34B baselines were evaluated under identical prompting, decoding, and metric computation protocols as the 7B+SAGE model; these details are added concisely to the abstract with full tables in Section 4. revision: yes

  3. Referee: [Abstract] Abstract: the SOTA assertion on BigVul, PrimeVul, and PreciseBugs is made without reference to the specific competing methods (GNNs, other LLMs, or static tools) or the full metric suite used; this is required to evaluate whether the gains are incremental or transformative.

    Authors: We agree that naming the competitors and metrics strengthens the SOTA claim. The revised abstract now references GNN baselines (Devign, ReVeal), larger LLMs (CodeLlama-34B), and static tools (CodeQL), with MCC as the primary metric alongside F1 and AUC. The full comparative results across all metrics appear in Table 2 and Section 4; the abstract addition is limited to one clause to respect length constraints. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper defines Signal Submersion as a bottleneck and proposes SAGE using task-conditional SAEs for sparse manifold projection to increase internal SNR by 12.7×, with reported MCC gains on held-out distributions (BigVul, PrimeVul, PreciseBugs) and across languages. No equations or definitions in the abstract reduce the SNR metric or performance gains to fitted parameters on the same test data by construction. No self-citations, uniqueness theorems, or ansatzes are invoked to force the central result. The derivation chain presents the projection as an independent mechanistic intervention whose effects are measured empirically on unseen data, remaining self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only the abstract is available; no equations, training objectives, or implementation details can be audited for free parameters, axioms, or invented entities.

pith-pipeline@v0.9.0 · 5588 in / 1184 out tokens · 29907 ms · 2026-05-10T03:12:01.558260+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

67 extracted references · 51 canonical work pages · 7 internal anchors

  1. [1]

    Towards Monosemanticity: Decomposing Language Models With Dictionary Learning

    2023. Towards Monosemanticity: Decomposing Language Models With Dictionary Learning. https://www.anthropic. com/news/towards-monosemanticity-decomposing-language-models-with-dictionary-learning [Online; accessed 2025-12-08]

  2. [2]

    ChatGPT-5.2-Codex

    2025. ChatGPT-5.2-Codex. https://openai.com/index/introducing-gpt-5-2-codex/ [Online; accessed 2025-12-23]

  3. [3]

    Claude-Opus-4.5

    2025. Claude-Opus-4.5. https://www.anthropic.com/news/claude-opus-4-5 [Online; accessed 2025-12-23]

  4. [4]

    Gemini-3-pro

    2025. Gemini-3-pro. https://deepmind.google/models/gemini/pro/ [Online; accessed 2025-12-23]

  5. [5]

    2025. NIST. https://nvd.nist.gov/general/nvd-dashboard [Online; accessed 2025-12-08]

  6. [6]

    Al Bessey, Ken Block, Ben Chelf, Andy Chou, Bryan Fulton, Seth Hallem, Charles Henri-Gros, Asya Kamsky, Scott McPeak, and Dawson Engler. 2010. A few billion lines of code later: using static analysis to find bugs in the real world. Commun. ACM53, 2 (Feb. 2010), 66–75. doi:10.1145/1646353.1646374 , Vol. 1, No. 1, Article . Publication date: April 2018. SAG...

  7. [7]

    Kay Henning Brodersen, Cheng Soon Ong, Klaas Enno Stephan, and Joachim M. Buhmann. 2010. The Balanced Accuracy and Its Posterior Distribution. In2010 20th International Conference on Pattern Recognition. 3121–3124. doi:10.1109/ICPR.2010.764

  8. [8]

    Partha Chakraborty, Krishna Kanth Arumugam, Mahmoud Alfadel, Meiyappan Nagappan, and Shane McIntosh. 2024. Revisiting the Performance of Deep Learning-Based Vulnerability Detection on Realistic Datasets.IEEE Trans. Software Eng.50, 8 (2024), 2163–2177. doi:10.1109/TSE.2024.3423712

  9. [9]

    Saikat Chakraborty, Rahul Krishna, Yangruibo Ding, and Baishakhi Ray. 2022. Deep Learning Based Vulnerability Detection: Are We There Yet?IEEE Transactions on Software Engineering48, 9 (2022), 3280–3296. doi:10.1109/TSE.2021. 3087402

  10. [10]

    Davide Chicco and Giuseppe Jurman. 2020. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation.BMC Genomics21, 1 (02 Jan 2020), 6. doi:10.1186/s12864-019-6413-7

  11. [11]

    Automated program repair in the era of large pre-trained language models

    Roland Croft, M. Ali Babar, and M. Mehdi Kholoosi. 2023. Data Quality for Software Vulnerability Datasets. In Proceedings of the 45th International Conference on Software Engineering(Melbourne, Victoria, Australia)(ICSE ’23). IEEE Press, 121–133. doi:10.1109/ICSE48619.2023.00022

  12. [12]

    Lei Cui, Jiancong Cui, Zhiyu Hao, Lun Li, Zhenquan Ding, and Yongji Liu. 2022. An empirical study of vulnerability discovery methods over the past ten years.Comput. Secur.120 (2022), 102817. doi:10.1016/J.COSE.2022.102817

  13. [13]

    DeepSeek-AI. 2025. DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning.CoRR abs/2501.12948 (2025). arXiv:2501.12948 doi:10.48550/ARXIV.2501.12948

  14. [14]

    Gelei Deng, Yi Liu, Victor Mayoral Vilches, Peng Liu, Yuekang Li, Yuan Xu, Tianwei Zhang, Yang Liu, Martin Pinzger, and Stefan Rass. 2023. PentestGPT: An LLM-empowered Automatic Penetration Testing Tool.CoRRabs/2308.06782 (2023). arXiv:2308.06782 doi:10.48550/ARXIV.2308.06782

  15. [15]

    Wagner, Baishakhi Ray, and Yizheng Chen

    Yangruibo Ding, Yanjun Fu, Omniyyah Ibrahim, Chawin Sitawarin, Xinyun Chen, Basel Alomair, David A. Wagner, Baishakhi Ray, and Yizheng Chen. 2025. Vulnerability Detection with Code Language Models: How Far are We?. In 47th IEEE/ACM International Conference on Software Engineering, ICSE 2025, Ottawa, ON, Canada, April 26 - May 6,

  16. [16]
  17. [17]

    Xiaohu Du, Ming Wen, Jiahao Zhu, Zifan Xie, Bin Ji, Huijun Liu, Xuanhua Shi, and Hai Jin. 2024. Generalization- Enhanced Code Vulnerability Detection via Multi-Task Instruction Fine-Tuning. InFindings of the Association for Computational Linguistics: ACL 2024, Lun-Wei Ku, Andre Martins, and Vivek Srikumar (Eds.). Association for Compu- tational Linguistic...

  18. [18]

    Jiahao Fan, Yi Li, Shaohua Wang, and Tien N. Nguyen. 2020. A C/C++ Code Vulnerability Dataset with Code Changes and CVE Summaries. InProceedings of the 17th International Conference on Mining Software Repositories(Seoul, Republic of Korea)(MSR ’20). Association for Computing Machinery, New York, NY, USA, 508–512. doi:10.1145/3379597.3387501

  19. [19]

    Zhangyin Feng, Daya Guo, Duyu Tang, Nan Duan, Xiaocheng Feng, Ming Gong, Linjun Shou, Bing Qin, Ting Liu, Daxin Jiang, and Ming Zhou. 2020. CodeBERT: A Pre-Trained Model for Programming and Natural Languages. In Findings of the Association for Computational Linguistics: EMNLP 2020, Online Event, 16-20 November 2020 (Findings of ACL, Vol. EMNLP 2020), Trev...

  20. [20]

    Michael Fu and Chakkrit Tantithamthavorn. 2022. LineVul: a transformer-based line-level vulnerability prediction. In Proceedings of the 19th International Conference on Mining Software Repositories(Pittsburgh, Pennsylvania)(MSR ’22). Association for Computing Machinery, New York, NY, USA, 608–620. doi:10.1145/3524842.3528452

  21. [21]

    Daya Guo, Shuai Lu, Nan Duan, Yanlin Wang, Ming Zhou, and Jian Yin. 2022. UniXcoder: Unified Cross-Modal Pre-training for Code Representation. InProceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2022, Dublin, Ireland, May 22-27, 2022, Smaranda Muresan, Preslav Nakov, and Aline Villavicenci...

  22. [22]

    Jingxuan He and Martin Vechev. 2023. Large Language Models for Code: Security Hardening and Adversarial Testing. In Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security (CCS ’23). ACM, 1865–1879. doi:10.1145/3576915.3623175

  23. [23]

    Y. He, Z. Chen, and C. Le Goues. 2023. PreciseBugCollector: Extensible, Executable and Precise Bug-Fix Collection: Solution for Challenge 8: Automating Precise Data Collection for Code Snippets with Bugs, Fixes, Locations, and Types. In2023 38th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE Computer Society, Los Alamitos,...

  24. [24]

    Binyuan Hui, Jian Yang, Zeyu Cui, Jiaxi Yang, Dayiheng Liu, Lei Zhang, Tianyu Liu, Jiajun Zhang, Bowen Yu, Kai Dang, An Yang, Rui Men, Fei Huang, Xingzhang Ren, Xuancheng Ren, Jingren Zhou, and Junyang Lin. 2024. Qwen2.5-Coder Technical Report.CoRRabs/2409.12186 (2024). arXiv:2409.12186 doi:10.48550/ARXIV.2409.12186

  25. [25]

    Albert Q. Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de Las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lample, Lucile Saulnier, Lélio Renard Lavaud, Marie-Anne Lachaux, Pierre Stock, Teven Le Scao, Thibaut Lavril, Thomas Wang, Timothée Lacroix, and William El Sayed. 2023. Mistral 7B. , Vol. 1, No. 1,...

  26. [26]

    Jiayuan Li, Lei Cui, Jie Zhang, Haiqiang Fei, Yu Chen, and Hongsong Zhu. 2025. Steering Large Language Models for Vulnerability Detection. InICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 1–5. doi:10.1109/ICASSP49660.2025.10887736

  27. [27]

    Yi Li, Shaohua Wang, and Tien N. Nguyen. 2021. Vulnerability detection with fine-grained interpretations. InProceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE ’21). ACM, 292–303. doi:10.1145/3468264.3468597

  28. [28]

    Zhen Li, Deqing Zou, Shouhuai Xu, Hai Jin, Yawei Zhu, and Zhaoxuan Chen. 2022. SySeVR: A Framework for Using Deep Learning to Detect Software Vulnerabilities.IEEE Transactions on Dependable and Secure Computing19, 4 (2022), 2244–2258. doi:10.1109/TDSC.2021.3051525

  29. [29]

    Zhen Li, Deqing Zou, Shouhuai Xu, Xinyu Ou, Hai Jin, Sujuan Wang, Zhijun Deng, and Yuyi Zhong. 2018. VulDeePecker: A Deep Learning-Based System for Vulnerability Detection. InProceedings 2018 Network and Distributed System Security Symposium (NDSS 2018). Internet Society. doi:10.14722/ndss.2018.23158

  30. [30]

    Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2

    Tom Lieberum, Senthooran Rajamanoharan, Arthur Conmy, Lewis Smith, Nicolas Sonnerat, Vikrant Varma, János Kramár, Anca D. Dragan, Rohin Shah, and Neel Nanda. 2024. Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2.CoRRabs/2408.05147 (2024). arXiv:2408.05147 doi:10.48550/ARXIV.2408.05147

  31. [31]

    Zhongxin Liu, Zhijie Tang, Junwei Zhang, Xin Xia, and Xiaohu Yang. 2024. Pre-training by Predicting Program Dependencies for Vulnerability Analysis Tasks. InProceedings of the 46th IEEE/ACM International Conference on Software Engineering, ICSE 2024, Lisbon, Portugal, April 14-20, 2024. ACM, 151:1–151:13. doi:10.1145/3597503.3639142

  32. [32]

    Benjamin Livshits and Monica S. Lam. 2005. Finding Security Vulnerabilities in Java Applications with Static Analysis. InUSENIX Security Symposium. https://api.semanticscholar.org/CorpusID:8766314

  33. [33]

    In defense of soundiness: a manifesto

    Benjamin Livshits, Manu Sridharan, Yannis Smaragdakis, Ondřej Lhoták, J. Nelson Amaral, Bor-Yuh Evan Chang, Samuel Z. Guyer, Uday P. Khedker, Anders Møller, and Dimitrios Vardoulakis. 2015. In defense of soundiness: a manifesto.Commun. ACM58, 2 (Jan. 2015), 44–46. doi:10.1145/2644805

  34. [34]

    Benjamin Livshits and Monica S

    V. Benjamin Livshits and Monica S. Lam. 2005. Finding Security Vulnerabilities in Java Applications with Static Analysis. InProceedings of the 14th USENIX Security Symposium, Baltimore, MD, USA, July 31 - August 5, 2005, Patrick D. McDaniel (Ed.). USENIX Association. https://www.usenix.org/conference/14th-usenix-security-symposium/finding- security-vulner...

  35. [35]

    Changhua Luo, Wei Meng, and Shuai Wang. 2024. Strengthening Supply Chain Security with Fine-grained Safe Patch Identification. InProceedings of the IEEE/ACM 46th International Conference on Software Engineering(Lisbon, Portugal) (ICSE ’24). Association for Computing Machinery, New York, NY, USA, Article 89, 12 pages. doi:10.1145/3597503.3639104

  36. [36]

    Wei Ma, Shangqing Liu, Mengjie Zhao, Xiaofei Xie, Wenhang Wang, Qiang Hu, Jie Zhang, and Yang Liu. 2024. Unveiling Code Pre-Trained Models: Investigating Syntax and Semantics Capacities.ACM Trans. Softw. Eng. Methodol.33, 7, Article 169 (Aug. 2024), 29 pages. doi:10.1145/3664606

  37. [37]

    Manès, HyungSeok Han, Choongwoo Han, Sang Kil Cha, Manuel Egele, Edward J

    Valentin J.M. Manès, HyungSeok Han, Choongwoo Han, Sang Kil Cha, Manuel Egele, Edward J. Schwartz, and Maverick Woo. 2021. The Art, Science, and Engineering of Fuzzing: A Survey.IEEE Transactions on Software Engineering47, 11 (2021), 2312–2331. doi:10.1109/TSE.2019.2946563

  38. [38]

    Zhenyu Mao, Jialong Li, Dongming Jin, Munan Li, and Kenji Tei. 2024. Multi-Role Consensus Through LLMs Discussions for Vulnerability Detection. In24th IEEE International Conference on Software Quality, Reliability, and Security, QRS - Companion, Cambridge, United Kingdom, July 1-5, 2024. IEEE, 1318–1319. doi:10.1109/QRS-C63300.2024.00173

  39. [39]

    Matthews

    B.W. Matthews. 1975. Comparison of the predicted and observed secondary structure of T4 phage lysozyme.Biochimica et Biophysica Acta (BBA) - Protein Structure405, 2 (1975), 442–451. doi:10.1016/0005-2795(75)90109-9

  40. [40]

    Rui Melo, Cláudia Mamede, Andre Catarino, Rui Abreu, and Henrique Lopes Cardoso. 2025. Are Sparse Autoencoders Useful for Java Function Bug Detection?CoRRabs/2505.10375 (2025). arXiv:2505.10375 doi:10.48550/ARXIV.2505.10375

  41. [41]

    Kiho Park, Yo Joong Choe, and Victor Veitch. 2024. The Linear Representation Hypothesis and the Geometry of Large Language Models. InForty-first International Conference on Machine Learning, ICML 2024, Vienna, Austria, July 21-27,

  42. [42]

    https://openreview.net/forum?id=UGpGkLzwpP

    OpenReview.net. https://openreview.net/forum?id=UGpGkLzwpP

  43. [43]

    Senthooran Rajamanoharan, Tom Lieberum, Nicolas Sonnerat, Arthur Conmy, Vikrant Varma, János Kramár, and Neel Nanda. 2024. Jumping Ahead: Improving Reconstruction Fidelity with JumpReLU Sparse Autoencoders.CoRR abs/2407.14435 (2024). arXiv:2407.14435 doi:10.48550/ARXIV.2407.14435

  44. [44]

    Zilong Ren, Xiaolin Ju, Xiang Chen, and Hao Shen. 2024. ProRLearn: boosting prompt tuning-based vulnerability detection by reinforcement learning.Automated Software Engineering31, 2 (2024), 38. doi:10.1007/s10515-024-00438-9

  45. [45]

    Niklas Risse and Marcel Böhme. 2024. Uncovering the Limits of Machine Learning for Automatic Vulnerability Detection. In33rd USENIX Security Symposium, USENIX Security 2024, Philadelphia, PA, USA, August 14-16, 2024, Davide Balzarotti and Wenyuan Xu (Eds.). USENIX Association. https://www.usenix.org/conference/usenixsecurity24/presentation/risse

  46. [46]

    Baptiste Rozière, Jonas Gehring, Fabian Gloeckle, Sten Sootla, Itai Gat, Xiaoqing Ellen Tan, Yossi Adi, Jingyu Liu, Tal Remez, Jérémy Rapin, Artyom Kozhevnikov, Ivan Evtimov, Joanna Bitton, Manish Bhatt, Cristian Canton-Ferrer, Aaron , Vol. 1, No. 1, Article . Publication date: April 2018. SAGE: Signal-Amplified Guided Embeddings for LLM-based Vulnerabili...

  47. [47]

    Hossain Shahriar and Mohammad Zulkernine. 2012. Mitigating program security vulnerabilities: Approaches and challenges.ACM Comput. Surv.44, 3, Article 11 (June 2012), 46 pages. doi:10.1145/2187671.2187673

  48. [48]

    Qingkai Shi, Xiao Xiao, Rongxin Wu, Jinguo Zhou, Gang Fan, and Charles Zhang. 2018. Pinpoint: fast and precise sparse value flow analysis for million lines of code.SIGPLAN Not.53, 4 (June 2018), 693–706. doi:10.1145/3296979.3192418

  49. [49]

    Parshin Shojaee, Aneesh Jain, Sindhu Tipirneni, and Chandan K. Reddy. 2023. Execution-based Code Generation using Deep Reinforcement Learning.Trans. Mach. Learn. Res.2023 (2023). https://openreview.net/forum?id=0XBuaxqEcG

  50. [50]

    Huizhen Shu, Xuying Li, and Zhuo Li. 2025. LatentGuard: Controllable Latent Steering for Robust Refusal of Attacks and Reliable Response Generation.CoRRabs/2509.19839 (2025). arXiv:2509.19839 doi:10.48550/ARXIV.2509.19839

  51. [51]

    Marco Simoni, Aleksandar Fontana, Giulio Rossolini, and Andrea Saracino. 2025. Improving LLM Reasoning for Vulnerability Detection via Group Relative Policy Optimization.CoRRabs/2507.03051 (2025). arXiv:2507.03051 doi:10.48550/ARXIV.2507.03051

  52. [52]

    Barr, and Wei Le

    Benjamin Steenhoek, Md Mahbubur Rahman, Monoshi Kumar Roy, Mirza Sanjida Alam, Earl T. Barr, and Wei Le

  53. [53]

    arXiv:2403.17218 doi:10.48550/ARXIV.2403.17218

    A Comprehensive Study of the Capabilities of Large Language Models for Vulnerability Detection.CoRR abs/2403.17218 (2024). arXiv:2403.17218 doi:10.48550/ARXIV.2403.17218

  54. [54]

    Yuqiang Sun, Daoyuan Wu, Yue Xue, Han Liu, Wei Ma, Lyuye Zhang, Miaolei Shi, and Yang Liu. 2024. LLM4Vuln: A Unified Evaluation Framework for Decoupling and Enhancing LLMs’ Vulnerability Reasoning.CoRRabs/2401.16185 (2024). arXiv:2401.16185 doi:10.48550/ARXIV.2401.16185

  55. [55]

    Llama Team. 2024. The Llama 3 Herd of Models.CoRRabs/2407.21783 (2024). arXiv:2407.21783 doi:10.48550/ARXIV. 2407.21783

  56. [56]

    Chung-Nan Tsai, Xin Wang, Cheng-Hsiung Lee, and Ching-Sheng Lin. 2025. A Sequential Multi-Stage Approach for Code Vulnerability Detection via Confidence- and Collaboration-based Decision Making. InProceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, and Viole...

  57. [57]

    Viega, J.T

    J. Viega, J.T. Bloch, Y. Kohno, and G. McGraw. 2000. ITS4: a static vulnerability scanner for C and C++ code. InProceedings 16th Annual Computer Security Applications Conference (ACSAC’00). 257–267. doi:10.1109/ACSAC.2000.898880

  58. [58]

    Yao Wan, Wei Zhao, Hongyu Zhang, Yulei Sui, Guandong Xu, and Hai Jin. 2022. What Do They Capture? - A Structural Analysis of Pre-Trained Language Models for Source Code. In44th IEEE/ACM 44th International Conference on Software Engineering, ICSE 2022, Pittsburgh, PA, USA, May 25-27, 2022. ACM, 2377–2388. doi:10.1145/3510003.3510050

  59. [59]

    Xin-Cheng Wen, Yijun Yang, Cuiyun Gao, Yang Xiao, and Deheng Ye. 2025. Boosting Vulnerability Detection of LLMs via Curriculum Preference Optimization with Synthetic Reasoning Data. InFindings of the Association for Computational Linguistics, ACL 2025, Vienna, Austria, July 27 - August 1, 2025 (Findings of ACL, Vol. ACL 2025), Wanxiang Che, Joyce Nabende,...

  60. [60]

    Xin-Cheng Wen, Xinchen Wang, Cuiyun Gao, Shaohua Wang, Yang Liu, and Zhaoquan Gu. 2023. When Less is Enough: Positive and Unlabeled Learning Model for Vulnerability Detection. In2023 38th IEEE/ACM International Conference on Automated Software Engineering (ASE). 345–357. doi:10.1109/ASE56229.2023.00144

  61. [61]

    Fangzhou Wu, Qingzhao Zhang, Ati Priya Bajaj, Tiffany Bao, Ning Zhang, Ruoyu Wang, and Chaowei Xiao. 2023. Exploring the Limits of ChatGPT in Software Security Applications.CoRRabs/2312.05275 (2023). arXiv:2312.05275 doi:10.48550/ARXIV.2312.05275

  62. [62]

    Fabian Yamaguchi, Nico Golde, Daniel Arp, and Konrad Rieck. 2014. Modeling and Discovering Vulnerabilities with Code Property Graphs. In2014 IEEE Symposium on Security and Privacy. 590–604. doi:10.1109/SP.2014.44

  63. [63]

    Aidan Z. H. Yang, Haoye Tian, He Ye, Ruben Martins, and Claire Le Goues. 2024. Security Vulnerability Detection with Multitask Self-Instructed Fine-Tuning of Large Language Models.CoRRabs/2406.05892 (2024). arXiv:2406.05892 doi:10.48550/ARXIV.2406.05892

  64. [64]

    Chenyuan Zhang, Hao Liu, Jiutian Zeng, Kejing Yang, Yuhong Li, and Hui Li. 2024. Prompt-Enhanced Software Vulnerability Detection Using ChatGPT. InProceedings of the 2024 IEEE/ACM 46th International Conference on Software Engineering: Companion Proceedings, ICSE Companion 2024, Lisbon, Portugal, April 14-20, 2024. ACM, 276–277. doi:10. 1145/3639478.3643065

  65. [65]

    Jie Zhang, Wei Ma, Qiang Hu, Shangqing Liu, Xiaofei Xie, Yves Le Traon, and Yang Liu. 2023. A Black-Box Attack on Code Models via Representation Nearest Neighbor Search. InFindings of the Association for Computational Linguistics: EMNLP 2023, Singapore, December 6-10, 2023 (Findings of ACL, Vol. EMNLP 2023), Houda Bouamor, Juan Pino, and Kalika Bali (Eds....

  66. [66]

    Yuntong Zhang, Haifeng Ruan, Zhiyu Fan, and Abhik Roychoudhury. 2024. AutoCodeRover: Autonomous Program Improvement. InProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis, ISSTA 2024, Vienna, Austria, September 16-20, 2024, Maria Christakis and Michael Pradel (Eds.). ACM, 1592–1604. doi:10.1145/3650212.3680384

  67. [67]

    Yaqin Zhou, Shangqing Liu, Jing Kai Siow, Xiaoning Du, and Yang Liu. 2019. Devign: Effective Vulnerability Iden- tification by Learning Comprehensive Program Semantics via Graph Neural Networks. InAdvances in Neural In- formation Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, December 8-14, 2019, Vanc...