LLM-based security code review is vulnerable to framing bias, with a novel iterative refinement attack achieving 100% success in reintroducing vulnerabilities across real projects.
LineVul: A transformer-based line- level vulnerability prediction
9 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 2polarities
background 2representative citing papers
ML4AVD research remains locked into binary function-level classification of C/C++ vulnerabilities because twelve pain points in the pipeline reinforce each other through feedback loops.
AttackPathGNN introduces a State Interference Graph and conjunction pooling inside a GNN to detect cross-function vulnerabilities in Solidity contracts, reporting 92.3% F1 on SmartBugs Wild.
PromptAudit evaluates five prompting strategies across five LLMs on 1000 CVEs and finds chain-of-thought prompting yields the strongest overall performance while adaptive chain-of-thought and self-consistency reduce effective results.
SAGE uses sparse autoencoders to boost vulnerability signals in LLMs, raising internal SNR 12.7x and delivering up to 318% MCC gains on vulnerability detection benchmarks.
UntrustVul identifies untrustworthy vulnerability predictions by marking lines that neither match historical vulnerability patterns nor influence vulnerable lines through dependencies, reporting AUC 70-88% and F1 82-94% on 115K predictions.
ConcernBERT is a BERT embedding model trained with triplet loss on class membership to encode concern-level semantics in Java entities, evaluated by recovering original classes from merged unlabeled groups on a new dataset of over 2M files, outperforming existing models.
Frontier LLMs detect up to 63% of web vulnerabilities in WordPress plugins with scoped prompts outperforming open-ended ones, but all show low consistency across runs and miss some baseline issues.
HYDRA is a hybrid model that uses heuristics plus deep embeddings and a VAE to predict latent zero-day vulnerabilities in patched functions from Chrome, Android, and ImageMagick.
citing papers explorer
-
Measuring and Exploiting Contextual Bias in LLM-Assisted Security Code Review
LLM-based security code review is vulnerable to framing bias, with a novel iterative refinement attack achieving 100% success in reintroducing vulnerabilities across real projects.
-
Direction for Detection: A Survey of Automated Vulnerability Detection and all of its Pain Points
ML4AVD research remains locked into binary function-level classification of C/C++ vulnerabilities because twelve pain points in the pipeline reinforce each other through feedback loops.
-
AttackPathGNN: Cross-function vulnerability detection in smart contracts using state interference graphs and conjunction pooling
AttackPathGNN introduces a State Interference Graph and conjunction pooling inside a GNN to detect cross-function vulnerabilities in Solidity contracts, reporting 92.3% F1 on SmartBugs Wild.
-
PromptAudit: Auditing Prompt Sensitivity in LLM-Based Vulnerability Detection
PromptAudit evaluates five prompting strategies across five LLMs on 1000 CVEs and finds chain-of-thought prompting yields the strongest overall performance while adaptive chain-of-thought and self-consistency reduce effective results.
-
SAGE: Signal-Amplified Guided Embeddings for LLM-based Vulnerability Detection
SAGE uses sparse autoencoders to boost vulnerability signals in LLMs, raising internal SNR 12.7x and delivering up to 318% MCC gains on vulnerability detection benchmarks.
-
UntrustVul: An Automated Approach for Identifying Untrustworthy Alerts in Vulnerability Detection Models
UntrustVul identifies untrustworthy vulnerability predictions by marking lines that neither match historical vulnerability patterns nor influence vulnerable lines through dependencies, reporting AUC 70-88% and F1 82-94% on 115K predictions.
-
ConcernBERT: Learning Responsibilities Using Class Membership
ConcernBERT is a BERT embedding model trained with triplet loss on class membership to encode concern-level semantics in Java entities, evaluated by recovering original classes from merged unlabeled groups on a new dataset of over 2M files, outperforming existing models.
-
Evaluating LLMs for Real-World Web Vulnerability Detection
Frontier LLMs detect up to 63% of web vulnerabilities in WordPress plugins with scoped prompts outperforming open-ended ones, but all show low consistency across runs and miss some baseline issues.
-
HYDRA: A Hybrid Heuristic-Guided Deep Representation Architecture for Predicting Latent Zero-Day Vulnerabilities in Patched Functions
HYDRA is a hybrid model that uses heuristics plus deep embeddings and a VAE to predict latent zero-day vulnerabilities in patched functions from Chrome, Android, and ImageMagick.