Frontier LLMs detect up to 63% of web vulnerabilities in WordPress plugins with scoped prompts outperforming open-ended ones, but all show low consistency across runs and miss some baseline issues.
Software Testing, Verification and Reliability30(7-8), e1716 (2020)
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CR 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Evaluating LLMs for Real-World Web Vulnerability Detection
Frontier LLMs detect up to 63% of web vulnerabilities in WordPress plugins with scoped prompts outperforming open-ended ones, but all show low consistency across runs and miss some baseline issues.