A Dual-Loop Agent Framework for Automated Vulnerability Reproduction.arXiv preprint arXiv:2602.05721,

Bin Liu, Yanjie Zhao, Zhenpeng Chen, Guoai Xu, Haoyu Wang · arXiv 2602.05721

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

representative citing papers

cs.CR · 2026-05-26 · unverdicted · novelty 7.0

SEC-bench Pro benchmark with 183 real vulnerabilities shows frontier LLM coding agents achieve at most 38.8% success on SpiderMonkey and 32% on V8.

Showing 1 of 1 citing paper after filters.

SEC-bench Pro: Can Language Models Solve Long-Horizon Software Security Tasks? cs.CR · 2026-05-26 · unverdicted · none · ref 7
SEC-bench Pro benchmark with 183 real vulnerabilities shows frontier LLM coding agents achieve at most 38.8% success on SpiderMonkey and 32% on V8.