Antaeus detects 15 logic vulnerabilities across 28 repositories via a pipeline of function prioritization, repository-level LLM reasoning, and comparative validation, outperforming baselines at similar cost.
Every- thing you wanted to know about LLM-based vulnera- bility detection but were afraid to ask
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 5roles
background 1polarities
background 1representative citing papers
FuzzingBrain V2, a multi-agent LLM system with a novel Suspicious Point abstraction and dual-layer fuzzing, reports 90% detection on a C/C++ benchmark and 29 confirmed zero-day vulnerabilities in real open-source projects.
ReasonVul deploys three LLM agents with independent analysis and structured debate to achieve 40% PairAcc and 72.52% F1 on PrimeVul, outperforming baselines by 81% in PairAcc.
Training Qwen3-8B on symbolic execution traces from Soteria improves violation detection in C programs by over 17 points, transfers across five property types, and shows superadditive gains with chain-of-thought.
Fine-tuned decoder-only LLMs fall into a Semantic Trap on vulnerability detection, achieving high scores on unpaired normal code but failing on paired vulnerable-patched code, semantic perturbations, and gap analysis, while reasoning supervision reduces symptoms at the cost of recall.
citing papers explorer
-
Antaeus: Hunting Repository-Level Logic Vulnerabilities via Context-Grounded LLM Reasoning
Antaeus detects 15 logic vulnerabilities across 28 repositories via a pipeline of function prioritization, repository-level LLM reasoning, and comparative validation, outperforming baselines at similar cost.
-
FuzzingBrain V2: A Multi-Agent LLM System for Automated Vulnerability Discovery and Reproduction
FuzzingBrain V2, a multi-agent LLM system with a novel Suspicious Point abstraction and dual-layer fuzzing, reports 90% detection on a C/C++ benchmark and 29 confirmed zero-day vulnerabilities in real open-source projects.
-
Three Heads Are Better Than One: A Multi-perspective Reasoning Framework for Enhanced Vulnerability Detection
ReasonVul deploys three LLM agents with independent analysis and structured debate to achieve 40% PairAcc and 72.52% F1 on PrimeVul, outperforming baselines by 81% in PairAcc.
-
Teaching LLMs Program Semantics via Symbolic Execution Traces
Training Qwen3-8B on symbolic execution traces from Soteria improves violation detection in C programs by over 17 points, transfers across five property types, and shows superadditive gains with chain-of-thought.
-
Do Fine-Tuned LLMs Understand Vulnerabilities? An Investigation into the Semantic Trap
Fine-tuned decoder-only LLMs fall into a Semantic Trap on vulnerability detection, achieving high scores on unpaired normal code but failing on paired vulnerable-patched code, semantic perturbations, and gap analysis, while reasoning supervision reduces symptoms at the cost of recall.