CodeQL detected 171 CVEs total, with 83 caught by a prior version before the fix; detections were often actionable within the vulnerable file but not stable across tool versions.
Semgrep*: Improving the limited performance of static application security testing (SAST) tools
6 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
representative citing papers
NESA presents a neuro-symbolic framework that decomposes static analyses into policy-defined sub-problems solved by parsers and LLMs to enable compilation-free customizable analysis with reduced hallucinations.
REALISTA generates semantically coherent adversarial prompts via latent-space optimization over input-dependent editing directions, achieving stronger hallucination elicitation than prior realistic attacks on open-source and reasoning LLMs.
PRISM detects and stops credential leakage during LLM generation in multi-agent pipelines using per-token risk scores from lexical, structural, and behavioral signals, achieving zero observed leaks and F1 of 0.832 on a 2000-task benchmark.
LLM approaches ExArch and ArTEMiS reach F1 scores of 0.86 and 0.81 for architecture entity recognition and traceability, matching or approaching baselines that require manual models.
The authors present a registered report outlining their planned large-scale empirical study of vulnerability communication in pull requests by different account types.
citing papers explorer
-
Longitudinal Analyses of SAST Tools: A CodeQL Case Study
CodeQL detected 171 CVEs total, with 83 caught by a prior version before the fix; detections were often actionable within the vulnerable file but not stable across tool versions.
-
REALISTA: Realistic Latent Adversarial Attacks that Elicit LLM Hallucinations
REALISTA generates semantically coherent adversarial prompts via latent-space optimization over input-dependent editing directions, achieving stronger hallucination elicitation than prior realistic attacks on open-source and reasoning LLMs.
-
PRISM: Generation-Time Detection and Mitigation of Secret Leakage in Multi-Agent LLM Pipelines
PRISM detects and stops credential leakage during LLM generation in multi-agent pipelines using per-token risk scores from lexical, structural, and behavioral signals, achieving zero observed leaks and F1 of 0.832 on a 2000-task benchmark.
-
Who's Who? LLM-assisted Software Traceability with Architecture Entity Recognition
LLM approaches ExArch and ArTEMiS reach F1 scores of 0.86 and 0.81 for architecture entity recognition and traceability, matching or approaching baselines that require manual models.
-
How Humans, Bots, and Agents Communicate About Vulnerabilities in Pull Requests
The authors present a registered report outlining their planned large-scale empirical study of vulnerability communication in pull requests by different account types.