Title resolution pending

Yinlin Deng, Chunqiu Steven Xia, Haoran Peng, Chenyuan Yang, Lingming Zhang · 2023

7 Pith papers cite this work. Polarity classification is still indexing.

7 Pith papers citing it

browse 7 citing papers

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

citation-role summary

background 1 method 1

citation-polarity summary

background 1 use method 1

representative citing papers

Agentic Vulnerability Reasoning on Windows COM Binaries

cs.CR · 2026-05-06 · accept · novelty 7.0

SLYP agentic pipeline discovers race condition vulnerabilities in Windows COM binaries and generates debugger-verified PoCs, scoring 0.973 F1 on a 40-case benchmark and finding 28 new confirmed vulnerabilities in production services.

Empowering Autonomous Debugging Agents with Efficient Dynamic Analysis

cs.SE · 2026-04-27 · unverdicted · novelty 7.0

ADI equips AI debugging agents with function-level interaction via a new execution trace structure, raising SWE-bench Verified resolution to 63.8% at $1.28 per task and delivering 6-18% gains when added to existing agents.

From Exploration to Specification: LLM-Based Property Generation for Mobile App Testing

cs.SE · 2026-04-15 · unverdicted · novelty 7.0

PropGen automates property generation for Android app testing via LLM synthesis from guided exploration and feedback refinement, yielding 912 valid properties and 25 previously unknown bugs across 12 apps.

Hidden Reliability Risks in Large Language Models: Systematic Identification of Precision-Induced Output Disagreements

cs.AI · 2026-04-02 · unverdicted · novelty 7.0

PrecisionDiff is a differential testing framework that uncovers widespread precision-induced behavioral disagreements in aligned LLMs, including safety-critical jailbreak divergences across precision formats.

TEMPLATEFUZZ: Fine-Grained Chat Template Fuzzing for Jailbreaking and Red Teaming LLMs

cs.CR · 2026-04-14 · unverdicted · novelty 6.0

TEMPLATEFUZZ mutates chat templates with element-level rules and heuristic search to reach 98.2% average jailbreak success rate on twelve open-source LLMs while degrading accuracy by only 1.1%.

AnyPoC: Universal Proof-of-Concept Test Generation for Scalable LLM-Based Bug Detection

cs.SE · 2026-04-13 · conditional · novelty 6.0

AnyPoC introduces a multi-agent system for generating and validating PoC tests from LLM bug reports, producing 1.3x more valid PoCs, rejecting 9.8x more false positives, and discovering 122 new bugs across 12 major projects.

FunFuzz: An LLM-Powered Evolutionary Fuzzing Framework

cs.CR · 2026-05-04 · unverdicted · novelty 5.0

FunFuzz uses parallel LLM islands with candidate migration and adaptive prompting to achieve higher compiler coverage and more unique internal failures than prior LLM fuzzers on GCC and Clang over 24-hour runs.

citing papers explorer

Showing 7 of 7 citing papers.

Agentic Vulnerability Reasoning on Windows COM Binaries cs.CR · 2026-05-06 · accept · none · ref 11
SLYP agentic pipeline discovers race condition vulnerabilities in Windows COM binaries and generates debugger-verified PoCs, scoring 0.973 F1 on a 40-case benchmark and finding 28 new confirmed vulnerabilities in production services.
Empowering Autonomous Debugging Agents with Efficient Dynamic Analysis cs.SE · 2026-04-27 · unverdicted · none · ref 20
ADI equips AI debugging agents with function-level interaction via a new execution trace structure, raising SWE-bench Verified resolution to 63.8% at $1.28 per task and delivering 6-18% gains when added to existing agents.
From Exploration to Specification: LLM-Based Property Generation for Mobile App Testing cs.SE · 2026-04-15 · unverdicted · none · ref 7
PropGen automates property generation for Android app testing via LLM synthesis from guided exploration and feedback refinement, yielding 912 valid properties and 25 previously unknown bugs across 12 apps.
Hidden Reliability Risks in Large Language Models: Systematic Identification of Precision-Induced Output Disagreements cs.AI · 2026-04-02 · unverdicted · none · ref 9
PrecisionDiff is a differential testing framework that uncovers widespread precision-induced behavioral disagreements in aligned LLMs, including safety-critical jailbreak divergences across precision formats.
TEMPLATEFUZZ: Fine-Grained Chat Template Fuzzing for Jailbreaking and Red Teaming LLMs cs.CR · 2026-04-14 · unverdicted · none · ref 10
TEMPLATEFUZZ mutates chat templates with element-level rules and heuristic search to reach 98.2% average jailbreak success rate on twelve open-source LLMs while degrading accuracy by only 1.1%.
AnyPoC: Universal Proof-of-Concept Test Generation for Scalable LLM-Based Bug Detection cs.SE · 2026-04-13 · conditional · none · ref 13
AnyPoC introduces a multi-agent system for generating and validating PoC tests from LLM bug reports, producing 1.3x more valid PoCs, rejecting 9.8x more false positives, and discovering 122 new bugs across 12 major projects.
FunFuzz: An LLM-Powered Evolutionary Fuzzing Framework cs.CR · 2026-05-04 · unverdicted · none · ref 1
FunFuzz uses parallel LLM islands with candidate migration and adaptive prompting to achieve higher compiler coverage and more unique internal failures than prior LLM fuzzers on GCC and Clang over 24-hour runs.

Title resolution pending

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer