arxiv: 2605.13044 · v1 · submitted 2026-05-13 · 💻 cs.CR · cs.AI

Recognition: unknown

No Attack Required: Semantic Fuzzing for Specification Violations in Agent Skills

Ying Li , Hongbo Wen , Yanju Chen , Hanzhi Liu , Yuan Tian , Yu Feng

Authors on Pith no claims yet

Pith reviewed 2026-05-14 19:11 UTC · model grok-4.3

classification 💻 cs.CR cs.AI

keywords agent skillsspecification violationssemantic fuzzingLLM agentsguardrail violationsreachability goalsmulti-armed banditbenign input testing

0 comments

The pith

Semantic fuzzing detects specification violations in 30 percent of real-world agent skills on ordinary inputs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Sefz, a framework that converts each natural-language guardrail in an agent skill into a reachability goal over an execution trace. It then uses an LLM-based mutator and a multi-armed bandit to generate benign inputs whose traces approach those goals. On 402 skills from a public marketplace, the method located violations in 120 skills, including 26 previously unknown exploitable guardrail breaches in deployed code. These failures occur without any external attack and are missed by static checkers or prompt-injection defenses. The work also isolates six recurring specification pitfalls that explain most of the observed problems.

Core claim

Sefz translates each guardrail into a reachability goal over an annotated execution trace, reducing violation checking to a deterministic graph query. An LLM mutator then produces benign inputs whose traces are steered toward the violation patterns by a multi-armed bandit that treats goal proximity as its reward signal. Evaluation across 402 real-world skills shows specification violations in 120 cases (29.9 percent), of which 26 are previously unknown exploitable guardrail violations in deployed skills.

What carries the argument

Sefz, the goal-directed semantic fuzzing framework that converts natural-language guardrails into deterministic reachability goals on execution traces and drives input generation with an LLM mutator plus multi-armed bandit reward based on goal proximity.

If this is right

Specification violations occur on benign inputs, so attack-focused defenses leave a large class of failures unaddressed.
26 exploitable guardrail violations were found in already-deployed skills, showing immediate practical impact.
Six recurring specification pitfalls account for the bulk of failures and directly suggest concrete safer design rules.
Traditional static analyzers and prompt-injection tools miss these internal specification breaches.
Skills can silently ignore their own documented constraints, breaking the contract users rely on when installing them.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Marketplaces could integrate semantic fuzzing as an automated pre-deployment check to filter out skills with weak guardrails.
The same reachability-goal approach could be applied to other LLM-driven systems that publish natural-language safety constraints.
Refining the guardrail-to-goal translation step would reduce the risk that some real violations are still missed.
Users may start demanding verifiable evidence that a skill upholds its stated guardrails rather than only resisting external attacks.

Load-bearing premise

Translating natural-language guardrails into deterministic reachability goals over execution traces accurately captures the intended semantics without creating false violations or missing real ones.

What would settle it

Manual review of a random sample of the 120 reported violations to confirm each is a genuine breach of the skill's documented guardrail on the supplied benign input.

Figures

Figures reproduced from arXiv: 2605.13044 by Hanzhi Liu, Hongbo Wen, Yanju Chen, Ying Li, Yuan Tian, Yu Feng.

**Figure 2.** Figure 2: Trace for Violation 1: the user request flows through the [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Architecture of SEFZ: Guardrail Decomposition (❶), Semantic Mutation (❷), and the Oracle (❸) form a closed fuzzing loop; the Sandboxed Executor drives agent execution and trace annotation. e1: user input e2: invoke(coda) e3: exec(cli, fails) e5: access(Doc B) e4: exec(curl) invoke invoke invoke dataflow [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: Trace for Violation 2: the documented CLI attempt [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗

**Figure 5.** Figure 5: The three artifacts in SEFZ’s analysis. A guardrail is translated into a reachability goal ϕ; an agent execution is recorded as an annotated trace τ ; the oracle decides violation by checking whether τ exhibits ϕ. sis [22], with two adaptations. First, because each input may steer the LLM through entirely different tool calls and resources, the graph is built dynamically from observed events rather than st… view at source ↗

**Figure 5.** Figure 5: the reachability goal ϕ that says what a violation looks like over the trace. Guardrails typically forbid a particular dependency chain (“no destructive action without confirmation”, “no sensitive data on an outbound channel”), so a violation is an execution whose trace contains such a chain, reducing the oracle to reachability over a labeled graph [23]. Definition 3 (Reachability Goal). A reachability go… view at source ↗

**Figure 6.** Figure 6: Cumulative violations discovered over wall-clock time. [PITH_FULL_IMAGE:figures/full_fig_p010_6.png] view at source ↗

**Figure 8.** Figure 8: Trace for Case 1: the protected path (e3, dashed) fails in the agent’s non-interactive context; the agent falls back to the generic path (e4), which skips all safety checks and unlocks the door without confirmation. open in the agent’s non-interactive context (no stdin available, so read fails and || return 0 silently returns success); and (2) the skill provides two ways to control a lock: a protected path… view at source ↗

**Figure 9.** Figure 9: Trace for Case 2: the agent asks for confirmation ( [PITH_FULL_IMAGE:figures/full_fig_p012_9.png] view at source ↗

read the original abstract

LLM-powered agents can silently delete documents, leak credentials, or transfer funds on a routine user request, not because the agent was attacked, but because the skill it invoked broke its own declared safety rules. We call these specification violations: benign inputs cause a skill to breach the natural-language guardrails in its own specification, typically because the guardrail's semantics are undefined for autonomous execution, or because the implementation silently ignores the documented constraint. These violations are invisible to static analyzers, traditional fuzzers, and prompt-injection defenses alike, yet they undermine the very contract a user trusts when installing a skill. We present Sefz, a goal-directed semantic fuzzing framework that automatically discovers specification violations in agent skills. Sefz translates each guardrail into a reachability goal over an annotated execution trace, reducing violation checking to a deterministic graph query. An LLM-based mutator generates benign inputs whose traces progressively approach the violation patterns, guided by a multi-armed bandit that uses goal-proximity as its reward signal. On 402 real-world skills from the largest public agent-skill marketplace, Sefz finds specification violations in 120 (29.9%), including 26 previously unknown exploitable guardrail violations in deployed skills. Six recurring specification pitfalls explain the bulk of the failures, suggesting concrete principles for safer skill design.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Sefz gives a workable method for catching spec violations in agent skills via reachability queries and bandit-guided mutation, but the translation from natural-language guardrails to those queries has no reported validation.

read the letter

The main takeaway is that this framework turns guardrail checking into a graph reachability problem and steers an LLM mutator with a bandit that rewards proximity to the target pattern. On 402 marketplace skills it reports violations in 120 cases, 26 of them exploitable. That scale and the reduction to deterministic queries are the concrete advances over plain fuzzing or prompt-injection tests.

Referee Report

2 major / 2 minor

Summary. The paper presents Sefz, a goal-directed semantic fuzzing framework that translates natural-language guardrails from agent skills into deterministic reachability goals over annotated execution traces. An LLM-based mutator, guided by a multi-armed bandit using goal-proximity rewards, generates benign inputs to discover violations. On 402 real-world skills from a public marketplace, Sefz reports specification violations in 120 skills (29.9%), including 26 previously unknown exploitable guardrail violations, and identifies six recurring specification pitfalls for safer design.

Significance. If the guardrail-to-reachability translations are faithful, the work identifies a practically important class of safety failures in deployed agent skills that arise from underspecified natural-language contracts rather than attacks. The scale of the evaluation on 402 marketplace skills supplies a useful empirical baseline, and the reduction of checking to deterministic graph queries is a clean technical step that supports reproducibility. The recurring pitfalls provide concrete, actionable guidance for skill developers.

major comments (2)

[Evaluation] Evaluation section: the headline result of 120 violations (29.9%) and 26 exploitable cases rests on LLM-generated reachability goals, yet no validation is reported (manual review of a sample, inter-rater agreement, or held-out guardrail comparison). This directly affects whether the measured rate reflects genuine specification breaches or translation artifacts.
[§4] §4 (framework description): the claim that violation checking reduces to a deterministic graph query is load-bearing, but the paper provides no error analysis or accuracy metrics for the trace annotation step that produces the graphs; annotation errors would propagate into both the bandit guidance and the final violation counts.

minor comments (2)

[Abstract] Abstract: the six recurring pitfalls are mentioned but not enumerated; a brief list would strengthen the takeaway without lengthening the abstract.
[Evaluation] The paper should clarify whether the 26 exploitable violations were confirmed by manual reproduction or only by the automated reachability query.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback and for recognizing the practical importance of identifying specification violations in agent skills. We agree that additional validation is needed for the LLM-generated reachability goals and for the trace annotation process. We will revise the manuscript accordingly by adding empirical validation studies and error analyses.

read point-by-point responses

Referee: [Evaluation] Evaluation section: the headline result of 120 violations (29.9%) and 26 exploitable cases rests on LLM-generated reachability goals, yet no validation is reported (manual review of a sample, inter-rater agreement, or held-out guardrail comparison). This directly affects whether the measured rate reflects genuine specification breaches or translation artifacts.

Authors: We agree that validation of the translations is essential to substantiate the reported rates. In the revised manuscript, we will add a dedicated validation subsection in the evaluation. This will include: (1) manual review by two independent reviewers of a random sample of 50 guardrail-to-reachability translations, with reported inter-rater agreement (Cohen's kappa); (2) a comparison against a held-out set of 20 guardrails where we manually craft reference reachability goals. These steps will quantify translation fidelity and support that the 29.9% figure reflects genuine violations rather than artifacts. revision: yes
Referee: [§4] §4 (framework description): the claim that violation checking reduces to a deterministic graph query is load-bearing, but the paper provides no error analysis or accuracy metrics for the trace annotation step that produces the graphs; annotation errors would propagate into both the bandit guidance and the final violation counts.

Authors: We acknowledge that the absence of error analysis for trace annotation is a gap, as annotation inaccuracies could affect downstream results. In the revision, we will expand §4 with an error analysis subsection. We will manually annotate a random sample of 100 execution traces and report precision, recall, and F1-score for key annotation elements (e.g., state transitions, variable bindings). We will also analyze and discuss potential error propagation into the multi-armed bandit rewards and violation detection, including sensitivity experiments where we inject controlled annotation noise. revision: yes

Circularity Check

0 steps flagged

No circularity: framework is a new construction with independent empirical results

full rationale

The paper presents Sefz as a novel goal-directed fuzzing framework that translates guardrails to reachability queries and applies LLM mutation plus bandit search. No equations, fitted parameters, or self-citations appear in the provided text. The 120/402 violation count is produced by executing the framework on external marketplace skills rather than by algebraic reduction or renaming of prior fitted quantities. The translation step is an explicit modeling choice whose accuracy is an open correctness question, not a definitional tautology. Therefore the derivation chain is self-contained and does not reduce to its inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The approach rests on the assumption that guardrail semantics can be faithfully encoded as reachability queries and that LLM mutations can be steered effectively by proximity rewards; no free parameters or invented entities are explicitly introduced in the abstract.

axioms (1)

domain assumption Natural-language guardrails in skill specifications can be translated into deterministic reachability goals over execution traces without loss of intended meaning.
Stated in the description of Sefz's core translation step.

pith-pipeline@v0.9.0 · 5548 in / 1264 out tokens · 27074 ms · 2026-05-14T19:11:44.275090+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

48 extracted references · 10 canonical work pages · 2 internal anchors

[1]

Agentic ai: Autonomous intelligence for complex goals–a comprehensive survey

Deepak Bhaskar Acharya, Karthigeyan Kuppan, and B Divya. Agentic ai: Autonomous intelligence for complex goals–a comprehensive survey. IEEe Access, 2025

2025
[2]

The landscape of prompt injection threats in llm agents: From taxonomy to analysis.arXiv preprint arXiv:2602.10453, 2026

Peiran Wang, Xinfeng Li, Chong Xiang, Jinghuai Zhang, Ying Li, Lixia Zhang, Xiaofeng Wang, and Yuan Tian. The landscape of prompt injection threats in llm agents: From taxonomy to analysis.arXiv preprint arXiv:2602.10453, 2026

work page arXiv 2026
[3]

Agent skills overview

Anthropic. Agent skills overview. https://platform.claude.com/docs/en/ agents-and-tools/agent-skills/overview, 2026. Online

2026
[4]

ClawHub Community. ClawHub. https://clawhub.ai/, 2026. Accessed: 2026-05-07

2026
[5]

Openclaw — personal ai assistant

npm. Openclaw — personal ai assistant. https://www.npmjs.com/ package/openclaw, 2026

2026
[6]

npm. clawhub. https://www.npmjs.com/package/clawhub, 2026

2026
[7]

OpenClaw documentation

OpenClaw. OpenClaw documentation. https://docs.openclaw.ai/, 2026. Accessed: 2026-05-07

2026
[8]

Demystifying rce vulnerabilities in llm-integrated apps

Tong Liu, Zizhuang Deng, Guozhu Meng, Yuekang Li, and Kai Chen. Demystifying rce vulnerabilities in llm-integrated apps. InProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communi- cations Security, pages 1716–1730, 2024

2024
[9]

Make agent defeat agent: Automatic detection of{Taint-Style}vulnerabilities in{LLM-based}agents

Fengyu Liu, Yuan Zhang, Jiaqi Luo, Jiarun Dai, Tian Chen, Letian Yuan, Zhengmin Yu, Youkun Shi, Ke Li, Chengyuan Zhou, et al. Make agent defeat agent: Automatic detection of{Taint-Style}vulnerabilities in{LLM-based}agents. In34th USENIX Security Symposium (USENIX Security 25), pages 3767–3786, 2025

2025
[10]

In33rd USENIX Security Symposium (USENIX Security 24), pages 4657–4674, 2024

Jiahao Yu, Xingwei Lin, Zheng Yu, and Xinyu Xing.{LLM-Fuzzer}: Scaling assessment of large language model jailbreaks. In33rd USENIX Security Symposium (USENIX Security 24), pages 4657–4674, 2024

2024
[11]

Datasentinel: A game-theoretic detection of prompt injection attacks

Yupei Liu, Yuqi Jia, Jinyuan Jia, Dawn Song, and Neil Zhenqiang Gong. Datasentinel: A game-theoretic detection of prompt injection attacks. In 2025 IEEE Symposium on Security and Privacy (SP), pages 2190–2208. IEEE, 2025

2025
[12]

Agentarmor: Enforcing program analysis on agent runtime trace to defend against prompt injection.arXiv preprint arXiv:2508.01249, 2025

Peiran Wang, Yang Liu, Yunfei Lu, Yifeng Cai, Hongbo Chen, Qingyou Yang, Jie Zhang, Jue Hong, and Ye Wu. Agentarmor: Enforcing program analysis on agent runtime trace to defend against prompt injection.arXiv preprint arXiv:2508.01249, 2025

work page arXiv 2025
[13]

Ignore previous prompt: Attack techniques for language models

F ´abio Perez and Ian Ribeiro. Ignore previous prompt: Attack techniques for language models. InNeurIPS 2022 ML Safety Workshop, 2022

2022
[14]

InjecAgent: Benchmarking indirect prompt injections in tool-integrated large lan- guage model agents

Qiusi Zhan, Zhixiang Liang, Zifan Ying, and Daniel Kang. InjecAgent: Benchmarking indirect prompt injections in tool-integrated large lan- guage model agents. InFindings of the Association for Computational Linguistics: ACL 2024, 2024

2024
[15]

Skillprobe: Security auditing for emerging agent skill marketplaces via multi-agent collaboration.arXiv preprint arXiv:2603.21019, 2026

Zihan Guo, Zhiyu Chen, Xiaohang Nie, Jianghao Lin, Yuanjian Zhou, and Weinan Zhang. Skillprobe: Security auditing for emerging agent skill marketplaces via multi-agent collaboration.arXiv preprint arXiv:2603.21019, 2026

work page arXiv 2026
[16]

Skill-inject: Measuring agent vulnerability to skill file attacks.arXiv preprint arXiv:2602.20156, 2026

David Schmotz, Luca Beurer-Kellner, Sahar Abdelnabi, and Maksym Andriushchenko. Skill-inject: Measuring agent vulnerability to skill file attacks.arXiv preprint arXiv:2602.20156, 2026

work page arXiv 2026
[17]

Mcpguard: Auto- matically detecting vulnerabilities in mcp servers.arXiv preprint arXiv:2510.23673, 2025

Bin Wang, Zexin Liu, Hao Yu, Ao Yang, Yenan Huang, Jing Guo, Huangsheng Cheng, Hui Li, and Huiyu Wu. Mcpguard: Auto- matically detecting vulnerabilities in mcp servers.arXiv preprint arXiv:2510.23673, 2025

work page arXiv 2025
[18]

Marcel B ¨ohme, Valentin J. M. Manes, and Sang Kil Cha. Boosting fuzzer efficiency: An information theoretic perspective. InProceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE), 2020

2020
[19]

Taming various privilege escalation in llm-based agent systems: A mandatory access control framework, 2026

Zimo Ji, Daoyuan Wu, Wenyuan Jiang, Pingchuan Ma, Zongjie Li, Yudong Gao, Shuai Wang, and Yingjiu Li. Taming various privilege escalation in llm-based agent systems: A mandatory access control framework, 2026

2026
[20]

Isolategpt: An execution isolation architecture for llm- based agentic systems

Yuhao Wu, Franziska Roesner, Tadayoshi Kohno, Ning Zhang, and Umar Iqbal. Isolategpt: An execution isolation architecture for llm- based agentic systems. In32nd Annual Network and Distributed System Security Symposium, NDSS 2025, San Diego, California, USA, February 24-28, 2025. The Internet Society, 2025

2025
[21]

Not what you’ve signed up for: Com- promising real-world llm-integrated applications with indirect prompt injection

Kai Greshake, Sahar Abdelnabi, Shailesh Mishra, Christoph Endres, Thorsten Holz, and Mario Fritz. Not what you’ve signed up for: Com- promising real-world llm-integrated applications with indirect prompt injection. InProceedings of the 16th ACM Workshop on Artificial Intelligence and Security, AISec ’23, page 79–90, New York, NY , USA,
[22]

Association for Computing Machinery
[23]

Ottenstein, and Joe D

Jeanne Ferrante, Karl J. Ottenstein, and Joe D. Warren. The program dependence graph and its use in optimization.ACM Transactions on Programming Languages and Systems, 9(3):319–349, 1987

1987
[24]

Demand interprocedural program analysis using logic databases

Thomas Reps. Demand interprocedural program analysis using logic databases. InApplications of Logic Databases, pages 163–196. Kluwer Academic Publishers, 1994

1994
[25]

AFL++: Combining incremental steps of fuzzing research

Andrea Fioraldi, Dominik Maier, Heiko Eißfeldt, and Marc Heuse. AFL++: Combining incremental steps of fuzzing research. In14th USENIX Workshop on Offensive Technologies (WOOT 20). USENIX Association, 2020

2020
[26]

libfuzzer: A library for coverage-guided fuzz testing

LLVM Project. libfuzzer: A library for coverage-guided fuzz testing. https://llvm.org/docs/LibFuzzer.html, 2025. Accessed May 14, 2026

2025
[27]

garak: A framework for security probing large language models, 2024

Leon Derczynski, Erick Galinkin, Jeffrey Martin, Subho Majumdar, and Nanna Inie. garak: A framework for security probing large language models, 2024

2024
[28]

AgentSpec: Customizable Runtime Enforcement for Safe and Reliable LLM Agents

Haoyu Wang, Christopher M Poskitt, and Jun Sun. Agentspec: Cus- tomizable runtime enforcement for safe and reliable llm agents.arXiv preprint arXiv:2503.18666, 2025

work page internal anchor Pith review arXiv 2025
[29]

Secalign: Defending against prompt injection with preference optimization

Sizhe Chen, Arman Zharmagambetov, Saeed Mahloujifar, Kamalika Chaudhuri, David Wagner, and Chuan Guo. Secalign: Defending against prompt injection with preference optimization. InProceedings of the 2025 ACM SIGSAC Conference on Computer and Communications Security, pages 2833–2847, 2025

2025
[30]

Here comes the ai worm: Preventing the propagation of adversarial self-replicating prompts within genai ecosystems

Stav Cohen, Ron Bitton, and Ben Nassi. Here comes the ai worm: Preventing the propagation of adversarial self-replicating prompts within genai ecosystems. InProceedings of the 2025 ACM SIGSAC Conference on Computer and Communications Security, pages 3975–3989, 2025

2025
[31]

Prompt injection attack to tool selection in llm agents.arXiv preprint arXiv:2504.19793, 2025

Jiawen Shi, Zenghui Yuan, Guiyao Tie, Pan Zhou, Neil Zhenqiang Gong, and Lichao Sun. Prompt injection attack to tool selection in llm agents. arXiv preprint arXiv:2504.19793, 2025

work page arXiv 2025
[32]

PromptArmor: Simple yet Effective Prompt Injection Defenses.arXiv preprint arXiv:2507.15219, 2025.https: //arxiv.org/abs/2507.15219

Tianneng Shi, Kaijie Zhu, Zhun Wang, Yuqi Jia, Will Cai, Weida Liang, Haonan Wang, Hend Alzahrani, Joshua Lu, Kenji Kawaguchi, et al. Promptarmor: Simple yet effective prompt injection defenses.arXiv preprint arXiv:2507.15219, 2025

work page arXiv 2025
[33]

Progent: Securing AI Agents with Privilege Control

Tianneng Shi, Jingxuan He, Zhun Wang, Hongwei Li, Linyu Wu, Wenbo Guo, and Dawn Song. Progent: Programmable privilege control for llm agents.arXiv preprint arXiv:2504.11703, 2025

work page internal anchor Pith review arXiv 2025
[34]

{JBShield}: Defending large language models from jailbreak attacks through activated concept analysis and manipulation

Shenyi Zhang, Yuchen Zhai, Keyan Guo, Hongxin Hu, Shengnan Guo, Zheng Fang, Lingchen Zhao, Chao Shen, Cong Wang, and Qian Wang. {JBShield}: Defending large language models from jailbreak attacks through activated concept analysis and manipulation. In34th USENIX Security Symposium (USENIX Security 25), pages 8215–8234, 2025

2025
[35]

Exploiting{Task-Level}vulnerabilities: An automatic jailbreak attack and defense benchmarking for{LLMs}

Lan Zhang, Xinben Gao, Liuyi Yao, Jinke Song, and Yaliang Li. Exploiting{Task-Level}vulnerabilities: An automatic jailbreak attack and defense benchmarking for{LLMs}. In34th USENIX Security Symposium (USENIX Security 25), pages 2363–2382, 2025

2025
[36]

In34th USENIX Security Symposium (USENIX Security 25), pages 2401–2420, 2025

Xueluan Gong, Mingzhe Li, Yilin Zhang, Fengyuan Ran, Chen Chen, Yanjiao Chen, Qian Wang, and Kwok-Yan Lam.{PAPILLON}: Efficient and stealthy fuzz{Testing-Powered}jailbreaks for{LLMs}. In34th USENIX Security Symposium (USENIX Security 25), pages 2401–2420, 2025

2025
[37]

An in-depth investigation of data collection in llm app ecosystems

Yuhao Wu, Evin Jaff, Ke Yang, Ning Zhang, and Umar Iqbal. An in-depth investigation of data collection in llm app ecosystems. In Proceedings of the 2025 ACM Internet Measurement Conference, pages 150–170, 2025

2025
[38]

Gptracker: A large-scale measurement of misused gpts

Xinyue Shen, Yun Shen, Michael Backes, and Yang Zhang. Gptracker: A large-scale measurement of misused gpts. In2025 IEEE Symposium on Security and Privacy (SP), pages 336–354. IEEE, 2025

2025
[39]

Model context protocol (mcp): Landscape, security threats, and future research direc- tions.ACM Transactions on Software Engineering and Methodology, 2025

Xinyi Hou, Yanjie Zhao, Shenao Wang, and Haoyu Wang. Model context protocol (mcp): Landscape, security threats, and future research direc- tions.ACM Transactions on Software Engineering and Methodology, 2025

2025
[40]

Malicious agent skills in the wild: A large-scale security empirical study.arXiv preprint arXiv:2602.06547, 2026

Yi Liu, Zhihao Chen, Yanjun Zhang, Gelei Deng, Yuekang Li, Jianting Ning, Ying Zhang, and Leo Yu Zhang. Malicious agent skills in the wild: A large-scale security empirical study.arXiv preprint arXiv:2602.06547, 2026

work page arXiv 2026
[41]

Directed greybox fuzzing

Marcel B ¨ohme, Van-Thuan Pham, Manh-Dung Nguyen, and Abhik Roychoudhury. Directed greybox fuzzing. InProceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security (CCS), 2017

2017
[42]

Hawkeye: Towards a desired directed grey- 14 box fuzzer

Hongxu Chen, Yinxing Xue, Yuekang Li, Bihuan Chen, Xiaofei Xie, Xiuheng Wu, and Yang Liu. Hawkeye: Towards a desired directed grey- 14 box fuzzer. InProceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security (CCS), 2018

2018
[43]

Large language model guided protocol fuzzing

Ruijie Meng, Martin Mirchev, Marcel B ¨ohme, and Abhik Roychoudhury. Large language model guided protocol fuzzing. InNDSS, 2024

2024
[44]

In32nd USENIX Security Symposium (USENIX Security 23), pages 1919–1936, 2023

Dawei Wang, Ying Li, Zhiyu Zhang, and Kai Chen.{CarpetFuzz}: Automatic program option constraint extraction from documentation for fuzzing. In32nd USENIX Security Symposium (USENIX Security 23), pages 1919–1936, 2023

1919
[45]

Prophet- fuzz: Fully automated prediction and fuzzing of high-risk option com- binations with only documentation via large language model

Dawei Wang, Geng Zhou, Li Chen, Dan Li, and Yukai Miao. Prophet- fuzz: Fully automated prediction and fuzzing of high-risk option com- binations with only documentation via large language model. In Proceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security, pages 735–749, 2024

2024
[46]

Docter: Documentation- guided fuzzing for testing deep learning api functions

Danning Xie, Yitong Li, Mijung Kim, Hung Viet Pham, Lin Tan, Xiangyu Zhang, and Michael W Godfrey. Docter: Documentation- guided fuzzing for testing deep learning api functions. InProceedings of the 31st ACM SIGSOFT international symposium on software testing and analysis, pages 176–188, 2022

2022
[47]

eBPF misbehavior detection: Fuzzing with a specification-based oracle

Tao Lyu, Kumar Kartikeya Dwivedi, Thomas Bourgeat, Mathias Payer, Meng Xu, and Sanidhya Kashyap. eBPF misbehavior detection: Fuzzing with a specification-based oracle. InProceedings of the ACM SIGOPS 31st Symposium on Operating Systems Principles (SOSP ’25). ACM, 2025

2025
[48]

I guess”/“probably

Yu Hao, Juefei Pu, Xingyu Li, Zhiyun Qian, and Ardalan Amiri Sani. Syzspec: Specification generation for linux kernel fuzzing via under- constrained symbolic execution. InProceedings of the 2025 ACM SIGSAC Conference on Computer and Communications Security, pages 813–826, 2025. 15 Table VI.Complete mutation operator catalog. Operators marked ∗are feedback...

2025