SDLLMFuzz: Dynamic-static LLM-assisted greybox fuzzing for structured input programs

Futai Zou; Tianming Zheng; Yihao Zou; Yue Wu

arxiv: 2604.17750 · v1 · submitted 2026-04-20 · 💻 cs.CR · cs.PL

SDLLMFuzz: Dynamic-static LLM-assisted greybox fuzzing for structured input programs

Yihao Zou , Tianming Zheng , Futai Zou , Yue Wu This is my paper

Pith reviewed 2026-05-10 05:18 UTC · model grok-4.3

classification 💻 cs.CR cs.PL

keywords greybox fuzzingLLM-assisted fuzzingstructured input programsvulnerability discoveryfeedback-driven input generationcrash analysisdynamic-static loop

0 comments

The pith

SDLLMFuzz combines LLM-generated inputs with static crash analysis to discover bugs faster in structured-input programs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes a fuzzing technique that uses large language models to create test inputs that respect the strict format requirements of certain programs. It adds a feedback mechanism where information from crashes is analyzed statically to inform the next set of LLM-generated inputs. This creates a cycle that helps the fuzzer reach deeper and more relevant parts of the code than mutation-only methods allow. Evaluation on the Magma benchmark with programs such as libxml2 shows improved rates of bug finding and shorter times to the first bug compared to prior approaches.

Core claim

The central claim is that a dynamic-static feedback loop, where LLMs produce syntactically valid and semantically diverse seeds while static analysis of core dumps and execution traces supplies semantic guidance, enables more efficient exploration of complex program behaviors in structured-input programs.

What carries the argument

The dynamic-static feedback loop that refines LLM inputs based on semantic information extracted from crash artifacts.

If this is right

Greater success in finding vulnerabilities within programs that process structured data like XML, PNG, and audio files.
Shorter intervals between starting the fuzzer and detecting the first bug.
Improved ability to generate inputs that satisfy syntactic constraints without relying solely on manual grammars or mutations.
More effective use of runtime crash information beyond simple coverage metrics.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Such hybrid systems may reduce the need for program-specific customizations in fuzzing tools.
Advances in LLM capabilities could further amplify the effectiveness of this feedback approach.
Similar techniques might apply to testing other constrained systems, such as network protocols or configuration parsers.

Load-bearing premise

Large language models can reliably produce syntactically valid and semantically diverse inputs, and static analysis of crash artifacts provides rich semantic information that guides effective subsequent input generation.

What would settle it

Repeating the Magma benchmark experiments and observing no significant gains in the number of bugs discovered or the time to first bug over traditional greybox fuzzers and other LLM baselines.

Figures

Figures reproduced from arXiv: 2604.17750 by Futai Zou, Tianming Zheng, Yihao Zou, Yue Wu.

**Figure 1.** Figure 1: Overview of the SDLLMFuzz framework. 3.5 Crash Feedback Encoding To enable the LLM to utilize crash information, we transform low-level debugging outputs into structured semantic descriptions. This encoding bridges the gap between execution data and LLM reasoning. 3.6 Dynamic–Static Feedback Loop The key component of SDLLMFuzz is the integration of dynamic execution and static analysis into a unified feedb… view at source ↗

**Figure 2.** Figure 2: Bug discovery results within 24 hours [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗

**Figure 3.** Figure 3: Bug discovery results within 48 hours. 5.2 Time-to-Bug Analysis [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗

**Figure 4.** Figure 4: Time-to-Bug heatmap 6 Discussion 6.1 Why SDLLMFuzz Works The effectiveness of SDLLMFuzz stems from the combination of semantic-aware input generation and feedback-driven refinement. Traditional fuzzing relies on random or heuristic mutations, which are often ineffective for structured-input programs due to strict syntax and semantic constraints. By leveraging LLMs, SDLLMFuzz generates structure-aware inp… view at source ↗

**Figure 5.** Figure 5: Time-to-Bug curves [PITH_FULL_IMAGE:figures/full_fig_p011_5.png] view at source ↗

**Figure 6.** Figure 6: Bug discovery in ablation study. testing environments. 6.3 Limitations Despite its effectiveness, SDLLMFuzz has several limitations. First, the performance of the framework depends on the quality of LLM-generated inputs. Poor prompt design or insufficient context may lead to suboptimal seed generation, affecting overall performance. Second, incorporating LLMs and static analysis introduces additional compu… view at source ↗

**Figure 7.** Figure 7: Time-to-Bug in ablation study. may still remain. External Validity. The evaluation is conducted on the Magma benchmark, which, although widely used, may not fully represent all real-world software systems. The generalizability of the results to other domains requires further validation. Construct Validity. We primarily use bug coverage and time-to-bug as evaluation metrics. While these are standard in fu… view at source ↗

read the original abstract

Fuzzing has become a widely adopted technique for vulnerability discovery, yet it remains ineffective for structured-input programs due to strict syntactic constraints and limited semantic awareness. Traditional greybox fuzzers rely on mutation-based strategies and coarse-grained coverage feedback, which often fail to generate valid inputs and explore deep execution paths. Recent advances in large language models (LLMs) have shown promise in improving input generation, but existing approaches primarily focus on seed generation and largely overlook the effective use of runtime feedback. In this paper, we propose SDLLMFuzz, a dynamic-static LLM-assisted greybox fuzzing framework for structured-input programs. Our approach integrates LLM-based structure-aware seed generation with static crash analysis, forming a unified feedback loop that iteratively refines test inputs. Specifically, we leverage LLMs to generate syntactically valid and semantically diverse inputs, while extracting rich semantic information from crash artifacts (e.g., core dumps and execution traces) to guide subsequent input generation. This dynamic-static feedback mechanism enables more efficient exploration of complex program behaviors. We evaluate SDLLMFuzz on the Magma benchmark across multiple structured-input programs, including libxml2, libpng, and libsndfile. Experimental results show that SDLLMFuzz significantly outperforms traditional greybox fuzzers and LLM-assisted baselines in terms of bug discovery and time-to-bug. These results demonstrate that combining semantic input generation with feedback-driven refinement is an effective direction for improving fuzzing performance on structured-input programs.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

SDLLMFuzz adds a static crash-analysis loop to LLM seed generation for structured fuzzing, but the abstract supplies no numbers, ablations, or validity stats to back the performance claims.

read the letter

The paper's main contribution is a feedback loop that uses LLMs to produce syntactically valid seeds for formats like XML and PNG, then extracts semantic signals from core dumps and execution traces to steer the next generation round. This is positioned as an advance over earlier LLM fuzzers that mostly handled initial seed creation without closing the loop on runtime artifacts. The idea is straightforward and targets a real pain point in greybox fuzzing of structured-input libraries. It does a decent job naming the gap in prior work and sketching how dynamic coverage plus static crash info could work together on the Magma benchmark programs. That framing is clear enough to be useful as a starting point for someone already following LLM-assisted fuzzing. The soft spot is the complete absence of supporting evidence in the abstract. The claim of significant gains in bug discovery and time-to-bug is stated without any counts, baselines, error bars, or ablation that isolates the static-analysis component. There are also no reported rates for how often the LLM actually produces valid inputs, which is the load-bearing assumption. If those rates are low or the static signals largely overlap with ordinary coverage feedback, the reported improvements could be explained by implementation details or run length rather than the proposed mechanism. The full paper may contain the missing tables and methodology, but nothing in the provided text lets a reader judge whether the central result holds. This work is aimed at security researchers and tool builders who already fuzz structured programs and are experimenting with LLMs. A reader in that group could extract the framework description and try the loop themselves even if the evaluation section needs more rigor. The paper deserves peer review because the problem is practical, the benchmark is standard, and the proposed direction is a plausible next step; a referee can ask for the quantitative details and ablations that are currently missing.

Referee Report

3 major / 1 minor

Summary. The paper proposes SDLLMFuzz, a greybox fuzzing framework for structured-input programs that combines LLM-based generation of syntactically valid and semantically diverse seeds with static analysis of crash artifacts (core dumps and execution traces) to create a dynamic-static feedback loop. The approach is evaluated on the Magma benchmark for programs including libxml2, libpng, and libsndfile, with the abstract claiming significant outperformance over traditional greybox fuzzers and LLM-assisted baselines in bug discovery and time-to-bug.

Significance. If the empirical claims hold with proper quantification, the work could advance fuzzing for complex input formats by showing how LLM generation plus static crash-derived semantics can improve upon coverage-only feedback. The dynamic-static loop idea addresses a known limitation in greybox fuzzing, and the Magma evaluation target is appropriate for structured programs.

major comments (3)

[Abstract] Abstract: The headline claim that 'Experimental results show that SDLLMFuzz significantly outperforms traditional greybox fuzzers and LLM-assisted baselines in terms of bug discovery and time-to-bug' is unsupported by any quantitative metrics, statistical details, error bars, baseline configurations, run counts, or methodology specifics anywhere in the manuscript.
[Approach] Approach description: No validity-rate statistics, syntactic correctness measurements, or diversity metrics are reported for the LLM-generated inputs on formats such as XML, PNG, or sound files, leaving the core assumption that LLMs reliably produce usable seeds unverified and load-bearing for the claimed gains.
[Evaluation] Evaluation: The manuscript contains no ablation isolating the static crash-analysis component from standard dynamic coverage feedback, nor any comparison of time-to-bug or unique bugs found with and without the static signals; without this, it is impossible to attribute outperformance to the proposed dynamic-static loop rather than engineering or run-time differences.

minor comments (1)

[Abstract] The abstract and approach sections use terms such as 'rich semantic information' and 'unified feedback loop' without defining how the extracted crash data is encoded or fed back to the LLM.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments, which help clarify how to strengthen the presentation of our results. We address each major point below and will revise the manuscript accordingly.

read point-by-point responses

Referee: [Abstract] Abstract: The headline claim that 'Experimental results show that SDLLMFuzz significantly outperforms traditional greybox fuzzers and LLM-assisted baselines in terms of bug discovery and time-to-bug' is unsupported by any quantitative metrics, statistical details, error bars, baseline configurations, run counts, or methodology specifics anywhere in the manuscript.

Authors: We agree that the abstract would be stronger with explicit quantitative support. In the revised version we will update the abstract to report concrete metrics from our Magma experiments, including the number of unique bugs found per target, mean time-to-bug with standard deviation, the number of independent runs (10 per fuzzer), and the statistical test used for significance. Baseline configurations and run-time settings will also be summarized briefly so the claim is self-contained. revision: yes
Referee: [Approach] Approach description: No validity-rate statistics, syntactic correctness measurements, or diversity metrics are reported for the LLM-generated inputs on formats such as XML, PNG, or sound files, leaving the core assumption that LLMs reliably produce usable seeds unverified and load-bearing for the claimed gains.

Authors: We acknowledge the omission. We will add a new table and accompanying text in the evaluation section that reports, for each target format, the syntactic validity rate of LLM-generated seeds (percentage that parse without error), the number of unique structural variants produced, and a simple diversity measure such as the count of distinct semantic categories observed across 1,000 samples. These measurements will be obtained from the same generation pipeline used in the main experiments. revision: yes
Referee: [Evaluation] Evaluation: The manuscript contains no ablation isolating the static crash-analysis component from standard dynamic coverage feedback, nor any comparison of time-to-bug or unique bugs found with and without the static signals; without this, it is impossible to attribute outperformance to the proposed dynamic-static loop rather than engineering or run-time differences.

Authors: We agree that an explicit ablation is necessary to attribute gains to the static component. We will add an ablation study that disables the static crash-analysis feedback while retaining all other components (LLM generation and dynamic coverage) and reports the resulting change in unique bugs discovered and time-to-bug on the same Magma targets and run configuration. The new results will appear in a dedicated subsection of the evaluation. revision: yes

Circularity Check

0 steps flagged

No circularity: purely empirical evaluation with no derivation chain

full rationale

The paper proposes an LLM-assisted fuzzing framework and supports its claims solely via experimental results on the external Magma benchmark (libxml2, libpng, libsndfile). No equations, fitted parameters, self-referential definitions, or load-bearing self-citations appear in the provided text. The central performance claims are falsifiable against independent baselines and do not reduce to quantities defined by the paper's own choices.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on two domain assumptions about LLM capabilities and the utility of crash analysis; no free parameters or invented entities are introduced in the abstract.

axioms (2)

domain assumption Large language models can be used to generate syntactically valid and semantically diverse inputs for structured data formats
This underpins the seed generation component of the framework.
domain assumption Static analysis of crash artifacts such as core dumps and execution traces provides rich semantic information that can guide effective subsequent input generation
This is required for the dynamic-static feedback loop to function as described.

pith-pipeline@v0.9.0 · 5567 in / 1490 out tokens · 71700 ms · 2026-05-10T05:18:24.189736+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

24 extracted references · 24 canonical work pages

[1]

Nautilus: Fishing for deep bugs with grammars

Cornelius Aschermann, Tommaso Frassetto, and Thorsten Holz. Nautilus: Fishing for deep bugs with grammars. In NDSS, 2019

work page 2019
[2]

Fuzzing: Challenges and reﬂec- tions

Marcel Böhme, Cristian Cadar, and Abhik Roychoudhury. Fuzzing: Challenges and reﬂec- tions. IEEE Software , 38(3):79–86, 2020

work page 2020
[3]

Directed greybox fuzzing

Marcel Böhme, Van-Thuan Pham, Manh-Dung Nguyen, et al. Directed greybox fuzzing. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pages 2329–2344, 2017

work page 2017
[4]

Cristian Cadar, Daniel Dunbar, and Dawson R. Engler. Klee: Unassisted and automatic generation of high-coverage tests for complex systems programs. In OSDI, volume 8, pages 209–224, 2008

work page 2008
[5]

Hawkeye: Towards a desired directed grey-box fuzzer

Hongxu Chen, Yinxing Xue, Yang Li, et al. Hawkeye: Towards a desired directed grey-box fuzzer. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Commu- nications Security, pages 2095–2108, 2018

work page 2018
[6]

Compiler fuzzing through deep learning

Chris Cummins, Pavlos Petoumenos, Alastair Murray, et al. Compiler fuzzing through deep learning. In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis , pages 95–105, 2018

work page 2018
[7]

Large language models are zero-shot fuzzers: Fuzzing deep-learning libraries via large language models

Yanjun Deng, Chunqiu Steven Xia, Hao Peng, et al. Large language models are zero-shot fuzzers: Fuzzing deep-learning libraries via large language models. In Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis , pages 423–435, 2023

work page 2023
[8]

Large language models are edge-case fuzzers: Testing deep learning libraries via fuzzgpt

Yanjun Deng, Chunqiu Steven Xia, Cheng Yang, et al. Large language models are edge- case fuzzers: Testing deep learning libraries via fuzzgpt. arXiv preprint arXiv:2304.02014 , 2023

work page arXiv 2023
[9]

Machine learning for black-box fuzzing of network protocols

Rui Fan and Yu Chang. Machine learning for black-box fuzzing of network protocols. In Information and Communications Security: 19th International Conference, ICICS 2017 , pages 621–632. Springer International Publishing, 2018

work page 2017
[10]

Patrice Godefroid, Adam Kiezun, and Michael Y. Levin. Grammar-based whitebox fuzzing. In Proceedings of the 29th ACM SIGPLAN Conference on Programming Language Design and Implementation , pages 206–215, 2008. 14

work page 2008
[11]

Levin, and David Molnar

Patrice Godefroid, Michael Y. Levin, and David Molnar. Automated whitebox fuzz testing. In NDSS, volume 8, pages 151–166, 2008

work page 2008
[12]

Learn&fuzz: Machine learning for input fuzzing

Patrice Godefroid, Hila Peleg, and Rishabh Singh. Learn&fuzz: Machine learning for input fuzzing. In 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE) , pages 50–59. IEEE, 2017

work page 2017
[13]

Ganfuzz: A gan-based industrial network protocol fuzzing framework

Zhifeng Hu, Jing Shi, Yu-Heng Huang, et al. Ganfuzz: A gan-based industrial network protocol fuzzing framework. In Proceedings of the 15th ACM International Conference on Computing Frontiers, pages 138–145, 2018

work page 2018
[14]

Fuzzing: A survey

Jian Li, Bo Zhao, and Chao Zhang. Fuzzing: A survey. Cybersecurity, 1(1):1–13, 2018

work page 2018
[15]

Deepfuzz: Automatic generation of syn- tax valid c programs for fuzz testing

Xuefeng Liu, Xiaoting Li, Rohan Prajapati, et al. Deepfuzz: Automatic generation of syn- tax valid c programs for fuzz testing. In Proceedings of the AAAI Conference on Artiﬁcial Intelligence, volume 33, pages 1044–1051, 2019

work page 2019
[16]

Miller, Lars Fredriksen, and Bryan So

Barton P. Miller, Lars Fredriksen, and Bryan So. An empirical study of the reliability of unix utilities. Communications of the ACM , 33(12):32–44, 1990

work page 1990
[17]

Jensen, and Chris W

Matthias Sablotny, Bjørn S. Jensen, and Chris W. Johnson. Recurrent neural networks for fuzz testing web browsers. In Information Security and Cryptology–ICISC 2018 , pages 354–370. Springer International Publishing, 2019

work page 2018
[18]

Fuzzing: Brute Force Vulnerability Discovery

Michael Sutton, Adam Greene, and Pedram Amini. Fuzzing: Brute Force Vulnerability Discovery. Pearson Education, 2007

work page 2007
[19]

Superion: Grammar-aware greybox fuzzing

Junjie Wang, Bihuan Chen, Lei Wei, et al. Superion: Grammar-aware greybox fuzzing. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE) , pages 724–735. IEEE, 2019

work page 2019
[20]

Universal fuzzing via large language models

Chunqiu Steven Xia, Michele Paltenghi, Jie Tian, et al. Fuzz4all: Universal fuzzing with large language models. arXiv preprint arXiv:2308.04748 , 2024

work page arXiv 2024
[21]

Format-aware learn&fuzz: Deep test data generation for eﬃcient fuzzing

Mohammad Zakeri Nasrabadi, Saeed Parsa, and Alireza Kalaee. Format-aware learn&fuzz: Deep test data generation for eﬃcient fuzzing. Neural Computing and Applications , 33:1497–1513, 2021

work page 2021
[22]

LLAMAFUZZ: Large Language Model Enhanced Greybox Fuzzing

Hongyu Zhang, Yicheng Rong, Yuxuan He, et al. Llamafuzz: Large language model en- hanced greybox fuzzing. arXiv preprint arXiv:2406.07714 , 2024

work page arXiv 2024
[23]

Seqfuzzer: An industrial protocol fuzzing framework from a deep learning perspective

Hong Zhao, Zhen Li, Hao Wei, et al. Seqfuzzer: An industrial protocol fuzzing framework from a deep learning perspective. In 2019 12th IEEE Conference on Software Testing, Validation and Veriﬁcation (ICST) , pages 59–67. IEEE, 2019

work page 2019
[24]

Fuzzing: A survey for roadmap

Xiaogang Zhu, Sheng Wen, Seyit Camtepe, et al. Fuzzing: A survey for roadmap. ACM Computing Surveys , 54(11s):1–36, 2022. 15

work page 2022

[1] [1]

Nautilus: Fishing for deep bugs with grammars

Cornelius Aschermann, Tommaso Frassetto, and Thorsten Holz. Nautilus: Fishing for deep bugs with grammars. In NDSS, 2019

work page 2019

[2] [2]

Fuzzing: Challenges and reﬂec- tions

Marcel Böhme, Cristian Cadar, and Abhik Roychoudhury. Fuzzing: Challenges and reﬂec- tions. IEEE Software , 38(3):79–86, 2020

work page 2020

[3] [3]

Directed greybox fuzzing

Marcel Böhme, Van-Thuan Pham, Manh-Dung Nguyen, et al. Directed greybox fuzzing. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pages 2329–2344, 2017

work page 2017

[4] [4]

Cristian Cadar, Daniel Dunbar, and Dawson R. Engler. Klee: Unassisted and automatic generation of high-coverage tests for complex systems programs. In OSDI, volume 8, pages 209–224, 2008

work page 2008

[5] [5]

Hawkeye: Towards a desired directed grey-box fuzzer

Hongxu Chen, Yinxing Xue, Yang Li, et al. Hawkeye: Towards a desired directed grey-box fuzzer. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Commu- nications Security, pages 2095–2108, 2018

work page 2018

[6] [6]

Compiler fuzzing through deep learning

Chris Cummins, Pavlos Petoumenos, Alastair Murray, et al. Compiler fuzzing through deep learning. In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis , pages 95–105, 2018

work page 2018

[7] [7]

Large language models are zero-shot fuzzers: Fuzzing deep-learning libraries via large language models

Yanjun Deng, Chunqiu Steven Xia, Hao Peng, et al. Large language models are zero-shot fuzzers: Fuzzing deep-learning libraries via large language models. In Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis , pages 423–435, 2023

work page 2023

[8] [8]

Large language models are edge-case fuzzers: Testing deep learning libraries via fuzzgpt

Yanjun Deng, Chunqiu Steven Xia, Cheng Yang, et al. Large language models are edge- case fuzzers: Testing deep learning libraries via fuzzgpt. arXiv preprint arXiv:2304.02014 , 2023

work page arXiv 2023

[9] [9]

Machine learning for black-box fuzzing of network protocols

Rui Fan and Yu Chang. Machine learning for black-box fuzzing of network protocols. In Information and Communications Security: 19th International Conference, ICICS 2017 , pages 621–632. Springer International Publishing, 2018

work page 2017

[10] [10]

Patrice Godefroid, Adam Kiezun, and Michael Y. Levin. Grammar-based whitebox fuzzing. In Proceedings of the 29th ACM SIGPLAN Conference on Programming Language Design and Implementation , pages 206–215, 2008. 14

work page 2008

[11] [11]

Levin, and David Molnar

Patrice Godefroid, Michael Y. Levin, and David Molnar. Automated whitebox fuzz testing. In NDSS, volume 8, pages 151–166, 2008

work page 2008

[12] [12]

Learn&fuzz: Machine learning for input fuzzing

Patrice Godefroid, Hila Peleg, and Rishabh Singh. Learn&fuzz: Machine learning for input fuzzing. In 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE) , pages 50–59. IEEE, 2017

work page 2017

[13] [13]

Ganfuzz: A gan-based industrial network protocol fuzzing framework

Zhifeng Hu, Jing Shi, Yu-Heng Huang, et al. Ganfuzz: A gan-based industrial network protocol fuzzing framework. In Proceedings of the 15th ACM International Conference on Computing Frontiers, pages 138–145, 2018

work page 2018

[14] [14]

Fuzzing: A survey

Jian Li, Bo Zhao, and Chao Zhang. Fuzzing: A survey. Cybersecurity, 1(1):1–13, 2018

work page 2018

[15] [15]

Deepfuzz: Automatic generation of syn- tax valid c programs for fuzz testing

Xuefeng Liu, Xiaoting Li, Rohan Prajapati, et al. Deepfuzz: Automatic generation of syn- tax valid c programs for fuzz testing. In Proceedings of the AAAI Conference on Artiﬁcial Intelligence, volume 33, pages 1044–1051, 2019

work page 2019

[16] [16]

Miller, Lars Fredriksen, and Bryan So

Barton P. Miller, Lars Fredriksen, and Bryan So. An empirical study of the reliability of unix utilities. Communications of the ACM , 33(12):32–44, 1990

work page 1990

[17] [17]

Jensen, and Chris W

Matthias Sablotny, Bjørn S. Jensen, and Chris W. Johnson. Recurrent neural networks for fuzz testing web browsers. In Information Security and Cryptology–ICISC 2018 , pages 354–370. Springer International Publishing, 2019

work page 2018

[18] [18]

Fuzzing: Brute Force Vulnerability Discovery

Michael Sutton, Adam Greene, and Pedram Amini. Fuzzing: Brute Force Vulnerability Discovery. Pearson Education, 2007

work page 2007

[19] [19]

Superion: Grammar-aware greybox fuzzing

Junjie Wang, Bihuan Chen, Lei Wei, et al. Superion: Grammar-aware greybox fuzzing. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE) , pages 724–735. IEEE, 2019

work page 2019

[20] [20]

Universal fuzzing via large language models

Chunqiu Steven Xia, Michele Paltenghi, Jie Tian, et al. Fuzz4all: Universal fuzzing with large language models. arXiv preprint arXiv:2308.04748 , 2024

work page arXiv 2024

[21] [21]

Format-aware learn&fuzz: Deep test data generation for eﬃcient fuzzing

Mohammad Zakeri Nasrabadi, Saeed Parsa, and Alireza Kalaee. Format-aware learn&fuzz: Deep test data generation for eﬃcient fuzzing. Neural Computing and Applications , 33:1497–1513, 2021

work page 2021

[22] [22]

LLAMAFUZZ: Large Language Model Enhanced Greybox Fuzzing

Hongyu Zhang, Yicheng Rong, Yuxuan He, et al. Llamafuzz: Large language model en- hanced greybox fuzzing. arXiv preprint arXiv:2406.07714 , 2024

work page arXiv 2024

[23] [23]

Seqfuzzer: An industrial protocol fuzzing framework from a deep learning perspective

Hong Zhao, Zhen Li, Hao Wei, et al. Seqfuzzer: An industrial protocol fuzzing framework from a deep learning perspective. In 2019 12th IEEE Conference on Software Testing, Validation and Veriﬁcation (ICST) , pages 59–67. IEEE, 2019

work page 2019

[24] [24]

Fuzzing: A survey for roadmap

Xiaogang Zhu, Sheng Wen, Seyit Camtepe, et al. Fuzzing: A survey for roadmap. ACM Computing Surveys , 54(11s):1–36, 2022. 15

work page 2022