pith. sign in

arxiv: 2604.27000 · v1 · submitted 2026-04-29 · 💻 cs.SE · cs.CR· cs.PL

Adaptive and AI-Augmented Security Testing: A Systematic Survey of Program Analysis, Feedback-Driven Testing, and Hybrid Learning-Based Approaches

Pith reviewed 2026-05-07 13:27 UTC · model grok-4.3

classification 💻 cs.SE cs.CRcs.PL
keywords security testingprogram analysisfeedback-driven testingAI-augmented testingstructural-adaptive fragmentationsystematic surveyfuzzingcontinuous integration security
0
0 comments X

The pith

Security testing research shows a persistent disconnect between structural program analysis and adaptive feedback mechanisms.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This survey examines fifty-five studies on adaptive and AI-augmented security testing drawn from five domains including structural analysis, feedback-driven fuzzing, and LLM-based generation. It establishes that structural representations such as ASTs, CFGs, and CPGs remain separated from mechanisms that refine tests using execution signals or learning. The authors name this separation structural-adaptive fragmentation and observe that human triage signals are never fed back to improve the structural models. A sympathetic reader would care because modern CI/CD pipelines still produce large numbers of warnings that require manual review, and closing the gap could reduce that burden while improving vulnerability detection.

Core claim

Analysis of the fifty-five studies reveals a persistent disconnect between structural program representations such as ASTs, CFGs, and CPGs and adaptive testing mechanisms that the authors term structural-adaptive fragmentation. Neither paradigm alone resolves the separation, and no existing system incorporates human triage signals as feedback for refining structural models. The survey identifies five open research challenges and outlines an agenda for unified, semantically grounded, feedback-driven, polyglot security testing frameworks.

What carries the argument

Structural-adaptive fragmentation: the systematic separation between structural program representations (ASTs, CFGs, CPGs) and adaptive testing mechanisms that neither paradigm individually addresses.

If this is right

  • Hybrid systems that combine program analysis with adaptive learning can reduce reliance on non-adaptive workflows in continuous security testing.
  • Incorporating execution feedback and human triage into structural models would lower the volume of manual warnings in CI/CD pipelines.
  • A unified framework supporting semantically grounded, feedback-driven testing would enable more effective vulnerability detection across multiple languages.
  • Progress on the five identified open challenges would move the field toward integrated rather than fragmented security testing approaches.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Closing the identified gap could produce testing tools that automatically adjust structural models based on both runtime signals and human judgments, reducing false positives over time.
  • The fragmentation pattern may appear in other software engineering domains where static analysis outputs feed into dynamic or learning-based processes without feedback loops.
  • A practical next step would be to prototype a system that routes human triage decisions back into structural representations and measure changes in warning volume or detection accuracy.

Load-bearing premise

The fifty-five peer-reviewed studies selected from the systematic search of four databases represent the field without major bias in inclusion or categorization.

What would settle it

Identification of even one existing system that incorporates human triage signals as feedback to refine structural program models would falsify the claim that no such system exists.

Figures

Figures reproduced from arXiv: 2604.27000 by Michael Wienczkowski.

Figure 1
Figure 1. Figure 1: Study selection flowchart following PRISMA-style reporting guidelines [21]. Starting from 22,088 raw records across four databases, successive view at source ↗
Figure 2
Figure 2. Figure 2: Conceptual map of the five surveyed research domains positioned along two axes: view at source ↗
read the original abstract

Modern software systems are increasingly developed within rapid continuous integration and deployment (CI/CD) pipelines, where ensuring security prior to release presents significant technical and organizational challenges. Traditional static and dynamic analysis tools provide valuable structural and behavioral insights, yet they often operate in non-adaptive workflows and produce large volumes of warnings requiring manual triage. Feedback-driven fuzzing and search-based testing approaches have demonstrated the power of iterative input refinement guided by execution signals, while large language models (LLMs) have shown promise in automated test generation but frequently lack semantic grounding in program structure. This paper presents a systematic survey of adaptive and AI-augmented security testing research across five domains: (1) structural program analysis for vulnerability detection, (2) DevSecOps and continuous security testing, (3) feedback-driven fuzzing and search-based testing, (4) LLM-based automated test generation, and (5) emerging hybrid systems integrating program analysis with adaptive learning. We analyze fifty-five peer-reviewed studies drawn from a systematic search of four major databases yielding 22,088 raw records. Our analysis reveals a persistent disconnect between structural program representations (ASTs, CFGs, and CPGs) and adaptive testing mechanisms. We characterize this as structural-adaptive fragmentation: a systematic separation that neither paradigm individually addresses. No existing system incorporates human triage signals as feedback for refining structural models. We conclude by identifying five open research challenges and outlining a unified agenda for semantically grounded, feedback-driven, polyglot security testing frameworks.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. This paper presents a systematic survey of adaptive and AI-augmented security testing research across five domains: (1) structural program analysis for vulnerability detection, (2) DevSecOps and continuous security testing, (3) feedback-driven fuzzing and search-based testing, (4) LLM-based automated test generation, and (5) emerging hybrid systems. Drawing on 55 peer-reviewed studies identified from a search of four databases that returned 22,088 raw records, the authors identify a persistent 'structural-adaptive fragmentation' between structural representations (ASTs, CFGs, CPGs) and adaptive testing mechanisms, assert that no existing system incorporates human triage signals as feedback for refining structural models, and outline five open research challenges for unified, semantically grounded frameworks.

Significance. If the survey's selection and synthesis are reproducible and unbiased, the work is significant for mapping fragmentation in the field and highlighting the absence of human-in-the-loop refinement loops. It provides a clear research agenda that could guide development of hybrid tools integrating program structure with feedback-driven and LLM-based methods, addressing real challenges in CI/CD security testing.

major comments (1)
  1. [Systematic review methodology (search strategy, study selection, and synthesis sections)] The central claims of structural-adaptive fragmentation and that 'no existing system incorporates human triage signals as feedback for refining structural models' rest entirely on the qualitative synthesis of the 55 selected studies. However, the manuscript does not provide the explicit search strings, inclusion/exclusion criteria, data extraction protocol, quality assessment criteria, or inter-rater agreement metrics used to reduce 22,088 records to 55 studies (see the systematic search description). Without these details, it is impossible to assess whether relevant hybrid or feedback-refinement papers were under-retrieved or mis-categorized, making the fragmentation characterization unverifiable.
minor comments (1)
  1. [Abstract] The abstract lists five domains but the full enumeration in the text should be cross-checked for exact alignment with the later analysis sections to avoid minor reader confusion.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their thorough and constructive review. The feedback highlights an important opportunity to improve the transparency of our systematic review process, and we will revise the manuscript accordingly to strengthen reproducibility while preserving the core contributions.

read point-by-point responses
  1. Referee: [Systematic review methodology (search strategy, study selection, and synthesis sections)] The central claims of structural-adaptive fragmentation and that 'no existing system incorporates human triage signals as feedback for refining structural models' rest entirely on the qualitative synthesis of the 55 selected studies. However, the manuscript does not provide the explicit search strings, inclusion/exclusion criteria, data extraction protocol, quality assessment criteria, or inter-rater agreement metrics used to reduce 22,088 records to 55 studies (see the systematic search description). Without these details, it is impossible to assess whether relevant hybrid or feedback-refinement papers were under-retrieved or mis-categorized, making the fragmentation characterization unverifiable.

    Authors: We agree that the current manuscript provides only a high-level overview of the search process and lacks the granular protocol details needed for full reproducibility. In the revised version we will insert a dedicated 'Systematic Review Methodology' subsection (following PRISMA guidelines) that explicitly lists: (1) the complete search strings used in each of the four databases, (2) the full inclusion/exclusion criteria applied at each screening stage, (3) the data extraction protocol and form, (4) the quality assessment criteria (including scoring rubrics), and (5) inter-rater agreement statistics (Cohen's kappa) for both title/abstract and full-text screening. These details were recorded during the original review and will be reported without changing the set of 55 studies or the resulting synthesis. We believe this addition will allow independent verification of the fragmentation characterization and the claim regarding the absence of human-triage feedback loops. revision: yes

Circularity Check

0 steps flagged

No significant circularity in survey synthesis

full rationale

The paper is a systematic literature survey whose central claims (structural-adaptive fragmentation and absence of human-triage feedback loops) are synthesized from qualitative review of 55 externally identified peer-reviewed studies. No mathematical derivations, parameter fittings, self-referential predictions, or load-bearing self-citations appear in the derivation chain. The survey process applies standard database search and inclusion criteria to independent external sources, making the analysis self-contained against literature benchmarks without reduction to internal definitions or author priors.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

As a survey paper the central claims rest on the literature selection process and qualitative synthesis. No free parameters, mathematical axioms, or new postulated entities are introduced.

pith-pipeline@v0.9.0 · 5574 in / 1117 out tokens · 48932 ms · 2026-05-07T13:27:46.853862+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

61 extracted references · 2 canonical work pages

  1. [1]

    Usage, costs, and benefits of continuous integration in open-source projects,

    M. Hilton, T. Tunnell, K. Huang, D. Marinov, and D. Dig, “Usage, costs, and benefits of continuous integration in open-source projects,” in Proceedings of the IEEE/ACM International Conference on Automated Software Engineering (ASE), 2016

  2. [2]

    What developers want and need from program analysis,

    M. Christakis and C. Bird, “What developers want and need from program analysis,” inProceedings of the IEEE/ACM International Conference on Automated Software Engineering (ASE), 2016

  3. [3]

    An empirical study of security warnings from static application security testing tools,

    B. Aloraini, M. Nagappan, D. German, A. Zerouali, and G. Robles, “An empirical study of security warnings from static application security testing tools,”Journal of Systems and Software, vol. 158, 2019

  4. [4]

    Tricorder: Building a program analysis ecosystem,

    C. Sadowski, J. van Gogh, C. Jaspan, E. Söderberg, and C. Winter, “Tricorder: Building a program analysis ecosystem,” inProceedings of the International Conference on Software Engineering (ICSE), 2015, pp. 598–608

  5. [5]

    Moving fast with software verification,

    C. Calcagnoet al., “Moving fast with software verification,” inPro- ceedings of the NASA Formal Methods Symposium (NFM), ser. Lecture Notes in Computer Science, vol. 9058, 2015, pp. 3–11

  6. [6]

    Modeling and discov- ering vulnerabilities with code property graphs,

    F. Yamaguchi, N. Golde, D. Arp, and K. Rieck, “Modeling and discov- ering vulnerabilities with code property graphs,” inProceedings of the IEEE Symposium on Security and Privacy (S&P), 2014, pp. 590–604

  7. [7]

    Declarative static analysis for multilingual programs using CodeQL,

    J. Younet al., “Declarative static analysis for multilingual programs using CodeQL,”Software: Practice and Experience, vol. 53, no. 2, 2023

  8. [8]

    The art, science, and engineering of fuzzing: A survey,

    V . J. M. Manèset al., “The art, science, and engineering of fuzzing: A survey,”IEEE Transactions on Software Engineering, vol. 47, no. 11, pp. 2312–2331, 2021

  9. [9]

    Directed greybox fuzzing,

    M. Böhme, V .-T. Pham, M.-D. Nguyen, and A. Roychoudhury, “Directed greybox fuzzing,” inProceedings of the ACM Conference on Computer and Communications Security (CCS), 2017, pp. 2329–2344

  10. [10]

    Driller: Augmenting fuzzing through selective symbolic execution,

    N. Stephenset al., “Driller: Augmenting fuzzing through selective symbolic execution,” inProceedings of the NDSS Symposium, 2016

  11. [11]

    An empirical evaluation of using large language models for automated unit test generation,

    M. Schäfer, S. Nadi, A. Eghbali, and F. Tip, “An empirical evaluation of using large language models for automated unit test generation,”IEEE Transactions on Software Engineering, vol. 50, no. 1, 2024

  12. [12]

    White-box compiler fuzzing empowered by large language models,

    C. Yanget al., “White-box compiler fuzzing empowered by large language models,”Proceedings of the ACM on Programming Languages (OOPSLA), vol. 8, 2024

  13. [13]

    Joern: Efficient mining of software vulnerabilities with interprocedural data- flow graphs,

    F. Yamaguchi, C. Wressnegger, H. Gascon, and K. Rieck, “Joern: Efficient mining of software vulnerabilities with interprocedural data- flow graphs,” inProceedings of the DIMVA, 2014

  14. [14]

    Securify: Practical security analysis of smart contracts,

    P. Tsankov, A. Dan, D. Drachsler-Cohen, A. Gervais, F. Buenzli, and M. Vechev, “Securify: Practical security analysis of smart contracts,” in Proceedings of the ACM Conference on Computer and Communications Security (CCS), 2018

  15. [15]

    VulDeePecker: A deep learning-based system for vulner- ability detection,

    Z. Liet al., “VulDeePecker: A deep learning-based system for vulner- ability detection,” inProceedings of the NDSS Symposium, 2018

  16. [16]

    Devign: Effective vulnerability identification by learning comprehensive program seman- tics via graph neural networks,

    S. Chakraborty, R. Krishna, Y . Ding, and B. Ray, “Devign: Effective vulnerability identification by learning comprehensive program seman- tics via graph neural networks,” inProceedings of the Conference on Neural Information Processing Systems (NeurIPS), 2019

  17. [17]

    ASTER: Natural and multi-language unit test generation with LLMs,

    R. Pan, M. Kim, R. Krishna, R. Pavuluri, and S. Sinha, “ASTER: Natural and multi-language unit test generation with LLMs,” inProceedings of the IEEE/ACM International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP), 2025

  18. [18]

    HITS: High-coverage LLM- based unit test generation via method slicing,

    Z. Wang, K. Liu, G. Li, and Z. Jin, “HITS: High-coverage LLM- based unit test generation via method slicing,” inProceedings of the IEEE/ACM International Conference on Automated Software Engineer- ing (ASE), 2024

  19. [19]

    Challenges and solutions when adopting DevSecOps: A systematic review,

    R. N. Rajapakse, M. Zahedi, M. A. Babar, and H. Shen, “Challenges and solutions when adopting DevSecOps: A systematic review,”Information and Software Technology, vol. 141, 2022

  20. [20]

    An empirical study of DevSecOps focused on continuous security testing,

    C. Feioet al., “An empirical study of DevSecOps focused on continuous security testing,” inProceedings of the IEEE European Symposium on Security and Privacy Workshops (EuroS&PW), 2024

  21. [21]

    Guidelines for performing systematic literature reviews in software engineering,

    B. Kitchenham and S. Charters, “Guidelines for performing systematic literature reviews in software engineering,” Keele University, Tech. Rep. EBSE-2007-01, 2007

  22. [22]

    Llm-assisted static analysis for detecting security vulnerabilities,

    Z. Li, S. Dutta, and M. Naik, “IRIS: LLM-assisted static analysis for de- tecting security vulnerabilities,” inProceedings of the International Con- ference on Learning Representations (ICLR), 2025, arXiv:2405.17238

  23. [23]

    LLMs in software security: A survey of vulnerability detection techniques and insights,

    Z. Sheng, Z. Chen, S. Gu, H. Huang, G. Gu, and J. Huang, “LLMs in software security: A survey of vulnerability detection techniques and insights,”ACM Computing Surveys, vol. 1, no. 1, pp. 1–33, 2025

  24. [24]

    Abstract interpretation: A unified lattice model for static analysis of programs by construction or approximation of fixpoints,

    P. Cousot and R. Cousot, “Abstract interpretation: A unified lattice model for static analysis of programs by construction or approximation of fixpoints,” inProceedings of the ACM Symposium on Principles of Programming Languages (POPL), 1977, pp. 238–252

  25. [25]

    Symbolic execution and program testing,

    J. C. King, “Symbolic execution and program testing,”Communications of the ACM, vol. 19, no. 7, pp. 385–394, 1976

  26. [26]

    Program slicing,

    M. Weiser, “Program slicing,” inProceedings of the International Conference on Software Engineering (ICSE), 1981, pp. 439–449

  27. [27]

    The measurement of observer agreement for categorical data,

    J. R. Landis and G. G. Koch, “The measurement of observer agreement for categorical data,”Biometrics, vol. 33, no. 1, pp. 159–174, 1977

  28. [28]

    All you ever wanted to know about dynamic taint analysis and forward symbolic execution (but might have been afraid to ask),

    E. J. Schwartz, T. Avgerinos, and D. Brumley, “All you ever wanted to know about dynamic taint analysis and forward symbolic execution (but might have been afraid to ask),” inProceedings of the IEEE Symposium on Security and Privacy (S&P), 2010

  29. [29]

    EXE: Automatically generating inputs of death,

    C. Cadar, V . Ganesh, P. M. Pawlowski, D. L. Dill, and D. R. Engler, “EXE: Automatically generating inputs of death,” inProceedings of the ACM Conference on Computer and Communications Security (CCS), 2006, pp. 322–335

  30. [30]

    Finding security vulnerabilities in Java applications with static analysis,

    V . B. Livshits and M. S. Lam, “Finding security vulnerabilities in Java applications with static analysis,” inProceedings of the USENIX Security Symposium, 2005

  31. [31]

    Saluki: Finding taint- style vulnerabilities with static property checking,

    I. Gotovchits, R. A. van Tonder, and C. Cadar, “Saluki: Finding taint- style vulnerabilities with static property checking,” inProceedings of the NDSS Symposium, 2018

  32. [32]

    How developers engage with static analysis tools in different contexts,

    C. Vassalloet al., “How developers engage with static analysis tools in different contexts,”Empirical Software Engineering, vol. 25, 2020

  33. [33]

    An empirical characterization of security checks in CI workflows,

    F. Zampettiet al., “An empirical characterization of security checks in CI workflows,” inProceedings of the IEEE/ACM International Conference on Mining Software Repositories (MSR), 2020

  34. [34]

    Barriers to using static application security testing (SAST) tools,

    T. Wadhamset al., “Barriers to using static application security testing (SAST) tools,” inProceedings of the International Conference on Evaluation and Assessment in Software Engineering (EASE), 2024

  35. [35]

    Security smells in Ansible and Chef scripts: A replication study,

    M. R. Rahman and L. Williams, “Security smells in Ansible and Chef scripts: A replication study,”ACM Transactions on Software Engineering and Methodology, vol. 28, no. 4, 2019

  36. [36]

    The promise and peril of mining Git repositories,

    C. Birdet al., “The promise and peril of mining Git repositories,” inProceedings of the IEEE/ACM International Conference on Mining Software Repositories (MSR), 2009

  37. [37]

    An empirical study of the reliability of UNIX utilities,

    B. P. Miller, L. Fredriksen, and B. So, “An empirical study of the reliability of UNIX utilities,”Communications of the ACM, vol. 33, no. 12, pp. 32–44, 1990

  38. [38]

    American fuzzy lop,

    M. Zalewski, “American fuzzy lop,” http://lcamtuf.coredump.cx/afl/, 2013

  39. [39]

    Coverage-based grey- box fuzzing as Markov chain,

    M. Böhme, V .-T. Pham, and A. Roychoudhury, “Coverage-based grey- box fuzzing as Markov chain,” inProceedings of the ACM Conference on Computer and Communications Security (CCS), 2016

  40. [40]

    FairFuzz: A targeted mutation strategy for increasing greybox fuzz testing coverage,

    C. Lemieux and K. Sen, “FairFuzz: A targeted mutation strategy for increasing greybox fuzz testing coverage,” inProceedings of the IEEE/ACM International Conference on Automated Software Engineer- ing (ASE), 2018

  41. [41]

    Automated whitebox fuzz testing,

    P. Godefroid, M. Y . Levin, and D. Molnar, “Automated whitebox fuzz testing,” inProceedings of the NDSS Symposium, 2008

  42. [42]

    QSYM: A practical concolic execution engine tailored for hybrid fuzzing,

    I. Yunet al., “QSYM: A practical concolic execution engine tailored for hybrid fuzzing,” inProceedings of the USENIX Security Symposium, 2018

  43. [43]

    NEUZZ: Efficient fuzzing with neural program smoothing,

    D. She, K. Pei, D. Epstein, J. Yang, B. Ray, and S. Jana, “NEUZZ: Efficient fuzzing with neural program smoothing,” inProceedings of the IEEE Symposium on Security and Privacy (S&P), 2019

  44. [44]

    Superion: Grammar-aware greybox fuzzing,

    J. Wanget al., “Superion: Grammar-aware greybox fuzzing,” inProceed- ings of the International Conference on Software Engineering (ICSE), 2019

  45. [45]

    Angora: Efficient fuzzing by principled search,

    P. Chen and H. Chen, “Angora: Efficient fuzzing by principled search,” inProceedings of the IEEE Symposium on Security and Privacy (S&P), 2018

  46. [46]

    Metrics are fitness functions too,

    M. Harman and J. Clark, “Metrics are fitness functions too,” inPro- ceedings of the IEEE International Symposium on Software Metrics (METRICS), 2004

  47. [47]

    EvoSuite: Automatic test suite generation for object-oriented software,

    G. Fraser and A. Arcuri, “EvoSuite: Automatic test suite generation for object-oriented software,” inProceedings of the ACM International Symposium on the Foundations of Software Engineering (FSE), 2011, pp. 416–419

  48. [48]

    Do automatically generated unit tests find real faults? an empirical study of developer-written tests and three state-of- the-art tools,

    S. Shamshiriet al., “Do automatically generated unit tests find real faults? an empirical study of developer-written tests and three state-of- the-art tools,” inProceedings of the IEEE/ACM International Conference on Automated Software Engineering (ASE), 2015

  49. [49]

    Randoop: Feedback-directed random testing for Java,

    C. Pacheco and M. D. Ernst, “Randoop: Feedback-directed random testing for Java,” inProceedings of the OOPSLA Companion, 2007

  50. [50]

    Learning to generate assert statements for unit tests,

    M. Tufano, R. Watson, G. Bavota, M. D. Penta, M. White, and D. Poshyvanyk, “Learning to generate assert statements for unit tests,” in Proceedings of the International Conference on Software Engineering: New Ideas and Emerging Results (ICSE-NIER), 2020

  51. [51]

    Can large language models write good property-based tests?

    A. Vikram, C. Murphy, and G. Kaiser, “Can large language models write good property-based tests?” inProceedings of the ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA), 2023

  52. [52]

    Leveraging GPT-4 for vulnerability-witnessing unit test generation,

    G. Antal, D. Bán, M. Isztin, R. Ferenc, and P. Hegedüs, “Leveraging GPT-4 for vulnerability-witnessing unit test generation,” inProceedings of the International Conference on Evaluation and Assessment in Soft- ware Engineering (EASE), 2025, pp. 1056–1065

  53. [53]

    Mutation- guided LLM-based test generation at Meta,

    M. Harman, J. Ritchey, I. Harper, S. Sengupta, and K. Mao, “Mutation- guided LLM-based test generation at Meta,” inCompanion Proceedings of the ACM International Conference on the Foundations of Software Engineering (FSE), 2025, pp. 180–191

  54. [54]

    Low-cost and comprehensive non-textual input fuzzing via LLM-synthesized generators,

    X. Zhanget al., “Low-cost and comprehensive non-textual input fuzzing via LLM-synthesized generators,” inProceedings of the USENIX Secu- rity Symposium, 2025

  55. [55]

    Static pro- gram analysis guided LLM based unit test generation,

    S. R. Chowdhury, G. Sridhara, A. K. Raghavan, and J. Bose, “Static pro- gram analysis guided LLM based unit test generation,” inProceedings of the ACM IKDD CODS and COMAD, 2024, pp. 279–283

  56. [56]

    Symbolic execution for software testing: Three decades later,

    C. Cadar and K. Sen, “Symbolic execution for software testing: Three decades later,”Communications of the ACM, vol. 56, no. 2, pp. 82–90, 2013

  57. [57]

    A survey of new trends in symbolic execution for software testing and analysis,

    C. S. P ˘as˘areanu and W. Visser, “A survey of new trends in symbolic execution for software testing and analysis,”International Journal on Software Tools for Technology Transfer, vol. 11, no. 4, 2009

  58. [58]

    Pezzè and M

    M. Pezzè and M. Young,Software Testing and Analysis: Process, Principles and Techniques. New York, NY: Wiley, 2008

  59. [59]

    Y ASA: Scalable multi-language taint analysis on the unified AST,

    A. G. S. Team, “Y ASA: Scalable multi-language taint analysis on the unified AST,”arXiv preprint arXiv:2601.17390, 2026

  60. [60]

    LLM-powered security test generation: Oracles, vulnerability probes, and adversarial inputs,

    A. Mastropaolo, R. Kuhn, J. V oas, and B. Baudry, “LLM-powered security test generation: Oracles, vulnerability probes, and adversarial inputs,”IEEE Computer, vol. 59, no. 2, pp. 101–107, 2026

  61. [61]

    Enhancing DevSecOps through large language model integration: A pipeline-centric approach,

    M. Kisielewicz, P. Kotzbach, and M. K˛ edziora, “Enhancing DevSecOps through large language model integration: A pipeline-centric approach,” inProceedings of the International Conference on Computational Col- lective Intelligence (ICCI), 2026. TABLE V COMPLETEATTRIBUTEEXTRACTION FORALL55 PRIMARYSTUDIES(P01–P55) ID First Author Venue Year Rep. LLM ML Adapt...