pith. sign in

arxiv: 2605.27360 · v1 · pith:O2R73SARnew · submitted 2026-05-26 · 💻 cs.NI · cs.AI

GENESIS: Harnessing AI Agents for Autonomous 6G RAN Synthesis, Research, and Testing

Pith reviewed 2026-06-29 15:00 UTC · model grok-4.3

classification 💻 cs.NI cs.AI
keywords AI agents6G RANautonomous synthesisover-the-air testingknowledge baseagentic frameworkradio access networkLLM limitations
0
0 comments X

The pith

GENESIS converts RAN intents such as specification clauses or anomalies into solutions validated by over-the-air experiments and stored in a persistent knowledge base.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper identifies six manual R&D processes that each take months in cellular networks and shows that standard LLMs fail on them by hallucinating APIs and producing non-transferable simulation results. GENESIS addresses this with an agentic framework that decomposes intents into composable actions executed against real hardware, with every output recorded in the SYNAPSE knowledge layer for reuse. The result is a closed loop in which capabilities improve across iterations rather than resetting. A sympathetic reader would care because this structure could compress the current multi-month cycle into automated runs while maintaining interoperability and hardware fidelity.

Core claim

GENESIS is an agentic AI framework that converts intents into solutions validated with over-the-air experiments, fed back into a persistent knowledge base. It is built on three composable primitives (agents, skills, hooks) and a knowledge layer (SYNAPSE) that doubles as the source of ground truth and the recipient of every artifact the framework produces, making capabilities compound across runs.

What carries the argument

Three composable primitives (agents, skills, hooks) plus the SYNAPSE knowledge layer that supplies ground truth and accumulates every produced artifact.

If this is right

  • Feature synthesis from standards or papers becomes an automated pipeline ending in OTA validation.
  • Conformance testing, anomaly hardening, and data-driven optimization run without manual intervention.
  • Novel waveform discovery and security hardening proceed from research hypotheses to hardware tests.
  • Every artifact enriches the shared knowledge base so later runs start from accumulated results.
  • The six structural bottlenecks that currently consume months per iteration are addressed in a single loop.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the loop holds, the dominant cost in 6G R&D shifts from engineering labor to defining clear intents.
  • The same structure could be tested on other hardware-in-the-loop domains where simulation-reality gaps are costly.
  • Over repeated runs the knowledge base may encode domain-specific patterns that general LLMs lack.
  • Failure to close the loop would appear first as repeated API hallucinations on successive intents.

Load-bearing premise

The three primitives and SYNAPSE layer can prevent LLM hallucinations of APIs and ensure solutions transfer from simulation to real RAN hardware.

What would settle it

A test in which GENESIS receives a standards clause, emits code, and the code fails interoperability checks on actual RAN equipment because of an incorrect API call.

Figures

Figures reproduced from arXiv: 2605.27360 by Ali Saeizadeh, Andrea Lacava, Angelo Feraudo, Gabriele Gemmi, Maxime Elkael, Michele Polese, Minh Dat Nguyen, Paolo Testolina, Pedram Johari, Reshma Prasad, Salvatore D'Oro, Soumendra Nanda, Tamerlan Aghayev, Tommaso Melodia.

Figure 1
Figure 1. Figure 1: GENESIS architecture, organized as four horizontal layers tied together by the SYNAPSE knowledge plane and a pluggable LLM backend. A single intent (a specification clause, a KPI anomaly, or a research hypothesis) enters at the top. The agentic framework routes it through one of six capability pipelines (SYNTHESIZE, TEST, HARDEN, OPTIMIZE, DISCOVER, SECURE), composed at run-time from a pool of agent specia… view at source ↗
Figure 2
Figure 2. Figure 2: Summary of the six GENESIS capabilities, including the input and output for each one, the stages in which they unfold (in the order indicated by the numbers, color-coded to the capability), and the ingestion of the outcome in the knowledge base, after a human review. focusing on a foundational model, GENESIS develops agentic components that can leverage different LLMs and test benches. Summary. To our know… view at source ↗
Figure 3
Figure 3. Figure 3: Tiered skill invocation chain. Each tier owns a disjoint set of inputs. [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: GENESIS agentic testbed and relationships between specialist agents and testbed components. Three properties make the abstraction portable in practice. First, markdown ”specialist” personas translate to the system prompt and tool list that every agentic runtime consumes, the only adapter required is a loader that parses the frontmatter into the framework’s agent-selection mechanism. Second, SKILL.md files … view at source ↗
Figure 5
Figure 5. Figure 5: Per-stage SYNTHESIZE pipeline trace for Case Study 1 and the RRC.ConnMean-specific artifact each produced. Stages marked by an asterisk require a human approval. protocol stack and exposure over the O-RAN E2 interface. It is the simplest of the three GENESIS use cases in this paper: two specifications (one 3GPP, one O-RAN), one OAI subsystem (the measurement collector), leading to an O-RAN control￾plane ad… view at source ↗
Figure 6
Figure 6. Figure 6: Time series plot of RRC.ConnMean. The trace climbs 0-1-2-3 as [PITH_FULL_IMAGE:figures/full_fig_p012_6.png] view at source ↗
Figure 8
Figure 8. Figure 8: Logical architecture of the end-to-end CHO use case. Solid boxes (left) [PITH_FULL_IMAGE:figures/full_fig_p012_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Per-stage SYNTHESIZE pipeline trace for Case Study 2 and the CHO￾specific artifact each produced. Stages marked by an asterisk require a human approval. 5 6 7 8 9 10 A3 offset (dB) −54 −51 −48 −45 −42 RSRP (dBm) RSRP DU0 RSRP DU1 Wobble peak ∆ (first HO) ∆ (PP HO) A3 offset 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 DU 0 DU 1 tsim (s) Serving [PITH_FULL_IMAGE:figures/full_fig_p013_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: xApp anti-ping-pong loop. UE wobbles between [PITH_FULL_IMAGE:figures/full_fig_p013_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Handover success rate as a function of UE speed. For each (mode, [PITH_FULL_IMAGE:figures/full_fig_p014_11.png] view at source ↗
Figure 13
Figure 13. Figure 13: Over-the-air CHO result captured during one walk of the testbed in [PITH_FULL_IMAGE:figures/full_fig_p014_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Mapping of ALLSTaR scheduling synthesis and discovery capabilities [PITH_FULL_IMAGE:figures/full_fig_p015_14.png] view at source ↗
Figure 16
Figure 16. Figure 16: Median wall-clock time per SYNTHESIZE stage. Light dotted bars: RRC.ConnMean pipeline on Opus. Mid striped bars: same pipeline on Sonnet. Dark cross-hatched bars: CHO + E2SM-RC pipeline on Opus. Solo totals: 44 min. (Opus KPM), 93 min. (Sonnet KPM), 226 min. (Opus Feature). The operating point that follows: if wall-clock latency drives the deployment, Opus dominates; if the goal is to amortize a fixed bud… view at source ↗
read the original abstract

Cellular research and development (R&D) is throttled by six structural processes that each consume months of manual engineering work per iteration: (i) synthesizing new features from standards or research papers into production code; (ii) conformance and interoperability testing; (iii) hardening against field anomalies and diverse deployment environments; (iv) data-driven optimization of network functionalities; (v) discovering and prototyping novel waveforms, functionalities, and capabilities for future standards; and (vi) securing the stack against vulnerabilities. Although Large Language Models (LLMs) have compressed comparable R&D work in general software engineering from days to minutes, their known pitfalls worsen on Radio Access Network (RAN) use cases: they hallucinate Application Programming Interfaces (APIs) and mis-read specifications, which kills interoperability of RAN components at the first mistake, and they heavily rely on simulations for designing algorithms, which is notorious for breaking when transferred to real hardware. To address these challenges, we present GENESIS, an agentic Artificial Intelligence (AI) framework that converts intents (e.g., a specification clause, a telemetry anomaly, or a research hypothesis) into solutions validated with over-the-air experiments, fed back into a persistent knowledge base. GENESIS is built on three composable primitives (agents, skills, hooks) and a knowledge layer (SYNAPSE) that doubles as the source of ground truth and the recipient of every artifact the framework produces, making capabilities compound across runs.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The paper presents GENESIS, an agentic AI framework for automating six labor-intensive 6G RAN R&D processes (feature synthesis from standards, conformance testing, anomaly hardening, data-driven optimization, waveform discovery, and security). It claims that GENESIS converts high-level intents (specification clauses, telemetry anomalies, or research hypotheses) into solutions validated via over-the-air experiments, with all artifacts fed back into a persistent SYNAPSE knowledge base. The framework is built from three composable primitives (agents, skills, hooks) that compound capabilities across runs while mitigating LLM pitfalls such as API hallucination and sim-to-real transfer failures.

Significance. If the claimed end-to-end capability were demonstrated, the work would be highly significant for cellular R&D: it could compress multi-month manual cycles into automated, compounding iterations and directly address the documented LLM failure modes that currently block reliable use in standards-compliant RAN hardware. The emphasis on OTA validation and a persistent ground-truth knowledge layer distinguishes it from simulation-only agentic proposals.

major comments (2)
  1. [Abstract] Abstract: the central claim that GENESIS 'converts intents ... into solutions validated with over-the-air experiments' is unsupported by any experimental results, error metrics, failure cases, implementation details, or even high-level pseudocode; without these the claim cannot be evaluated and remains a system sketch rather than a substantiated contribution.
  2. [System Architecture / Primitives] The description of the three primitives (agents, skills, hooks) and SYNAPSE does not specify concrete mechanisms, guardrails, or verification steps that would prevent the documented LLM failure modes (API hallucination, non-transferable simulation results) when the framework is applied to real RAN hardware and 3GPP specifications.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. The comments correctly identify that the current manuscript is primarily a system description and that stronger substantiation is needed for the central claims. We address each point below and commit to revisions that improve clarity and technical detail without overstating the current evidence.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the central claim that GENESIS 'converts intents ... into solutions validated with over-the-air experiments' is unsupported by any experimental results, error metrics, failure cases, implementation details, or even high-level pseudocode; without these the claim cannot be evaluated and remains a system sketch rather than a substantiated contribution.

    Authors: We agree that the abstract phrasing overstates what is demonstrated. The manuscript presents the framework architecture and intended workflow rather than completed end-to-end OTA experiments. In revision we will (i) rephrase the abstract to state that GENESIS is designed to produce OTA-validated solutions via the described primitives and knowledge layer, (ii) add high-level pseudocode for the core agent loop, and (iii) include a dedicated section on example workflows with explicit discussion of failure modes and mitigation strategies. Full quantitative OTA results with error metrics remain future work and will not be claimed in this paper. revision: partial

  2. Referee: [System Architecture / Primitives] The description of the three primitives (agents, skills, hooks) and SYNAPSE does not specify concrete mechanisms, guardrails, or verification steps that would prevent the documented LLM failure modes (API hallucination, non-transferable simulation results) when the framework is applied to real RAN hardware and 3GPP specifications.

    Authors: The referee is correct that the current text remains at the level of high-level primitives. We will expand the System Architecture and SYNAPSE sections to include: (a) explicit guardrails such as schema-based API validation against 3GPP reference implementations inside skills, (b) multi-agent cross-verification hooks that require consensus before any code or configuration is emitted, (c) simulation-to-real transfer checks that compare against a curated set of hardware benchmarks stored in SYNAPSE, and (d) provenance tracking that records every artifact's verification status. These concrete mechanisms will be described with pseudocode and will directly target the hallucination and transfer issues raised. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper presents a high-level architectural description of the GENESIS framework built on agents, skills, hooks, and the SYNAPSE knowledge layer. No equations, derivations, fitted parameters, or mathematical claims appear in the provided text. The central claims are descriptive and system-level rather than derived from self-referential loops, self-citations that bear the load of uniqueness, or renamings of known results. No instances of the enumerated circularity patterns are present, and the argument remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 2 invented entities

Only the abstract is available; no free parameters, axioms, or invented entities beyond the named framework components can be extracted or verified.

invented entities (2)
  • GENESIS no independent evidence
    purpose: Agentic AI framework for 6G RAN tasks
    Introduced as the central contribution of the paper.
  • SYNAPSE no independent evidence
    purpose: Persistent knowledge layer serving as ground truth
    New component described as doubling as source of truth and recipient of artifacts.

pith-pipeline@v0.9.1-grok · 5850 in / 1293 out tokens · 37206 ms · 2026-06-29T15:00:58.949966+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

67 extracted references · 14 canonical work pages · 9 internal anchors

  1. [1]

    OpenAirInterface 5G source code repository,

    OpenAirInterface Software Alliance, “OpenAirInterface 5G source code repository,” 2026, accessed: May 2026. [Online]. Available: https://gitlab.eurecom.fr/oai/openairinterface5g

  2. [2]

    5G radio network design for ultra-reliable low-latency communication,

    J. Sachs, G. Wikstrom, T. Dudda, R. Baldemair, and K. Kittichokechai, “5G radio network design for ultra-reliable low-latency communication,” IEEE Netw., vol. 32, no. 2, pp. 24–31, Apr. 2018

  3. [3]

    On integrated access and backhaul networks: Current status and potentials,

    C. Madapatha, B. Makki, C. Fang, O. Teyeb, E. Dahlman, M.-S. Alouini, and T. Svensson, “On integrated access and backhaul networks: Current status and potentials,”IEEE Open J. Commun. Soc., vol. 1, pp. 1374– 1389, Sep. 2020. 17

  4. [4]

    Ultra-reliable low-latency in 5G: A close reality or a distant goal?

    A. Maghsoudnia, E. Vlad, A. Gong, D. M. Dumitriu, and H. Hassanieh, “Ultra-reliable low-latency in 5G: A close reality or a distant goal?” in Proc. 23rd ACM Workshop Hot Topics Netw., Irvine, CA, USA, Nov. 2024, p. 111–120

  5. [5]

    Understand- ing O-RAN: Architecture, interfaces, algorithms, security, and research challenges,

    M. Polese, L. Bonati, S. D’oro, S. Basagni, and T. Melodia, “Understand- ing O-RAN: Architecture, interfaces, algorithms, security, and research challenges,”IEEE Commun. Surveys Tuts., vol. 25, no. 2, pp. 1376–1411, 2nd Quart. 2023

  6. [6]

    AI-RAN alliance,

    AI-RAN Alliance, “AI-RAN alliance,” 2026, accessed: May 2026. [Online]. Available: https://ai-ran.org/

  7. [7]

    AutoGen: Enabling next-gen LLM applications via multi-agent conversations,

    Q. Wu, G. Bansal, J. Zhang, Y . Wu, B. Li, E. Zhu, L. Jiang, X. Zhang, S. Zhang, J. Liu, A. H. Awadallah, R. W. White, D. Burger, and C. Wang, “AutoGen: Enabling next-gen LLM applications via multi-agent conversations,” inProc. First Conf. Lang. Model. (COLM), Aug. 2024, p. 1–43. [Online]. Available: https://openreview.net/forum?id=BAakY1hNKS

  8. [8]

    MetaGPT: Meta programming for a multi-agent collaborative framework,

    S. Hong, M. Zhuge, J. Chen, X. Zheng, Y . Cheng, C. Zhang, J. Wang, Z. Wang, S. K. S. Yau, Z. Lin, L. Zhou, C. Ran, L. Xiao, C. Wu, and J. Schmidhuber, “MetaGPT: Meta programming for a multi-agent collaborative framework,” inProc. 12th Int. Conf. Learn. Represent. (ICLR), Apr. 2023. [Online]. Available: https://openreview.net/forum?id=VtmBAGCN7o

  9. [9]

    CrewAI: Framework for orchestrating role-playing autonomous AI agents,

    CrewAI Inc., “CrewAI: Framework for orchestrating role-playing autonomous AI agents,” 2026. [Online]. Available: https://github.com/ crewAIInc/crewAI

  10. [10]

    Claude code: Agents and sub-agents,

    Anthropic, “Claude code: Agents and sub-agents,” 2026, accessed: 2026- 05-20. [Online]. Available: https://code.claude.com/docs/en/sub-agents

  11. [11]

    SWE-bench: Can Language Models Resolve Real-World GitHub Issues?

    C. E. Jimenez, J. Yang, A. Wettig, S. Yao, K. Pei, O. Press, and K. Narasimhan, “SWE-bench: Can language models resolve real-world GitHub issues?” Nov. 2024. [Online]. Available: https: //arxiv.org/abs/2310.06770

  12. [12]

    CoderEval: A benchmark of pragmatic code generation with generative pre-trained models,

    H. Yu, B. Shen, D. Ran, J. Zhang, Q. Zhang, Y . Ma, G. Liang, Y . Li, Q. Wang, and T. Xie, “CoderEval: A benchmark of pragmatic code generation with generative pre-trained models,” inProc. IEEE/ACM 46th Int. Conf. Softw. Eng., Lisbon, Portugal, Apr. 2024, pp. 428–439

  13. [13]

    Who is using AI to code? Global diffusion and impact of generative AI,

    S. Daniotti, J. Wachs, X. Feng, and F. Neffke, “Who is using AI to code? Global diffusion and impact of generative AI,”Science, vol. 391, no. 6787, pp. 831–835, Jan. 2026

  14. [14]

    Colosseum: Large-scale wireless exper- imentation through hardware-in-the-loop network emulation,

    L. Bonati, P. Johari, M. Polese, S. D’Oro, S. Mohanti, M. Tehrani- Moayyed, D. Villa, S. Shrivastava, C. Tassie, K. Yoder, A. Bagga, P. Patel, V . Petkov, M. Seltser, F. Restuccia, A. Gosain, K. R. Chowdhury, S. Basagni, and T. Melodia, “Colosseum: Large-scale wireless exper- imentation through hardware-in-the-loop network emulation,” inProc. IEEE Int. Sy...

  15. [15]

    X5G: An open, programmable, multi- vendor, end-to-end, private 5G O-RAN testbed with NVIDIA ARC and OpenAirInterface,

    D. Villa, I. Khan, F. Kaltenberger, N. Hedberg, R. S. da Silva, S. Maxenti, L. Bonati, A. Kelkar, C. Dick, E. Baena, J. M. Jornet, T. Melodia, M. Polese, and D. Koutsonikolas, “X5G: An open, programmable, multi- vendor, end-to-end, private 5G O-RAN testbed with NVIDIA ARC and OpenAirInterface,”IEEE Trans. Mobile Comput., vol. 24, no. 11, pp. 11 305–11 322...

  16. [16]

    Arena: A 64-antenna SDR-based ceiling grid testing platform for sub-6 GHz 5G-and-beyond radio spectrum research,

    L. Bertizzolo, L. Bonati, E. Demirors, A. Al-Shawabka, S. D’Oro, F. Restuccia, and T. Melodia, “Arena: A 64-antenna SDR-based ceiling grid testing platform for sub-6 GHz 5G-and-beyond radio spectrum research,”Comput. Netw., vol. 181, pp. 1–17, Nov. 2020

  17. [17]

    AutoRAN: Automated and Zero-Touch Open RAN systems,

    S. Maxenti, R. Shirkhani, M. Elkael, L. Bonati, S. D’Oro, T. Melodia, and M. Polese, “AutoRAN: Automated and Zero-Touch Open RAN systems,” IEEE Trans. Mobile Comput., pp. 1–18, Jan. 2026, early access

  18. [18]

    5G; Management and orchestration; 5G performance measurements, version 17.7.1,

    “5G; Management and orchestration; 5G performance measurements, version 17.7.1,” 3rd Gener. Partnership Project (3GPP), TS 28.552, Jul

  19. [19]

    Available: http://www.3gpp.org/DynaReport/28552.htm

    [Online]. Available: http://www.3gpp.org/DynaReport/28552.htm

  20. [20]

    ALL- STaR: Automated LLM-driven scheduler generation and testing for intent-based RAN,

    M. Elkael, M. Polese, R. Prasad, S. Maxenti, and T. Melodia, “ALLSTaR: Automated LLM-driven scheduler generation and testing for intent-based RAN,” Oct. 2025. [Online]. Available: http://arxiv.org/abs/2505.18389

  21. [21]

    Voyager: An Open-Ended Embodied Agent with Large Language Models

    G. Wang, Y . Xie, Y . Jiang, A. Mandlekar, C. Xiao, Y . Zhu, L. Fan, and A. Anandkumar, “V oyager: An open-ended embodied agent with large language models,” Oct. 2023. [Online]. Available: https://arxiv.org/abs/2305.16291

  22. [22]

    Glia: A human-inspired AI for automated systems design and optimization,

    P. Hamadanian, P. Karimi, A. Nasr-Esfahany, K. Noorbakhsh, J. Chandler, A. ParandehGheibi, M. Alizadeh, and H. Balakrishnan, “Glia: A human-inspired AI for automated systems design and optimization,” Oct

  23. [23]

    Glia: A Human-Inspired AI for Automated Systems Design and Optimization

    [Online]. Available: https://arxiv.org/abs/2510.27176

  24. [24]

    Toward Autonomous O-RAN: A Multi-Scale Agentic AI Framework for Real-Time Network Control and Management

    H. Navidan, M. Cheraghinia, J. Fontaine, M. Seif, E. D. Poorter, H. V . Poor, I. Moerman, and A. Shahid, “Toward autonomous O-RAN: A multi-scale agentic AI framework for real-time network control and management,” Feb. 2026. [Online]. Available: https: //arxiv.org/abs/2602.14117

  25. [25]

    Agentic AI empowered intent-based networking for 6G,

    G. Jiang, K. Wang, X. Chen, and Y . Huang, “Agentic AI empowered intent-based networking for 6G,” Jan. 2026. [Online]. Available: https://arxiv.org/abs/2601.06640

  26. [26]

    6G Needs Agents: Toward Agentic AI-Native Networks for Autonomous Intelligence

    M. A. Ferrag, A. Lakas, and M. Debbah, “6G needs agents: Toward agentic AI-native networks for autonomous intelligence,” May 2026. [Online]. Available: https://arxiv.org/abs/2605.01546

  27. [27]

    Advanced architectures integrated with agentic AI for next-generation wireless networks,

    K. Dev, S. A. Khowaja, E. Zeydan, K. Singh, and M. Debbah, “Advanced architectures integrated with agentic AI for next-generation wireless networks,”IEEE Commun. Standards Mag., pp. 1–8, Nov. 2025, early access

  28. [28]

    Agents Should Replace Narrow Predictive AI as the Orchestrator in 6G AI-RAN

    P. Gajjar and V . K. Shah, “Agents should replace narrow predictive AI as the orchestrator in 6G AI-RAN,” May 2026. [Online]. Available: https://arxiv.org/abs/2605.11516

  29. [29]

    ComAgent: Multi-LLM based agentic AI empowered intelligent wire- less networks,

    H. Li, M. Xiao, K. Wang, R. Schober, D. I. Kim, and Y . L. Guan, “ComAgent: Multi-LLM based agentic AI empowered intelligent wireless networks,” Jan. 2026. [Online]. Available: https: //arxiv.org/abs/2601.19607

  30. [30]

    The AI Telco Engineer: Toward Autonomous Discovery of Wireless Communications Algorithms

    F. A. Aoudia, J. Hoydis, S. Cammerer, L. Maggi, G. Marti, and A. Keller, “The AI telco engineer: Toward autonomous discovery of wireless communications algorithms,” Apr. 2026. [Online]. Available: https://arxiv.org/abs/2604.19803

  31. [31]

    A practical approach to defining a framework for developing an agentic AIOps system,

    R. D. Zota, C. B ˘arbulescu, and R. Constantinescu, “A practical approach to defining a framework for developing an agentic AIOps system,”Electronics, vol. 14, no. 9, Apr. 2025. [Online]. Available: https://www.mdpi.com/2079-9292/14/9/1775

  32. [32]

    5GReasoner: A property-directed security and privacy analysis framework for 5G cellular network protocol,

    S. R. Hussain, M. Echeverria, I. Karim, O. Chowdhury, and E. Bertino, “5GReasoner: A property-directed security and privacy analysis framework for 5G cellular network protocol,” inProc. ACM SIGSAC Conf. Comput. Commun. Security, Nov. 2019, pp. 669–684

  33. [33]

    AgentRAN: An Agentic AI Architecture for Autonomous Control of Open 6G Networks,

    M. Elkael, S. D’Oro, L. Bonati, M. Polese, Y . Lee, K. Furueda, and T. Melodia, “AgentRAN: An Agentic AI Architecture for Autonomous Control of Open 6G Networks,”IEEE Commun. Mag., pp. 1–7, May 2026, early access

  34. [34]

    ChatDev: Communicative agents for software development,

    C. Qian, W. Liu, H. Liu, N. Chen, Y . Dang, J. Li, C. Yang, W. Chen, Y . Su, X. Cong, J. Xu, D. Li, Z. Liu, and M. Sun, “ChatDev: Communicative agents for software development,” inProc. 62nd Annu. Meeting Assoc. Comput. Linguistics (Long Papers), Jun. 2024, pp. 15 174–15 186

  35. [35]

    LangGraph: Building stateful, multi-agent applications with LLMs,

    LangChain Inc., “LangGraph: Building stateful, multi-agent applications with LLMs,” 2026, accessed: 2026-04-26. [Online]. Available: https://github.com/langchain-ai/langgraph

  36. [36]

    OpenAI agents SDK,

    OpenAI, “OpenAI agents SDK,” 2026, accessed: 2026-05-20. [Online]. Available: https://github.com/openai/openai-agents-python

  37. [37]

    Externalization in LLM Agents: A Unified Review of Memory, Skills, Protocols and Harness Engineering

    C. Zhou, H. Chai, W. Chen, Z. Guo, R. Shan, Y . Song, T. Xu, Y . Yang, A. Yu, W. Zhang, C. Zheng, J. Zhu, Z. Zheng, Z. Zhang, X. Lou, C. Zhang, Z. Fu, J. Wang, W. Liu, J. Lin, and W. Zhang, “Externalization in LLM agents: A unified review of memory, skills, protocols and harness engineering,” Apr. 2026. [Online]. Available: https://arxiv.org/abs/2604.08224

  38. [38]

    From large AI models to agentic AI: A tutorial on future intelligent communications,

    F. Jiang, C. Pan, K. Wang, P. Michiardi, O. A. Dobre, and M. Debbah, “From large AI models to agentic AI: A tutorial on future intelligent communications,”IEEE J. Sel. Areas Commun., vol. 44, pp. 3507–3540, Feb. 2026

  39. [39]

    Sionna: An Open-Source Li- brary for Next-Generation Physical Layer Research,

    J. Hoydis, S. Cammerer, F. A. Aoudia, A. Vem, N. Binder, G. Marcus, and A. Keller, “Sionna: An open-source library for next-generation physical layer research,” Mar. 2023. [Online]. Available: https://arxiv.org/abs/2203.11854

  40. [40]

    SWE-agent: Agent-computer interfaces enable automated software engineering,

    J. Yang, C. E. Jimenez, A. Wettig, K. Lieret, S. Yao, K. Narasimhan, and O. Press, “SWE-agent: Agent-computer interfaces enable automated software engineering,”Proc. Adv. Neural Inf. Process. Syst. (NeurIPS), vol. 37, pp. 50 528–50 652, Dec. 2024

  41. [41]

    Is self-repair a silver bullet for code generation?

    T. X. Olausson, J. P. Inala, C. Wang, J. Gao, and A. Solar-Lezama, “Is self-repair a silver bullet for code generation?” inProc. 12th Int. Conf. Learn. Represent. (ICLR), Jan. 2024

  42. [42]

    DEMO: AI5GTest: LLM based automation for 5G O-RAN testing,

    A. Ganiyu, P. Gajjar, and V . K. Shah, “DEMO: AI5GTest: LLM based automation for 5G O-RAN testing,” inProc. 18th ACM Conf. Secur. Privacy Wireless Mobile Netw. (WiSec), Jun. 2025, p. 298–299

  43. [43]

    TSpec-LLM: An open- source dataset for LLM understanding of 3GPP specifications,

    R. Nikbakht, M. Benzaghta, and G. Geraci, “TSpec-LLM: An open- source dataset for LLM understanding of 3GPP specifications,” inProc. IEEE Globecom Workshops (GC Wkshps), Dec. 2024, pp. 1–6

  44. [44]

    PROSPER: Extracting protocol specifi- cations using large language models,

    P. Sharma and V . Yegneswaran, “PROSPER: Extracting protocol specifi- cations using large language models,” inProc. 22nd ACM Wkshps. Hot Topics Netw. (HotNets), Nov. 2023, pp. 41–47

  45. [45]

    ORANSight-2.0: Foundational LLMs for O- RAN,

    P. Gajjar and V . K. Shah, “ORANSight-2.0: Foundational LLMs for O- RAN,”IEEE Trans. Machine Learn. Commun. Netw., vol. 3, pp. 903–920, Jul. 2025. 18

  46. [46]

    TeleResilienceBench: Quantifying Resilience for LLM Reasoning in Telecommunications

    P. Gajjar, E. Ojo, and V . K. Shah, “TeleResilienceBench: Quantifying resilience for LLM reasoning in telecommunications,” May 2026. [Online]. Available: https://arxiv.org/html/2605.09929v1

  47. [47]

    An experimental reservoir-augmented foundation model: 6G O-RAN case study,

    F. Rezazadeh, R. Zhao, J. Dai, A. A. Gargari, H. Chergui, and L. Liu, “An experimental reservoir-augmented foundation model: 6G O-RAN case study,” inProc. 2nd Int. Gener. AI Comput. Lang. Model. Conf. (GACLM), Aug. 2025, pp. 338–342

  48. [48]

    An introduction to deep learning for the physical layer,

    T. O’Shea and J. Hoydis, “An introduction to deep learning for the physical layer,”IEEE Trans. Cognit. Commun. Netw., vol. 3, no. 4, pp. 563–575, Dec. 2017

  49. [49]

    Threat modeling: Uncover security design flaws using the STRIDE approach,

    S. Hernan, S. Lambert, T. Ostwald, and A. Shostack, “Threat modeling: Uncover security design flaws using the STRIDE approach,” MSDN Mag., Nov. 2006, Microsoft Corporation. [Online]. Avail- able: https://learn.microsoft.com/en-us/archive/msdn-magazine/2006/ november/uncover-security-design-flaws-using-the-stride-approach

  50. [50]

    Russell and P

    S. Russell and P. Norvig,Artificial Intelligence: A Modern Approach, 4th ed. Hoboken, NJ: Pearson, 2020

  51. [51]

    ReAct: Synergizing reasoning and acting in language models,

    S. Yao, J. Zhao, D. Yu, N. Du, I. Shafran, K. Narasimhan, and Y . Cao, “ReAct: Synergizing reasoning and acting in language models,” inProc. 11th Int. Conf. Learn. Represent. (ICLR), Mar. 2023

  52. [52]

    Hooks reference - claude code documentation,

    Anthropic, “Hooks reference - claude code documentation,” 2026, accessed: May 2026. [Online]. Available: https://docs.claude.com/en/ docs/claude-code/hooks

  53. [53]

    The OpenAir- Interface 5G New Radio Implementation: Current Status and Roadmap,

    F. Kaltenberger, G. De Souza, R. Knopp, and H. Wang, “The OpenAir- Interface 5G New Radio Implementation: Current Status and Roadmap,” inITG WS, 2019

  54. [54]

    srsLTE: An open-source platform for LTE evolution and experimentation,

    I. Gomez-Miguelez, A. Garcia-Saavedra, P. D. Sutton, P. Serrano, C. Cano, and D. J. Leith, “srsLTE: An open-source platform for LTE evolution and experimentation,” inProc. ACM Int. Wkshps. Wireless Netw. Testbeds, Exper. Eval. Charac. (WiNTECH), Oct. 2016, pp. 25–32

  55. [55]

    Blog: Introducing the Initial OCUDU Technical Project Release, 26.04,

    Linux Foundation, “Blog: Introducing the Initial OCUDU Technical Project Release, 26.04,” 2026, accessed: May 2026. [Online]. Available: https://ocudu.org/news/ blog-introducing-the-initial-ocudu-technical-project-release-26-04/

  56. [56]

    Near-Realtime RAN Intelligent Controller (RIC),

    O-RAN Software Community, “Near-Realtime RAN Intelligent Controller (RIC),” 2024, Linux Foundation Project. [Online]. Available: https://wiki.o-ran-sc.org/display/RICP

  57. [57]

    FlexRIC: An SDK for next- generation SD-RANs,

    R. Schmidt, M. Irazabal, and N. Nikaein, “FlexRIC: An SDK for next- generation SD-RANs,” inProc. 17th Int. Conf. Emerging Netw. Exper. Technol., Dec. 2021, p. 411–425

  58. [58]

    dApps: Enabling real-time AI-based open RAN control,

    A. Lacava, L. Bonati, N. Mohamadi, R. Gangula, F. Kaltenberger, P. Johari, S. D’Oro, F. Cuomo, M. Polese, and T. Melodia, “dApps: Enabling real-time AI-based open RAN control,”Comput. Netw., vol. 269, p. 111342, Sep. 2025

  59. [59]

    NVIDIA Aerial CUDA-accelerated RAN documentation,

    NVIDIA Corporation, “NVIDIA Aerial CUDA-accelerated RAN documentation,” 2026, accessed: 2026-05-20. [Online]. Available: https://docs.nvidia.com/aerial/cuda-accelerated-ran/latest/index.html

  60. [60]

    OpenAirInterface: A flexible platform for 5G research,

    N. Nikaein, M. K. Marina, S. Manickam, A. Dawson, R. Knopp, and C. Bonnet, “OpenAirInterface: A flexible platform for 5G research,” SIGCOMM Comput. Commun. Rev., vol. 44, no. 5, p. 33–38, Oct. 2014

  61. [61]

    xDevSM: An open-source framework for portable, AI-ready xApps across heterogeneous O-RAN deployments,

    A. Feraudo, S. Maxenti, A. Lacava, L. Bonati, P. Bellavista, M. Polese, and T. Melodia, “xDevSM: An open-source framework for portable, AI-ready xApps across heterogeneous O-RAN deployments,” Feb. 2026. [Online]. Available: https://arxiv.org/abs/2602.03821

  62. [62]

    O-RAN working group 3, Near-RT RIC and E2 interface, E2 service model (E2SM), KPM, version 07.00,

    O-RAN Alliance, “O-RAN working group 3, Near-RT RIC and E2 interface, E2 service model (E2SM), KPM, version 07.00,” O-RAN Alliance, Tech. Spec. O-RAN.WG3.TS.E2SM-KPM-R004-v07.00, 2026

  63. [63]

    5G; NR; Radio Resource Control (RRC); Protocol specification, version 16.1.0,

    “5G; NR; Radio Resource Control (RRC); Protocol specification, version 16.1.0,” 3rd Gener. Partnership Project (3GPP), TS 38.331, Jul. 2020. [Online]. Available: http://www.3gpp.org/DynaReport/38331.htm

  64. [64]

    5G; NR; NR and NG-RAN overall description; Stage-2, version 16.4.0,

    “5G; NR; NR and NG-RAN overall description; Stage-2, version 16.4.0,” 3rd Gener. Partnership Project (3GPP), TS 38.300, Jan. 2021. [Online]. Available: http://www.3gpp.org/DynaReport/38300.htm

  65. [65]

    5G; NG-RAN; F1 Application Protocol (F1AP), version 17.6.0,

    “5G; NG-RAN; F1 Application Protocol (F1AP), version 17.6.0,” 3rd Gener. Partnership Project (3GPP), TS 38.473, Oct. 2023. [Online]. Available: http://www.3gpp.org/DynaReport/38473.htm

  66. [66]

    O-RAN working group 3, Near-RT RIC and E2 interface, E2 service model (E2SM), RAN control, version 03.00,

    O-RAN Alliance, “O-RAN working group 3, Near-RT RIC and E2 interface, E2 service model (E2SM), RAN control, version 03.00,” O- RAN Alliance, Tech. Spec. O-RAN.WG3.E2SM-RC-R003-v03.00, 2023

  67. [67]

    API pricing,

    Anthropic, “API pricing,” Open standard, online documentation, 2026, accessed: May 2026. [Online]. Available: https://claude.com/pricing#api