Advanced AI Service Provisioning in O-RAN through LLM Engine Integration

Bo Tang; Pranshav Gajjar; Seyed Bagher Hashemi Natanzi; Vijay K. Shah

arxiv: 2605.23809 · v2 · pith:6EKVVAACnew · submitted 2026-05-22 · 📡 eess.SY · cs.LG· cs.SY

Advanced AI Service Provisioning in O-RAN through LLM Engine Integration

Seyed Bagher Hashemi Natanzi , Pranshav Gajjar , Bo Tang , Vijay K. Shah This is my paper

Pith reviewed 2026-06-30 15:09 UTC · model grok-4.3

classification 📡 eess.SY cs.LGcs.SY

keywords O-RANLLM integrationAI service provisioningDual-Brain architecturexApps rAppsintent-based orchestrationNeuralSmith5G testbed

0 comments

The pith

An LLM orchestrator translates operator intents into O-RAN data policies and deployment code, paired with an on-demand ML engine for real-time classifiers.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that O-RAN AI application creation can be accelerated by a Dual-Brain architecture. An LLM handles high-level translation of intents into policies and code, while a separate automated ML engine trains lightweight models via API. This addresses the current slow, manual process of building xApps and rApps for data collection and deployment. A sympathetic reader would care if this makes embedding AI in radio networks practical and scalable.

Core claim

The authors present a proof-of-concept Dual-Brain architecture that combines an LLM-based orchestrator, which translates operator intents into data-collection policies and deployment code, with an automated ML engine called NeuralSmith that trains lightweight classifiers on demand via an API.

What carries the argument

The Dual-Brain architecture, which uses the LLM for intent translation and code generation while delegating model training and inference to a dedicated ML engine.

Load-bearing premise

The LLM can reliably and safely generate correct data-collection policies and deployment code for real-time RAN control without introducing errors that require extensive human review.

What would settle it

Deploying code generated by the LLM in the O-RAN testbed and verifying whether the resulting applications perform accurate real-time control without errors or security problems.

Figures

Figures reproduced from arXiv: 2605.23809 by Bo Tang, Pranshav Gajjar, Seyed Bagher Hashemi Natanzi, Vijay K. Shah.

**Figure 1.** Figure 1: The Dual-Brain architecture. The ZTO-Agent (LLM orchestrator, Non-RT RIC rApp) parses intents, curates data, and synthesizes xApp code. [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗

**Figure 2.** Figure 2: Four-phase provisioning workflow: (1) intent and telemetry subscription, (2) curated data to [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 4.** Figure 4: ZTO-Agent orchestration latency comparison across four foundation [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

**Figure 3.** Figure 3: ZTO-Agent (Llama-3.1-8B via Ollama) orchestration latency is [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

read the original abstract

The Open Radio Access Network (O-RAN) architecture allows AI to be embedded directly into the RAN through modular xApps and rApps, yet creating these applications collecting data, training models, writing code, and deploying them safely remains slow and largely manual. Large Language Models (LLMs) offer strong reasoning and code-generation capabilities but are unsuited for the fast, deterministic inference required in real-time RAN control. We present a proof-of-concept Dual-Brain architecture that combines both strengths: an LLM-based orchestrator translates operator intents into data-collection policies and deployment code, while an automated ML engine, NeuralSmith, trains lightweight classifiers on demand via an API. We describe the architecture and provisioning workflow, share practical insights from a containerized O-RAN 5G~SA testbed, and discuss open research directions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper describes a PoC architecture for LLM-assisted O-RAN provisioning but supplies no quantitative validation of the generated policies or code.

read the letter

Here's the quick read on this one. The core of the paper is a proof-of-concept for a Dual-Brain setup in O-RAN where an LLM turns high-level intents into policies and code for deploying AI services, while a separate engine called NeuralSmith handles the model training. They outline the workflow and mention running it on their testbed.

What stands out is the practical angle from the containerized 5G testbed and the recognition that LLMs aren't suitable for the real-time part, so they keep that separate. That's a sensible split.

The soft spot is exactly what the stress-test flags: no numbers at all on whether the LLM outputs are reliable. No error rates, no tests showing the generated xApps work without breaking timing or introducing issues. For something that claims to make provisioning safer and faster, the absence of any validation data makes the central claim unsupported. The architecture description alone doesn't address LLM failure modes like bad API calls or wrong parameters.

Nothing in the work is mathematically new or a fresh derivation; it's an application of current LLM tools to this telecom workflow. The citation pattern isn't an issue since it's early stage.

This paper is aimed at folks in the O-RAN and AI-for-networks community who want ideas on reducing manual work in xApp deployment. A reader looking for a starting point on LLM integration might find the workflow useful, but anyone needing evidence of performance will come away wanting more.

It deserves peer review. The idea is timely for the field, and the description is clear enough that referees could point to specific additions like benchmark runs or safety checks that would make it stronger. I'd recommend sending it out.

Referee Report

2 major / 2 minor

Summary. The manuscript describes a proof-of-concept Dual-Brain architecture for advanced AI service provisioning in O-RAN. An LLM-based orchestrator translates operator intents into data-collection policies and deployment code, while the NeuralSmith automated ML engine trains lightweight classifiers on demand. The architecture and provisioning workflow are illustrated with insights from a containerized O-RAN 5G SA testbed.

Significance. This work addresses the slow manual process of creating xApps and rApps in O-RAN by leveraging LLMs for orchestration and automated ML for model training. If the safety and correctness concerns can be resolved, it has the potential to significantly reduce development time for AI-driven RAN applications. The combination of LLM reasoning with deterministic ML inference is a promising direction, though currently the lack of empirical validation limits the assessed impact.

major comments (2)

[Abstract] The abstract describes a PoC and testbed workflow but supplies no quantitative performance data, error rates, or comparison against manual baselines, so the central claim that the architecture works safely remains unsupported by evidence in the provided text.
[PoC description] The LLM orchestrator's ability to generate correct data-collection policies and deployment code without introducing errors is assumed but not demonstrated; no verification steps or test results are reported to mitigate risks such as hallucinated API calls or incorrect parameters that could violate RAN latency and safety requirements.

minor comments (2)

Consider providing more details on the containerized testbed setup, including specific O-RAN components used.
The open research directions section could benefit from more concrete examples of potential issues to address.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for the constructive feedback and for recognizing the potential of the Dual-Brain approach. We address the major comments point by point below.

read point-by-point responses

Referee: [Abstract] The abstract describes a PoC and testbed workflow but supplies no quantitative performance data, error rates, or comparison against manual baselines, so the central claim that the architecture works safely remains unsupported by evidence in the provided text.

Authors: We agree that the work is a proof-of-concept focused on architecture and workflow rather than quantitative evaluation. The abstract does not advance a claim of proven safety or performance; it presents the PoC and notes open research directions. We will revise the abstract to explicitly state that no empirical benchmarks or error rates are provided and that safety is addressed at the architectural level through separation of LLM orchestration from deterministic ML inference. revision: yes
Referee: [PoC description] The LLM orchestrator's ability to generate correct data-collection policies and deployment code without introducing errors is assumed but not demonstrated; no verification steps or test results are reported to mitigate risks such as hallucinated API calls or incorrect parameters that could violate RAN latency and safety requirements.

Authors: The manuscript describes the provisioning workflow and testbed integration but does not include experiments measuring LLM output correctness or error mitigation. We acknowledge this as a limitation of the current PoC. We will add text in the discussion section outlining potential verification mechanisms (e.g., static analysis of generated policies and human review) as directions for future work, without claiming empirical validation. revision: partial

standing simulated objections not resolved

Provision of quantitative error rates, safety validation experiments, or comparisons against manual baselines, as these were outside the scope of the described proof-of-concept.

Circularity Check

0 steps flagged

No circularity: architectural PoC with no derivations or equations

full rationale

The paper describes a Dual-Brain architecture and provisioning workflow for O-RAN as a proof-of-concept. It contains no equations, no mathematical derivations, no fitted parameters presented as predictions, and no load-bearing self-citations that reduce claims to prior author work. The central content is an architectural proposal plus testbed insights, which is self-contained as a descriptive contribution without any reduction of results to inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 2 invented entities

Only the abstract is available, so the ledger is populated from the high-level description; no free parameters, axioms, or invented entities with independent evidence are explicitly quantified.

axioms (1)

domain assumption LLMs can translate natural-language operator intents into correct and safe O-RAN deployment artifacts
Central to the Dual-Brain claim; stated implicitly in the abstract as the role of the LLM orchestrator.

invented entities (2)

Dual-Brain architecture no independent evidence
purpose: Combines LLM reasoning with automated ML training for O-RAN provisioning
New system name introduced to describe the split between orchestrator and ML engine
NeuralSmith no independent evidence
purpose: Automated ML engine that trains lightweight classifiers on demand via API
Named tool presented as the fast inference component

pith-pipeline@v0.9.1-grok · 5685 in / 1270 out tokens · 29845 ms · 2026-06-30T15:09:51.753134+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Reference graph

Works this paper leans on

15 extracted references · 4 canonical work pages · 1 internal anchor

[1]

Artificial intelligence enabled wireless networking for 5g and beyond: Recent advances and future challenges,

C.-X. Wang, M. D. Renzo, S. Stanczak, S. Wang, and E. G. Larsson, “Artificial intelligence enabled wireless networking for 5g and beyond: Recent advances and future challenges,”IEEE Wireless Communica- tions, vol. 27, no. 1, pp. 16–23, 2020

2020
[2]

N. D. Tripathi and V . K. Shah,Fundamentals of O-RAN. John Wiley & Sons, 2025

2025
[3]

Large generative ai models for telecom: The next big thing?

L. Bariah, Q. Zhao, H. Zou, Y . Tian, F. Bader, and M. Debbah, “Large generative ai models for telecom: The next big thing?”IEEE Communications Magazine, vol. 62, no. 11, pp. 84–90, Nov. 2024

2024
[4]

Netllm: Adapting large language models for networking,

D. Wu, X. Wang, Y . Qiao, Z. Wang, J. Jiang, S. Cui, and F. Wang, “Netllm: Adapting large language models for networking,” inProceed- ings of the ACM SIGCOMM 2024 Conference, ser. ACM SIGCOMM ’24. Association for Computing Machinery, 2024, pp. 661–678

2024
[5]

Oran-bench-13k: An open source benchmark for assessing llms in open radio access networks,

P. Gajjar and V . K. Shah, “Oran-bench-13k: An open source benchmark for assessing llms in open radio access networks,” in2025 IEEE 22nd Consumer Communications & Networking Conference (CCNC), 2025, pp. 1–4

2025
[6]

ORANSight-2.0: Foundational LLMs for O-RAN,

——, “ORANSight-2.0: Foundational LLMs for O-RAN,”IEEE Trans- actions on Machine Learning in Communications and Networking, vol. 3, pp. 903–920, 2025

2025
[7]

arXiv preprint arXiv:2411.06490 , year=

F. Ayed, A. Maatouk, N. Piovesan, A. D. Domenico, M. Debbah, and Z.-Q. Luo, “Hermes: A large language model framework on the journey to autonomous networks,” 2024. [Online]. Available: https://arxiv.org/abs/2411.06490

work page arXiv 2024
[8]

LLM-xApp: A large language model empowered radio resource management xApp for 5G O-RAN,

X. Wu, J. Farooq, Y . Wang, and J. Chen, “LLM-xApp: A large language model empowered radio resource management xApp for 5G O-RAN,” in2024 IEEE Global Communications Conference (GLOBECOM), 2024, pp. 1–6. [Online]. Available: https://ieeexplore. ieee.org/document/10825313

work page arXiv 2024
[9]

Agents Should Replace Narrow Predictive AI as the Orchestrator in 6G AI-RAN

P. Gajjar and V . K. Shah, “Agents should replace narrow predictive ai as the orchestrator in 6g ai-ran,” 2026. [Online]. Available: https://arxiv.org/abs/2605.11516

work page internal anchor Pith review Pith/arXiv arXiv 2026
[10]

Automated ml engineering platform,

NeuralSmith, “Automated ml engineering platform,” 2024. [Online]. Available: https://neuralsmith.com

2024
[11]

Understanding o-ran: Architecture, interfaces, al- gorithms, security, and research challenges,

M. Poleseet al., “Understanding o-ran: Architecture, interfaces, al- gorithms, security, and research challenges,”IEEE Communications Surveys & Tutorials, vol. 25, no. 2, pp. 1376–1411, 2023

2023
[12]

Oai 5g ran,

OpenAirInterface Software Alliance, “Oai 5g ran,” 2024. [Online]. Available: https://openairinterface.org

2024
[13]

Flexric: An sdk for next-generation sd-rans,

R. Schmidtet al., “Flexric: An sdk for next-generation sd-rans,” in Proceedings of ACM CoNEXT, 2021, pp. 411–425

2021
[14]

Ai testing framework for next-g o-ran networks: Requirements, design, and research opportu- nities,

B. Tang, V . K. Shah, V . Marojevic, and J. H. Reed, “Ai testing framework for next-g o-ran networks: Requirements, design, and research opportu- nities,”IEEE Wireless Communications, vol. 30, no. 1, pp. 70–77, 2023

2023
[15]

Should i have expressed a different intent? counterfactual generation for llm-based autonomous control,

A. Farzaneh, S. D’Oro, and O. Simeone, “Should i have expressed a different intent? counterfactual generation for llm-based autonomous control,” 2026. [Online]. Available: https://arxiv.org/abs/2601.20090

work page arXiv 2026

[1] [1]

Artificial intelligence enabled wireless networking for 5g and beyond: Recent advances and future challenges,

C.-X. Wang, M. D. Renzo, S. Stanczak, S. Wang, and E. G. Larsson, “Artificial intelligence enabled wireless networking for 5g and beyond: Recent advances and future challenges,”IEEE Wireless Communica- tions, vol. 27, no. 1, pp. 16–23, 2020

2020

[2] [2]

N. D. Tripathi and V . K. Shah,Fundamentals of O-RAN. John Wiley & Sons, 2025

2025

[3] [3]

Large generative ai models for telecom: The next big thing?

L. Bariah, Q. Zhao, H. Zou, Y . Tian, F. Bader, and M. Debbah, “Large generative ai models for telecom: The next big thing?”IEEE Communications Magazine, vol. 62, no. 11, pp. 84–90, Nov. 2024

2024

[4] [4]

Netllm: Adapting large language models for networking,

D. Wu, X. Wang, Y . Qiao, Z. Wang, J. Jiang, S. Cui, and F. Wang, “Netllm: Adapting large language models for networking,” inProceed- ings of the ACM SIGCOMM 2024 Conference, ser. ACM SIGCOMM ’24. Association for Computing Machinery, 2024, pp. 661–678

2024

[5] [5]

Oran-bench-13k: An open source benchmark for assessing llms in open radio access networks,

P. Gajjar and V . K. Shah, “Oran-bench-13k: An open source benchmark for assessing llms in open radio access networks,” in2025 IEEE 22nd Consumer Communications & Networking Conference (CCNC), 2025, pp. 1–4

2025

[6] [6]

ORANSight-2.0: Foundational LLMs for O-RAN,

——, “ORANSight-2.0: Foundational LLMs for O-RAN,”IEEE Trans- actions on Machine Learning in Communications and Networking, vol. 3, pp. 903–920, 2025

2025

[7] [7]

arXiv preprint arXiv:2411.06490 , year=

F. Ayed, A. Maatouk, N. Piovesan, A. D. Domenico, M. Debbah, and Z.-Q. Luo, “Hermes: A large language model framework on the journey to autonomous networks,” 2024. [Online]. Available: https://arxiv.org/abs/2411.06490

work page arXiv 2024

[8] [8]

LLM-xApp: A large language model empowered radio resource management xApp for 5G O-RAN,

X. Wu, J. Farooq, Y . Wang, and J. Chen, “LLM-xApp: A large language model empowered radio resource management xApp for 5G O-RAN,” in2024 IEEE Global Communications Conference (GLOBECOM), 2024, pp. 1–6. [Online]. Available: https://ieeexplore. ieee.org/document/10825313

work page arXiv 2024

[9] [9]

Agents Should Replace Narrow Predictive AI as the Orchestrator in 6G AI-RAN

P. Gajjar and V . K. Shah, “Agents should replace narrow predictive ai as the orchestrator in 6g ai-ran,” 2026. [Online]. Available: https://arxiv.org/abs/2605.11516

work page internal anchor Pith review Pith/arXiv arXiv 2026

[10] [10]

Automated ml engineering platform,

NeuralSmith, “Automated ml engineering platform,” 2024. [Online]. Available: https://neuralsmith.com

2024

[11] [11]

Understanding o-ran: Architecture, interfaces, al- gorithms, security, and research challenges,

M. Poleseet al., “Understanding o-ran: Architecture, interfaces, al- gorithms, security, and research challenges,”IEEE Communications Surveys & Tutorials, vol. 25, no. 2, pp. 1376–1411, 2023

2023

[12] [12]

Oai 5g ran,

OpenAirInterface Software Alliance, “Oai 5g ran,” 2024. [Online]. Available: https://openairinterface.org

2024

[13] [13]

Flexric: An sdk for next-generation sd-rans,

R. Schmidtet al., “Flexric: An sdk for next-generation sd-rans,” in Proceedings of ACM CoNEXT, 2021, pp. 411–425

2021

[14] [14]

Ai testing framework for next-g o-ran networks: Requirements, design, and research opportu- nities,

B. Tang, V . K. Shah, V . Marojevic, and J. H. Reed, “Ai testing framework for next-g o-ran networks: Requirements, design, and research opportu- nities,”IEEE Wireless Communications, vol. 30, no. 1, pp. 70–77, 2023

2023

[15] [15]

Should i have expressed a different intent? counterfactual generation for llm-based autonomous control,

A. Farzaneh, S. D’Oro, and O. Simeone, “Should i have expressed a different intent? counterfactual generation for llm-based autonomous control,” 2026. [Online]. Available: https://arxiv.org/abs/2601.20090

work page arXiv 2026