Advanced AI Service Provisioning in O-RAN through LLM Engine Integration

Bo Tang; Pranshav Gajjar; Seyed Bagher Hashemi Natanzi; Vijay K. Shah

arxiv: 2605.23809 · v2 · pith:6EKVVAACnew · submitted 2026-05-22 · 📡 eess.SY · cs.LG· cs.SY

Advanced AI Service Provisioning in O-RAN through LLM Engine Integration

Seyed Bagher Hashemi Natanzi , Pranshav Gajjar , Bo Tang , Vijay K. Shah This is my paper

Pith reviewed 2026-05-25 03:08 UTC · model grok-4.3

classification 📡 eess.SY cs.LGcs.SY

keywords O-RANLLMAI provisioningDual-Brain architecturexAppsrAppsNeuralSmith5G SA testbed

0 comments

The pith

A Dual-Brain architecture pairs an LLM orchestrator with an on-demand ML engine to automate O-RAN AI service creation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper aims to show that LLMs can manage high-level translation of operator intents into data policies and deployment code for O-RAN xApps and rApps, while a separate engine called NeuralSmith handles training of lightweight classifiers for fast inference. This hybrid setup addresses the slow manual process of collecting data, writing code, and deploying AI in the modular O-RAN architecture. A reader would care because current approaches limit how quickly AI can be embedded in radio access networks despite the architecture's design for it. The work demonstrates the workflow in a containerized 5G standalone testbed.

Core claim

We present a proof-of-concept Dual-Brain architecture that combines both strengths: an LLM-based orchestrator translates operator intents into data-collection policies and deployment code, while an automated ML engine, NeuralSmith, trains lightweight classifiers on demand via an API. We describe the architecture and provisioning workflow, share practical insights from a containerized O-RAN 5G SA testbed, and discuss open research directions.

What carries the argument

Dual-Brain architecture with LLM-based orchestrator for intent translation and NeuralSmith engine for on-demand classifier training via API.

Load-bearing premise

An LLM can reliably generate correct, safe, and deterministic deployment code and policies for real-time RAN control.

What would settle it

Deploy the LLM-generated code and policies in the containerized O-RAN 5G SA testbed and verify whether they execute without errors, safety violations, or failures under real-time control conditions.

Figures

Figures reproduced from arXiv: 2605.23809 by Bo Tang, Pranshav Gajjar, Seyed Bagher Hashemi Natanzi, Vijay K. Shah.

**Figure 1.** Figure 1: The Dual-Brain architecture. The ZTO-Agent (LLM orchestrator, Non-RT RIC rApp) parses intents, curates data, and synthesizes xApp code. [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗

**Figure 2.** Figure 2: Four-phase provisioning workflow: (1) intent and telemetry subscription, (2) curated data to [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 4.** Figure 4: ZTO-Agent orchestration latency comparison across four foundation [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

**Figure 3.** Figure 3: ZTO-Agent (Llama-3.1-8B via Ollama) orchestration latency is [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

read the original abstract

The Open Radio Access Network (O-RAN) architecture allows AI to be embedded directly into the RAN through modular xApps and rApps, yet creating these applications collecting data, training models, writing code, and deploying them safely remains slow and largely manual. Large Language Models (LLMs) offer strong reasoning and code-generation capabilities but are unsuited for the fast, deterministic inference required in real-time RAN control. We present a proof-of-concept Dual-Brain architecture that combines both strengths: an LLM-based orchestrator translates operator intents into data-collection policies and deployment code, while an automated ML engine, NeuralSmith, trains lightweight classifiers on demand via an API. We describe the architecture and provisioning workflow, share practical insights from a containerized O-RAN 5G~SA testbed, and discuss open research directions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper sketches an LLM-plus-AutoML architecture for O-RAN but provides no supporting measurements or validation.

read the letter

The paper's core is a Dual-Brain architecture where an LLM turns high-level intents into data policies and code for O-RAN xApps/rApps, while NeuralSmith handles model training on demand. They ran it on a containerized 5G testbed and note open questions. It does a decent job mapping out the provisioning steps and explaining why LLMs are kept out of the fast path. That separation is a reasonable design choice given known LLM limitations. The workflow they outline covers the end-to-end process from intent to deployed app, which could help people thinking about similar systems. On the downside, the paper stays at the level of description. There are no results from the testbed, no metrics on how often the LLM produces usable code, no timing data, and no validation that the generated policies work correctly in the RAN. The contribution boils down to showing that such an integration is possible in principle, but without evidence on reliability or efficiency, it's difficult to assess the real value. The work is aimed at researchers and engineers in the O-RAN and telecom AI space. Someone looking for concrete examples of LLM use in network automation might find the architecture diagram and workflow useful as a reference. I think it deserves peer review. The topic is relevant to current efforts in intelligent RAN, and even a preliminary architecture paper can spark useful discussion if the authors expand on their testbed experiences in revision.

Referee Report

1 major / 0 minor

Summary. The manuscript presents a proof-of-concept Dual-Brain architecture for AI service provisioning in O-RAN. An LLM-based orchestrator translates operator intents into data-collection policies and deployment code, while the NeuralSmith automated ML engine trains lightweight classifiers on demand via an API. The work describes the architecture and provisioning workflow, shares practical insights from a containerized O-RAN 5G SA testbed, and discusses open research directions, while explicitly noting that LLMs are unsuited for fast deterministic real-time RAN control.

Significance. If the described workflow and separation of roles hold, the approach could reduce manual effort in creating xApps and rApps by automating intent-to-code translation and on-demand model training. The explicit scoping of the LLM to orchestration (avoiding real-time inference) is a strength that aligns with known LLM limitations. The contribution is primarily a conceptual framework and workflow description rather than new algorithms or benchmarked performance gains.

major comments (1)

Abstract: the claim of sharing 'practical insights from a containerized O-RAN 5G SA testbed' is not accompanied by any quantitative results, error metrics, timing data, or specific observations on the provisioning workflow or model performance. This absence is load-bearing for assessing whether the Dual-Brain architecture delivers the promised reduction in manual effort.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback on our proof-of-concept manuscript. We address the single major comment below and agree that the abstract claim requires clarification given the descriptive nature of the work.

read point-by-point responses

Referee: Abstract: the claim of sharing 'practical insights from a containerized O-RAN 5G SA testbed' is not accompanied by any quantitative results, error metrics, timing data, or specific observations on the provisioning workflow or model performance. This absence is load-bearing for assessing whether the Dual-Brain architecture delivers the promised reduction in manual effort.

Authors: We agree that the manuscript provides no quantitative results, error metrics, timing data, or performance benchmarks, as the contribution is explicitly a proof-of-concept architecture and workflow description rather than an empirical study. The 'practical insights' consist of qualitative observations on implementation challenges, containerized deployment steps, and workflow feasibility drawn from the testbed, which are elaborated in the body of the paper (e.g., architecture integration and open directions). The manuscript does not claim or promise a measured reduction in manual effort; any such inference is external to the stated scope. We will revise the abstract to more precisely characterize the contribution as conceptual and workflow-oriented, removing any implication of quantified benefits. We can also expand the description of specific workflow observations in the main text if the editor deems it helpful. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper is a descriptive systems/architectural contribution presenting a proof-of-concept Dual-Brain workflow for O-RAN service provisioning. It contains no equations, no fitted parameters, no derivations, no predictions of quantities, and no load-bearing self-citations that reduce any claim to its own inputs by construction. The central claim is scoped to describing the architecture, provisioning workflow, and testbed observations rather than proving a mathematical result or generalizing from fitted data, so no circularity patterns apply.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 2 invented entities

No mathematical models, fitted parameters, or formal axioms are present in the abstract. The work introduces named components (Dual-Brain, NeuralSmith) as engineering constructs rather than new physical or mathematical entities.

invented entities (2)

Dual-Brain architecture no independent evidence
purpose: Split LLM reasoning from fast ML inference for O-RAN provisioning
Introduced in the abstract as the core proposed system; no independent evidence supplied.
NeuralSmith no independent evidence
purpose: Automated ML engine that trains lightweight classifiers on demand
Named component presented as part of the architecture; no external validation or falsifiable prediction given.

pith-pipeline@v0.9.0 · 5685 in / 1180 out tokens · 25057 ms · 2026-05-25T03:08:17.484253+00:00 · methodology

Advanced AI Service Provisioning in O-RAN through LLM Engine Integration

Core claim

What carries the argument

Load-bearing premise

What would settle it

discussion (0)