pith. sign in

arxiv: 2604.03282 · v1 · submitted 2026-03-24 · 📡 eess.SY · cs.AI· cs.SY

Customized User Plane Processing via Code Generating AI Agents for Next Generation Mobile Networks

Pith reviewed 2026-05-15 00:27 UTC · model grok-4.3

classification 📡 eess.SY cs.AIcs.SY
keywords AI agentscode generationuser plane6G networkscustomized processingprotocol data unitsgenerative AImobile networks
0
0 comments X

The pith

AI agents can generate on-demand code for customized user plane processing blocks in mobile networks when given suitable models, prompts, and templates.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines how generative AI agents use natural language requests to produce code that customizes user plane traffic handling in next-generation mobile networks. Agents receive text descriptions of desired behaviors, such as inspecting or acting on specific protocol data units, and output new function blocks to implement them. Evaluations focus on how model choice, prompt phrasing, and code templates influence whether the generated code matches the requested behavior. Results show successful generation is possible under appropriate conditions, pointing toward networks that can add capabilities dynamically without manual development. This approach aims to increase flexibility for vertical applications interacting with 6G systems.

Core claim

Generative AI agents equipped with code generation capabilities can produce functional code for new user plane processing blocks that decode protocol data units and execute application-specified actions, with accuracy depending on model selection, prompt design, and the availability of a code template.

What carries the argument

Code-generating AI agents that convert text-based service requests into executable user plane function blocks for inspecting and acting on protocol data units.

If this is right

  • Networks can add new user plane functions rapidly in response to application requests instead of relying on pre-installed blocks.
  • Vertical applications gain direct influence over traffic handling through natural language specifications.
  • Development effort for custom connectivity services decreases because code is produced automatically.
  • 6G systems can adapt their data plane behavior on shorter timescales than manual reconfiguration allows.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar code-generation methods could extend to control-plane functions or orchestration tasks if safety constraints are met.
  • Live deployment would likely require automated verification steps to catch security or performance problems before activation.
  • The approach might reduce the need for extensive pre-standardized function libraries by allowing on-the-fly creation.

Load-bearing premise

The generated code will correctly and safely process real protocol data units when integrated into live network environments.

What would settle it

Running the generated blocks against live or emulated traffic streams and checking whether they perform the exact requested inspections and actions without runtime errors or incorrect outputs.

Figures

Figures reproduced from arXiv: 2604.03282 by Onur Ayan, Xiaowen Ma, Xueli An, Yunpu Ma.

Figure 1
Figure 1. Figure 1: System architecture of the proposed AI-agent-enabled framework for on-demand Customized Processing Block generation within the mobile core [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Processing flow of the proposed on-demand Customized Processing [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Example of the error type “Operation/Calculation Error – Incorrect Arithmetic Operation (IAO)” observed in a Pub–Sub protocol use case. forward the published information on a specific topic to all of its subscribers. TABLE II ERROR TYPES Error Types Condition Error (CE) Missing condition Incorrect condition Constant Value Error (CVE) Constant value error Reference Error (RE) Wrong method/variable Undefined… view at source ↗
Figure 4
Figure 4. Figure 4: Error composition across three protocols under different configurations. [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
read the original abstract

Generative AI is envisioned to have a crucial impact on next generation mobile networking, making the sixth generation (6G) system considerably more autonomous, flexible, and adaptive than its predecessors. By leveraging their natural language processing and code generation capabilities, AI agents enable novel interactions and services between networks and vertical applications. A particularly promising and interesting use case is the customization of connectivity services for vertical applications by generating new customized processing blocks based on text-based service requests. More specifically, AI agents are able to generate code for a new function block that handles user plane traffic, allowing it to inspect and decode a protocol data unit (PDU) and perform specified actions as requested by the application. In this study, we investigate the code generation problem for generating such customized processing blocks on-demand. We evaluate various factors affecting the accuracy of the code generation process in this context, including model selection, prompt design, and the provision of a code template for the agent to utilize. Our findings indicate that AI agents are capable of generating such blocks with the desired behavior on-demand under suitable conditions. We believe that exploring the code generation for network-specific tasks is a very interesting problem for 6G and beyond, enabling networks to achieve a new level of customization by generating new capabilities on-demand.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The paper investigates the use of code-generating AI agents to produce customized user-plane processing blocks for 6G networks. Given a natural-language service request, an agent is prompted to emit code that inspects/decodes a protocol data unit (PDU) and performs application-specified actions. The study evaluates the effects of model selection, prompt design, and the provision of a code template on generation accuracy and reports that suitable combinations allow agents to produce blocks exhibiting the desired behavior on demand.

Significance. If the central claim were demonstrated, the work would open a path toward on-demand, text-driven customization of network functions, reducing manual development cycles for vertical-specific processing. The approach is timely for 6G autonomy goals. However, because the manuscript supplies only generation-success statistics and no runtime verification, the practical significance remains prospective rather than established.

major comments (1)
  1. [Evaluation / Results (abstract and main experimental sections)] The evaluation reports only whether generated code compiles or matches syntactic expectations across models, prompts, and templates. No execution tests against real or synthetic PDUs, no behavioral correctness checks, and no security or performance analysis are described. Consequently the claim that the blocks exhibit 'desired behavior on-demand' rests on an unverified assumption rather than demonstrated functionality.
minor comments (1)
  1. [Abstract] The abstract states that 'positive findings on accuracy factors' were obtained yet supplies no numerical values, error rates, or dataset sizes, making it impossible for a reader to gauge effect sizes from the summary alone.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the thoughtful review and for highlighting the distinction between syntactic generation success and demonstrated runtime behavior. We agree that the current evaluation leaves the functional correctness of the generated blocks as an assumption rather than a verified result. We will strengthen the manuscript by adding execution-based validation while preserving the original focus on code-generation factors.

read point-by-point responses
  1. Referee: The evaluation reports only whether generated code compiles or matches syntactic expectations across models, prompts, and templates. No execution tests against real or synthetic PDUs, no behavioral correctness checks, and no security or performance analysis are described. Consequently the claim that the blocks exhibit 'desired behavior on-demand' rests on an unverified assumption rather than demonstrated functionality.

    Authors: We acknowledge that our reported metrics are limited to syntactic validity and successful compilation. The manuscript does not contain runtime tests that feed synthetic or captured PDUs into the generated blocks and verify that the intended inspection, decoding, and action logic execute correctly. We will add a new experimental subsection that (i) constructs minimal synthetic PDUs matching the vertical-service descriptions, (ii) executes the generated code in an isolated user-plane simulator, and (iii) reports pass/fail rates for behavioral correctness. We will also include a brief discussion of security considerations (e.g., input sanitization) and note performance overhead as future work. These additions will be placed after the existing generation-accuracy results so that the original analysis of model, prompt, and template effects remains intact. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical evaluation of external AI models

full rationale

The paper reports experimental success rates for code generation by third-party AI models (model selection, prompt variants, templates) when asked to produce user-plane processing blocks. No equations, fitted parameters, or self-referential definitions appear; the central claim is an observed empirical outcome rather than a derivation that reduces to its own inputs by construction. No load-bearing self-citations or uniqueness theorems are invoked. The work is therefore self-contained against external benchmarks and receives the default non-circularity finding.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The claim rests primarily on the domain assumption that current generative models possess sufficient capability for protocol-aware code synthesis when prompted appropriately; no free parameters or new entities are introduced.

axioms (1)
  • domain assumption Generative AI models can produce correct, functional code for network protocol handling tasks when provided with suitable prompts and templates.
    Central to the reported findings on on-demand generation success under suitable conditions.

pith-pipeline@v0.9.0 · 5536 in / 1177 out tokens · 34888 ms · 2026-05-15T00:27:55.340236+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

19 extracted references · 19 canonical work pages

  1. [1]

    Framework, Use Cases and Requirements for AI Agent Protocols,

    J. Rosenberg and C. Jennings, “Framework, Use Cases and Requirements for AI Agent Protocols,” May 2025. [Online]. Available: https://www.ietf.org/id/draft-rosenberg-ai-protocols-00.html

  2. [2]

    Experiential Networked Intelligence (ENI); Study on Multi-Agent Frameworks for Next-Generation Core Networks (GR ENI 056),

    European Telecommunications Standards Institute (ETSI), “Experiential Networked Intelligence (ENI); Study on Multi-Agent Frameworks for Next-Generation Core Networks (GR ENI 056),” October 2025

  3. [3]

    A survey on large language models for communication, network, and service management: Application insights, challenges, and future directions,

    G. O. Boateng, H. Sami, A. Alagha, H. Elmekki, A. Hammoud, R. Mizouni, A. Mourad, H. Otrok, J. Bentahar, S. Muhaidat, C. Talhi, Z. Dziong, and M. Guizani, “A survey on large language models for communication, network, and service management: Application insights, challenges, and future directions,”IEEE Communications Surveys & Tutorials, 2025

  4. [4]

    Large language models and artificial intelligence generated content technologies meet communication networks,

    J. Guo, M. Wang, H. Yin, B. Song, Y . Chi, F. R. Yu, and C. Yuen, “Large language models and artificial intelligence generated content technologies meet communication networks,”IEEE Internet of Things Journal, vol. 12, no. 2, pp. 1529–1553, 2025

  5. [5]

    Decision-making large language model for wireless communication: A comprehensive survey on key techniques,

    N. Yang, M. Fan, W. Wang, and H. Zhang, “Decision-making large language model for wireless communication: A comprehensive survey on key techniques,”IEEE Communications Surveys & Tutorials, 2025

  6. [6]

    Llm-driven multi-agent architectures for intelligent self-organizing net- works,

    A. Qayyum, A. Albaseer, J. Qadir, A. Al-Fuqaha, and M. Abdallah, “Llm-driven multi-agent architectures for intelligent self-organizing net- works,”IEEE Network, pp. 1–10, 2025

  7. [7]

    Teleqna: A benchmark dataset to assess large language models telecommunications knowledge,

    A. Maatouk, F. Ayed, N. Piovesan, A. D. Domenico, M. Debbah, and Z.- Q. Luo, “Teleqna: A benchmark dataset to assess large language models telecommunications knowledge,”IEEE Network, pp. 1–1, 2025

  8. [8]

    Telecomgpt: A framework to build telecom-specific large language models,

    H. Zou, Q. Zhao, Y . Tian, L. Bariah, F. Bader, T. Lestable, and M. Deb- bah, “Telecomgpt: A framework to build telecom-specific large language models,”IEEE Transactions on Machine Learning in Communications and Networking, vol. 3, pp. 948–975, 2025

  9. [9]

    An empirical study of code generation errors made by large language models,

    D. Song, Z. Zhou, Z. Wang, Y . Huang, S. Chen, B. Kou, L. Ma, and T. Zhang, “An empirical study of code generation errors made by large language models,” inProceedings of the ACM/IEEE International Con- ference on MAPS (MAPS ’23). San Francisco, CA, USA: Association for Computing Machinery, 2023

  10. [10]

    Experien- tial Networked Intelligence (ENI); Study on AI Agents based Next- generation Network Slicing (GR ENI 051),

    European Telecommunications Standards Institute (ETSI), “Experien- tial Networked Intelligence (ENI); Study on AI Agents based Next- generation Network Slicing (GR ENI 051),” February 2025

  11. [11]

    A-core: A novel framework of agentic ai in the 6g core network,

    W. Tong, W. Huo, T. Lejkin, J. Penhoat, C. Peng, C. Pereira, F. Wang, S. Wu, L. Yang, and Y . Shi, “A-core: A novel framework of agentic ai in the 6g core network,” in2025 IEEE International Conference on Communications Workshops (ICC Workshops), 2025, pp. 1104–1109

  12. [12]

    Taco - a generative ai copilot for intent-based telecommunication core network analysis,

    T. T ´othfalusi, Z. Csisz ´ar, and P. Varga, “Taco - a generative ai copilot for intent-based telecommunication core network analysis,” inIEEE Network Operations and Management Symposium (NOMS), 2025

  13. [13]

    Model Context Protocol,

    Anthropic, “Model Context Protocol,” https://modelcontextprotocol.io/, 2025, [Online; accessed 13-October-2025]

  14. [14]

    Study on architecture enhancement for extended reality and media service (xrm),

    3GPP, “Study on architecture enhancement for extended reality and media service (xrm),” September 2024

  15. [15]

    J. F. Kurose and K. W. Ross,Computer Networking: A Top-Down Approach (7th Edition). Pearson, January 2021

  16. [16]

    Rfc 7231: Hypertext transfer protocol (http/1.1): semantics and content,

    R. Fielding and J. Reschke, “Rfc 7231: Hypertext transfer protocol (http/1.1): semantics and content,” 2014

  17. [17]

    Rfc1171: Point-to-point protocol for the transmission of multi-protocol datagrams over point-to-point links,

    D. Perkins, “Rfc1171: Point-to-point protocol for the transmission of multi-protocol datagrams over point-to-point links,” 1990

  18. [18]

    Motcoder: Elevating large language models with modular of thought for challenging programming tasks,

    J. Li, P. Chen, B. Xia, H. Xu, and J. Jia, “Motcoder: Elevating large language models with modular of thought for challenging programming tasks,”arXiv preprint arXiv:2312.15960, 2023

  19. [19]

    Large language models for software engi- neering: A systematic literature review,

    X. Hou, Y . Zhao, Y . Liu, Z. Yang, K. Wang, L. Li, X. Luo, D. Lo, J. Grundy, and H. Wang, “Large language models for software engi- neering: A systematic literature review,”ACM Transactions on Software Engineering and Methodology, vol. 33, no. 8, pp. 1–79, 2024