Customized User Plane Processing via Code Generating AI Agents for Next Generation Mobile Networks
Pith reviewed 2026-05-15 00:27 UTC · model grok-4.3
The pith
AI agents can generate on-demand code for customized user plane processing blocks in mobile networks when given suitable models, prompts, and templates.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Generative AI agents equipped with code generation capabilities can produce functional code for new user plane processing blocks that decode protocol data units and execute application-specified actions, with accuracy depending on model selection, prompt design, and the availability of a code template.
What carries the argument
Code-generating AI agents that convert text-based service requests into executable user plane function blocks for inspecting and acting on protocol data units.
If this is right
- Networks can add new user plane functions rapidly in response to application requests instead of relying on pre-installed blocks.
- Vertical applications gain direct influence over traffic handling through natural language specifications.
- Development effort for custom connectivity services decreases because code is produced automatically.
- 6G systems can adapt their data plane behavior on shorter timescales than manual reconfiguration allows.
Where Pith is reading between the lines
- Similar code-generation methods could extend to control-plane functions or orchestration tasks if safety constraints are met.
- Live deployment would likely require automated verification steps to catch security or performance problems before activation.
- The approach might reduce the need for extensive pre-standardized function libraries by allowing on-the-fly creation.
Load-bearing premise
The generated code will correctly and safely process real protocol data units when integrated into live network environments.
What would settle it
Running the generated blocks against live or emulated traffic streams and checking whether they perform the exact requested inspections and actions without runtime errors or incorrect outputs.
Figures
read the original abstract
Generative AI is envisioned to have a crucial impact on next generation mobile networking, making the sixth generation (6G) system considerably more autonomous, flexible, and adaptive than its predecessors. By leveraging their natural language processing and code generation capabilities, AI agents enable novel interactions and services between networks and vertical applications. A particularly promising and interesting use case is the customization of connectivity services for vertical applications by generating new customized processing blocks based on text-based service requests. More specifically, AI agents are able to generate code for a new function block that handles user plane traffic, allowing it to inspect and decode a protocol data unit (PDU) and perform specified actions as requested by the application. In this study, we investigate the code generation problem for generating such customized processing blocks on-demand. We evaluate various factors affecting the accuracy of the code generation process in this context, including model selection, prompt design, and the provision of a code template for the agent to utilize. Our findings indicate that AI agents are capable of generating such blocks with the desired behavior on-demand under suitable conditions. We believe that exploring the code generation for network-specific tasks is a very interesting problem for 6G and beyond, enabling networks to achieve a new level of customization by generating new capabilities on-demand.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper investigates the use of code-generating AI agents to produce customized user-plane processing blocks for 6G networks. Given a natural-language service request, an agent is prompted to emit code that inspects/decodes a protocol data unit (PDU) and performs application-specified actions. The study evaluates the effects of model selection, prompt design, and the provision of a code template on generation accuracy and reports that suitable combinations allow agents to produce blocks exhibiting the desired behavior on demand.
Significance. If the central claim were demonstrated, the work would open a path toward on-demand, text-driven customization of network functions, reducing manual development cycles for vertical-specific processing. The approach is timely for 6G autonomy goals. However, because the manuscript supplies only generation-success statistics and no runtime verification, the practical significance remains prospective rather than established.
major comments (1)
- [Evaluation / Results (abstract and main experimental sections)] The evaluation reports only whether generated code compiles or matches syntactic expectations across models, prompts, and templates. No execution tests against real or synthetic PDUs, no behavioral correctness checks, and no security or performance analysis are described. Consequently the claim that the blocks exhibit 'desired behavior on-demand' rests on an unverified assumption rather than demonstrated functionality.
minor comments (1)
- [Abstract] The abstract states that 'positive findings on accuracy factors' were obtained yet supplies no numerical values, error rates, or dataset sizes, making it impossible for a reader to gauge effect sizes from the summary alone.
Simulated Author's Rebuttal
We thank the referee for the thoughtful review and for highlighting the distinction between syntactic generation success and demonstrated runtime behavior. We agree that the current evaluation leaves the functional correctness of the generated blocks as an assumption rather than a verified result. We will strengthen the manuscript by adding execution-based validation while preserving the original focus on code-generation factors.
read point-by-point responses
-
Referee: The evaluation reports only whether generated code compiles or matches syntactic expectations across models, prompts, and templates. No execution tests against real or synthetic PDUs, no behavioral correctness checks, and no security or performance analysis are described. Consequently the claim that the blocks exhibit 'desired behavior on-demand' rests on an unverified assumption rather than demonstrated functionality.
Authors: We acknowledge that our reported metrics are limited to syntactic validity and successful compilation. The manuscript does not contain runtime tests that feed synthetic or captured PDUs into the generated blocks and verify that the intended inspection, decoding, and action logic execute correctly. We will add a new experimental subsection that (i) constructs minimal synthetic PDUs matching the vertical-service descriptions, (ii) executes the generated code in an isolated user-plane simulator, and (iii) reports pass/fail rates for behavioral correctness. We will also include a brief discussion of security considerations (e.g., input sanitization) and note performance overhead as future work. These additions will be placed after the existing generation-accuracy results so that the original analysis of model, prompt, and template effects remains intact. revision: yes
Circularity Check
No circularity: empirical evaluation of external AI models
full rationale
The paper reports experimental success rates for code generation by third-party AI models (model selection, prompt variants, templates) when asked to produce user-plane processing blocks. No equations, fitted parameters, or self-referential definitions appear; the central claim is an observed empirical outcome rather than a derivation that reduces to its own inputs by construction. No load-bearing self-citations or uniqueness theorems are invoked. The work is therefore self-contained against external benchmarks and receives the default non-circularity finding.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Generative AI models can produce correct, functional code for network protocol handling tasks when provided with suitable prompts and templates.
Reference graph
Works this paper leans on
-
[1]
Framework, Use Cases and Requirements for AI Agent Protocols,
J. Rosenberg and C. Jennings, “Framework, Use Cases and Requirements for AI Agent Protocols,” May 2025. [Online]. Available: https://www.ietf.org/id/draft-rosenberg-ai-protocols-00.html
work page 2025
-
[2]
European Telecommunications Standards Institute (ETSI), “Experiential Networked Intelligence (ENI); Study on Multi-Agent Frameworks for Next-Generation Core Networks (GR ENI 056),” October 2025
work page 2025
-
[3]
G. O. Boateng, H. Sami, A. Alagha, H. Elmekki, A. Hammoud, R. Mizouni, A. Mourad, H. Otrok, J. Bentahar, S. Muhaidat, C. Talhi, Z. Dziong, and M. Guizani, “A survey on large language models for communication, network, and service management: Application insights, challenges, and future directions,”IEEE Communications Surveys & Tutorials, 2025
work page 2025
-
[4]
J. Guo, M. Wang, H. Yin, B. Song, Y . Chi, F. R. Yu, and C. Yuen, “Large language models and artificial intelligence generated content technologies meet communication networks,”IEEE Internet of Things Journal, vol. 12, no. 2, pp. 1529–1553, 2025
work page 2025
-
[5]
N. Yang, M. Fan, W. Wang, and H. Zhang, “Decision-making large language model for wireless communication: A comprehensive survey on key techniques,”IEEE Communications Surveys & Tutorials, 2025
work page 2025
-
[6]
Llm-driven multi-agent architectures for intelligent self-organizing net- works,
A. Qayyum, A. Albaseer, J. Qadir, A. Al-Fuqaha, and M. Abdallah, “Llm-driven multi-agent architectures for intelligent self-organizing net- works,”IEEE Network, pp. 1–10, 2025
work page 2025
-
[7]
Teleqna: A benchmark dataset to assess large language models telecommunications knowledge,
A. Maatouk, F. Ayed, N. Piovesan, A. D. Domenico, M. Debbah, and Z.- Q. Luo, “Teleqna: A benchmark dataset to assess large language models telecommunications knowledge,”IEEE Network, pp. 1–1, 2025
work page 2025
-
[8]
Telecomgpt: A framework to build telecom-specific large language models,
H. Zou, Q. Zhao, Y . Tian, L. Bariah, F. Bader, T. Lestable, and M. Deb- bah, “Telecomgpt: A framework to build telecom-specific large language models,”IEEE Transactions on Machine Learning in Communications and Networking, vol. 3, pp. 948–975, 2025
work page 2025
-
[9]
An empirical study of code generation errors made by large language models,
D. Song, Z. Zhou, Z. Wang, Y . Huang, S. Chen, B. Kou, L. Ma, and T. Zhang, “An empirical study of code generation errors made by large language models,” inProceedings of the ACM/IEEE International Con- ference on MAPS (MAPS ’23). San Francisco, CA, USA: Association for Computing Machinery, 2023
work page 2023
-
[10]
European Telecommunications Standards Institute (ETSI), “Experien- tial Networked Intelligence (ENI); Study on AI Agents based Next- generation Network Slicing (GR ENI 051),” February 2025
work page 2025
-
[11]
A-core: A novel framework of agentic ai in the 6g core network,
W. Tong, W. Huo, T. Lejkin, J. Penhoat, C. Peng, C. Pereira, F. Wang, S. Wu, L. Yang, and Y . Shi, “A-core: A novel framework of agentic ai in the 6g core network,” in2025 IEEE International Conference on Communications Workshops (ICC Workshops), 2025, pp. 1104–1109
work page 2025
-
[12]
Taco - a generative ai copilot for intent-based telecommunication core network analysis,
T. T ´othfalusi, Z. Csisz ´ar, and P. Varga, “Taco - a generative ai copilot for intent-based telecommunication core network analysis,” inIEEE Network Operations and Management Symposium (NOMS), 2025
work page 2025
-
[13]
Anthropic, “Model Context Protocol,” https://modelcontextprotocol.io/, 2025, [Online; accessed 13-October-2025]
work page 2025
-
[14]
Study on architecture enhancement for extended reality and media service (xrm),
3GPP, “Study on architecture enhancement for extended reality and media service (xrm),” September 2024
work page 2024
-
[15]
J. F. Kurose and K. W. Ross,Computer Networking: A Top-Down Approach (7th Edition). Pearson, January 2021
work page 2021
-
[16]
Rfc 7231: Hypertext transfer protocol (http/1.1): semantics and content,
R. Fielding and J. Reschke, “Rfc 7231: Hypertext transfer protocol (http/1.1): semantics and content,” 2014
work page 2014
-
[17]
D. Perkins, “Rfc1171: Point-to-point protocol for the transmission of multi-protocol datagrams over point-to-point links,” 1990
work page 1990
-
[18]
Motcoder: Elevating large language models with modular of thought for challenging programming tasks,
J. Li, P. Chen, B. Xia, H. Xu, and J. Jia, “Motcoder: Elevating large language models with modular of thought for challenging programming tasks,”arXiv preprint arXiv:2312.15960, 2023
-
[19]
Large language models for software engi- neering: A systematic literature review,
X. Hou, Y . Zhao, Y . Liu, Z. Yang, K. Wang, L. Li, X. Luo, D. Lo, J. Grundy, and H. Wang, “Large language models for software engi- neering: A systematic literature review,”ACM Transactions on Software Engineering and Methodology, vol. 33, no. 8, pp. 1–79, 2024
work page 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.