Towards Agentic Test-Driven Quality Assurance for 6G Networks
Pith reviewed 2026-05-08 07:20 UTC · model grok-4.3
The pith
An agentic framework uses TM Forum models to convert user intent into auditable 6G service specifications with derived tests.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The architecture enables deterministic graph traversal from high-level Product Offerings down to granular Service/Resource and Test specifications using TM Forum information models, allowing autonomous agents to perform intent co-creation and derive tests upfront in an end-to-end orchestration framework for 6G networks.
What carries the argument
TM Forum information models and catalogs that support deterministic graph traversal from high-level offerings to service, resource, and test specifications.
If this is right
- Agents can automatically derive validation tests from refined intents before any network provisioning occurs.
- The system produces auditable specifications that directly support proactive SLA compliance in 6G services.
- Message-driven multi-agent patterns combined with knowledge retrieval allow iterative intent refinement grounded in standardized models.
- Evaluation of multiple LLM backends with the TMF knowledge base reveals variability in tool-use reliability that must be addressed for full orchestration.
Where Pith is reading between the lines
- If the traversal works as described, network operators could reduce post-deployment issues by enforcing test coverage at the intent stage rather than after rollout.
- The observed LLM variability points to a need for hybrid approaches that combine agents with stricter rule-based checks on model outputs.
- This intent-to-test mapping could be adapted to other complex systems beyond 6G, such as cloud service orchestration, where standards-based graphs already exist.
Load-bearing premise
LLM-based agents can reliably handle intent co-creation and tool use without major hallucinations or inconsistent outputs.
What would settle it
A controlled test run of the prototype where agents produce correct test derivations and auditable specifications for multiple intents without observed hallucinations or traversal errors would support the claim; repeated failures in tool access or inconsistent mappings would refute it.
read the original abstract
This work proposes an agentic, intent-driven end-to-end (E2E) orchestration framework that integrates intent co-creation with a Test-Driven Quality Assurance paradigm. In this framework, autonomous agents iteratively refine a user's initial intent into a confirmed, auditable specification. Furthermore, the system automatically derives validation tests from these intents before provisioning, directly mirroring the Test-Driven Development workflow in software engineering to ensure proactive Service Level Agreement (SLA) compliance. The architecture is grounded in a standards-aligned knowledge representation using TM Forum (TMF) information models and catalogs. This enables deterministic graph traversal from high-level Product Offerings down to granular Service/Resource and Test specifications. We prototyped this architecture by extending OpenSlice with a message-driven, multi-agent pattern and integrating MCP-enabled (Model Context Protocol) tool access for real-time knowledge retrieval. Currently, our evaluation of the agents targets the intent co-creation phase as a baseline toward full-scale orchestration. Building on experiments with multiple open-source Large Language Model (LLM) backends integrated with the TMF-based knowledge base, we observe substantial variability in tool-use reliability and hallucination patterns, underscoring the critical importance of robust knowledge integration in agentic 6G systems.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. This paper proposes an agentic, intent-driven end-to-end orchestration framework for 6G networks that integrates LLM-based intent co-creation with a Test-Driven Quality Assurance paradigm. It grounds the architecture in TM Forum information models and catalogs to claim deterministic graph traversal from high-level Product Offerings to granular Service/Resource and Test specifications. A prototype extending OpenSlice with message-driven multi-agent patterns and MCP-enabled tool access is presented, with initial evaluation limited to the intent co-creation phase that reports substantial variability in LLM tool-use reliability and hallucination patterns.
Significance. If the framework's determinism and reliability can be established, it would advance intent-based 6G networking by adapting TDD principles to ensure proactive SLA compliance via automated test derivation from standards-aligned models. The explicit grounding in TMF catalogs and the prototype observations on LLM variability provide concrete starting points for agentic systems research; however, the conceptual nature and limited validation currently constrain its immediate applicability.
major comments (2)
- [Abstract] Abstract: The central claim that TMF information models 'enable deterministic graph traversal from high-level Product Offerings down to granular Service/Resource and Test specifications' is load-bearing, yet the same paragraph reports 'substantial variability in tool-use reliability and hallucination patterns' for the LLM agents that perform intent co-creation and mediate access to the knowledge base via MCP tools. This non-determinism in the traversal mechanism directly conflicts with the determinism asserted for the overall framework and requires explicit resolution (e.g., via deterministic wrappers, verification steps, or revised claims).
- [Prototype evaluation] Prototype evaluation section: The evaluation is described only as targeting 'the intent co-creation phase as a baseline toward full-scale orchestration,' with no quantitative results, datasets, or metrics provided for the downstream steps of test derivation, provisioning, or SLA compliance checking. This leaves the end-to-end effectiveness of the Test-Driven QA paradigm unsupported, which is essential to the paper's primary contribution.
minor comments (1)
- [Abstract] The acronym 'MCP' (Model Context Protocol) is introduced without expansion on first use; a brief definition or reference would improve accessibility for readers unfamiliar with the tool-access mechanism.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback on our manuscript. We address each major comment below with point-by-point responses and indicate the planned revisions.
read point-by-point responses
-
Referee: The central claim that TMF information models 'enable deterministic graph traversal from high-level Product Offerings down to granular Service/Resource and Test specifications' is load-bearing, yet the same paragraph reports 'substantial variability in tool-use reliability and hallucination patterns' for the LLM agents that perform intent co-creation and mediate access to the knowledge base via MCP tools. This non-determinism in the traversal mechanism directly conflicts with the determinism asserted for the overall framework and requires explicit resolution (e.g., via deterministic wrappers, verification steps, or revised claims).
Authors: We appreciate the referee highlighting this tension in the abstract. The determinism claim specifically concerns the TMF information models and catalogs, which define fixed, standards-based relationships enabling precise traversal between entities once correctly identified. The observed variability and hallucination patterns relate exclusively to the LLM agents' tool invocation, intent refinement, and result interpretation steps, not to the underlying graph traversal logic. We will revise the abstract to explicitly distinguish these aspects and incorporate a brief description of mitigation strategies, such as output verification against the knowledge graph. This change will be made in the revised version. revision: yes
-
Referee: The evaluation is described only as targeting 'the intent co-creation phase as a baseline toward full-scale orchestration,' with no quantitative results, datasets, or metrics provided for the downstream steps of test derivation, provisioning, or SLA compliance checking. This leaves the end-to-end effectiveness of the Test-Driven QA paradigm unsupported, which is essential to the paper's primary contribution.
Authors: We agree that the presented evaluation is limited to the intent co-creation phase, as explicitly stated in the manuscript, and serves as a baseline. The architecture for downstream steps (test derivation, provisioning, and SLA checking) is described in detail but has not received quantitative prototype evaluation at this stage. In revision, we will expand the evaluation section to include additional architectural details, any preliminary simulation results for test specification generation, and explicit discussion of current scope limitations with full end-to-end metrics positioned as future work. This will better contextualize the contribution without overstating current results. revision: partial
Circularity Check
No circularity: framework proposal with no derivations or fitted parameters
full rationale
The paper presents an architectural framework proposal for agentic orchestration in 6G networks, grounded in TM Forum information models for graph traversal and LLM-based agents for intent refinement. No equations, parameters, or mathematical derivations are present in the provided text. The central claim of deterministic traversal is asserted as a property of the TMF standards alignment rather than derived from any self-referential construction or fitted input. Evaluation observations on LLM variability are reported as empirical findings, not used to retroactively define the determinism. This is a self-contained conceptual design with external grounding in standards, yielding no circular steps.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption TM Forum information models and catalogs provide sufficient structure for deterministic graph traversal from Product Offerings to Test specifications.
invented entities (1)
-
Agentic intent co-creation and test derivation system
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Autonomous Networks Framework v2.0.0 (IG1218F),
TMForum, “Autonomous Networks Framework v2.0.0 (IG1218F),” 2025
work page 2025
-
[2]
3GPP, 3GPP TS 28.100, Technical Specification Group Services and System Aspects; Management and orchestration; Levels of autonomous network; (Release 17), v17.1.0, Sept. 2022
work page 2022
-
[3]
P. Stjernholm et al., “Intent -driven networks,” Ericsson AB, Stockholm, Sweden, White Paper BNEW -25:002723, Jan. 2025. [Online]. Available: https://www.ericsson.com/492adc/assets/local/reports- papers/white-apers/2025/intent-driven-networks.pdf
work page 2025
-
[4]
5G orchestration and service assurance,
J. Hodges, “5G orchestration and service assurance,” HeavyReading, Cisco, Amdocs, White Paper, Apr. 2024. [Online]. Available: https://www.amdocs.com/sites/default/files/2024-05/5G-orchestration- and-service-assurance-05-01-2024.pdf
work page 2024
-
[5]
Agentic AI for telecom: Charting the course for an intelligent future,
GSMA, “Agentic AI for telecom: Charting the course for an intelligent future,” GSMA, London, U.K., Tech. Rep., 2025. [Online]. Available: https://www.gsma.com/solutions-and-impact/technologies/artificial- intelligence/agentic-ai-for-telecom-charting-the-course-for-an- intelligent-future/, Last accessed: 11 February 2025
work page 2025
-
[6]
Advanced architectures integrated with agentic AI for next-generation wireless networks,
K. Dev et al., “Advanced architectures integrated with agentic AI for next-generation wireless networks,” 2025, arXiv:2502.01089
-
[7]
Revolutionizing Networking: A Comprehensive Overview of Intent -Based Networking,
S. Minhas, R. Jaswal, A. Sharma , and S. Singla, "Revolutionizing Networking: A Comprehensive Overview of Intent -Based Networking," 2024 International Conference on Emerging Innovations and Advanced Computing (INNOCOMP), Sonipat, India, 2024, pp. 463 -468, doi: 10.1109/INNOCOMP63224.2024.00081
-
[8]
Intent -driven network and service management: Definitions, modeling and implementation,
S. Mwanje et al., “Intent -driven network and service management: Definitions, modeling and implementation,” TechRxiv, accessed 2025
work page 2025
-
[9]
Intent -Based Networking: Current Advances, Open Challenges, and Future Directions,
M. Gharbaoui, B. Martini , and P. Castoldi, "Intent -Based Networking: Current Advances, Open Challenges, and Future Directions," 2023 23rd International Conference on Transparent Optical Networks (ICTON), Bucharest, Romania, 2023, pp. 1 -5, doi: 10.1109/ICTON59386.2023.10207407
-
[10]
A Survey on Intent-Driven End-to-End 6G Mobile Communication System,
Y. Wang, C. Yang, T. Li, Y. Ouyang, X. Mi, and Y. Song, "A Survey on Intent-Driven End-to-End 6G Mobile Communication System," in IEEE Communications Surveys & Tutorials, vol. 28, pp. 882 -915, 2026, doi: 10.1109/COMST.2025.3575041
-
[11]
Security in Intent -Based Networking: Challenges and Solutions,
Ahmad, J. Malinen, F. Christou, P. Porambage, A. Kirstädter and J. Suomalainen, "Security in Intent -Based Networking: Challenges and Solutions," 2023 IEEE Conference on Standards for Communications and Networking (CSCN), Munich, Germany, 2023, pp. 296 -301, doi: 10.1109/CSCN60443.2023.10453125
-
[12]
TMF 620 Product Catalog Management https://www.tmforum.org/resources/specifications/tmf620-product- catalog-management-api-user-guide-v5-0-0/, accessed 2025
work page 2025
-
[13]
TMF 653 Service Test Management https://www.tmforum.org/resources/specification/tmf653-service-test- management-api-user-guide-v4-1-0/
-
[14]
Intent in Autonomous Networks, v1.3.0 (IG1253),
TMForum, “Intent in Autonomous Networks, v1.3.0 (IG1253),” 2022
work page 2022
-
[15]
ETSI, “Zero -touch network and Service Management (ZSM); Intent - driven autonomous networks; Generic aspects, v2.1 .1 (ETSI GR ZSM 011),” 2024
work page 2024
-
[16]
Defining Intent -Based Service Management Automation for 6G Multi -Stakeholders Scenarios,
P. Alemany et al., "Defining Intent -Based Service Management Automation for 6G Multi -Stakeholders Scenarios," in IEEE Open Journal of the Communications Society, vol. 6, pp. 2373 -2396, 2025, doi: 10.1109/OJCOMS.2025.3554250
-
[17]
Multi -Agent Team Learning in Virtualized Open Radio Access Networks (O-RAN),
P. E. Iturria -Rivera, H. Zhang, H. Zhou, S. Mollahasani , and M. Erol - Kantarci, “Multi -Agent Team Learning in Virtualized Open Radio Access Networks (O-RAN),” Sensors, vol. 22, no. 14, 2022
work page 2022
-
[18]
SANNet: A Semantic-Aware Agentic AI Networking Framework for Multi -Agent Cross-Layer Coordination,
Y. Xiao, H. Zhou, X. Li, Y. Gao, G. Shi , and P. Zhang, “SANNet: A Semantic-Aware Agentic AI Networking Framework for Multi -Agent Cross-Layer Coordination,” arXiv, May 2025
work page 2025
-
[19]
A Reference Architecture for Autonomous Networks: An Agent -Based Approach,
J. Sifakis, D. Li, H. Huang, Y. Zhang, W. Dang, R. Huang , and Y. Yu, “A Reference Architecture for Autonomous Networks: An Agent -Based Approach,” 2025
work page 2025
-
[20]
Gartner Research, Trust, Risk and Security Management (TRiSM) for AI Systems,
“Gartner Research, Trust, Risk and Security Management (TRiSM) for AI Systems,” 2024
work page 2024
-
[21]
Fraser, S., Beck, K., Caputo, B., Mackinnon, T., Newkirk, J., Poole, C. (2003). Test Driven Development (TDD). In: Marchesi, M., Succi, G. (eds) Extreme Programming and Agile Processes in Software Engineering. XP 2003. Lecture Notes in Computer Science, vol 2675. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44870-5_84
-
[22]
Software Development Group OpenSlice (SDG OSL)
ETSI, “Software Development Group OpenSlice (SDG OSL)”. [Online]. Available: https://osl.etsi.org, Last accessed: 11 February 2025
work page 2025
-
[23]
ETSI, “SDG OSL GitLab repository”. [Online]. Available: https://labs.etsi.org/rep/osl/code/addons/org.etsi.osl.controllers.askagent, Last accessed: 11 February 2025
work page 2025
- [24]
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.