pith. sign in

arxiv: 2606.29742 · v1 · pith:FRCYX3ZJnew · submitted 2026-06-29 · 💻 cs.SE

MicroAgent: Context-Augmented Multi-Agent Framework for Automatic Microservice Decomposition

Pith reviewed 2026-06-30 05:43 UTC · model grok-4.3

classification 💻 cs.SE
keywords microservice decompositionmulti-agent frameworkmonolithic applicationscontext augmentationsoftware architecture migrationJava web applicationsdesign principleslarge language models
0
0 comments X

The pith

MicroAgent divides microservice decomposition into five subtasks handled by specialized agents with multi-granularity context and analytical tools to reach 89.2% average accuracy.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes MicroAgent as a multi-agent framework that splits the task of breaking monolithic applications into microservices into five subtasks, each assigned to a dedicated agent. Tailored multi-granularity context keeps each agent focused while analytical tools enforce design principles and reduce information overload. Evaluations across 10 Java Web applications show the framework delivers 89.2% average decomposition accuracy, exceeding the prior best method by 24.6%.

Core claim

MicroAgent divides the decomposition process into five distinct subtasks and assigns each to a specialized agent. Each agent receives tailored multi-granularity context to stay focused and integrates analytical tools to guide decisions according to established design principles. This produces an average decomposition accuracy of 89.2% on 10 Java Web applications, 24.6% above the state-of-the-art baseline.

What carries the argument

The five-subtask division with specialized agents supplied multi-granularity context and integrated analytical tools for principle enforcement.

If this is right

  • Developers gain an automated route to partition legacy monoliths that captures semantic relationships more reliably than prior automated techniques.
  • The subtask structure and tool integration produce decompositions that better satisfy cohesion and design principles.
  • The reported 24.6% accuracy lift holds across the 10 evaluated Java Web applications.
  • A case study confirms the framework yields decompositions with measurable practical benefits.
  • Information overload is mitigated for each agent through context tailoring.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same agent-plus-context pattern could be tested on codebases written in languages other than Java to check cross-language portability.
  • Embedding the framework inside continuous-integration pipelines might allow incremental decomposition as code evolves.
  • The accuracy numbers rest on the specific choice of five subtasks; altering that number on new applications would test whether the count itself is load-bearing.
  • Combining agent outputs with targeted human review at key decision points could further raise accuracy beyond the fully automated results.

Load-bearing premise

The five-subtask division together with multi-granularity context and analytical tools is sufficient to keep agents focused and enforce design principles without systematic biases or missed semantic relationships.

What would settle it

Re-running the evaluations on a fresh set of 10 Java applications and obtaining accuracy below 80% or no gain over the baseline method would challenge the reported superiority.

Figures

Figures reproduced from arXiv: 2606.29742 by Hui Zeng, Junjie Huang, Michael R. Lyu, Shiwen Shan, Xingyan Chen, Yanlin Wang, Yuxin Su, Zishan Su.

Figure 1
Figure 1. Figure 1: Motivating examples demonstrating the challenges [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Overview of the multi-agent workflow of MicroAgent. Specifically, in the first subtask Domain Identification, the Do￾main Agentis responsible for analyzing business logic and identify￾ing candidate domains. Then, in Domain Clustering, the Clustering Agents are dynamically instantiated for each domain, where each agent collects and clusters domain-specific classes associated with its respective domain. To a… view at source ↗
Figure 3
Figure 3. Figure 3: Per-application WS across methods. Answer to RQ1: MicroAgent achieves satisfying results in both architectural metrics and similarity metrics. Our tool demonstrates a 24.6% improvement in the weighted similarity, and a substantial 409.6% improvement in 𝑐2𝑐𝑐𝑣𝑔 (𝑡ℎ𝑐𝑣𝑔 = 90%). This implies MicroAgent produces decomposition results that are closer to the ground truth decomposition, which are of much higher qua… view at source ↗
Figure 4
Figure 4. Figure 4: Demonstration of case study in gulimall applica [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗
read the original abstract

The adoption of Microservice Architecture (MSA) has revolutionized software engineering by enhancing scalability, agility, and maintainability over traditional monolithic applications. As more developers transition their legacy systems to microservice-based architectures, effective microservice decomposition-partitioning monolithic applications into highly cohesive services-becomes vital. However, this decomposition task presents significant challenges. Manual approaches are time-consuming and labor-intensive. Existing automated methods often fail to capture the necessary semantic insights from complex applications, while naive applications of Large Language Models tend to overlook crucial contextual information and design principles, leading to suboptimal results. To address these challenges, we propose MicroAgent, a Context-Augmented Multi-Agent Framework for Microservice Decomposition. Our framework divides the decomposition process into five distinct subtasks and assigns each to a specialized agent. To enhance the effectiveness of each agent, we provide tailored, multi-granularity context that keeps its analysis focused and mitigates information overload. Furthermore, to ensure the decomposition adheres to established design principles, we integrate analytical tools that guide the agents' decision-making. Experimental evaluations on 10 Java Web applications demonstrate that MicroAgent achieves an average decomposition accuracy of 89.2%, outperforming the state-of-the-art method by 24.6%. We also conduct a case study to highlight the practical benefits of our design.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The manuscript proposes MicroAgent, a multi-agent LLM framework that decomposes microservice extraction into five subtasks (each handled by a specialized agent), supplies each agent with tailored multi-granularity context, and integrates analytical tools to enforce design principles. On 10 Java Web applications the framework is reported to reach 89.2 % average decomposition accuracy, 24.6 % above the prior state-of-the-art method; a case study is also presented.

Significance. If the accuracy metric and ground-truth construction are reproducible and non-circular, the result would demonstrate that structured multi-agent prompting plus external analysis tools can materially improve automated architectural refactoring. The explicit five-subtask division and tool integration constitute a concrete, testable design choice that could be adopted or extended by other refactoring tools.

major comments (1)
  1. [Evaluation section] Evaluation section (and abstract claim): the manuscript reports an average accuracy of 89.2 % and a 24.6 % improvement but supplies no explicit definition of the accuracy metric, no description of how reference decompositions were obtained (expert judgment, automated proxy, or inter-rater protocol), and no statistical significance or reliability assessment. Without these details the headline quantitative result cannot be interpreted or reproduced, rendering the central performance claim load-bearing yet unverifiable.
minor comments (2)
  1. [Abstract] The abstract and introduction should cite the exact prior SOTA method being compared (name, reference, and year) rather than the generic phrase “state-of-the-art method.”
  2. [§3] Notation for the five subtasks and the multi-granularity context levels should be introduced once with a table or diagram so that later sections can refer to them unambiguously.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback on the evaluation methodology. We agree that the current presentation of results lacks necessary details for reproducibility and interpretability, and we will revise the manuscript to address this.

read point-by-point responses
  1. Referee: [Evaluation section] Evaluation section (and abstract claim): the manuscript reports an average accuracy of 89.2 % and a 24.6 % improvement but supplies no explicit definition of the accuracy metric, no description of how reference decompositions were obtained (expert judgment, automated proxy, or inter-rater protocol), and no statistical significance or reliability assessment. Without these details the headline quantitative result cannot be interpreted or reproduced, rendering the central performance claim load-bearing yet unverifiable.

    Authors: We agree with the referee that these details are essential. In the revised manuscript we will: (1) provide an explicit, formal definition of the accuracy metric (including the matching criteria between proposed and reference decompositions); (2) describe the construction of the reference decompositions, including the expert judgment process, number of experts, and any inter-rater protocol or agreement statistics used; and (3) add statistical significance testing (e.g., paired comparisons with p-values) together with reliability measures. These additions will appear in a dedicated subsection of the Evaluation section and will be cross-referenced from the abstract and results tables. We will also make the ground-truth data and evaluation scripts available to support reproducibility. revision: yes

Circularity Check

0 steps flagged

No circularity; empirical claim on external applications is self-contained

full rationale

The paper presents a multi-agent framework design and reports an experimental accuracy of 89.2% on 10 independent Java Web applications. No derivation chain, equations, fitted parameters renamed as predictions, or self-citation load-bearing steps appear in the abstract or described structure. The performance claim rests on external evaluation rather than reducing by construction to the framework inputs or prior self-citations. This is the standard non-circular outcome for an empirical systems paper.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 1 invented entities

Abstract-only review; the framework itself constitutes the main invented contribution. No explicit free parameters, mathematical axioms, or independently evidenced entities are described. Design choices such as the exact five subtasks and context granularity levels are presented as engineering decisions without external validation.

invented entities (1)
  • MicroAgent framework (five specialized agents plus multi-granularity context and analytical tools) no independent evidence
    purpose: To perform automatic microservice decomposition while respecting design principles
    The central proposed artifact introduced to solve the stated problem; no independent evidence supplied in abstract.

pith-pipeline@v0.9.1-grok · 5785 in / 1205 out tokens · 35184 ms · 2026-06-30T05:43:59.347435+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

54 extracted references · 4 canonical work pages · 1 internal anchor

  1. [1]

    https://github.com/blueperf

    BLUEPERF — github.com. https://github.com/blueperf

  2. [2]

    https://www.anthropic.com/claude/sonnet

    Claude Sonnet 4.5 — anthropic.com. https://www.anthropic.com/claude/sonnet

  3. [3]

    https://blog.langchain.com/context- engineering-for-agents

    Context Engineering — blog.langchain.com. https://blog.langchain.com/context- engineering-for-agents

  4. [4]

    https://www.llamaindex.ai/blog/context-engineering-what-it-is-and- techniques-to-consider

    Context Engineering - What it is, and techniques to consider — LlamaIn- dex - Build Knowledge Assistants over your Enterprise Data — llamain- dex.ai. https://www.llamaindex.ai/blog/context-engineering-what-it-is-and- techniques-to-consider

  5. [5]

    https://github.com/7ep/demo

    GitHub - 7ep/demo: A demonstration of a web application with tests — github.com. https://github.com/7ep/demo

  6. [6]

    https://github.com/blueperf/acmeair-monolithic- java

    GitHub - blueperf/acmeair-monolithic-java: This version of Acme air is re- designed removing hardcoded components to WXS and also optimized for Cloud Data Services — github.com. https://github.com/blueperf/acmeair-monolithic- java

  7. [7]

    https://github.com/Jackson0714/ PassJava-Platform

    GitHub - jackson0714/passjava-platform. https://github.com/Jackson0714/ PassJava-Platform

  8. [8]

    — github.com

    GitHub - microsoft/PartsUnlimitedMRP: This application uses entirely open source software including Linux, Java, Apache, and MongoDB which creates a web front end, an order service, and an integration service. — github.com. https://github.com/microsoft/PartsUnlimitedMRP

  9. [9]

    — github.com

    GitHub - microsoft/PartsUnlimitedMRPmicro: A microservices-based application using entirely open source software including Docker, Kubernetes, Java, Apache, Hystrix, and MongoDB which creates a web front end and 5 supporting microser- vices. — github.com. https://github.com/microsoft/PartsUnlimitedMRPmicro

  10. [10]

    https://github.com/mybatis/jpetstore-6

    GitHub - mybatis/jpetstore-6: A web application built on top of MyBatis 3, Spring 3 and Stripes — github.com. https://github.com/mybatis/jpetstore-6

  11. [11]

    https://github.com/NiceSeason/gulimall- learning

    GitHub - niceseason/gulimall-learning. https://github.com/NiceSeason/gulimall- learning

  12. [12]

    https://github.com/spring- petclinic/spring-petclinic-microservices

    GitHub - spring-petclinic/spring-petclinic-microservices: Distributed version of Spring Petclinic built with Spring Cloud — github.com. https://github.com/spring- petclinic/spring-petclinic-microservices

  13. [13]

    https://github.com/spring-projects/spring-petclinic

    GitHub - spring-projects/spring-petclinic: A sample Spring-based application — github.com. https://github.com/spring-projects/spring-petclinic

  14. [14]

    https://github.com/techa03/goodsKill

    GitHub - techa03/goodskill. https://github.com/techa03/goodsKill

  15. [15]

    https://github.com/youlaitech/youlai-mall

    GitHub - youlaitech/youlai-mall. https://github.com/youlaitech/youlai-mall

  16. [16]

    https://github.com/zlt2000/ microservices-platform

    GitHub - zlt2000/microservices-platform. https://github.com/zlt2000/ microservices-platform

  17. [17]

    https://openai.com/index/introducing-gpt- 5-2/

    Introducing GPT-5.2 — openai.com. https://openai.com/index/introducing-gpt- 5-2/

  18. [18]

    https://javaparser.org/

    JavaParser - Home — javaparser.org. https://javaparser.org/

  19. [19]

    https: //huggingface.co/sentence-transformers/all-MiniLM-L6-v2

    sentence-transformers/all-MiniLM-L6-v2·Hugging Face — huggingface.co. https: //huggingface.co/sentence-transformers/all-MiniLM-L6-v2

  20. [20]

    Abgaz, Y., McCarren, A., Elger, P., Solan, D., Lapuz, N., Bivol, M., Jackson, G., Yilmaz, M., Buckley, J., and Clarke, P.Decomposition of monolith applications into microservices architectures: A systematic review.IEEE Transactions on Software Engineering 49, 8 (2023), 4213–4242

  21. [21]

    Al-Debagy, O., and Martinek, P.A microservice decomposition method through using distributed representation of source code.Scalable Computing: Practice and Experience 22, 1 (2021), 39–52

  22. [22]

    S., Dam, H

    Alsayed, A. S., Dam, H. K., and Nguyen, C.Microdec: Leveraging large language models for microservice decomposition

  23. [23]

    InProceedings of the 41st ACM SIG- PLAN conference on programming language design and implementation(2020), pp

    Antoniadis, A., Filippakis, N., Krishnan, P., Ramesh, R., Allen, N., and Smaragdakis, Y.Static analysis of java enterprise applications: frameworks and caches, the elephants in the room. InProceedings of the 41st ACM SIG- PLAN conference on programming language design and implementation(2020), pp. 794–807

  24. [24]

    K., Colanzi, T

    Assunção, W. K., Colanzi, T. E., Carvalho, L., Pereira, J. A., Garcia, A., de Lima, M. J., and Lucena, C.A multi-criteria strategy for redesigning legacy features as microservices: An industrial case study. In2021 IEEE International conference on software analysis, evolution and reengineering (SANER)(2021), IEEE, pp. 377–387

  25. [25]

    K., de Mello, R., and de Lima, M

    Carvalho, L., Garcia, A., Assunção, W. K., de Mello, R., and de Lima, M. J. Analysis of the criteria adopted in industry to extract microservices. In2019 IEEE/ACM Joint 7th International Workshop on Conducting Empirical Studies in Industry (CESI) and 6th International Workshop on Software Engineering Research and Industrial Practice (SER&IP)(2019), IEEE, ...

  26. [26]

    InEuropean conference on object-oriented programming(1995), Springer, pp

    Dean, J., Grove, D., and Chambers, C.Optimization of object-oriented programs using static class hierarchy analysis. InEuropean conference on object-oriented programming(1995), Springer, pp. 77–101

  27. [27]

    InProceedings of the AAAI conference on artificial intelligence(2021), vol

    Desai, U., Bandyopadhyay, S., and Tamilselvam, S.Graph neural network to dilute outliers for refactoring monolith application. InProceedings of the AAAI conference on artificial intelligence(2021), vol. 35, pp. 72–80

  28. [28]

    Addison-Wesley Professional, 2004

    Evans, E.Domain-driven design: tackling complexity in the heart of software. Addison-Wesley Professional, 2004

  29. [29]

    M.Large language models for software engineering: Survey and open problems

    Fan, A., Gokkaya, B., Harman, M., Lyubarskiy, M., Sengupta, S., Yoo, S., and Zhang, J. M.Large language models for software engineering: Survey and open problems. In2023 IEEE/ACM International Conference on Software Engineering: Future of Software Engineering (ICSE-FoSE)(2023), IEEE, pp. 31–53

  30. [30]

    Q., Autili, M., Rossi, F., and Tivoli, M.From monolithic to microservice architecture: an automated approach based on graph clustering and combinatorial optimization

    Filippone, G., Mehmood, N. Q., Autili, M., Rossi, F., and Tivoli, M.From monolithic to microservice architecture: an automated approach based on graph clustering and combinatorial optimization. In2023 IEEE 20th International Con- ference on Software Architecture (ICSA)(2023), IEEE, pp. 47–57. [31]III, S. M. F.What led amazon to its own microservices archi...

  31. [31]

    Kalia, A. K., Xiao, J., Lin, C., Sinha, S., Rofrano, J., Vukovic, M., and Banerjee, D.Mono2micro: an ai-based toolchain for evolving monolithic enterprise applica- tions to a microservice architecture. InProceedings of the 28th ACM joint meeting on european software engineering conference and symposium on the foundations of software engineering(2020), pp....

  32. [32]

    com 25, 14-26 (2014), 12

    Lewis, J., and Fowler, M.Microservices: a definition of this new architectural term.MartinFowler. com 25, 14-26 (2014), 12

  33. [33]

    Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V., Goyal, N., Küt- tler, H., Lewis, M., Yih, W.-t., Rocktäschel, T., et al.Retrieval-augmented generation for knowledge-intensive nlp tasks.Advances in neural information processing systems 33(2020), 9459–9474

  34. [34]

    Li, J., W ang, M., Zheng, Z., and Zhang, M.Loogle: Can long-context language models understand long contexts?arXiv preprint arXiv:2311.04939(2023)

  35. [35]

    arXiv preprint arXiv:2404.02060 , year=

    Li, T., Zhang, G., Do, Q. D., Yue, X., and Chen, W.Long-context llms struggle with long in-context learning.arXiv preprint arXiv:2404.02060(2024)

  36. [36]

    DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models

    Liu, A., Mei, A., Lin, B., Xue, B., Wang, B., Xu, B., Wu, B., Zhang, B., Lin, C., Dong, C., et al.Deepseek-v3. 2: Pushing the frontier of open large language models.arXiv preprint arXiv:2512.02556(2025)

  37. [37]

    In 2022 IEEE International Conference on Web Services (ICWS)(2022), IEEE, pp

    Liu, B., Xiong, J., Ren, Q., Tyszberowicz, S., and Y ang, Z.Log2ms: a framework for automated refactoring monolith into microservices using execution logs. In 2022 IEEE International Conference on Web Services (ICWS)(2022), IEEE, pp. 391– 396

  38. [38]

    Lutellier, T., Chollak, D., Garcia, J., Tan, L., Rayside, D., Medvidović, N., and Kroeger, R.Measuring the impact of code dependencies on software architecture recovery techniques.IEEE Transactions on Software Engineering 44, 2 (2017), 159–181

  39. [39]

    C.The single responsibility principle

    Martin, R. C.The single responsibility principle. https://blog.cleancoder.com/ uncle-bob/2014/05/08/SingleReponsibilityPrinciple.html

  40. [40]

    In2017 IEEE International Conference on Web Services (ICWS)(2017), IEEE, pp

    Mazlami, G., Cito, J., and Leitner, P.Extraction of microservices from mono- lithic software architectures. In2017 IEEE International Conference on Web Services (ICWS)(2017), IEEE, pp. 524–531

  41. [41]

    Using domain analysis to model microservices

    Microsoft. Using domain analysis to model microservices. https: //learn.microsoft.com/en-us/azure/architecture/microservices/model/domain- analysis

  42. [42]

    O’Reilly Media, Inc

    Newman, S.Building microservices: designing fine-grained systems. " O’Reilly Media, Inc. ", 2021

  43. [43]

    Next: Teaching large language models to reason about code execution.arXiv preprint arXiv:2404.14662(2024)

    Ni, A., Allamanis, M., Cohan, A., Deng, Y., Shi, K., Sutton, C., and Yin, P. Next: Teaching large language models to reason about code execution.arXiv preprint arXiv:2404.14662(2024)

  44. [44]

    In Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering(2022), pp

    Nitin, V., Asthana, S., Ray, B., and Krishna, R.Cargo: Ai-guided dependency analysis for migrating monolithic applications to microservices architecture. In Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering(2022), pp. 1–12. Conference’17, July 2017, Washington, DC, USA Zishan Su, Junjie Huang, Shiwen Shan, Xingyan...

  45. [45]

    Simon and Schuster, 2018

    Richardson, C.Microservices patterns: with examples in Java. Simon and Schuster, 2018

  46. [46]

    In2022 IEEE 19th International Conference on Software Architecture Companion (ICSA-C)(2022), IEEE, pp

    Romani, Y., Tibermacine, O., and Tibermacine, C.Towards migrating legacy software systems to microservice-based architectures: a data-centric process for microservice identification. In2022 IEEE 19th International Conference on Software Architecture Companion (ICSA-C)(2022), IEEE, pp. 15–19

  47. [47]

    A.Monoembed: Enhancing llm representations for monolith to microservices decomposition through contrastive learning.Empirical Software Engineering 31, 1 (2026), 11

    Sellami, K., and Saied, M. A.Monoembed: Enhancing llm representations for monolith to microservices decomposition through contrastive learning.Empirical Software Engineering 31, 1 (2026), 11

  48. [48]

    A., Ouni, A., and Abdalkareem, R.Combining static and dynamic analysis to decompose monolithic application into microservices

    Sellami, K., Saied, M. A., Ouni, A., and Abdalkareem, R.Combining static and dynamic analysis to decompose monolithic application into microservices. In International Conference on Service-Oriented Computing(2022), Springer, pp. 203– 218

  49. [49]

    E.A mathematical theory of communication.The Bell system technical journal 27, 3 (1948), 379–423

    Shannon, C. E.A mathematical theory of communication.The Bell system technical journal 27, 3 (1948), 379–423

  50. [50]

    [53]Thönes, J.Microservices.IEEE software 32, 1 (2015), 116–116

    Taibi, D., and Lenarduzzi, V.On the definition of microservice bad smells.IEEE software 35, 3 (2018), 56–62. [53]Thönes, J.Microservices.IEEE software 32, 1 (2015), 116–116

  51. [51]

    [55]Uchitelle, E

    Trabelsi, I., Abdellatif, M., Abubaker, A., Moha, N., Mosser, S., Ebrahimi- Kahou, S., and Guéhéneuc, Y.-G.From legacy to microservices: A type-based approach for microservices identification using machine learning and semantic analysis.Journal of Software: Evolution and Process 35, 10 (2023), e2503. [55]Uchitelle, E. M.Upgrading github from rails 3.2 to ...

  52. [52]

    InProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering(2024), pp

    W ang, Y., Bornais, S., and Rubin, J.Microservice decomposition techniques: An independent tool comparison. InProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering(2024), pp. 1295–1307

  53. [53]

    Xu, J., Luo, X., Pan, X., Li, Y., Pei, W., and Xu, Z.Alleviating the sample selection bias in few-shot learning by removing projection to the centroid.Advances in neural information processing systems 35(2022), 21073–21086

  54. [54]

    Zhang, Z., Wang, C., W ang, Y., Shi, E., Ma, Y., Zhong, W., Chen, J., Mao, M., and Zheng, Z.Llm hallucinations in practical code generation: Phenomena, mechanism, and mitigation.Proceedings of the ACM on Software Engineering 2, ISSTA (2025), 481–503