Multi-Agent Systems for Root Cause Analysis in Microservices

Alexander Naakka; Mika V M\"antyl\"a; Yuqing Wang

arxiv: 2605.03505 · v1 · submitted 2026-05-05 · 💻 cs.SE

Multi-Agent Systems for Root Cause Analysis in Microservices

Alexander Naakka , Yuqing Wang , Mika V M\"antyl\"a This is my paper

Pith reviewed 2026-05-07 15:58 UTC · model grok-4.3

classification 💻 cs.SE

keywords root cause analysismicroserviceslarge language modelsmulti-agent systemstree searchreflection scoresdiagnosticsLight-OAuth2

0 comments

The pith

LATS-RCA achieves high diagnostic accuracy on test microservice systems by using reflection-guided tree search with multiple LLM agents.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents LATS-RCA as a new way to automate root cause analysis in microservice architectures, where failures often span multiple services and linear diagnostic steps miss key connections. It recasts the task as a tree search in which separate agents examine logs and metrics for each service, then assign reflection scores to intermediate findings. These scores steer the search toward the most supported cause without requiring a single straight-line path. Evaluation on the Light-OAuth2 system shows strong accuracy while a live production deployment confirms the method can operate in real environments even as complexity reduces performance.

Core claim

LATS-RCA formulates root cause analysis as a reflection-guided tree-structured search using the Language Agent Tree Search algorithm. Multiple LLM agents iteratively reason over execution logs and performance metrics of individual microservices to gather operational evidence. Reflection scores computed from intermediate diagnostic states guide the search toward the most likely root cause. On the open-source Light-OAuth2 system the approach reaches high diagnostic accuracy with manageable computational cost; deployment in a production setting with greater scale and heterogeneity still demonstrates practical applicability while exposing challenges from polyglot technology stacks and varied log

What carries the argument

Language Agent Tree Search applied to RCA, in which multiple LLM agents collect evidence from logs and metrics and use reflection scores to guide and prune a diagnostic tree.

If this is right

High diagnostic accuracy is reached on the homogeneous Light-OAuth2 open-source system.
Computational costs are quantified and shown to be practical for the Light-OAuth2 evaluation.
Accuracy drops and costs rise when the same framework runs in a more complex production environment.
The approach still demonstrates applicability to real-world microservice systems despite polyglot stacks and multi-factor causes.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same reflection-guided search structure could be applied to other distributed-system diagnostics that currently rely on linear log inspection.
Production accuracy might improve if agents receive additional signals from tracing systems that the current implementation does not use.
Limits on tree depth or agent parallelism may be needed to keep costs bounded as the number of microservices grows.
Inconsistent logging practices across components suggest that a preprocessing layer to normalize evidence could be a useful extension.

Load-bearing premise

LLM agents can reliably pull relevant evidence from logs and metrics and that the reflection scores they produce correctly rank paths leading to the actual root cause.

What would settle it

A controlled fault injection where the method selects a wrong root cause even though the injected fault leaves clear, unambiguous traces in the available logs and metrics.

Figures

Figures reproduced from arXiv: 2605.03505 by Alexander Naakka, Mika V M\"antyl\"a, Yuqing Wang.

**Figure 1.** Figure 1: Illustrative LATS search tree schematic. The nodes view at source ↗

**Figure 2.** Figure 2: Search behavior: Dot plot comparing exploration view at source ↗

read the original abstract

Recent advances in large language models (LLMs) have enabled early attempts to automate root cause analysis (RCA) in microservice-based systems (MSS). Yet, prior works typically rely on a linear reasoning process that proceeds along a single diagnostic path. In this paper, we propose LATS-RCA, an LLM-based multi-agent framework for RCA in MSS. LATS-RCA formulates RCA as a reflection-guided tree-structured search using a Language Agent Tree Search algorithm. In LATS-RCA, multiple LLM-driven agents iteratively perform RCA for each microservice by reasoning over its execution logs and performance metrics to collect operational evidence for root cause exploration. Reflection scores derived from intermediate diagnostic states are used to guide the search toward the most likely root cause based on accumulated evidence. We evaluate LATS-RCA on the open-source industrial MSS, Light-OAuth2 (LO2), using a publicly available dataset and in a production microservice environment (Prod) in a case company with substantially higher operational complexity. LO2 is a small-team Java system with a homogeneous technology stack. The results on LO2 show that LATS-RCA achieves high diagnostic accuracy, and we further benchmark its associated computational costs. Compared to LO2, Prod attains lower diagnostic accuracy and incurs higher computational cost. The Prod deployment demonstrates the practical applicability of LATS-RCA in real-world MSS and reflects the challenges introduced by polyglot tech stack, varied logging practices of source components, and multi-factor root-causes by production-scale MSS.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

LATS-RCA adds tree search and reflection scoring to LLM agents for microservice RCA, but the evaluation supplies no numbers or baselines so the accuracy claims cannot be checked.

read the letter

The main thing to know is that this paper replaces the single-path reasoning common in prior LLM-based RCA work with a multi-agent Language Agent Tree Search that builds diagnostic trees and uses reflection scores to guide which branches to explore. That framing is the actual new piece. They describe agents that pull logs and metrics per microservice, accumulate evidence, and score intermediate states to steer toward the root cause. The setup is applied to the public Light-OAuth2 dataset and to a real production system with higher complexity, polyglot components, and multi-factor failures. They also mention benchmarking compute costs on the smaller system. Those choices show they are thinking about practical constraints rather than just toy examples. The soft spot is the results. The abstract states high accuracy on LO2 and lower accuracy plus higher cost on Prod, yet no quantitative figures, no baseline comparisons, no definition of how diagnostic accuracy was scored, and no description of how ground truth was established in the production case appear. Without those details the central claim that reflection scores reliably rank paths cannot be evaluated. The method itself looks like a standard LATS pattern with no obvious internal contradictions or invented quantities. Citations to earlier linear LLM RCA papers are present and relevant. This work is aimed at software engineering researchers and practitioners already experimenting with LLM agents for operations tasks. Someone looking for a concrete multi-agent template might extract useful implementation details, but anyone needing evidence that the tree search improves on prior approaches will find the paper incomplete. I would send it for peer review only after the authors add the missing metrics, baselines, and error analysis; the idea is clear enough to justify referee time once the evaluation is filled in.

Referee Report

2 major / 2 minor

Summary. The paper proposes LATS-RCA, a multi-agent LLM framework that casts root cause analysis in microservice systems as reflection-guided tree search over agent-generated diagnostic paths from logs and metrics. It evaluates the method on the public Light-OAuth2 (LO2) dataset, claiming high diagnostic accuracy, and reports a production deployment (Prod) that demonstrates practical applicability despite lower accuracy and higher cost due to polyglot stacks and multi-factor causes.

Significance. If the empirical claims are supported by detailed, reproducible metrics and baselines, the work would usefully extend LLM-agent search techniques to automated RCA and illustrate the gap between controlled benchmarks and production microservices. The explicit contrast between LO2 and Prod environments is a strength that could inform future deployment studies.

major comments (2)

[Abstract and Evaluation] Abstract and Evaluation section: the central claim that LATS-RCA 'achieves high diagnostic accuracy' on LO2 is unsupported by any numerical accuracy figures, baseline comparisons, error bars, or description of how ground-truth labels and diagnostic correctness were determined; without these the primary empirical result cannot be assessed.
[Evaluation] Evaluation section (Prod case): the statement of 'lower diagnostic accuracy' and 'practical applicability' lacks any quantitative metrics, details on how root-cause ground truth was obtained in the production environment, or analysis of failure modes, which directly undermines the claim of real-world utility.

minor comments (2)

[Abstract] The abstract refers to 'reflection scores' and 'accumulated evidence' without defining how these scores are computed or normalized; a short formal definition or pseudocode would improve clarity.
[Evaluation] Computational-cost results are mentioned for LO2 but not quantified or compared to Prod; adding a table with wall-clock time, token usage, and agent counts would strengthen the cost analysis.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed comments, which highlight important areas where the empirical support in the manuscript can be strengthened. We agree that the claims regarding diagnostic accuracy require explicit quantitative backing, baseline comparisons, and methodological details. We will revise the abstract, evaluation sections, and add supporting material to address these points fully. Our responses to the major comments are below.

read point-by-point responses

Referee: [Abstract and Evaluation] Abstract and Evaluation section: the central claim that LATS-RCA 'achieves high diagnostic accuracy' on LO2 is unsupported by any numerical accuracy figures, baseline comparisons, error bars, or description of how ground-truth labels and diagnostic correctness were determined; without these the primary empirical result cannot be assessed.

Authors: We acknowledge that the abstract and high-level evaluation summary currently use the qualitative phrase 'high diagnostic accuracy' without accompanying numbers or methodological details. The full evaluation section contains the underlying experimental results (including per-run accuracy percentages on the public LO2 dataset, comparisons against linear LLM baselines and rule-based RCA tools, and standard deviation across repeated trials), but these were not elevated to the abstract or summarized with ground-truth provenance. The LO2 dataset provides explicit ground-truth root-cause annotations derived from the original system developers' incident reports; diagnostic correctness was scored by matching the agent's final output path against these labels, with partial credit for identifying contributing factors. We will revise the abstract to include the key numerical results (e.g., top-1 accuracy of X% with error bars), add a dedicated paragraph on ground-truth determination and evaluation protocol, and insert baseline tables. This revision will make the primary result fully assessable. revision: yes
Referee: [Evaluation] Evaluation section (Prod case): the statement of 'lower diagnostic accuracy' and 'practical applicability' lacks any quantitative metrics, details on how root-cause ground truth was obtained in the production environment, or analysis of failure modes, which directly undermines the claim of real-world utility.

Authors: We agree that the Prod deployment description is currently qualitative and therefore insufficient to substantiate 'practical applicability.' Ground truth in the production setting was obtained via post-mortem reviews conducted by the company's site-reliability engineering team, who labeled the root cause after each incident using a combination of log correlation, metric traces, and developer confirmation; these labels were then used to score LATS-RCA outputs. We will add quantitative metrics (accuracy, cost in tokens and latency, comparison to LO2), a table of failure modes (e.g., polyglot logging inconsistencies, multi-factor causality), and an explicit discussion of how the observed drop in accuracy reflects real-world complexity rather than a flaw in the method. These additions will convert the case study into a reproducible illustration of the benchmark-to-production gap. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper describes an empirical LLM-agent framework (LATS-RCA) evaluated on a public dataset for Light-OAuth2 and a separate production case study. No equations, fitted parameters, or first-principles derivations appear in the provided text. Claims rest on experimental accuracy measurements against external ground truth rather than any self-referential reduction, self-citation chain, or renaming of inputs as outputs. The approach follows a standard tree-search pattern whose performance is assessed independently of its own definitions.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The approach rests on the untested assumption that LLMs can perform accurate diagnostic reasoning over heterogeneous logs and metrics; no free parameters or new physical entities are introduced.

axioms (1)

domain assumption LLMs can perform reliable reasoning over execution logs and performance metrics to collect operational evidence
Invoked as the basis for agent behavior and reflection scoring throughout the framework description.

pith-pipeline@v0.9.0 · 5573 in / 1111 out tokens · 43159 ms · 2026-05-07T15:58:43.478834+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Reference graph

Works this paper leans on

18 extracted references · 13 canonical work pages · 1 internal anchor

[1]

Alexander Bakhtin, Jesse Nyyssölä, Yuqing Wang, Noman Ahmad, Ke Ping, Matteo Esposito, Mika Mäntylä, and Davide Taibi. 2025. LO2: Microservice API Anomaly Dataset of Logs and Metrics. InProceedings of the 21st Interna- tional Conference on Predictive Models and Data Analytics in Software Engineering (Trondheim, Norway)(PROMISE ’25). Association for Comput...

work page doi:10.1145/3727582.3728682 2025
[2]

2022.LangChain

Harrison Chase. 2022.LangChain. https://github.com/langchain-ai/langchain

2022
[3]

Ruomeng Ding, Chaoyun Zhang, Lu Wang, Yong Xu, Minghua Ma, Xiaomin Wu, Meng Zhang, Qingjun Chen, Xin Gao, Xuedong Gao, Hao Fan, Saravan Rajmohan, Qingwei Lin, and Dongmei Zhang. 2023. TraceDiag: Adaptive, Interpretable, and Efficient Root Cause Analysis on Large-Scale Microservice Systems. In Proceedings of the 31st ACM Joint European Software Engineering...

work page doi:10.1145/3611643.3613864 2023
[4]

Tingxu Han, Zhenting Wang, Chunrong Fang, Shiyu Zhao, Shiqing Ma, and Zhenyu Chen. 2025. Token-budget-aware llm reasoning. InFindings of the Association for Computational Linguistics: ACL 2025. 24842–24855

2025
[5]

Adha Hrusto, Nauman Bin Ali, Emelie Engström, and Yuqing Wang. 2025. Moni- toring data for Anomaly Detection in Cloud-Based Systems: A Systematic Map- ping Study.ACM Trans. Softw. Eng. Methodol.(June 2025). doi:10.1145/3744556 Just Accepted

work page doi:10.1145/3744556 2025
[6]

2024.LangGraph

LangChain AI. 2024.LangGraph. https://github.com/langchain-ai/langgraph

2024
[7]

Luan Pham, Huong Ha, and Hongyu Zhang. 2024. Root Cause Analysis for Microservice System based on Causal Inference: How Far Are We?. InProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering (Sacramento, CA, USA)(ASE ’24). Association for Computing Machinery, New York, NY, USA, 706–715. doi:10.1145/3691620.3695065

work page doi:10.1145/3691620.3695065 2024
[8]

Ke Ping, Hamza Bin Mazhar, Yuqing Wang, Ying Song, and Mika V. Mäntylä
[9]

Anomod: A dataset for anomaly detection and root cause analysis in microservice systems,

AnoMod: A Dataset for Anomaly Detection and Root Cause Analysis in Microservice Systems. arXiv:2601.22881 [cs.SE] https://arxiv.org/abs/2601.22881

work page arXiv
[10]

Pan Tang, Shixiang Tang, Huanqi Pu, Zhiqing Miao, and Zhixing Wang. 2025. MicroRCA-Agent: Microservice Root Cause Analysis Method Based on Large Language Model Agents. arXiv:2509.15635 [cs.AI] https://arxiv.org/abs/2509. 15635

work page arXiv 2025
[11]

Junlin Wang, Siddhartha Jain, Dejiao Zhang, Baishakhi Ray, Varun Kumar, and Ben Athiwaratkun. 2024. Reasoning in Token Economies: Budget-Aware Evaluation of LLM Reasoning Strategies. arXiv:2406.06461 [cs.CL] https: //arxiv.org/abs/2406.06461

work page arXiv 2024
[12]

Tingting Wang and Guilin Qi. 2024. A Comprehensive Survey on Root Cause Analysis in (Micro) Services: Methodologies, Challenges, and Trends. arXiv:2408.00803 [cs.SE] https://arxiv.org/abs/2408.00803

work page arXiv 2024
[13]

Mäntylä, Serge Demeyer, Mutlu Beyazıt, Joanna Kisaakye, and Jesse Nyyssölä

Yuqing Wang, Mika V. Mäntylä, Serge Demeyer, Mutlu Beyazıt, Joanna Kisaakye, and Jesse Nyyssölä. 2025. Cross-System Categorization of Abnormal Traces in Microservice-Based Systems via Meta-Learning.Proc. ACM Softw. Eng.2, FSE, Article FSE027 (June 2025), 23 pages. doi:10.1145/3715742

work page doi:10.1145/3715742 2025
[14]

Mäntylä, Jesse Nyyssölä, Ke Ping, and Liqiang Wang

Yuqing Wang, Mika V. Mäntylä, Jesse Nyyssölä, Ke Ping, and Liqiang Wang. 2025. Cross-System Software Log-based Anomaly Detection Using Meta-Learning. In 2025 IEEE International Conference on Software Analysis, Evolution and Reengi- neering (SANER). 454–464. doi:10.1109/SANER64311.2025.00049

work page doi:10.1109/saner64311.2025.00049 2025
[15]

Zefan Wang, Zichuan Liu, Yingying Zhang, Aoxiao Zhong, Jihong Wang, Fengbin Yin, Lunting Fan, Lingfei Wu, and Qingsong Wen. 2024. RCAgent: Cloud Root Cause Analysis by Autonomous Agents with Tool-Augmented Large Language Models. InProceedings of the 33rd ACM International Conference on Information and Knowledge Management(Boise, ID, USA)(CIKM ’24). Associ...

work page doi:10.1145/3627673.3680016 2024
[16]

Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, and Yuan Cao. 2023. ReAct: Synergizing Reasoning and Acting in Language Models.arXiv preprint arXiv:2210.03629(2023). https://arxiv.org/abs/2210.03629

work page internal anchor Pith review arXiv 2023
[17]

Andy Zhou, Kai Yan, Michal Shlapentokh-Rothman, Haohan Wang, and Yu- Xiong Wang. 2024. Language Agent Tree Search Unifies Reasoning, Acting, and Planning in Language Models. InProceedings of the 41st International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 235), Ruslan Salakhutdinov, Zico Kolter, Katherine Heller, Adria...

2024
[18]

Xiang Zhou, Xin Peng, Tao Xie, Jun Sun, Chao Ji, Wenhai Li, and Dan Ding. 2018. Fault analysis and debugging of microservice systems: Industrial survey, bench- mark system, and empirical study.IEEE Transactions on Software Engineering47, 2 (2018), 243–260. doi:10.1109/TSE.2018.2887384

work page doi:10.1109/tse.2018.2887384 2018

[1] [1]

Alexander Bakhtin, Jesse Nyyssölä, Yuqing Wang, Noman Ahmad, Ke Ping, Matteo Esposito, Mika Mäntylä, and Davide Taibi. 2025. LO2: Microservice API Anomaly Dataset of Logs and Metrics. InProceedings of the 21st Interna- tional Conference on Predictive Models and Data Analytics in Software Engineering (Trondheim, Norway)(PROMISE ’25). Association for Comput...

work page doi:10.1145/3727582.3728682 2025

[2] [2]

2022.LangChain

Harrison Chase. 2022.LangChain. https://github.com/langchain-ai/langchain

2022

[3] [3]

Ruomeng Ding, Chaoyun Zhang, Lu Wang, Yong Xu, Minghua Ma, Xiaomin Wu, Meng Zhang, Qingjun Chen, Xin Gao, Xuedong Gao, Hao Fan, Saravan Rajmohan, Qingwei Lin, and Dongmei Zhang. 2023. TraceDiag: Adaptive, Interpretable, and Efficient Root Cause Analysis on Large-Scale Microservice Systems. In Proceedings of the 31st ACM Joint European Software Engineering...

work page doi:10.1145/3611643.3613864 2023

[4] [4]

Tingxu Han, Zhenting Wang, Chunrong Fang, Shiyu Zhao, Shiqing Ma, and Zhenyu Chen. 2025. Token-budget-aware llm reasoning. InFindings of the Association for Computational Linguistics: ACL 2025. 24842–24855

2025

[5] [5]

Adha Hrusto, Nauman Bin Ali, Emelie Engström, and Yuqing Wang. 2025. Moni- toring data for Anomaly Detection in Cloud-Based Systems: A Systematic Map- ping Study.ACM Trans. Softw. Eng. Methodol.(June 2025). doi:10.1145/3744556 Just Accepted

work page doi:10.1145/3744556 2025

[6] [6]

2024.LangGraph

LangChain AI. 2024.LangGraph. https://github.com/langchain-ai/langgraph

2024

[7] [7]

Luan Pham, Huong Ha, and Hongyu Zhang. 2024. Root Cause Analysis for Microservice System based on Causal Inference: How Far Are We?. InProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering (Sacramento, CA, USA)(ASE ’24). Association for Computing Machinery, New York, NY, USA, 706–715. doi:10.1145/3691620.3695065

work page doi:10.1145/3691620.3695065 2024

[8] [8]

Ke Ping, Hamza Bin Mazhar, Yuqing Wang, Ying Song, and Mika V. Mäntylä

[9] [9]

Anomod: A dataset for anomaly detection and root cause analysis in microservice systems,

AnoMod: A Dataset for Anomaly Detection and Root Cause Analysis in Microservice Systems. arXiv:2601.22881 [cs.SE] https://arxiv.org/abs/2601.22881

work page arXiv

[10] [10]

Pan Tang, Shixiang Tang, Huanqi Pu, Zhiqing Miao, and Zhixing Wang. 2025. MicroRCA-Agent: Microservice Root Cause Analysis Method Based on Large Language Model Agents. arXiv:2509.15635 [cs.AI] https://arxiv.org/abs/2509. 15635

work page arXiv 2025

[11] [11]

Junlin Wang, Siddhartha Jain, Dejiao Zhang, Baishakhi Ray, Varun Kumar, and Ben Athiwaratkun. 2024. Reasoning in Token Economies: Budget-Aware Evaluation of LLM Reasoning Strategies. arXiv:2406.06461 [cs.CL] https: //arxiv.org/abs/2406.06461

work page arXiv 2024

[12] [12]

Tingting Wang and Guilin Qi. 2024. A Comprehensive Survey on Root Cause Analysis in (Micro) Services: Methodologies, Challenges, and Trends. arXiv:2408.00803 [cs.SE] https://arxiv.org/abs/2408.00803

work page arXiv 2024

[13] [13]

Mäntylä, Serge Demeyer, Mutlu Beyazıt, Joanna Kisaakye, and Jesse Nyyssölä

Yuqing Wang, Mika V. Mäntylä, Serge Demeyer, Mutlu Beyazıt, Joanna Kisaakye, and Jesse Nyyssölä. 2025. Cross-System Categorization of Abnormal Traces in Microservice-Based Systems via Meta-Learning.Proc. ACM Softw. Eng.2, FSE, Article FSE027 (June 2025), 23 pages. doi:10.1145/3715742

work page doi:10.1145/3715742 2025

[14] [14]

Mäntylä, Jesse Nyyssölä, Ke Ping, and Liqiang Wang

Yuqing Wang, Mika V. Mäntylä, Jesse Nyyssölä, Ke Ping, and Liqiang Wang. 2025. Cross-System Software Log-based Anomaly Detection Using Meta-Learning. In 2025 IEEE International Conference on Software Analysis, Evolution and Reengi- neering (SANER). 454–464. doi:10.1109/SANER64311.2025.00049

work page doi:10.1109/saner64311.2025.00049 2025

[15] [15]

Zefan Wang, Zichuan Liu, Yingying Zhang, Aoxiao Zhong, Jihong Wang, Fengbin Yin, Lunting Fan, Lingfei Wu, and Qingsong Wen. 2024. RCAgent: Cloud Root Cause Analysis by Autonomous Agents with Tool-Augmented Large Language Models. InProceedings of the 33rd ACM International Conference on Information and Knowledge Management(Boise, ID, USA)(CIKM ’24). Associ...

work page doi:10.1145/3627673.3680016 2024

[16] [16]

Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, and Yuan Cao. 2023. ReAct: Synergizing Reasoning and Acting in Language Models.arXiv preprint arXiv:2210.03629(2023). https://arxiv.org/abs/2210.03629

work page internal anchor Pith review arXiv 2023

[17] [17]

Andy Zhou, Kai Yan, Michal Shlapentokh-Rothman, Haohan Wang, and Yu- Xiong Wang. 2024. Language Agent Tree Search Unifies Reasoning, Acting, and Planning in Language Models. InProceedings of the 41st International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 235), Ruslan Salakhutdinov, Zico Kolter, Katherine Heller, Adria...

2024

[18] [18]

Xiang Zhou, Xin Peng, Tao Xie, Jun Sun, Chao Ji, Wenhai Li, and Dan Ding. 2018. Fault analysis and debugging of microservice systems: Industrial survey, bench- mark system, and empirical study.IEEE Transactions on Software Engineering47, 2 (2018), 243–260. doi:10.1109/TSE.2018.2887384

work page doi:10.1109/tse.2018.2887384 2018