pith. sign in

arxiv: 2605.00936 · v1 · submitted 2026-05-01 · 💻 cs.LG · cs.AI

EventADL: Open-Box Anomaly Detection and Localization Framework for Events in Cloud-Based Service Systems

Pith reviewed 2026-05-09 19:30 UTC · model grok-4.3

classification 💻 cs.LG cs.AI
keywords anomaly detectionroot cause localizationcloud systemsevent datapattern learningintervention graphservice reliability
0
0 comments X

The pith

Event data from cloud systems reveals anomalies and their root causes by learning normal interaction patterns and their frequencies.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper seeks to establish that event streams contain enough structure to support both detection of anomalies and automatic localization of their causes in cloud service systems. It grounds the claim in a study of hundreds of real incidents showing how problems appear as departures from usual entity interactions and occurrence rates. If the approach works, operators could use the abundant but previously underused event logs for reliable, interpretable monitoring without labels or opaque models. The method proceeds by training on historical events, flagging online deviations, and tracing causes through a graph of recent interactions.

Core claim

Normal behavior is captured first as Event Semantic Patterns that record expected interactions between system entities and then as Event Frequency Patterns that record how often those interactions occur. Significant departures from either pattern in a live event stream mark anomalies. Root cause localization follows by constructing an Intervention Graph that links recent interactions to the anomaly and identifies the most likely origin.

What carries the argument

Event Semantic Patterns and Event Frequency Patterns, which encode normal entity interactions and their rates, together with the Intervention Graph that relates recent interactions to flagged anomalies for cause identification.

If this is right

  • Event streams alone become sufficient for real-time anomaly detection in cloud environments.
  • Root causes become traceable automatically through the relationships modeled in the intervention graph.
  • The process works with unlabeled historical data and yields interpretable results for operators.
  • The same learned patterns support ongoing monitoring across multiple production cloud services.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The pattern-based approach might extend to other distributed systems where event logs are the primary observable data.
  • Hybrid systems could combine these event patterns with metric or log signals to catch anomalies that appear in only one data type.
  • One could test whether the learned patterns remain effective after software updates or hardware changes within the same service.

Load-bearing premise

Deviations from the learned semantic and frequency patterns will mark genuine anomalies and the intervention graph will recover the correct root cause from the surrounding events.

What would settle it

A real cloud incident in which the event stream matches the learned patterns yet a failure occurs, or in which the intervention graph selects an incorrect component as the origin.

Figures

Figures reproduced from arXiv: 2605.00936 by Daniel Kroening, Hui Guan, Joey Dodds, Luan Pham, Victor Nicolet.

Figure 1
Figure 1. Figure 1: An event in the OCSF schema [38]. Limitations of Existing Work. While ADL has been ac￾tively studied for metrics [9, 17, 18, 28, 41, 42] and un￾structured logs [1, 13, 22, 23, 29, 33], event-based ADL re￾mains underexplored. Metric-based methods [28, 41, 42] capture frequency-based anomalies but fail to detect point￾wise anomalies (i.e., an individual anomalous event). Log￾based methods such as DeepLog [13… view at source ↗
Figure 2
Figure 2. Figure 2: Insights from real-world incidents. (a) Distribution of anomaly types. (b) Number of events analyzed. [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Overview of EventADL. Three phases: offline training (upper left), online anomaly detection (lower left), and RCL (right). Offline, EventADL learns ESPs and EFPs from historical events. Online, events are evaluated against these patterns: ESP identifies pointwise anomalies, while EFP detects frequency-based anomalies. Upon detection, EventADL constructs an Intervention Graph encoding causal links between i… view at source ↗
Figure 4
Figure 4. Figure 4: An ESP in the jsonLogic [56] schema. Event Semantic Pattern (ESP) is a model of the normal event types and values observed in historical event data. They serve two purposes: a set of ESPs is used to detect pointwise anomalies (i.e., single events that are anomalous on their own), and they allow labeling normal events with a specific ESP, which is then used to compute EFPs (Section 4.3). Event Type and Even… view at source ↗
Figure 5
Figure 5. Figure 5: Detecting anomalies with EFP. Each subsequence 𝑤 (𝑖) 𝑢 is a point in Euclidean space and linked to its nearest non-trivial neighbor 𝑤 (𝑖) 𝑣 . The set of distances {𝑑 (𝑖) 𝑢 } forms the Event Frequency Pattern (EFP) 𝑓𝑖 ∈ F. The abnormal subsequence lies far from the cluster of normal subsequences, as it has a statistically large distance to its near￾est neighbor, indicating a potential anomaly. The set of al… view at source ↗
Figure 6
Figure 6. Figure 6: Scalability of EventADL. 5.7.3 Scalability of EventADL. In this section, we evaluate the scalability of three components in EventADL when handling large event streams. We deploy EventADL on a machine with 12 vCPUs and 36GB RAM, and measure its runtime when pro￾cessing event streams from deployed systems with different scales [PITH_FULL_IMAGE:figures/full_fig_p016_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Magnitude-based vs. shape-based EFP. 5.8.2 Shape-based vs Magnitude-based EFP. In this ab￾lation, we compare our magnitude-based EFP with the shape-based variant [26, 32] to assess their impact on the OUT dataset. As shown in [PITH_FULL_IMAGE:figures/full_fig_p017_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Robustness analysis of EventADL w.r.t. different parameters and noise levels on the Falcon dataset. 5.9 RQ5: Robustness of EventADL 5.9.1 Robustness to Parameter Settings. In this section, we conduct a robustness analysis to un￾derstand how the performance of EventADL varies under different parameter settings. We refer readers to [15] for information about parameters for ESP. EFP has two parameters: (1) th… view at source ↗
read the original abstract

Anomaly detection and localization (ADL) is critical for maintaining reliability and availability in cloud systems. Recent ADL developments focus on metric and log data, leaving event data unexplored. To address this gap, we propose EventADL, the first open-box event-based ADL framework for cloud-based service systems. To motivate the design of our framework, we conduct a systematic analysis on 520 real-world incidents, and provide insights into how anomalies and their root causes manifest through event data. EventADL has three phases: offline training, online anomaly detection, and root cause localization. During the training phase, EventADL first learns Event Semantic Patterns (ESPs), which capture normal interactions between system entities using historical event data, and then learns Event Frequency Patterns (EFPs), which capture the normal frequency of known ESPs. In the online anomaly detection phase, any data in the event stream that deviates significantly from either pattern is identified as anomalous. For localization, EventADL constructs an Intervention Graph that models the relationships between recent system interactions and the detected anomalies for automatic root cause localization. The framework is designed to operate efficiently with unlabeled data and to produce interpretable anomalies with their corresponding root causes. Our evaluation on three real cloud service systems and two real-world incidents demonstrates that EventADL outperforms existing methods, achieving F1-scores of at least 90% for anomaly detection and 100% top-3 accuracy in root cause localization.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript presents EventADL, an open-box framework for anomaly detection and localization (ADL) in cloud-based service systems using event data. Motivated by a systematic analysis of 520 real-world incidents, the framework learns Event Semantic Patterns (ESPs) capturing normal entity interactions and Event Frequency Patterns (EFPs) for normal frequencies during offline training. Online, it detects significant deviations from these patterns as anomalies. For localization, it constructs an Intervention Graph modeling relationships between recent interactions and detected anomalies. Evaluation on three real cloud systems and two incidents reports F1-scores of at least 90% for detection and 100% top-3 accuracy for root cause localization, outperforming existing methods.

Significance. If the claims hold, the work is significant for filling the gap in event-based ADL for cloud systems, offering an interpretable, open-box alternative to black-box approaches. The use of real incident data for motivation and evaluation on actual systems strengthens its practical relevance. The framework's design for unlabeled data and efficiency is a positive aspect. However, the limited scope of the localization evaluation tempers the overall impact.

major comments (3)
  1. [Evaluation section (and abstract)] Evaluation section (and abstract): The root cause localization claim of 100% top-3 accuracy is supported by results from only two real-world incidents. With no reported statistical significance, variance, cross-validation, incident selection criteria, or failure mode analysis, this small sample size does not provide load-bearing evidence for the effectiveness or generalizability of the Intervention Graph approach.
  2. [§3 (Framework description)] §3 (Framework description): The paper lacks sufficient details on the algorithms, parameters, or methods used to learn ESPs and EFPs from historical data, as well as the specific deviation thresholds or statistical tests for identifying anomalies in the online phase. This makes the training procedure and the reported F1-scores difficult to reproduce or assess for robustness.
  3. [Abstract and evaluation] Abstract and evaluation: The claim that EventADL 'outperforms existing methods' does not specify the baseline methods, how they were adapted to event data, or include quantitative comparison details (e.g., tables with metrics). This weakens the ability to evaluate the superiority asserted for both detection and localization.
minor comments (2)
  1. [Abstract] The abstract could briefly name the compared methods or key technical innovations to better frame the performance numbers.
  2. [Throughout] Ensure consistent definition of acronyms (e.g., ESP, EFP) on first use in the main body and consider adding a limitations section discussing scalability with high-volume event streams.

Simulated Author's Rebuttal

3 responses · 1 unresolved

We thank the referee for the constructive comments on our manuscript. We address each major comment point by point below, with proposed revisions to improve reproducibility, clarity, and transparency of our evaluation.

read point-by-point responses
  1. Referee: Evaluation section (and abstract): The root cause localization claim of 100% top-3 accuracy is supported by results from only two real-world incidents. With no reported statistical significance, variance, cross-validation, incident selection criteria, or failure mode analysis, this small sample size does not provide load-bearing evidence for the effectiveness or generalizability of the Intervention Graph approach.

    Authors: We acknowledge that the root cause localization results are based on only two real-world incidents, which limits the strength of generalizability claims. These incidents were selected as representative examples from the 520 analyzed incidents to illustrate the Intervention Graph in practice. In the revised manuscript, we will add the incident selection criteria, describe the key characteristics of each incident, include a failure mode discussion, and explicitly note the limitations of the small sample size in both the evaluation section and abstract. We will also discuss why statistical significance testing is not applicable with n=2. revision: partial

  2. Referee: §3 (Framework description): The paper lacks sufficient details on the algorithms, parameters, or methods used to learn ESPs and EFPs from historical data, as well as the specific deviation thresholds or statistical tests for identifying anomalies in the online phase. This makes the training procedure and the reported F1-scores difficult to reproduce or assess for robustness.

    Authors: We thank the referee for highlighting the need for greater reproducibility. In the revised manuscript, we will expand §3 with pseudocode for the ESP and EFP learning procedures, list all hyperparameters and thresholds (including deviation criteria and any statistical tests used for anomaly detection), and provide implementation-level details on the online phase. These additions will make the training and detection processes fully reproducible and allow better assessment of result robustness. revision: yes

  3. Referee: Abstract and evaluation: The claim that EventADL 'outperforms existing methods' does not specify the baseline methods, how they were adapted to event data, or include quantitative comparison details (e.g., tables with metrics). This weakens the ability to evaluate the superiority asserted for both detection and localization.

    Authors: We agree that the comparison claims require more explicit support. We will revise the abstract to name the baseline methods and update the evaluation section to include a detailed table with all metrics for both anomaly detection and localization. The revised text will also describe how each baseline was adapted to event data and provide the quantitative results that support the outperformance claims. revision: yes

standing simulated objections not resolved
  • The limited number of available real-world incidents (only two) for root cause localization prevents expanding the sample size, performing cross-validation, or reporting variance/statistical significance for that component of the evaluation.

Circularity Check

0 steps flagged

No circularity in EventADL derivation chain

full rationale

The paper's core chain consists of offline learning of Event Semantic Patterns (ESPs) and Event Frequency Patterns (EFPs) from historical unlabeled event data, followed by online deviation detection and construction of an Intervention Graph from recent interactions plus detected anomalies. These are standard pattern-learning and graph-construction steps with no equations shown that equate outputs to inputs by definition. The 520-incident analysis is used only for design motivation, not as a fitted input renamed as prediction. Evaluation F1 scores and top-3 accuracy are measured on separate real systems and two incidents, not forced by the training procedure itself. No self-citations, uniqueness theorems, or ansatzes are invoked as load-bearing elements.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review supplies no explicit free parameters, axioms, or invented entities; all modeling choices (pattern definitions, deviation thresholds, graph edges) are implicit and unstated.

pith-pipeline@v0.9.0 · 5566 in / 1084 out tokens · 19293 ms · 2026-05-09T19:30:47.470096+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

71 extracted references · 71 canonical work pages

  1. [1]

    Shan Ali, Chaima Boufaied, Domenico Bianculli, Paula Branco, and Lionel Briand. 2025. A Comprehensive Study of Machine Learning Techniques for Log-Based Anomaly Detection.arXiv/2307.16714(2025)

  2. [2]

    2025.Alibaba Cloud ActionTrail

    Alibaba Cloud. 2025.Alibaba Cloud ActionTrail. https://www.alibabacloud.com/help/en/actiontrail/ ActionTrail monitors and records your Alibaba Cloud account activities; latest documentation updated February 12, 2025

  3. [3]

    Amazon Web Services. 2024. Understanding CloudTrail Events. https://docs.aws.amazon.com/awscloudtrail/latest/ userguide/cloudtrail-events.html. Accessed: 2025-06-06

  4. [4]

    Mohammad Ruhul Amin, Pranav Garg, and Baris Coskun. 2019. Cadence: Conditional anomaly detection for events using noise-contrastive estimation. InProceedings of the 12th ACM Workshop on Artificial Intelligence and Security

  5. [5]

    Devansh Arpit, Matthew Fernandez, Itai Feigenbaum, Weiran Yao, Chenghao Liu, Wenzhuo Yang, Paul Josel, Shelby Heinecke, Eric Hu, Huan Wang, Stephen Hoi, Caiming Xiong, Kun Zhang, and Juan Carlos Niebles. 2023. Salesforce CausalAI Library: A Fast and Scalable Framework for Causal Analysis of Time Series and Tabular Data. (2023)

  6. [6]

    Sergul Aydore, Baris Coskun, and Luca Melis. 2022. Detecting anomalous events from categorical data using autoen- coders. US Patent 11,537,902

  7. [7]

    Ting Chen, Lu-An Tang, Yizhou Sun, Zhengzhang Chen, and Kai Zhang. 2016. Entity Embedding-Based Anomaly Detection for Heterogeneous Categorical Events. InProceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence. 1396–1403

  8. [8]

    Yuncong Chen. 2023. On injected anomalies. https://github.com/CloudWise-OpenSource/GAIA-DataSet/issues/11

  9. [9]

    Zhuangbin Chen, Jinyang Liu, Yuxin Su, Hongyu Zhang, Xiao Ling, Yongqiang Yang, and Michael R Lyu. 2022. Adaptive Performance Anomaly Detection For Online Service Systems Via Pattern Sketching. InProceedings of the 44th International Conference on Software Engineering. 61–72

  10. [10]

    Qian Cheng, Doyen Sahoo, Amrita Saha, Wenzhuo Yang, Chenghao Liu, Gerald Woo, Manpreet Singh, Silvio Saverese, and Steven CH Hoi. 2023. AI for IT Operations (AIOps) on Cloud Platforms: Reviews, Opportunities and Challenges. arXiv preprint arXiv:2304.04661(2023)

  11. [11]

    CloudWise-OpenSource. 2025. GAIA: Generic AIOps Atlas. https://github.com/CloudWise-OpenSource/GAIA-DataSet. CloudWise GAIA Dataset for AIOps

  12. [12]

    Baris Coskun, Wei Ding, and Luca Melis. 2022. Detecting anomalous events using autoencoders. US Patent 11,374,952

  13. [13]

    Min Du, Feifei Li, Guineng Zheng, and Vivek Srikumar. 2017. DeepLog: Anomaly detection and diagnosis from system logs through deep learning. InProceedings of ACM SIGSAC conference on computer and communications security

  14. [14]

    Aoyang Fang, Songhan Zhang, Yifan Yang, Haotong Wu, Junjielong Xu, Xuyang Wang, Rui Wang, Manyi Wang, Qisheng Lu, and Pinjia He. 2025. Rethinking the Evaluation of Microservice RCA with a Fault Propagation-Aware Benchmark. arXiv:2510.04711 [cs.SE] https://arxiv.org/abs/2510.04711

  15. [15]

    Margarida Ferreira, Victor Nicolet, Luan Pham, Joey Dodds, Daniel Kroening, Ines Lynce, and Ruben Martins. 2025. Hypergraph-Guided Regex Filter Synthesis for Event-Based Anomaly Detection. arXiv:2509.06911 [cs.SE]

  16. [16]

    Google Cloud. 2024. Cloud Audit Logs Overview. https://cloud.google.com/logging/docs/audit. Accessed: 2025-06-06

  17. [17]

    Wenwei Gu, Jiazhen Gu, Jinyang Liu, Zhuangbin Chen, Jianping Zhang, Jinxi Kuang, Cong Feng, Yongqiang Yang, and Michael R Lyu. 2025. ADAMAS: Adaptive Domain-Aware Performance Anomaly Detection in Cloud Service Systems. InProceedings of the IEEE/ACM 47th International Conference on Software Engineering. 911–923

  18. [18]

    Wenwei Gu, Xinying Sun, Jinyang Liu, Yintong Huo, Zhuangbin Chen, Jianping Zhang, Jiazhen Gu, Yongqiang Yang, and Michael R Lyu. 2024. Kpiroot: Efficient monitoring metric-based root cause localization in large-scale cloud systems. In2024 IEEE 35th International Symposium on Software Reliability Engineering (ISSRE). IEEE, 403–414

  19. [19]

    Hongcheng Guo, Jian Yang, Jiaheng Liu, Jiaqi Bai, Boyang Wang, Zhoujun Li, Tieqiao Zheng, Bo Zhang, Junran Peng, and Qi Tian. 2024. Logformer: A pre-train and tuning pipeline for log anomaly detection. InProceedings of the AAAI conference on artificial intelligence, Vol. 38. 135–143

  20. [20]

    Pinjia He, Jieming Zhu, Zibin Zheng, and Michael R Lyu. 2017. Drain: An online log parsing approach with fixed depth tree. In2017 IEEE international conference on web services (ICWS). IEEE, 33–40

  21. [21]

    Azam Ikram, Sarthak Chakraborty, Subrata Mitra, Shiv Saini, Saurabh Bagchi, and Murat Kocaoglu. 2022. Root cause analysis of failures in microservices through causal discovery.Advances in Neural Information Processing Systems35 (2022), 31158–31170

  22. [22]

    Max Landauer, Florian Skopik, and Markus Wurzenberger. 2024. A critical review of common log data sets used for evaluation of sequence-based anomaly detection techniques.Proceedings of the ACM on Software Engineering1, FSE (2024), 1354–1375

  23. [23]

    Van-Hoang Le and Hongyu Zhang. 2021. Log-based anomaly detection without log parsing. In2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 492–504

  24. [24]

    Cheryl Lee, Tianyi Yang, Zhuangbin Chen, Yuxin Su, and Michael R Lyu. 2023. Eadro: An end-to-end troubleshooting framework for microservices on multi-source data. In2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE). IEEE, 1750–1762. Proc. ACM Softw. Eng., Vol. 3, No. FSE, Article FSE179. Publication date: July 2026. FSE179:22 Luan...

  25. [25]

    Cheryl Lee, Tianyi Yang, Zhuangbin Chen, Yuxin Su, Yongqiang Yang, and Michael R Lyu. 2023. Heterogeneous anomaly detection for software systems via semi-supervised cross-modal attention. In2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE). IEEE, 1724–1736

  26. [26]

    Daesoo Lee, Sara Malacarne, and Erlend Aune. 2024. Explainable time series anomaly detection using masked latent generative modeling.Pattern Recognition156 (2024), 110826

  27. [27]

    Liqun Li, Xu Zhang, Shilin He, Yu Kang, Hongyu Zhang, Minghua Ma, Yingnong Dang, Zhangwei Xu, Saravan Rajmohan, Qingwei Lin, et al . 2023. Conan: Diagnosing batch failures for cloud systems. In2023 IEEE/ACM 45th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP). IEEE, 138–149

  28. [28]

    Mingjie Li, Zeyan Li, Kanglin Yin, Xiaohui Nie, Wenchi Zhang, Kaixin Sui, and Dan Pei. 2022. Causal Inference-Based Root Cause Analysis for Online Service Systems with Intervention Recognition. InProceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’22, Vol. 1). Association for Computing Machinery, New York, N...

  29. [29]

    Xiaoyun Li, Pengfei Chen, Linxiao Jing, Zilong He, and Guangba Yu. 2020. Swisslog: Robust and unified deep learning based log anomaly detection for diverse faults. In2020 IEEE 31st International Symposium on Software Reliability Engineering (ISSRE). IEEE, 92–103

  30. [30]

    Boyang Liu, Ding Wang, Kaixiang Lin, Pang-Ning Tan, and Jiayu Zhou. 2021. RCA: A Deep Collaborative Autoencoder Approach for Anomaly Detection. InIJCAI. ijcai.org, 1505–1511

  31. [31]

    Ping Liu, Haowen Xu, Qianyu Ouyang, Rui Jiao, Zhekang Chen, Shenglin Zhang, Jiahai Yang, Linlin Mo, Jice Zeng, Wenman Xue, et al. 2020. Unsupervised detection of microservice trace anomalies through service-level deep bayesian networks. In2020 IEEE 31st International Symposium on Software Reliability Engineering (ISSRE). IEEE, 48–58

  32. [32]

    Yue Lu, Renjie Wu, Abdullah Mueen, Maria A Zuluaga, and Eamonn Keogh. 2022. Matrix profile XXIV: scaling time series anomaly detection to trillions of datapoints and ultra-fast arriving data streams. InProceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 1173–1182

  33. [33]

    Weibin Meng, Ying Liu, Yichen Zhu, Shenglin Zhang, Dan Pei, Yuqing Liu, Yihao Chen, Ruizhi Zhang, Shimin Tao, Pei Sun, et al. 2019. LogAnomaly: Unsupervised detection of sequential and quantitative anomalies in unstructured logs. InIJCAI, Vol. 19. 4739–4745

  34. [34]

    Microsoft Corporation. 2024. Ingest Events from Azure Event Hubs into Azure Monitor Logs. https://learn.microsoft. com/en-us/azure/azure-monitor/logs/ingest-logs-event-hub. Accessed: 2025-06-06

  35. [35]

    AIOps Nankai. 2021. AIOps 2021 Challenge Dataset. https://www.aiops.cn/gitlab/aiops-nankai/data/trace/aiops2021/

  36. [36]

    Sasho Nedelkoski, Jorge Cardoso, and Odej Kao. 2019. Anomaly detection from system tracing data using multimodal deep learning. In2019 IEEE 12th International Conference on Cloud Computing (CLOUD). IEEE, 179–186

  37. [37]

    Hiep Nguyen, Yongmin Tan, and Xiaohui Gu. 2011. PAL: Propagation-aware Anomaly Localization for Cloud Hosted Distributed Applications. InManaging Large-scale Systems via the Analysis of System Logs and the Application of Machine Learning Techniques. 1–8

  38. [38]

    Open Cybersecurity Schema Framework. 2022. Open Cybersecurity Schema Framework (OCSF). https://github.com/ ocsf Accessed: 2025-08-04

  39. [39]

    OpenTelemetry. 2025. Semantic Conventions for Events. https://opentelemetry.io/docs/specs/semconv/general/events/. Accessed: 2025-06-06

  40. [40]

    EventADL: Open-Box Anomaly Detection and Localization Framework for Events in Cloud-Based Service Systems

    Luan Pham. 2026.Artifacts of "EventADL: Open-Box Anomaly Detection and Localization Framework for Events in Cloud-Based Service Systems". doi:10.5281/zenodo.19433493

  41. [41]

    Luan Pham. 2026. Graph-Free Root Cause Analysis.arXiv preprint arXiv:2601.21359(2026)

  42. [42]

    Luan Pham, Huong Ha, and Hongyu Zhang. 2024. BARO: Robust root cause analysis for microservices via multivariate bayesian online change point detection.Proceedings of the ACM on Software Engineering1, FSE (2024), 2214–2237

  43. [43]

    Luan Pham, Huong Ha, and Hongyu Zhang. 2024. Root Cause Analysis for Microservice System based on Causal Inference: How Far Are We?. InProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering. 706–715

  44. [44]

    Luan Pham, Huong Ha, Xiuzhen Zhang, and Hongyu Zhang. 2026. TORAI: Multi-Source Root Cause Analysis for Blind Spots in Microservice Service Call Graph.Proceedings of the ACM on Software Engineering3, FSE (2026)

  45. [45]

    Luan Pham, Hongyu Zhang, Huong Ha, Flora Salim, and Xiuzhen Zhang. 2025. RCAEval: A Benchmark for Root Cause Analysis of Microservice Systems with Telemetry Data. InCompanion Proceedings of the ACM on Web Conference 2025

  46. [46]

    Chen Qiu, Timo Pfrommer, Marius Kloft, Stephan Mandt, and Maja Rudolph. 2021. Neural Transformation Learning for Deep Anomaly Detection Beyond Images. InProceedings of Machine Learning Research, Vol. 139. 8703–8714

  47. [47]

    Rui Ren, Jingbang Yang, Linxiao Yang, Xinyue Gu, and Liang Sun. 2024. SLIM: a Scalable Light-weight Root Cause Analysis for Imbalanced Data in Microservice. InProceedings of the 2024 IEEE/ACM 46th International Conference on Software Engineering: Companion Proceedings. 328–330

  48. [48]

    Devjeet Roy, Xuchao Zhang, Rashi Bhave, Chetan Bansal, Pedro Las-Casas, Rodrigo Fonseca, and Saravan Rajmohan

  49. [49]

    Exploring LLM-based Agents for Root Cause Analysis.arXiv preprint arXiv:2403.04123(2024). Proc. ACM Softw. Eng., Vol. 3, No. FSE, Article FSE179. Publication date: July 2026. EventADL: Open-Box Anomaly Detection and Localization Framework for Events in Cloud-Based Service Systems FSE179:23

  50. [50]

    Lukas Ruff, Robert Vandermeulen, Nico Goernitz, Lucas Deecke, Shoaib Ahmed Siddiqui, Alexander Binder, Emmanuel Müller, and Marius Kloft. 2018. Deep one-class classification. InInternational conference on machine learning. PMLR

  51. [51]

    Huasong Shan, Yunpeng Zhang, Yuan Chen, Xiao Xiao, Haifeng Liu, Xiaofeng He, Min Li, and Wei Ding. 2019. 𝜖-Diagnosis: Unsupervised and real-time diagnosis of small-window long-tail latency in large-scale microservice platforms.Proceedings of the World Wide Web Conference, WWW 2019(2019), 3215–3222

  52. [52]

    Tom Shenkar and Lior Wolf. 2022. Anomaly detection for tabular data with internal contrastive learning. InInternational Conference on Learning Representations

  53. [53]

    Jacopo Soldani and Antonio Brogi. 2022. Anomaly detection and failure root cause analysis in (micro) service-based cloud applications: A survey.ACM Computing Surveys (CSUR)55, 3 (2022), 1–39

  54. [54]

    Yongqian Sun, Zihan Lin, Binpeng Shi, Shenglin Zhang, Shiyu Ma, Pengxiang Jin, Zhenyu Zhong, Lemeng Pan, Yicheng Guo, and Dan Pei. 2025. Interpretable failure localization for microservice systems based on graph autoencoder.ACM Transactions on Software Engineering and Methodology34, 2 (2025), 1–28

  55. [55]

    Yongqian Sun, Binpeng Shi, Mingyu Mao, Minghua Ma, Sibo Xia, Shenglin Zhang, and Dan Pei. 2024. Art: A unified unsupervised framework for incident management in microservice systems. InProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering. 1183–1194

  56. [56]

    Wil Van der Aalst, Ton Weijters, and Laura Maruster. 2004. Workflow Mining: Discovering Process Models from Event Logs.IEEE Transactions on Knowledge and Data Engineering16, 9 (2004), 1128–1142

  57. [57]

    Jeremy Wadhams. n.d.. JSONLogic: A lightweight, safe way to share logic between systems. https://jsonlogic.com/

  58. [58]

    Dongjie Wang, Zhengzhang Chen, Yanjie Fu, Yanchi Liu, and Haifeng Chen. 2023. Incremental causal graph learning for online root cause analysis. InProceedings of the 29th ACM SIGKDD conference on knowledge discovery and data mining. 2269–2278

  59. [59]

    Hu Wang, Guansong Pang, Chunhua Shen, and Congbo Ma. 2020. Unsupervised Representation Learning by Predicting Random Distances. InIJCAI. ijcai.org, 2950–2956

  60. [60]

    Hanzhang Wang, Zhengkai Wu, Huai Jiang, Yichao Huang, Jiamu Wang, Selcuk Kopru, and Tao Xie. 2021. Groot: An Event-graph-based Approach for Root Cause Analysis in Industrial Settings.Proceedings of the 36th IEEE/ACM International Conference on Automated Software Engineering(2021), 419–429

  61. [61]

    Yidan Wang, Zhouruixing Zhu, Qiuai Fu, Yuchi Ma, and Pinjia He. 2024. MRCA: Metric-level root cause analysis for microservices via multi-modal data. InProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering. 1057–1068

  62. [62]

    2024.Experimentation in Software Engineering

    Claes Wohlin, Per Runeson, Martin Höst, Magnus C Ohlsson, Björn Regnell, Anders Wesslén, et al. 2024.Experimentation in Software Engineering. Springer

  63. [63]

    Shuaiyu Xie, Jian Wang, Hanbin He, Zhihao Wang, Yuqi Zhao, Neng Zhang, and Bing Li. 2026. TVDiag: A Task- oriented and View-invariant Failure Diagnosis Framework for Microservice-based Systems with Multimodal Data. ACM Transactions on Software Engineering and Methodology35, 2 (2026), 1–39

  64. [64]

    Ruyue Xin, Peng Chen, and Zhiming Zhao. 2023. CausalRCA: Causal inference based precise fine-grained root cause localization for microservice applications.Journal of Systems and Software203 (2023), 111724

  65. [65]

    Hongzuo Xu, Guansong Pang, Yijie Wang, and Yongjun Wang. 2023. Deep Isolation Forest for Anomaly Detection. IEEE Trans. Knowl. Data Eng.35, 12 (2023), 12591–12604

  66. [66]

    Lin Yang, Junjie Chen, Zan Wang, Weijing Wang, Jiajun Jiang, Xuyuan Dong, and Wenbin Zhang. 2021. Semi-supervised log-based anomaly detection via probabilistic label estimation. In2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). IEEE, 1448–1460

  67. [67]

    Guangba Yu, Pengfei Chen, Yufeng Li, Hongyang Chen, Xiaoyun Li, and Zibin Zheng. 2023. Nezha: Interpretable fine-grained root causes analysis for microservices on multi-modal observability data. InProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 553–565

  68. [68]

    Jun Zengy, Xiang Wang, Jiahao Liu, Yinfang Chen, Zhenkai Liang, Tat-Seng Chua, and Zheng Leong Chua. 2022. Shadewatcher: Recommendation-guided cyber threat analysis using system audit records. In2022 IEEE symposium on security and privacy (SP). IEEE, 489–506

  69. [69]

    Chenxi Zhang, Zhen Dong, Xin Peng, Bicheng Zhang, and Miao Chen. 2024. Trace-based multi-dimensional root cause localization of performance issues in microservice systems. InProceedings of the IEEE/ACM 46th International Conference on Software Engineering. 1–12

  70. [70]

    Xu Zhang, Yong Xu, Qingwei Lin, Bo Qiao, Hongyu Zhang, Yingnong Dang, Chunyu Xie, Xinsheng Yang, Qian Cheng, Ze Li, et al. 2019. Robust log-based anomaly detection on unstable log data. InProceedings of the ACM joint meeting on European software engineering conference and symposium on the foundations of software engineering. 807–817

  71. [71]

    Lecheng Zheng, Zhengzhang Chen, Jingrui He, and Haifeng Chen. 2024. MULAN: multi-modal causal structure learning and root cause analysis for microservice systems. InProceedings of the ACM Web Conference 2024. 4107–4116. Received 2025-09-11; accepted 2026-03-24 Proc. ACM Softw. Eng., Vol. 3, No. FSE, Article FSE179. Publication date: July 2026