pith. sign in

arxiv: 2603.23890 · v2 · pith:KCOKFEEEnew · submitted 2026-03-25 · 💻 cs.SE · cs.LG

Praxium: Diagnosing Cloud Anomalies with AI-based Telemetry and Dependency Analysis

Pith reviewed 2026-05-21 09:58 UTC · model grok-4.3

classification 💻 cs.SE cs.LG
keywords anomaly detectionroot cause analysismicroservicescloud computingdependency analysistelemetrycausal inferencesoftware installations
0
0 comments X

The pith

Praxium detects microservice anomalies in cloud systems and infers their root causes from recent software installations by combining telemetry monitoring with dependency analysis.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents Praxium as a framework that continuously watches telemetry metrics to flag anomalies in complex microservice architectures. It then applies causal impact analysis to dependency installation records to identify which recent software change most likely triggered the issue. This matters because frequent rollouts in CI/CD pipelines make it hard for engineers to manually trace problems back to specific installs or configurations. If the approach holds, it would give site reliability teams actionable details to resolve anomalies more quickly without relying on expert input each time. The work shows the method maintains strong detection results and reliable root-cause inferences across repeated tests with different anomaly types and installation frequencies.

Core claim

Praxium identifies anomalies by monitoring target metrics in telemetry data and then performs root cause analysis by measuring the causal impact of recent software installations, providing relevant diagnostic information to administrators even when package installations occur at shorter intervals.

What carries the argument

Causal impact analysis on dependency installation information, paired with anomaly detection on telemetry data.

If this is right

  • Anomaly detection remains effective across varied synthetic test conditions and hyperparameter choices.
  • Root cause inference continues to point to the correct installation even as update intervals shorten.
  • The combination of telemetry and dependency data supplies administrators with targeted information for anomaly resolution.
  • The framework reduces reliance on manual expert diagnosis in continuous deployment environments.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same telemetry-plus-dependency pattern could apply to diagnosing issues in other distributed systems that use frequent updates.
  • It opens the possibility of integrating such analysis directly into existing monitoring stacks to automate more of the troubleshooting loop.
  • Testing against real production anomalies rather than synthetics would clarify how well the causal signals hold up outside controlled settings.

Load-bearing premise

The method assumes synthetic anomalies accurately represent real microservice behaviors and that dependency data supplies complete causal signals without unmeasured confounders.

What would settle it

Running Praxium on a live production microservice deployment and checking whether it correctly identifies the root cause when a genuine anomaly occurs after a specific installation.

Figures

Figures reproduced from arXiv: 2603.23890 by Ayse K. Coskun, Gianluca Stringhini, Jason Li, Rohan Kumar, Syed Mohammad Qasim, Zongshun Zhang.

Figure 1
Figure 1. Figure 1: An overview of Praxium. The system monitors a microservice cluster, collecting telemetry, trace, and installation data via Prometheus, Jaeger, and [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: An example of the software dependency logging system. A background [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: An example of the hyperparameters of experiment 1. Stride [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: An example of experiment 2, in two cases. First, the case 1 is trivial: [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: An example of the Social Network service dependency graph during the [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗
read the original abstract

As the modern microservice architecture for cloud applications grows in popularity, cloud services are becoming increasingly complex and more vulnerable to misconfiguration and software bugs. Traditional approaches rely on expert input to diagnose and fix microservice anomalies, which lacks scalability in the face of the continuous integration and continuous deployment (CI/CD) paradigm. Microservice rollouts, containing new software installations, have complex interactions with the components of an application. Consequently, this added difficulty in attributing anomalous behavior to any specific installation or rollout results in potentially slower resolution times. To address the gaps in current diagnostic methods, this paper introduces Praxium, a framework for anomaly detection and root cause inference. Praxium aids administrators in evaluating target metric performance in the context of dependency installation information provided by a software discovery tool, PraxiPaaS. Praxium continuously monitors telemetry data to identify anomalies, then conducts root cause analysis via causal impact on recent software installations, in order to provide site reliability engineers (SRE) relevant information about an observed anomaly. In this paper, we demonstrate that Praxium is capable of effective anomaly detection and root cause inference, and we provide an analysis on effective anomaly detection hyperparameter tuning as needed in a practical setting. Across 75 total trials using four synthetic anomalies, anomaly detection consistently performs at >0.97 macro-F1. In addition, we show that causal impact analysis reliably infers the correct root cause of anomalies, even as package installations occur at increasingly shorter intervals.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript introduces Praxium, a framework for anomaly detection and root cause inference in microservice cloud applications. It combines continuous monitoring of telemetry data to identify anomalies with causal impact analysis that leverages dependency installation information from the PraxiPaaS software discovery tool. The central empirical claims are that anomaly detection achieves macro-F1 scores above 0.97 across 75 trials on four synthetic anomalies and that causal impact analysis reliably identifies the correct root cause even as package installation intervals shorten.

Significance. If the synthetic results generalize, the integration of telemetry-based detection with installation-tied causal inference could offer a scalable, automated alternative to expert-driven diagnosis in CI/CD environments, potentially shortening resolution times for SRE teams. The approach addresses a practical gap in attributing anomalies amid frequent rollouts, but its significance is currently constrained by the absence of real production validation.

major comments (3)
  1. [§5] §5 (Evaluation): The reported >0.97 macro-F1 is obtained exclusively on four synthetic anomalies across 75 trials; the manuscript must show that these anomalies reproduce the telemetry distributions, correlation structures, and failure cascades observed in real microservice production data, or the generalization claim for root-cause inference cannot be supported.
  2. [§4.2] §4.2 (Causal Impact Analysis): The root-cause claims rest on the assumption that PraxiPaaS dependency data supplies all relevant causal signals without unmeasured confounders (network jitter, resource contention, external load); no sensitivity analysis or confounder discussion is provided, which is load-bearing for the reliability statement under shortening installation intervals.
  3. [Abstract and §3] Abstract and §3 (Method): No description of the anomaly detection algorithm, choice of AI model, or baseline methods is supplied, preventing assessment of whether the macro-F1 result constitutes an advance or is reproducible.
minor comments (2)
  1. [Figures] Figure captions and axis labels in the evaluation plots should explicitly state the synthetic anomaly types and trial counts to improve clarity.
  2. [§5] The hyperparameter tuning analysis mentioned in the abstract would benefit from a dedicated table summarizing the selected values and their effect on F1.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed comments. We address each major point below and indicate the revisions we will make to strengthen the manuscript.

read point-by-point responses
  1. Referee: [§5] §5 (Evaluation): The reported >0.97 macro-F1 is obtained exclusively on four synthetic anomalies across 75 trials; the manuscript must show that these anomalies reproduce the telemetry distributions, correlation structures, and failure cascades observed in real microservice production data, or the generalization claim for root-cause inference cannot be supported.

    Authors: We agree that stronger evidence of fidelity between synthetic and real telemetry would better support generalization. Our evaluation is deliberately scoped to synthetic data with known ground truth to enable repeatable, controlled experiments. In the revision we will expand §5 with an explicit description of the synthetic anomaly generation procedure, detailing how the four anomaly types were constructed to match reported distributions, correlations, and cascade patterns from public microservice traces and the literature. We will also add a dedicated limitations subsection that states the current results do not claim direct generalization to production environments and identifies real-world validation as necessary future work. revision: partial

  2. Referee: [§4.2] §4.2 (Causal Impact Analysis): The root-cause claims rest on the assumption that PraxiPaaS dependency data supplies all relevant causal signals without unmeasured confounders (network jitter, resource contention, external load); no sensitivity analysis or confounder discussion is provided, which is load-bearing for the reliability statement under shortening installation intervals.

    Authors: The concern about unmeasured confounders is well taken. We will revise §4.2 to include an explicit discussion of modeling assumptions and potential confounders. In addition, we will add a sensitivity analysis that injects controlled levels of network jitter, resource contention, and external load into the simulation while varying installation intervals, reporting how these factors affect causal-impact accuracy. This will directly test the robustness of the reliability claims. revision: yes

  3. Referee: [Abstract and §3] Abstract and §3 (Method): No description of the anomaly detection algorithm, choice of AI model, or baseline methods is supplied, preventing assessment of whether the macro-F1 result constitutes an advance or is reproducible.

    Authors: We apologize for the omission of these details. In the revised manuscript we will expand both the Abstract and §3 to describe the anomaly detection algorithm in full, specify the AI model and its hyperparameters, explain the rationale for the chosen approach, and present comparisons against standard baselines (statistical thresholding and alternative ML detectors). These additions will allow readers to evaluate the advance and reproduce the reported macro-F1 scores. revision: yes

Circularity Check

0 steps flagged

No circularity in derivation chain

full rationale

The paper introduces Praxium as an empirical framework for anomaly detection via telemetry monitoring and root cause inference via causal impact analysis on installation events from PraxiPaaS. All reported results (>0.97 macro-F1 across 75 synthetic-anomaly trials and reliable root-cause inference) are presented as experimental outcomes from direct evaluation rather than as outputs of any mathematical derivation, equation, or first-principles reduction. No self-definitional loops, fitted parameters renamed as predictions, or load-bearing self-citations that collapse the central claims appear in the provided text. The approach therefore remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The abstract supplies no explicit free parameters, axioms, or invented entities; all technical details of the anomaly detection model and causal method are omitted.

pith-pipeline@v0.9.0 · 5820 in / 1181 out tokens · 40826 ms · 2026-05-21T09:58:03.590023+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

22 extracted references · 22 canonical work pages

  1. [1]

    Nightcore: efficient and scalable serverless computing for latency-sensitive, interactive microservices,

    Y . Gan, M. Liang, S. Dev, D. Lo, and C. Delimitrou, “Sage: practical and scalable ml-driven performance debugging in microservices,” inProceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, ser. ASPLOS ’21. New York, NY , USA: Association for Computing Machinery, 2021, p. 135–151. [O...

  2. [2]

    Unleashing Performance Insights with Online Probabilistic Tracing ,

    M. Toslali, S. Qasim, S. Parthasarathy, F. A. Oliveira, H. Huang, G. Stringhini, Z. Liu, and A. K. Coskun, “ Unleashing Performance Insights with Online Probabilistic Tracing ,” in2024 IEEE International Conference on Cloud Engineering (IC2E). Los Alamitos, CA, USA: IEEE Computer Society, Sep. 2024, pp. 72–82. [Online]. Available: https://doi.ieeecomputer...

  3. [3]

    Jaeger: A distributed tracing system,

    The Jaeger Authors, “Jaeger: A distributed tracing system,” https://www.jaegertracing.io, 2025, accessed: 20 February 2026

  4. [4]

    Tritium: A cross-layer analytics system for enhancing microservice rollouts in the cloud,

    S. Allen, M. Toslali, S. Parthasarathy, F. Oliveira, and A. K. Coskun, “Tritium: A cross-layer analytics system for enhancing microservice rollouts in the cloud,” inProceedings of the Seventh International Workshop on Container Technologies and Container Clouds, 2021, pp. 19–24

  5. [5]

    Prometheus: Monitoring system and time series database,

    Prometheus Authors, “Prometheus: Monitoring system and time series database,” https://prometheus.io, 2025, computer software. Accessed: Feb 20, 2026

  6. [6]

    Root cause analysis of failures in microservices through causal discovery,

    A. Ikram, S. Chakraborty, S. Mitra, S. Saini, S. Bagchi, and M. Ko- caoglu, “Root cause analysis of failures in microservices through causal discovery,”Advances in Neural Information Processing Systems, vol. 35, pp. 31 158–31 170, 2022

  7. [7]

    Dapper, a large-scale distributed systems tracing infrastructure,

    B. H. Sigelman, L. A. Barroso, M. Burrows, P. Stephenson, M. Plakal, D. Beaver, S. Jaspan, and C. Shanbhag, “Dapper, a large-scale distributed systems tracing infrastructure,” 2010

  8. [8]

    Zipkin: A distributed tracing system,

    OpenZipkin, “Zipkin: A distributed tracing system,” https://zipkin.io, 2025, accessed: 20 February 2026

  9. [9]

    Graph-based trace analysis for microservice architecture understanding and problem diagnosis,

    X. Guo, X. Peng, H. Wang, W. Li, H. Jiang, D. Ding, T. Xie, and L. Su, “Graph-based trace analysis for microservice architecture understanding and problem diagnosis,” inProceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the F oundations of Software Engineering, ser. ESEC/FSE 2020. New York, NY , USA: Ass...

  10. [10]

    Unsupervised detection of microservice trace anomalies through service-level deep bayesian networks,

    P. Liu, H. Xu, Q. Ouyang, R. Jiao, Z. Chen, S. Zhang, J. Yang, L. Mo, J. Zeng, W. Xueet al., “Unsupervised detection of microservice trace anomalies through service-level deep bayesian networks,” in2020 IEEE 31st International Symposium on Software Reliability Engineering (ISSRE). IEEE, 2020, pp. 48–58

  11. [11]

    Tracemodel: An automatic anomaly detection and root cause localization framework for microservice sys- tems,

    Y . Cai, B. Han, J. Su, and X. Wang, “Tracemodel: An automatic anomaly detection and root cause localization framework for microservice sys- tems,” in2021 17th International Conference on Mobility, Sensing and Networking (MSN). IEEE, 2021, pp. 512–519

  12. [12]

    Prodigy: Towards unsupervised anomaly detection in production hpc systems,

    B. Aksar, E. Sencan, B. Schwaller, O. Aaziz, V . J. Leung, J. Brandt, B. Kulis, M. Egele, and A. K. Coskun, “Prodigy: Towards unsupervised anomaly detection in production hpc systems,” inProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2023, pp. 1–14

  13. [13]

    Osv-scanner: A tool for scanning open source dependencies for vulnerabilities,

    Google, “Osv-scanner: A tool for scanning open source dependencies for vulnerabilities,” https://github.com/google/osv-scanner, 2025, accessed: 20 February 2026

  14. [14]

    Snyk: Open source security,

    S. Ltd., “Snyk: Open source security,” https://snyk.io, 2025, accessed: 20 February 2026

  15. [15]

    Owasp dependency-check,

    OW ASP, “Owasp dependency-check,” https://owasp.org/www-project- dependency-check/, 2025, accessed: 20 February 2026

  16. [16]

    Praxipaas: A decomposable machine learning system for efficient container package discovery,

    Z. Zhang, R. Kumar, J. Li, L. Korver, A. Byrne, G. Stringhini, I. Matta, and A. Coskun, “Praxipaas: A decomposable machine learning system for efficient container package discovery,” in2024 IEEE International Conference on Cloud Engineering (IC2E), 2024, pp. 178–188

  17. [17]

    Deltasherlock: Identifying changes in the cloud,

    A. Turk, H. Chen, A. Byrne, J. Knollmeyer, S. S. Duri, C. Isci, and A. K. Coskun, “Deltasherlock: Identifying changes in the cloud,” in 2016 IEEE International Conference on Big Data (Big Data), 2016, pp. 763–772

  18. [18]

    Praxi: Cloud software discovery that learns from practice,

    A. Byrne, E. Ates, A. Turk, V . Pchelin, S. Duri, S. Nadgowda, C. Isci, and A. K. Coskun, “Praxi: Cloud software discovery that learns from practice,”IEEE Transactions on Cloud Computing, vol. 10, no. 2, pp. 872–884, 2022

  19. [19]

    Inferring causal impact using bayesian structural time-series models,

    K. H. Brodersen, F. Gallusser, J. Koehler, N. Remy, and S. L. Scott, “Inferring causal impact using bayesian structural time-series models,” The Annals of Applied Statistics, vol. 9, no. 1, pp. 247–274, 2015

  20. [20]

    An open-source bench- mark suite for microservices and their hardware-software implications for cloud & edge systems,

    Y . Gan, Y . Zhang, D. Cheng, A. Shetty, P. Rathi, N. Katarki, A. Bruno, J. Hu, B. Ritchken, B. Jackson, K. Hu, M. Pancholi, Y . He, B. Clancy, C. Colen, F. Wen, C. Leung, S. Wang, L. Zaruvinsky, M. Espinosa, R. Lin, Z. Liu, J. Padilla, and C. Delimitrou, “An open-source bench- mark suite for microservices and their hardware-software implications for clou...

  21. [21]

    Pyyaml: Yaml parser and emitter for python,

    K. Simonovet al., “Pyyaml: Yaml parser and emitter for python,” https://pyyaml.org/, 2023, accessed: March 05, 2025

  22. [22]

    wrk2: A constant throughput load generator,

    R. Giltene, “wrk2: A constant throughput load generator,” https://github.com/giltene/wrk2, 2015, accessed: Feb 20, 2026