SREGym is a modular, open-source live benchmark with 90 high-fidelity SRE failure scenarios built on real cloud stacks for evaluating AI agents on diagnosis and mitigation tasks.
Gala: Can graph-augmented large language model agentic workflows elevate root cause analysis?
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
years
2026 2verdicts
UNVERDICTED 2roles
background 1polarities
background 1representative citing papers
RCLAgent uses multi-agent recursion-of-thought with parallel reasoning on trace graphs to outperform prior methods in root cause localization accuracy and efficiency for microservice systems.
citing papers explorer
-
SREGym: A Live Benchmark for AI SRE Agents with High-Fidelity Failure Scenarios
SREGym is a modular, open-source live benchmark with 90 high-fidelity SRE failure scenarios built on real cloud stacks for evaluating AI agents on diagnosis and mitigation tasks.
-
Towards In-Depth Root Cause Localization for Microservices with Multi-Agent Recursion-of-Thought
RCLAgent uses multi-agent recursion-of-thought with parallel reasoning on trace graphs to outperform prior methods in root cause localization accuracy and efficiency for microservice systems.