GraphMind: From Operational Traces to Self-Evolving Workflow Automation

Anna Pavlenko; Divya Vermareddy; Hannah Lerner; Hemkesh Vijaya Kumar; Joyce Cahoon; Katherine Lin; Mathieu Demarne; Meina Wang; Miso Cilimdzic; Nima Shahbazi

arxiv: 2605.17617 · v1 · pith:2JN7WILBnew · submitted 2026-05-17 · 💻 cs.AI

GraphMind: From Operational Traces to Self-Evolving Workflow Automation

Yiwen Zhu , Joyce Cahoon , Anna Pavlenko , Qiushi Bai , Nima Shahbazi , Divya Vermareddy , Meina Wang , Mathieu Demarne

show 8 more authors

Swati Bararia Wenjing Wang Hemkesh Vijaya Kumar Hannah Lerner Katherine Lin Steve Toscano Miso Cilimdzic Subru Krishnan

This is my paper

Pith reviewed 2026-05-20 12:02 UTC · model grok-4.3

classification 💻 cs.AI

keywords workflow graphsself-evolving automationoperational tracesmulti-agent systemsincident investigationadaptive reinforcementcloud operations

0 comments

The pith

GraphMind builds action-centric workflow graphs from past resolution traces and lets them evolve through execution feedback to automate incident investigation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents GraphMind as an end-to-end system that turns large volumes of human problem-solving records into structured graphs showing problems, actions, and causal links. An online engine then uses these graphs to guide multi-agent reasoning when handling new incidents, combining retrieval along graph paths with step-by-step decision making. A reinforcement component strengthens paths that succeed in real executions and weakens those that do not, creating a closed loop where the graph updates itself. This matters for operations that still depend on repeated human coordination because the approach removes the need for ongoing manual redesign of workflows as conditions change. The system has been applied to incident investigation in production cloud database services.

Core claim

GraphMind constructs, executes, and evolves action-centric workflow graphs without human effort through an offline extraction pipeline that builds graphs from resolution traces, an online multi-agent traversal engine that navigates and executes them, and an Adaptive Traversal Reinforcement layer that reinforces successful paths while decaying stale elements.

What carries the argument

Action-centric workflow graphs, built offline from traces and traversed online by a multi-agent engine, with Adaptive Traversal Reinforcement updating the graph from execution results.

If this is right

Operational workflows can run with far less ongoing human input once the initial graph is built from traces.
Diagnostic performance improves over simple retrieval of similar past traces in reach, accuracy, and speed.
The graph adapts to shifting conditions through direct feedback from its own executions rather than external updates.
Production deployment across multiple services is feasible at the scale of real cloud operations.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar trace-to-graph pipelines could be tried in other high-volume operational domains such as IT support tickets or supply-chain adjustments.
Over time the reinforced graph might capture patterns that human-written procedures miss because it draws directly from observed outcomes.
Testing the system on entirely novel problems outside the initial trace distribution would reveal how far the evolution mechanism extends coverage.

Load-bearing premise

The offline pipeline accurately extracts causal relationships and structured workflow graphs from human resolution traces without introducing significant noise or missing key context.

What would settle it

A clear drop in mitigation success or a rise in required human corrections when the system encounters new incident types absent from the original traces would show the extraction and evolution steps are not sufficient.

Figures

Figures reproduced from arXiv: 2605.17617 by Anna Pavlenko, Divya Vermareddy, Hannah Lerner, Hemkesh Vijaya Kumar, Joyce Cahoon, Katherine Lin, Mathieu Demarne, Meina Wang, Miso Cilimdzic, Nima Shahbazi, Qiushi Bai, Steve Toscano, Subru Krishnan, Swati Bararia, Wenjing Wang, Yiwen Zhu.

**Figure 2.** Figure 2: Incremental graph construction. As new opera [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Graph size under varying clustering thresholds for [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 5.** Figure 5: Impact of retrieval parameters (𝑘𝑝, 𝑘𝑎) on online troubleshooting metrics. Best cell marked with ∗. 0 1 2 3 4 5 6 Epoch 20k 20k 20k Total Edges Total Synth. 0 100 200 300 Synth. (a) Cumul. edge synthesis. 0 1 2 3 4 5 6 Epoch 0.00 0.20 0.40 0.60 Gini Coeff. Edge Node (b) Gini coefficient [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗

**Figure 6.** Figure 6: Reinforcement evolution over six epochs. (a) 289 [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗

**Figure 7.** Figure 7: Reinforced subgraph evolution across six epochs. Node size and edge thickness are proportional to reinforcement [PITH_FULL_IMAGE:figures/full_fig_p009_7.png] view at source ↗

**Figure 8.** Figure 8: Engagement and responsiveness across 62 produc [PITH_FULL_IMAGE:figures/full_fig_p009_8.png] view at source ↗

read the original abstract

Complex operational workflows coordinating personnel, tools, and information are central to enterprise operations, yet end-to-end automation remains challenging due to extensive requirements for human inputs and the inability to adapt over time. We present GraphMind, an end-to-end system that constructs, executes, and evolves action-centric workflow graphs without human effort. The system operates in three phases. First, a scalable offline pipeline extracts structured workflow graphs from large volumes of human resolution traces, capturing problems, actions, and their causal relationships. Second, an online multi-agent traversal engine navigates the graph to dynamically construct and execute workflows, combining graph-guided retrieval with LLM-driven reasoning at each step. Third, Adaptive Traversal Reinforcement (ATR) reinforces successful traversal paths and decays stale elements. This closed-loop mechanism enables the graph to self-optimize and adapt to shifting operational conditions. GraphMind has been deployed across four production cloud database services for incident investigation. Evaluated on production data, the system substantially outperforms a Trace-RAG baseline in mitigation reach, groundedness, and diagnostic throughput, scoring 4.95/5 in blind expert review. The ATR layer provides further gains across all metrics, demonstrating that workflow graphs can learn and improve from execution-derived feedback.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

GraphMind integrates trace extraction, multi-agent traversal, and ATR reinforcement into a deployed system for ops workflows, but the unverified extraction step underpins all the performance claims.

read the letter

The main thing here is a practical system that pulls structured workflow graphs out of human resolution traces, then runs multi-agent LLM traversal on them for incident handling, with an Adaptive Traversal Reinforcement loop that updates the graph from execution feedback. It reports deployment across four production cloud database services and clear gains over a Trace-RAG baseline in mitigation reach, groundedness, and throughput, plus a 4.95/5 blind expert score.

Referee Report

2 major / 1 minor

Summary. The paper presents GraphMind, an end-to-end system that constructs, executes, and evolves action-centric workflow graphs from human resolution traces for automating complex operational workflows. It operates via three phases: an offline pipeline extracting structured graphs capturing problems, actions, and causal relationships; an online multi-agent traversal engine combining graph-guided retrieval with LLM reasoning; and Adaptive Traversal Reinforcement (ATR) for reinforcing successful paths and enabling self-optimization. The system is reported to be deployed across four production cloud database services for incident investigation, substantially outperforming a Trace-RAG baseline in mitigation reach, groundedness, and diagnostic throughput, with a 4.95/5 blind expert review score and further gains from ATR.

Significance. If the extraction accuracy and evaluation results hold, the work could meaningfully advance self-evolving automation in enterprise IT operations by minimizing human inputs and enabling adaptation to changing conditions. The production deployment across multiple services and the closed-loop ATR mechanism represent practical strengths with potential for broader impact in AI-driven workflow systems. However, the absence of methodological details currently limits the assessed significance.

major comments (2)

[Abstract] Abstract: The abstract reports strong production results, outperformance over Trace-RAG, and a 4.95/5 expert score but provides no details on evaluation methodology, baselines, statistical significance, trace processing, or dataset scale. This is load-bearing for the central claims, as it prevents verification of whether the gains are robust or influenced by post-hoc choices.
[Offline pipeline] Offline pipeline description: The system's foundation is the offline extraction of causal relationships and structured workflow graphs from human resolution traces. No quantitative metrics on extraction accuracy, error analysis, ablation studies on graph quality, or handling of noise/missing context are provided, which directly affects confidence in the online traversal, mitigation reach, and ATR-based evolution claims.

minor comments (1)

[Abstract] Abstract: A high-level figure illustrating the three-phase flow (offline extraction to online traversal to ATR) would improve clarity of the system architecture.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback and for recognizing the practical value of GraphMind's production deployment and closed-loop ATR mechanism. We address each major comment below and describe the revisions planned for the next version of the manuscript.

read point-by-point responses

Referee: [Abstract] Abstract: The abstract reports strong production results, outperformance over Trace-RAG, and a 4.95/5 expert score but provides no details on evaluation methodology, baselines, statistical significance, trace processing, or dataset scale. This is load-bearing for the central claims, as it prevents verification of whether the gains are robust or influenced by post-hoc choices.

Authors: We agree that the abstract, due to length constraints, presents results at a high level. The full manuscript contains the requested details on evaluation methodology, dataset scale, Trace-RAG baseline construction, and expert review protocol in the Experiments section. To improve accessibility, we will revise the abstract to incorporate a concise statement on evaluation setup, dataset size, and statistical reporting while preserving the high-level focus. revision: yes
Referee: [Offline pipeline] Offline pipeline description: The system's foundation is the offline extraction of causal relationships and structured workflow graphs from human resolution traces. No quantitative metrics on extraction accuracy, error analysis, ablation studies on graph quality, or handling of noise/missing context are provided, which directly affects confidence in the online traversal, mitigation reach, and ATR-based evolution claims.

Authors: The referee is correct that the current description of the offline pipeline lacks quantitative validation. While the pipeline architecture and causal extraction process are detailed in the manuscript, we did not report accuracy metrics or ablations. In the revised manuscript we will add a dedicated evaluation subsection reporting precision/recall on a held-out annotated trace set, an error analysis for noise and missing context, and an ablation on graph quality's impact on downstream traversal performance. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation remains self-contained

full rationale

The paper describes a three-phase pipeline: (1) offline extraction of structured workflow graphs from human resolution traces, (2) online multi-agent traversal combining graph-guided retrieval with LLM reasoning, and (3) ATR that reinforces successful paths and decays stale elements using execution-derived feedback. Evaluation metrics (mitigation reach, groundedness, diagnostic throughput, 4.95/5 expert review) are reported against a Trace-RAG baseline on production data from four deployed services. No equations, fitted parameters, or self-citations are presented that reduce any claimed prediction or result to the input traces by construction. The reinforcement loop explicitly draws from new execution outcomes rather than re-using the original trace data for both construction and scoring. The central claims therefore rest on externally observable deployment performance and comparative evaluation rather than tautological re-labeling of inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Based on abstract only, the central claim rests on the assumption that human resolution traces contain sufficient structured causal information for reliable graph extraction and that LLM-driven reasoning at traversal steps produces grounded actions. No explicit free parameters or invented entities are named.

axioms (1)

domain assumption Human resolution traces contain extractable problems, actions, and causal relationships that form accurate workflow graphs.
Invoked in the first phase description of the offline pipeline.

pith-pipeline@v0.9.0 · 5809 in / 1384 out tokens · 35411 ms · 2026-05-20T12:02:18.443545+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Adaptive Traversal Reinforcement (ATR) reinforces successful traversal paths and decays stale elements... inspired by Ant Colony Optimization
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

workflow graph G=(V,E) with typed nodes (domains, problems, actions) and edges (CAUSES, RESOLVES, LEADS_TO)

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

44 extracted references · 44 canonical work pages · 3 internal anchors

[1]

Anthropic. 2024. Model Context Protocol (MCP). https://modelcontextprotocol. io/. Open standard for connecting AI applications to external systems; en- ables structured, secure interaction between LLMs and data sources/tools via a universal protocol

work page 2024
[2]

Anthropic. 2025. Claude Code. https://docs.anthropic.com/en/docs/claude-code. Agentic coding tool with built-in MCP client support

work page 2025
[3]

Akari Asai, Zeqiu Wu, Yizhong Wang, Avirup Sil, and Hannaneh Hajishirzi. 2024. Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection. InProceedings of the International Conference on Learning Representations (ICLR)

work page 2024
[4]

Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Ok- sana Yakhnenko. 2013. Translating Embeddings for Modeling Multi-relational Data. InAdvances in Neural Information Processing Systems (NeurIPS)

work page 2013
[5]

Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al . 2020. Language Models are Few-Shot Learners. InAdvances in Neural Information Processing Systems (NeurIPS)

work page 2020
[6]

Yinfang Chen, Huaibing Xie, Minghua Ma, Yu Kang, Xin Gao, Liu Shi, Yunjie Cao, Xuedong Gao, Hao Fan, Ming Wen, Jun Zeng, Supriyo Ghosh, Xuchao Zhang, Chaoyun Zhang, Qingwei Lin, Saravan Rajmohan, Dongmei Zhang, and Tianyin Xu. 2024. Automatic Root Cause Analysis via Large Language Models for Cloud Incidents. InProceedings of the 19th European Conference o...

work page doi:10.1145/3627703.3629553 2024
[7]

DeLong, Ramon Fernandez Mir, and Jacques D

Lara N. DeLong, Ramon Fernandez Mir, and Jacques D. Fleuriot. 2024. Neurosym- bolic AI for Reasoning Over Knowledge Graphs: A Survey.IEEE Transactions on Neural Networks and Learning Systems(2024). https://doi.org/10.1109/TNNLS. 2024.3420218

work page doi:10.1109/tnnls 2024
[8]

Tim Dettmers, Pasquale Minervini, Pontus Stenetorp, and Sebastian Riedel. 2018. Convolutional 2D Knowledge Graph Embeddings. InProceedings of the AAAI Conference on Artificial Intelligence

work page 2018
[9]

Patrizia d’Ettorre and Alain Lenoir. 2010. Nestmate Recognition. InAnt Ecology, Lori Lach, Catherine L. Parr, and Kirsti L. Abbott (Eds.). Oxford University Press, 194–209

work page 2010
[10]

Gianni Di Caro and Marco Dorigo. 1998. AntNet: distributed stigmergetic control for communications networks.Journal of Artificial Intelligence Research9 (1998), 317–365

work page 1998
[11]

Marco Dorigo and Luca Maria Gambardella. 1997. Ant colony system: a coopera- tive learning approach to the traveling salesman problem. InIEEE Transactions on Evolutionary Computation, Vol. 1. 53–66. https://doi.org/10.1109/4235.585892

work page doi:10.1109/4235.585892 1997
[12]

Marco Dorigo, Vittorio Maniezzo, and Alberto Colorni. 1996. Ant system: optimization by a colony of cooperating agents.IEEE Transactions on Sys- tems, Man, and Cybernetics, Part B (Cybernetics)26, 1 (1996), 29–41. https: //doi.org/10.1109/3477.484436

work page doi:10.1109/3477.484436 1996
[13]

2004.Ant Colony Optimization

Marco Dorigo and Thomas Stützle. 2004.Ant Colony Optimization. MIT Press

work page 2004
[14]

Darren Edge, Ha Trinh, Newman Cheng, Joshua Bradley, Alex Chao, Apurva Mody, Steven Truitt, Dasha Metropolitansky, Robert Osazuwa Ness, and Jonathan Larson. 2025. From Local to Global: A Graph RAG Approach to Query-Focused Summarization. arXiv:2404.16130 [cs.CL] https://arxiv.org/abs/2404.16130

work page internal anchor Pith review Pith/arXiv arXiv 2025
[15]

Corrado Gini. 1921. Measurement of inequality of incomes.The economic journal 31, 121 (1921), 124–125

work page 1921
[16]

GitHub. 2026. GitHub Copilot CLI. https://github.blog/changelog/2026-02-25- github-copilot-cli-is-now-generally-available/. MCP-compatible agentic coding assistant for the command line

work page 2026
[17]

Retrieval-Augmented Generation with Graphs (GraphRAG)

Haoyu Han, Yu Wang, Harry Shomer, Kai Guo, Jiayuan Ding, Yongjia Lei, Mahantesh Halappanavar, Ryan A. Rossi, Subhabrata Mukherjee, et al . 2025. Retrieval-Augmented Generation with Graphs (GraphRAG).arXiv preprint arXiv:2501.00309(2025)

work page internal anchor Pith review Pith/arXiv arXiv 2025
[18]

Stephen C. Johnson. 1967. Hierarchical Clustering Schemes.Psychometrika32, 3 (Sept. 1967), 241–254. https://doi.org/10.1007/BF02289588

work page doi:10.1007/bf02289588 1967
[19]

Liubov Kovriguina, Irina Toma, et al. 2024. LLM-based SPARQL Query Generation from Natural Language over Federated Knowledge Graphs. InProceedings of the International Semantic Web Conference (ISWC). https://arxiv.org/abs/2410.06062

work page arXiv 2024
[20]

GraphMind : From Operational Traces to Self-Evolving Workflow Automation Conference’17, July 2017, Washington, DC, USA

Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, et al. GraphMind : From Operational Traces to Self-Evolving Workflow Automation Conference’17, July 2017, Washington, DC, USA

work page 2017
[21]

In Advances in Neural Information Processing Systems (NeurIPS)

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. In Advances in Neural Information Processing Systems (NeurIPS)

work page
[22]

Daniel Merkle, Martin Middendorf, and Hartmut Schmeck. 2002. Ant colony optimization for resource-constrained project scheduling. InIEEE Transactions on Evolutionary Computation, Vol. 6. 333–346

work page 2002
[23]

OpenAI. 2023. Function Calling. https://platform.openai.com/docs/guides/ function-calling. Accessed: 2025-05-15

work page 2023
[24]

OpenAI. 2023. GPT-4 Technical Report. https://cdn.openai.com/papers/gpt-4.pdf

work page 2023
[25]

Shirui Pan, Linhao Luo, Yufei Wang, Chen Chen, Jiapu Wang, and Xindong Wu

work page
[26]

IEEE Transactions on Knowledge and Data Engineering36, 7 (2024), 3580–3601

Unifying Large Language Models and Knowledge Graphs: A Roadmap. IEEE Transactions on Knowledge and Data Engineering36, 7 (2024), 3580–3601. https://doi.org/10.1109/TKDE.2024.3352100

work page doi:10.1109/tkde.2024.3352100 2024
[27]

Kartik Ravichandran, Namrata Gurumurthy Kumar, Prateek Mishra, and Rahul Agrawal. 2025. OG-RAG: Ontology-Grounded Retrieval-Augmented Generation for Large Language Models. InProceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP). https://doi.org/10.18653/v1/2025.emnlp- main.1674

work page doi:10.18653/v1/2025.emnlp- 2025
[28]

Timo Schick, Jane Dwivedi-Yu, Roberto Dessì, Roberta Raileanu, Maria Lomeli, Eric Hambro, Luke Zettlemoyer, Nicola Cancedda, and Thomas Scialom. 2023. Toolformer: Language Models Can Teach Themselves to Use Tools. InAdvances in Neural Information Processing Systems (NeurIPS)

work page 2023
[29]

Mili Shah, Joyce Cahoon, Mirco Milletari, Jing Tian, Fotis Psallidas, Andreas Mueller, and Nick Litombe. 2024. Improving LLM-based KGQA for multi-hop Question Answering with implicit reasoning in few-shot examples. InProceedings of the 1st Workshop on Knowledge Graphs and Large Language Models (KaLLM 2024). Association for Computational Linguistics, Bangk...

work page doi:10.18653/v1/2024.kallm-1.13 2024
[30]

Aditi Singh, Abul Ehtesham, Saket Kumar, and Tala Talaei Khoei. 2025. Agentic Retrieval-Augmented Generation: A Survey on Agentic RAG.arXiv preprint arXiv:2501.09136(2025)

work page internal anchor Pith review Pith/arXiv arXiv 2025
[31]

Vikramank Singh, Kapil Eknath Vaidya, Vinayshekhar Bannihatti Kumar, Sopan Khosla, Murali Narayanaswamy, Rashmi Gangadharaiah, and Tim Kraska. 2024. Panda: Performance Debugging for Databases using LLM Agents. InProceedings of the Conference on Innovative Data Systems Research (CIDR)

work page 2024
[32]

Thomas Stützle and Holger H. Hoos. 2000. MAX-MIN Ant System.Future Generation Computer Systems16, 8 (2000), 889–914. https://doi.org/10.1016/S0167- 739X(00)00043-1

work page doi:10.1016/s0167- 2000
[33]

Sina Tabakhi, Parham Moradi, and Fardin Akhlaghian. 2014. An unsupervised feature selection algorithm based on ant colony optimization.Engineering Appli- cations of Artificial Intelligence32 (2014), 112–123

work page 2014
[34]

Wil M. P. van der Aalst. 2011.Process Mining: Discovery, Conformance and Enhancement of Business Processes. Springer. https://doi.org/10.1007/978-3-642- 19345-3

work page doi:10.1007/978-3-642- 2011
[35]

Wil M. P. van der Aalst, Ton Weijters, and Laura Maruster. 2004. Workflow Mining: Discovering Process Models from Event Logs.IEEE Transactions on Knowledge and Data Engineering16, 9 (2004), 1128–1142. https://doi.org/10.1109/TKDE.2004.47

work page doi:10.1109/tkde.2004.47 2004
[36]

van Zweden and Patrizia d’Ettorre

Jelle S. van Zweden and Patrizia d’Ettorre. 2010. Nestmate Recognition in Social Insects and the Role of Hydrocarbons. InInsect Hydrocarbons: Biology, Biochem- istry, and Chemical Ecology, Gary J. Blomquist and Anne-Geneviève Bagnères (Eds.). Cambridge University Press, 222–243

work page 2010
[37]

Lei Wang, Chen Ma, Xueyang Feng, Zeyu Zhang, Hao Yang, Jingsen Zhang, Zhiyuan Chen, Jiakai Tang, Xu Chen, Yankai Lin, et al. 2024. A survey on large language model based autonomous agents.Frontiers of Computer Science18, 6 (2024), 186345. https://doi.org/10.1007/s11704-024-40231-1

work page doi:10.1007/s11704-024-40231-1 2024
[38]

Zhen Wang, Jianwen Zhang, Jianlin Feng, and Zheng Chen. 2014. Knowledge Graph Embedding by Translating on Hyperplanes. InProceedings of the AAAI Conference on Artificial Intelligence

work page 2014
[39]

Xinyi Xia, Yijun Zhu, Jianan Guo, et al . 2025. Knowledge Graph Finetuning Enhances Knowledge Manipulation in Large Language Models. InProceedings of the International Conference on Learning Representations (ICLR). https:// openreview.net/forum?id=oMFOKjwaRS

work page 2025
[40]

Yuqing Xue, Zhuoran Wang, Wei Sun, Fanxin Meng, Wenchao Zhang, Zhenyu Li, et al. 2025. TRIANGLE: A Benchmark and Framework for Automated Incident Triage in Large-Scale Cloud Systems. InProceedings of the ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE). https://netman.aiops.org/wp-conte...

work page 2025
[41]

Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, and Yuan Cao. 2023. ReAct: Synergizing Reasoning and Acting in Language Models. InProceedings of the International Conference on Learning Representations (ICLR)

work page 2023
[42]

William Zhang, Yiwen Zhu, Yunlei Lu, Mathieu Demarne, Wenjing Wang, Kai Deng, Nutan Sahoo, Katherine Lin, Miso Cilimdzic, and Subru Krishnan. 2025. FLAIR: Feedback Learning for Adaptive Information Retrieval. InProceedings of the 34th ACM International Conference on Information and Knowledge Management (CIKM ’25). Association for Computing Machinery, Seou...

work page doi:10.1145/3746252.3761553 2025
[43]

Xuanhe Zhou, Guoliang Li, Zhaoyan Liu, et al. 2024. D-Bot: Database Diagnosis System using Large Language Models.Proceedings of the VLDB Endowment17, 10 (2024), 2514–2527. https://doi.org/10.14778/3675034.3675043

work page doi:10.14778/3675034.3675043 2024
[44]

Yiwen Zhu, Mathieu Demarne, Kai Deng, Wenjing Wang, Nutan Sahoo, Divya Vermareddy, Hannah Lerner, Yunlei Lu, Swati Bararia, Anjali Bhavan, William Zhang, Xia Li, Katherine Lin, Miso Cilimdzic, and Subru Krishnan. 2025. DECO: Life-Cycle Management of Enterprise-Grade Copilots. https://doi.org/10.1145/ 3770854.3783949 arXiv:2412.06099 [cs.SE]

work page arXiv 2025

[1] [1]

Anthropic. 2024. Model Context Protocol (MCP). https://modelcontextprotocol. io/. Open standard for connecting AI applications to external systems; en- ables structured, secure interaction between LLMs and data sources/tools via a universal protocol

work page 2024

[2] [2]

Anthropic. 2025. Claude Code. https://docs.anthropic.com/en/docs/claude-code. Agentic coding tool with built-in MCP client support

work page 2025

[3] [3]

Akari Asai, Zeqiu Wu, Yizhong Wang, Avirup Sil, and Hannaneh Hajishirzi. 2024. Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection. InProceedings of the International Conference on Learning Representations (ICLR)

work page 2024

[4] [4]

Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Ok- sana Yakhnenko. 2013. Translating Embeddings for Modeling Multi-relational Data. InAdvances in Neural Information Processing Systems (NeurIPS)

work page 2013

[5] [5]

Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al . 2020. Language Models are Few-Shot Learners. InAdvances in Neural Information Processing Systems (NeurIPS)

work page 2020

[6] [6]

Yinfang Chen, Huaibing Xie, Minghua Ma, Yu Kang, Xin Gao, Liu Shi, Yunjie Cao, Xuedong Gao, Hao Fan, Ming Wen, Jun Zeng, Supriyo Ghosh, Xuchao Zhang, Chaoyun Zhang, Qingwei Lin, Saravan Rajmohan, Dongmei Zhang, and Tianyin Xu. 2024. Automatic Root Cause Analysis via Large Language Models for Cloud Incidents. InProceedings of the 19th European Conference o...

work page doi:10.1145/3627703.3629553 2024

[7] [7]

DeLong, Ramon Fernandez Mir, and Jacques D

Lara N. DeLong, Ramon Fernandez Mir, and Jacques D. Fleuriot. 2024. Neurosym- bolic AI for Reasoning Over Knowledge Graphs: A Survey.IEEE Transactions on Neural Networks and Learning Systems(2024). https://doi.org/10.1109/TNNLS. 2024.3420218

work page doi:10.1109/tnnls 2024

[8] [8]

Tim Dettmers, Pasquale Minervini, Pontus Stenetorp, and Sebastian Riedel. 2018. Convolutional 2D Knowledge Graph Embeddings. InProceedings of the AAAI Conference on Artificial Intelligence

work page 2018

[9] [9]

Patrizia d’Ettorre and Alain Lenoir. 2010. Nestmate Recognition. InAnt Ecology, Lori Lach, Catherine L. Parr, and Kirsti L. Abbott (Eds.). Oxford University Press, 194–209

work page 2010

[10] [10]

Gianni Di Caro and Marco Dorigo. 1998. AntNet: distributed stigmergetic control for communications networks.Journal of Artificial Intelligence Research9 (1998), 317–365

work page 1998

[11] [11]

Marco Dorigo and Luca Maria Gambardella. 1997. Ant colony system: a coopera- tive learning approach to the traveling salesman problem. InIEEE Transactions on Evolutionary Computation, Vol. 1. 53–66. https://doi.org/10.1109/4235.585892

work page doi:10.1109/4235.585892 1997

[12] [12]

Marco Dorigo, Vittorio Maniezzo, and Alberto Colorni. 1996. Ant system: optimization by a colony of cooperating agents.IEEE Transactions on Sys- tems, Man, and Cybernetics, Part B (Cybernetics)26, 1 (1996), 29–41. https: //doi.org/10.1109/3477.484436

work page doi:10.1109/3477.484436 1996

[13] [13]

2004.Ant Colony Optimization

Marco Dorigo and Thomas Stützle. 2004.Ant Colony Optimization. MIT Press

work page 2004

[14] [14]

Darren Edge, Ha Trinh, Newman Cheng, Joshua Bradley, Alex Chao, Apurva Mody, Steven Truitt, Dasha Metropolitansky, Robert Osazuwa Ness, and Jonathan Larson. 2025. From Local to Global: A Graph RAG Approach to Query-Focused Summarization. arXiv:2404.16130 [cs.CL] https://arxiv.org/abs/2404.16130

work page internal anchor Pith review Pith/arXiv arXiv 2025

[15] [15]

Corrado Gini. 1921. Measurement of inequality of incomes.The economic journal 31, 121 (1921), 124–125

work page 1921

[16] [16]

GitHub. 2026. GitHub Copilot CLI. https://github.blog/changelog/2026-02-25- github-copilot-cli-is-now-generally-available/. MCP-compatible agentic coding assistant for the command line

work page 2026

[17] [17]

Retrieval-Augmented Generation with Graphs (GraphRAG)

Haoyu Han, Yu Wang, Harry Shomer, Kai Guo, Jiayuan Ding, Yongjia Lei, Mahantesh Halappanavar, Ryan A. Rossi, Subhabrata Mukherjee, et al . 2025. Retrieval-Augmented Generation with Graphs (GraphRAG).arXiv preprint arXiv:2501.00309(2025)

work page internal anchor Pith review Pith/arXiv arXiv 2025

[18] [18]

Stephen C. Johnson. 1967. Hierarchical Clustering Schemes.Psychometrika32, 3 (Sept. 1967), 241–254. https://doi.org/10.1007/BF02289588

work page doi:10.1007/bf02289588 1967

[19] [19]

Liubov Kovriguina, Irina Toma, et al. 2024. LLM-based SPARQL Query Generation from Natural Language over Federated Knowledge Graphs. InProceedings of the International Semantic Web Conference (ISWC). https://arxiv.org/abs/2410.06062

work page arXiv 2024

[20] [20]

GraphMind : From Operational Traces to Self-Evolving Workflow Automation Conference’17, July 2017, Washington, DC, USA

Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, et al. GraphMind : From Operational Traces to Self-Evolving Workflow Automation Conference’17, July 2017, Washington, DC, USA

work page 2017

[21] [21]

In Advances in Neural Information Processing Systems (NeurIPS)

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. In Advances in Neural Information Processing Systems (NeurIPS)

work page

[22] [22]

Daniel Merkle, Martin Middendorf, and Hartmut Schmeck. 2002. Ant colony optimization for resource-constrained project scheduling. InIEEE Transactions on Evolutionary Computation, Vol. 6. 333–346

work page 2002

[23] [23]

OpenAI. 2023. Function Calling. https://platform.openai.com/docs/guides/ function-calling. Accessed: 2025-05-15

work page 2023

[24] [24]

OpenAI. 2023. GPT-4 Technical Report. https://cdn.openai.com/papers/gpt-4.pdf

work page 2023

[25] [25]

Shirui Pan, Linhao Luo, Yufei Wang, Chen Chen, Jiapu Wang, and Xindong Wu

work page

[26] [26]

IEEE Transactions on Knowledge and Data Engineering36, 7 (2024), 3580–3601

Unifying Large Language Models and Knowledge Graphs: A Roadmap. IEEE Transactions on Knowledge and Data Engineering36, 7 (2024), 3580–3601. https://doi.org/10.1109/TKDE.2024.3352100

work page doi:10.1109/tkde.2024.3352100 2024

[27] [27]

Kartik Ravichandran, Namrata Gurumurthy Kumar, Prateek Mishra, and Rahul Agrawal. 2025. OG-RAG: Ontology-Grounded Retrieval-Augmented Generation for Large Language Models. InProceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP). https://doi.org/10.18653/v1/2025.emnlp- main.1674

work page doi:10.18653/v1/2025.emnlp- 2025

[28] [28]

Timo Schick, Jane Dwivedi-Yu, Roberto Dessì, Roberta Raileanu, Maria Lomeli, Eric Hambro, Luke Zettlemoyer, Nicola Cancedda, and Thomas Scialom. 2023. Toolformer: Language Models Can Teach Themselves to Use Tools. InAdvances in Neural Information Processing Systems (NeurIPS)

work page 2023

[29] [29]

Mili Shah, Joyce Cahoon, Mirco Milletari, Jing Tian, Fotis Psallidas, Andreas Mueller, and Nick Litombe. 2024. Improving LLM-based KGQA for multi-hop Question Answering with implicit reasoning in few-shot examples. InProceedings of the 1st Workshop on Knowledge Graphs and Large Language Models (KaLLM 2024). Association for Computational Linguistics, Bangk...

work page doi:10.18653/v1/2024.kallm-1.13 2024

[30] [30]

Aditi Singh, Abul Ehtesham, Saket Kumar, and Tala Talaei Khoei. 2025. Agentic Retrieval-Augmented Generation: A Survey on Agentic RAG.arXiv preprint arXiv:2501.09136(2025)

work page internal anchor Pith review Pith/arXiv arXiv 2025

[31] [31]

Vikramank Singh, Kapil Eknath Vaidya, Vinayshekhar Bannihatti Kumar, Sopan Khosla, Murali Narayanaswamy, Rashmi Gangadharaiah, and Tim Kraska. 2024. Panda: Performance Debugging for Databases using LLM Agents. InProceedings of the Conference on Innovative Data Systems Research (CIDR)

work page 2024

[32] [32]

Thomas Stützle and Holger H. Hoos. 2000. MAX-MIN Ant System.Future Generation Computer Systems16, 8 (2000), 889–914. https://doi.org/10.1016/S0167- 739X(00)00043-1

work page doi:10.1016/s0167- 2000

[33] [33]

Sina Tabakhi, Parham Moradi, and Fardin Akhlaghian. 2014. An unsupervised feature selection algorithm based on ant colony optimization.Engineering Appli- cations of Artificial Intelligence32 (2014), 112–123

work page 2014

[34] [34]

Wil M. P. van der Aalst. 2011.Process Mining: Discovery, Conformance and Enhancement of Business Processes. Springer. https://doi.org/10.1007/978-3-642- 19345-3

work page doi:10.1007/978-3-642- 2011

[35] [35]

Wil M. P. van der Aalst, Ton Weijters, and Laura Maruster. 2004. Workflow Mining: Discovering Process Models from Event Logs.IEEE Transactions on Knowledge and Data Engineering16, 9 (2004), 1128–1142. https://doi.org/10.1109/TKDE.2004.47

work page doi:10.1109/tkde.2004.47 2004

[36] [36]

van Zweden and Patrizia d’Ettorre

Jelle S. van Zweden and Patrizia d’Ettorre. 2010. Nestmate Recognition in Social Insects and the Role of Hydrocarbons. InInsect Hydrocarbons: Biology, Biochem- istry, and Chemical Ecology, Gary J. Blomquist and Anne-Geneviève Bagnères (Eds.). Cambridge University Press, 222–243

work page 2010

[37] [37]

Lei Wang, Chen Ma, Xueyang Feng, Zeyu Zhang, Hao Yang, Jingsen Zhang, Zhiyuan Chen, Jiakai Tang, Xu Chen, Yankai Lin, et al. 2024. A survey on large language model based autonomous agents.Frontiers of Computer Science18, 6 (2024), 186345. https://doi.org/10.1007/s11704-024-40231-1

work page doi:10.1007/s11704-024-40231-1 2024

[38] [38]

Zhen Wang, Jianwen Zhang, Jianlin Feng, and Zheng Chen. 2014. Knowledge Graph Embedding by Translating on Hyperplanes. InProceedings of the AAAI Conference on Artificial Intelligence

work page 2014

[39] [39]

Xinyi Xia, Yijun Zhu, Jianan Guo, et al . 2025. Knowledge Graph Finetuning Enhances Knowledge Manipulation in Large Language Models. InProceedings of the International Conference on Learning Representations (ICLR). https:// openreview.net/forum?id=oMFOKjwaRS

work page 2025

[40] [40]

Yuqing Xue, Zhuoran Wang, Wei Sun, Fanxin Meng, Wenchao Zhang, Zhenyu Li, et al. 2025. TRIANGLE: A Benchmark and Framework for Automated Incident Triage in Large-Scale Cloud Systems. InProceedings of the ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE). https://netman.aiops.org/wp-conte...

work page 2025

[41] [41]

Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, and Yuan Cao. 2023. ReAct: Synergizing Reasoning and Acting in Language Models. InProceedings of the International Conference on Learning Representations (ICLR)

work page 2023

[42] [42]

William Zhang, Yiwen Zhu, Yunlei Lu, Mathieu Demarne, Wenjing Wang, Kai Deng, Nutan Sahoo, Katherine Lin, Miso Cilimdzic, and Subru Krishnan. 2025. FLAIR: Feedback Learning for Adaptive Information Retrieval. InProceedings of the 34th ACM International Conference on Information and Knowledge Management (CIKM ’25). Association for Computing Machinery, Seou...

work page doi:10.1145/3746252.3761553 2025

[43] [43]

Xuanhe Zhou, Guoliang Li, Zhaoyan Liu, et al. 2024. D-Bot: Database Diagnosis System using Large Language Models.Proceedings of the VLDB Endowment17, 10 (2024), 2514–2527. https://doi.org/10.14778/3675034.3675043

work page doi:10.14778/3675034.3675043 2024

[44] [44]

Yiwen Zhu, Mathieu Demarne, Kai Deng, Wenjing Wang, Nutan Sahoo, Divya Vermareddy, Hannah Lerner, Yunlei Lu, Swati Bararia, Anjali Bhavan, William Zhang, Xia Li, Katherine Lin, Miso Cilimdzic, and Subru Krishnan. 2025. DECO: Life-Cycle Management of Enterprise-Grade Copilots. https://doi.org/10.1145/ 3770854.3783949 arXiv:2412.06099 [cs.SE]

work page arXiv 2025