Agent Behavior Mining: Generative AI Agent Governance in Business Processes
Pith reviewed 2026-06-27 05:12 UTC · model grok-4.3
The pith
Generative AI agents in business processes become auditable when their reasoning traces are recorded as standard event logs.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
An event data model that translates granular agent activities including reasoning traces, tool usage, and token costs into standardized process logs enables the direct application of process mining techniques, thereby making generative AI agent decision-making observable and traceable and addressing the invisible autonomy risk in AI-driven business processes.
What carries the argument
The event data model for agent activities that records reasoning traces, tool usage, and token costs as process events suitable for mining.
If this is right
- Managers can detect when generative AI agents deviate from company policies using standard process mining tools.
- The amount of operational variability introduced by the agents can be measured and compared across runs.
- Behavioral transparency through log inspection is treated as a necessary condition for establishing trust in the agents.
- The capacity to examine agent reasoning steps is positioned as a required governance feature for future AI-driven processes.
Where Pith is reading between the lines
- The same logging structure could be adapted to monitor autonomous agents in domains outside business processes, such as customer service or software development pipelines.
- Linking the logs to cost and outcome data might let organizations tune agent prompts or tools for better compliance without separate oversight systems.
- Auditors or regulators could one day treat the presence of such reasoning traces as evidence of due diligence in AI deployments.
Load-bearing premise
The event data model that records reasoning traces, tool usage, and token costs can be implemented inside real multi-agent business processes without losing essential decision information or creating excessive costs.
What would settle it
A live multi-agent business process deployment in which the captured logs omit key decision factors or require so much extra effort that teams stop using the model.
Figures
read the original abstract
As organizations increasingly deploy generative AI agents to automate business processes, they face a governance dilemma: although these agents can increase operational flexibility, their non-deterministic nature challenges the control and standardization that Business Process Management seeks to enforce. This paper addresses this \emph{invisible autonomy risk} by introducing \emph{Agent Behavior Mining}, a governance capability that enables the application of process mining techniques to render generative AI agent decision-making observable and traceable. We (1) improve the understanding of generative AI agent behavior through an event data model that translates granular agent activities -- including reasoning traces, tool usage, and token costs -- into standardized process logs; (2) instantiate the data model in a multi-agent order-to-cash implementation, demonstrating how process managers can leverage agent logs to detect policy deviations and quantify operational variability; and (3) evaluate the perceived practical utility of the approach in an exploratory study with 18 industry practitioners. The results indicate that practitioners view behavioral transparency as a prerequisite for trust and consider the ability to examine agent reasoning as an important governance requirement for the next generation of AI-driven business processes.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces Agent Behavior Mining as a governance approach for generative AI agents in business processes. It defines an event data model that maps agent reasoning traces, tool usage, and token costs to standardized process logs; instantiates the model in a multi-agent order-to-cash system to enable deviation detection and variability quantification; and reports results from an exploratory study with 18 industry practitioners indicating that behavioral transparency is viewed as essential for trust and governance.
Significance. If the central claims hold, the work would offer a concrete mechanism for applying established process mining techniques to non-deterministic AI agents, addressing a practical gap in BPM governance. The practitioner evaluation, if methodologically sound, would provide initial evidence that transparency features align with industry needs. Strengths include the explicit linkage of agent internals to process logs and the focus on a real business process instantiation.
major comments (2)
- [Abstract] Abstract, points (1) and (2): the claim that the event data model enables detection of policy deviations and quantification of operational variability rests on the assumption that the mapping from raw reasoning traces and tool calls to logs preserves all decision-relevant information without truncation or prohibitive overhead. No fidelity metrics, information-loss analysis, or cost measurements are provided to substantiate this; if the model summarizes or filters traces, downstream mining may miss the very deviations the approach claims to surface.
- [Abstract] Abstract, point (3) and practitioner study description: the reported results on practitioner views of transparency as a prerequisite for trust are presented without any methods details, survey instrument, sampling procedure, response analysis, or quantitative breakdown. This absence prevents verification of the exploratory study's soundness and weakens its role as supporting evidence for the governance utility claim.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on the abstract claims and the practitioner evaluation. We address each major comment below and will revise the manuscript accordingly to improve substantiation and transparency.
read point-by-point responses
-
Referee: [Abstract] Abstract, points (1) and (2): the claim that the event data model enables detection of policy deviations and quantification of operational variability rests on the assumption that the mapping from raw reasoning traces and tool calls to logs preserves all decision-relevant information without truncation or prohibitive overhead. No fidelity metrics, information-loss analysis, or cost measurements are provided to substantiate this; if the model summarizes or filters traces, downstream mining may miss the very deviations the approach claims to surface.
Authors: We agree that the abstract does not include explicit fidelity metrics or overhead measurements. The full manuscript (Section 3) defines a direct mapping that retains all reasoning traces, tool calls, and costs without summarization or filtering, and Section 4 demonstrates deviation detection on the order-to-cash process. To strengthen the claim, we will add a dedicated analysis subsection quantifying information preservation (e.g., trace completeness) and token overhead in the revised version. revision: yes
-
Referee: [Abstract] Abstract, point (3) and practitioner study description: the reported results on practitioner views of transparency as a prerequisite for trust are presented without any methods details, survey instrument, sampling procedure, response analysis, or quantitative breakdown. This absence prevents verification of the exploratory study's soundness and weakens its role as supporting evidence for the governance utility claim.
Authors: We acknowledge that the abstract and study description lack sufficient methodological detail. The exploratory study with 18 practitioners is reported in Section 5, but we agree more transparency is needed. In revision we will expand both the abstract and Section 5 to include the survey instrument, sampling approach, response rate, and quantitative breakdown of findings. revision: yes
Circularity Check
No circularity: empirical application of process mining to AI agents with no derivations or self-referential reductions
full rationale
The paper introduces Agent Behavior Mining as the application of existing process mining techniques to generative AI agent logs via an event data model. No equations, fitted parameters, predictions derived from inputs, or load-bearing self-citations appear in the abstract or described contributions. The three steps—defining the data model, instantiating it in an order-to-cash system, and running a practitioner study—are presented as independent empirical work without any reduction of outputs to inputs by construction. This matches the default expectation of a non-circular paper.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Springer, 2 edn
van der Aalst, W.M.P.: Process Mining - Data Science in Action. Springer, 2 edn. (2016)
2016
-
[2]
vanderAalst,W.M.P.,Bichler,M.,Heinzl,A.:Roboticprocessautomation.Business & information systems engineering 60(4), 269–272 (2018)
2018
-
[3]
In: Contemporary issues in database design and information systems development, pp
van der Aalst, W.M.P., Netjes, M., Reijers, H.A.: Supporting the full bpm life-cycle using process mining and intelligent redesign. In: Contemporary issues in database design and information systems development, pp. 100–132. IGI Global Scientific Publishing (2007)
2007
-
[4]
Qualitative research in psychology 3(2), 77–101 (2006)
Braun, V., Clarke, V.: Using thematic analysis in psychology. Qualitative research in psychology 3(2), 77–101 (2006)
2006
-
[5]
MIS quarterly 13(3), 319–340 (1989)
Davis, F.D.: Perceived usefulness, perceived ease of use, and user acceptance of information technology. MIS quarterly 13(3), 319–340 (1989)
1989
-
[6]
arXiv preprint arXiv:2411.05285 (2024)
Dong, L., Lu, Q., Zhu, L.: Agentops: Enabling observability of llm agents. arXiv preprint arXiv:2411.05285 (2024)
arXiv 2024
-
[7]
ACM Trans
Dumas, M., Fournier, F., Limonad, L., Marrella, A., Montali, M., Rehse, J.R., Ac- corsi, R., Calvanese, D., De Giacomo, G., Fahland, D., et al.: AI-augmented business process management systems: a research manifesto. ACM Trans. on Management Information Systems 14(1), 1–19 (2023)
2023
-
[8]
Springer (2018)
Dumas, M., Rosa, L.M., Mendling, J., Reijers, A.H.: Fundamentals of business process management. Springer (2018)
2018
-
[9]
arXiv preprint arXiv:2410.14495 (2024)
Fahland, D., Montali, M., Lebherz, J., van der Aalst, W.M.P., van Asseldonk, M., Blank, P., Bosmans, L., Brenscheidt, M., Di Ciccio, C., Delgado, A., et al.: Towards a simple and extensible standard for object-centric event data (oced)–core model, design space, and lessons learned. arXiv preprint arXiv:2410.14495 (2024)
arXiv 2024
-
[10]
Sage publications (2013) Agent Behavior Mining 17
Fowler Jr, F.J.: Survey research methods. Sage publications (2013) Agent Behavior Mining 17
2013
-
[11]
Academy of management annals 14(2), 627–660 (2020)
Glikson, E., Woolley, A.W.: Human trust in artificial intelligence: Review of empir- ical research. Academy of management annals 14(2), 627–660 (2020)
2020
-
[12]
BPM reports, BPMcenter.org (2014)
Gunther, C., Verbeek, H.: XES - standard definition. BPM reports, BPMcenter.org (2014)
2014
-
[13]
In: Design research in information systems: theory and practice, pp
Hevner, A., Chatterjee, S.: Design science research in information systems. In: Design research in information systems: theory and practice, pp. 9–22. Springer (2010)
2010
-
[14]
Jennings, N.R., Norman, T.J., Faratin, P., O’Brien, P.D., Odgers, B.: Autonomous agents for business process management. Appl. Artif. Intell. 14(2), 145–189 (2000)
2000
-
[15]
Academy of management annals 14(1), 366–410 (2020)
Kellogg, K.C., Valentine, M.A., Christin, A.: Algorithms at work: The new contested terrain of control. Academy of management annals 14(1), 366–410 (2020)
2020
-
[16]
In: Proceedings of the International Conference on Information Systems (ICIS) 2017
Mohlmann, M., Zalmanson, L.: Hands on the wheel: Navigating algorithmic manage- ment and uber drivers’ autonomy. In: Proceedings of the International Conference on Information Systems (ICIS) 2017. AIS (2017),https://aisel.aisnet.org/ icis2017/DigitalPlatforms/Presentations/3, paper 3
2017
-
[17]
arXiv preprint arXiv:2503.06745 (2025)
Moshkovich, D., Mulian, H., Zeltyn, S., Eder, N., Skarbovsky, I., Abitbol, R.: Beyond black-box benchmarking: Observability, analytics, and optimization of agentic systems. arXiv preprint arXiv:2503.06745 (2025)
arXiv 2025
-
[18]
OpenTelemetry Authors: Semantic conventions for generative ai systems.https: //opentelemetry.io/docs/specs/semconv/gen-ai/(2024), version 1.36.0
2024
-
[19]
Journal of applied psychology 88(5), 879 (2003)
Podsakoff, P.M., MacKenzie, S.B., Lee, J.Y., Podsakoff, N.P.: Common method biases in behavioral research: a critical review of the literature and recommended remedies. Journal of applied psychology 88(5), 879 (2003)
2003
-
[20]
In: Proceedings of the 2020 conference on fairness, accountability, and transparency
Raji, I.D., Smart, A., White, R.N., Mitchell, M., Gebru, T., Hutchinson, B., SmithLoud, J., Theron, D., Barnes, P.: Closing the ai accountability gap: Defining an end-to-end framework for internal algorithmic auditing. In: Proceedings of the 2020 conference on fairness, accountability, and transparency. pp. 33–44 (2020)
2020
-
[21]
Com- puters in Industry 126, 103404 (2021)
Reijers, H.A.: Business process management: The evolution of a discipline. Com- puters in Industry 126, 103404 (2021)
2021
-
[22]
In: Handbook on business process management 1: introduction, methods, and information systems, pp
Rosemann, M., vom Brocke, J.: The six core elements of business process manage- ment. In: Handbook on business process management 1: introduction, methods, and information systems, pp. 105–122. Springer (2014)
2014
-
[23]
ACM SIGAda Ada Letters 43(2), 43–51 (2024)
Schmidt, D.C., Spencer-Smith, J., Fu, Q., White, J.: Towards a catalog of prompt patterns to enhance the discipline of prompt engineering. ACM SIGAda Ada Letters 43(2), 43–51 (2024)
2024
-
[24]
In: Conceptual Modeling
Shen, Q., Polyvyanyy, A., Lipovetzky, N., Kampik, T.: Agent system event data: Concepts, dimensions, applications. In: Conceptual Modeling. pp. 56–72 (2024)
2024
-
[25]
IEEE Access 9, 99480–99494 (2021)
Tour, A., Polyvyanyy, A., Kalenkova, A.A.: Agent system mining: Vision, benefits, and challenges. IEEE Access 9, 99480–99494 (2021)
2021
-
[26]
In: Business Process Management
Tour, A., Polyvyanyy, A., Kalenkova, A.A., Senderovich, A.: Agent miner: An algorithm for discovering agent systems from event data. In: Business Process Management. pp. 284–302 (2023)
2023
-
[27]
Harvard Business Press (2004)
Weill, P., Ross, J.W.: IT governance: How top performers manage IT decision rights for superior results. Harvard Business Press (2004)
2004
-
[28]
Science China Information Sciences 68(2), 121101 (2025)
Xi, Z., Chen, W., Guo, X., He, W., Ding, Y., Hong, B., Zhang, M., Wang, J., Jin, S., Zhou, E., et al.: The rise and potential of large language model based agents: A survey. Science China Information Sciences 68(2), 121101 (2025)
2025
-
[29]
In: ICLR
Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K.R., Cao, Y.: React: Synergizing reasoning and acting in language models. In: ICLR. OpenReview.net (2023),https://openreview.net/forum?id=WE_vluYUL-X
2023
-
[30]
Yin, R.K.: Case study research: Design and methods, vol. 5. sage (2009)
2009
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.