pith. sign in

arxiv: 2604.14687 · v1 · submitted 2026-04-16 · 💻 cs.AI

M2-PALE: A Framework for Explaining Multi-Agent MCTS--Minimax Hybrids via Process Mining and LLMs

Pith reviewed 2026-05-10 11:39 UTC · model grok-4.3

classification 💻 cs.AI
keywords MCTSMinimaxprocess mininglarge language modelsexplainable AImulti-agent systemscheckershybrid search
0
0 comments X

The pith

M2-PALE extracts process models from MCTS-Minimax hybrid traces and uses LLMs to generate causal explanations of their decisions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces M2-PALE as a way to interpret the behavior of multi-agent MCTS agents that have been augmented with shallow full-width Minimax search in the rollout phase. Process mining algorithms extract behavioral workflows directly from the agents' execution traces, after which LLMs convert those models into human-readable causal and distal accounts of why particular moves are chosen. A sympathetic reader would care because standard MCTS can miss key moves and fall into traps, while the hybrid version adds depth but makes the resulting trees even harder to follow. The demonstration occurs in a small-scale checkers environment and is positioned as a foundation that can extend to more complex domains.

Core claim

The authors establish that process models discovered by Alpha Miner, iDHM, and Inductive Miner from hybrid agent traces can be synthesized by LLMs into accurate, human-readable explanations that capture both the immediate causes and longer-term strategic intent behind the agent's choices.

What carries the argument

M2-PALE framework, which applies Alpha Miner, iDHM, and Inductive Miner to agent execution traces and then uses LLMs to synthesize the resulting process models into causal and distal explanations.

If this is right

  • Developers can identify tactical weaknesses such as omitted critical moves that standard MCTS would miss.
  • The hybrid agents become more interpretable in domains where strategic depth matters.
  • Explanations scale with domain complexity because the mining and synthesis steps operate on traces rather than on the full search tree.
  • Users receive concrete causal reasons for agent behavior instead of opaque tree statistics.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same pipeline could be tested on larger checkers boards or other perfect-information games to measure how explanation quality changes with state-space size.
  • If the explanations prove reliable, they could be fed back into agent design to automatically adjust Minimax depth or MCTS selection policies.
  • Real-time versions of the framework might support human-AI team play by surfacing explanations during ongoing games.

Load-bearing premise

The mined process models, once interpreted by LLMs, faithfully describe the agent's actual decision logic rather than producing plausible but inaccurate summaries.

What would settle it

A direct comparison on held-out game positions in which the LLM-generated explanations predict moves that contradict the actual choices made by the hybrid agent when the process model is consulted.

Figures

Figures reproduced from arXiv: 2604.14687 by Liyuan Zhao, Tim Miller, Yiyu Qian.

Figure 1
Figure 1. Figure 1: Four quality dimensions in process mining [10] [PITH_FULL_IMAGE:figures/full_fig_p005_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: One iteration of the general MCTS approach [9] [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Overview of methodology In this study, we apply this approach to the domain of checkers, a non￾cooperative board game where two players compete to capture opponent pieces or exhaust their available moves [22]. The implementation details are as follows [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Visual plots showing fitness metrics across all experimental models (5– [PITH_FULL_IMAGE:figures/full_fig_p013_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Pruning operation [PITH_FULL_IMAGE:figures/full_fig_p024_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Simplified red agent Petri-net (10 episodes) generated by inductive miner [PITH_FULL_IMAGE:figures/full_fig_p025_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Simplified white agent Petri-net (10 episodes) generated by inductive [PITH_FULL_IMAGE:figures/full_fig_p026_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Red agent C-net generated by iDHM (Fixed Simulation Depth, Fixed [PITH_FULL_IMAGE:figures/full_fig_p027_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Red agent Petri-net generated by inductive miner algorithm (Fixed Sim [PITH_FULL_IMAGE:figures/full_fig_p027_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: White agent C-net generated by iDHM (Fixed Simulation Depth, Fixed [PITH_FULL_IMAGE:figures/full_fig_p028_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: White agent Petri-net generated by inductive miner algorithm (Fixed [PITH_FULL_IMAGE:figures/full_fig_p029_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Red agent C-net generated by iDHM (Fixed Simulation Depth, Fixed [PITH_FULL_IMAGE:figures/full_fig_p030_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Red agent Petri-net generated by inductive miner algorithm (Fixed Sim [PITH_FULL_IMAGE:figures/full_fig_p031_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: White agent C-net generated by iDHM (Fixed Simulation Depth, Fixed [PITH_FULL_IMAGE:figures/full_fig_p031_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: White agent Petri-net generated by inductive miner algorithm (Fixed [PITH_FULL_IMAGE:figures/full_fig_p032_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: Red agent C-net generated by iDHM (Fixed Simulation Depth, Fixed [PITH_FULL_IMAGE:figures/full_fig_p033_16.png] view at source ↗
Figure 17
Figure 17. Figure 17: Red agent Petri-net generated by inductive miner algorithm (Fixed Sim [PITH_FULL_IMAGE:figures/full_fig_p033_17.png] view at source ↗
Figure 18
Figure 18. Figure 18: White Agent C-net generated by iDHM (Fixed Simulation Depth, Fixed [PITH_FULL_IMAGE:figures/full_fig_p034_18.png] view at source ↗
Figure 19
Figure 19. Figure 19: White Agent Petri-net generated by inductive miner algorithm (Fixed [PITH_FULL_IMAGE:figures/full_fig_p035_19.png] view at source ↗
Figure 20
Figure 20. Figure 20: Red Agent C-net generated by iDHM (Fixed Iteration Times, Fixed [PITH_FULL_IMAGE:figures/full_fig_p036_20.png] view at source ↗
Figure 21
Figure 21. Figure 21: Red Agent Petri-net generated by inductive miner algorithm (Fixed [PITH_FULL_IMAGE:figures/full_fig_p036_21.png] view at source ↗
Figure 22
Figure 22. Figure 22: White Agent C-net generated by iDHM (Fixed Iteration Times, Fixed [PITH_FULL_IMAGE:figures/full_fig_p037_22.png] view at source ↗
Figure 23
Figure 23. Figure 23: White Agent Petri-net generated by inductive miner algorithm (Fixed [PITH_FULL_IMAGE:figures/full_fig_p038_23.png] view at source ↗
Figure 24
Figure 24. Figure 24: Red Agent C-net generated by iDHM (Fixed Iteration Times, Fixed [PITH_FULL_IMAGE:figures/full_fig_p039_24.png] view at source ↗
Figure 25
Figure 25. Figure 25: Red Agent Petri-net generated by inductive miner algorithm (Fixed [PITH_FULL_IMAGE:figures/full_fig_p040_25.png] view at source ↗
Figure 26
Figure 26. Figure 26: White Agent C-net generated by iDHM (Fixed Iteration Times, Fixed [PITH_FULL_IMAGE:figures/full_fig_p040_26.png] view at source ↗
Figure 27
Figure 27. Figure 27: White Agent Petri-net generated by inductive miner algorithm (Fixed [PITH_FULL_IMAGE:figures/full_fig_p041_27.png] view at source ↗
Figure 28
Figure 28. Figure 28: Red agent C-net generated by iDHM (Fixed Iteration Times, Fixed [PITH_FULL_IMAGE:figures/full_fig_p041_28.png] view at source ↗
Figure 29
Figure 29. Figure 29: Red agent Petri-net generated by inductive miner algorithm (Fixed It [PITH_FULL_IMAGE:figures/full_fig_p042_29.png] view at source ↗
Figure 30
Figure 30. Figure 30: White agent C-net generated by iDHM (Fixed Iteration Times, Fixed [PITH_FULL_IMAGE:figures/full_fig_p042_30.png] view at source ↗
Figure 31
Figure 31. Figure 31: White agent Petri-net generated by inductive miner algorithm (Fixed [PITH_FULL_IMAGE:figures/full_fig_p043_31.png] view at source ↗
Figure 32
Figure 32. Figure 32: Red agent C-net generated by iDHM (Fixed Simulation Depth, Fixed [PITH_FULL_IMAGE:figures/full_fig_p044_32.png] view at source ↗
Figure 33
Figure 33. Figure 33: Red agent Petri-net generated by inductive miner algorithm (Fixed Sim [PITH_FULL_IMAGE:figures/full_fig_p045_33.png] view at source ↗
Figure 34
Figure 34. Figure 34: White agent C-net generated by iDHM (Fixed Simulation Depth, Fixed [PITH_FULL_IMAGE:figures/full_fig_p045_34.png] view at source ↗
Figure 35
Figure 35. Figure 35: White agent Petri-net generated by inductive miner algorithm (Fixed [PITH_FULL_IMAGE:figures/full_fig_p046_35.png] view at source ↗
Figure 36
Figure 36. Figure 36: Red agent C-net generated by iDHM (Fixed Simulation Depth, Fixed [PITH_FULL_IMAGE:figures/full_fig_p047_36.png] view at source ↗
Figure 37
Figure 37. Figure 37: Red agent Petri-net generated by inductive miner algorithm (Fixed Sim [PITH_FULL_IMAGE:figures/full_fig_p048_37.png] view at source ↗
Figure 38
Figure 38. Figure 38: White agent C-net generated by iDHM (Fixed Simulation Depth, Fixed [PITH_FULL_IMAGE:figures/full_fig_p048_38.png] view at source ↗
Figure 39
Figure 39. Figure 39: White agent Petri-net generated by inductive miner algorithm (Fixed [PITH_FULL_IMAGE:figures/full_fig_p049_39.png] view at source ↗
Figure 40
Figure 40. Figure 40: Red agent C-net generated by iDHM (Fixed Simulation Depth, Fixed [PITH_FULL_IMAGE:figures/full_fig_p050_40.png] view at source ↗
Figure 41
Figure 41. Figure 41: Red agent Petri-net generated by inductive miner algorithm (Fixed Sim [PITH_FULL_IMAGE:figures/full_fig_p050_41.png] view at source ↗
Figure 42
Figure 42. Figure 42: White agent C-net generated by iDHM (Fixed Simulation Depth, Fixed [PITH_FULL_IMAGE:figures/full_fig_p051_42.png] view at source ↗
Figure 43
Figure 43. Figure 43: White agent Petri-net generated by inductive miner algorithm (Fixed [PITH_FULL_IMAGE:figures/full_fig_p051_43.png] view at source ↗
read the original abstract

Monte-Carlo Tree Search (MCTS) is a fundamental sampling-based search algorithm widely used for online planning in sequential decision-making domains. Despite its success in driving recent advances in artificial intelligence, understanding the behavior of MCTS agents remains a challenge for both developers and users. This difficulty stems from the complex search trees produced through the simulation of numerous future states and their intricate relationships. A known weakness of standard MCTS is its reliance on highly selective tree construction, which may lead to the omission of crucial moves and a vulnerability to tactical traps. To resolve this, we incorporate shallow, full-width Minimax search into the rollout phase of multi-agent MCTS to enhance strategic depth. Furthermore, to demystify the resulting decision-making logic, we introduce \textsf{M2-PALE} (MCTS--Minimax Process-Aided Linguistic Explanations). This framework employs process mining techniques, specifically the Alpha Miner, iDHM, and Inductive Miner algorithms, to extract underlying behavioral workflows from agent execution traces. These process models are then synthesized by LLMs to generate human-readable causal and distal explanations. We demonstrate the efficacy of our approach in a small-scale checkers environment, establishing a scalable foundation for interpreting hybrid agents in increasingly complex strategic domains.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes M2-PALE, a framework for explaining the behavior of multi-agent MCTS-Minimax hybrid agents. It extracts process models from agent execution traces using the Alpha Miner, iDHM, and Inductive Miner algorithms, then synthesizes these models with LLMs to produce human-readable causal and distal explanations. The approach is demonstrated via a small-scale checkers environment to address limitations in standard MCTS such as selective tree construction and vulnerability to tactical traps.

Significance. If the generated explanations prove faithful to the underlying agent's search biases and value estimates, the framework could provide a useful bridge between process mining and LLM-based interpretability for complex strategic agents. The hybrid MCTS-Minimax design itself is a straightforward engineering response to known MCTS weaknesses, but the paper's contribution rests on an unverified demonstration rather than quantitative validation or theoretical guarantees.

major comments (2)
  1. [Abstract] Abstract and demonstration section: the claim that M2-PALE 'demonstrate[s] the efficacy of our approach' in checkers supplies no quantitative metrics for explanation fidelity, no comparison against ground-truth minimax/MCTS value estimates or move preferences, no human or automated faithfulness evaluation, and no error analysis. This leaves the central claim that the LLM-synthesized process models yield accurate causal and distal explanations unsupported by evidence.
  2. [Framework description] The weakest assumption—that Alpha Miner/iDHM/Inductive Miner models, when fed to an LLM, recover the agent's true decision logic rather than plausible but unfaithful summaries—is stated but never tested against the hybrid agent's internal state (e.g., rollout values or tree statistics). Without such a test the framework cannot be distinguished from post-hoc rationalization.
minor comments (2)
  1. [§3] The expansion of the acronym M2-PALE is given clearly, but the manuscript would benefit from an explicit statement of the input/output format of the process-mining step (event logs, activity labels, etc.) to allow replication.
  2. [§2] Notation for the hybrid agent (MCTS rollout phase augmented by shallow Minimax) is introduced without a diagram or pseudocode; a small figure would clarify the integration point.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback on our manuscript. We address each major comment below and describe the revisions we will undertake to strengthen the work.

read point-by-point responses
  1. Referee: [Abstract] Abstract and demonstration section: the claim that M2-PALE 'demonstrate[s] the efficacy of our approach' in checkers supplies no quantitative metrics for explanation fidelity, no comparison against ground-truth minimax/MCTS value estimates or move preferences, no human or automated faithfulness evaluation, and no error analysis. This leaves the central claim that the LLM-synthesized process models yield accurate causal and distal explanations unsupported by evidence.

    Authors: We agree that the current demonstration is illustrative rather than supported by quantitative evidence. The manuscript shows example explanations generated by M2-PALE in the small-scale checkers setting but does not report fidelity metrics, comparisons to ground-truth agent values, or formal evaluations. In the revised manuscript we will add a dedicated evaluation subsection that includes (i) quantitative alignment scores between LLM-generated causal explanations and the hybrid agent's rollout values and move preferences, (ii) a basic error analysis of cases where explanations diverge from internal search statistics, and (iii) a small-scale automated faithfulness check. We will also moderate the abstract claim from 'demonstrate the efficacy' to 'illustrate the framework and provide initial evidence of its utility' pending these additions. revision: yes

  2. Referee: [Framework description] The weakest assumption—that Alpha Miner/iDHM/Inductive Miner models, when fed to an LLM, recover the agent's true decision logic rather than plausible but unfaithful summaries—is stated but never tested against the hybrid agent's internal state (e.g., rollout values or tree statistics). Without such a test the framework cannot be distinguished from post-hoc rationalization.

    Authors: We accept this critique. The manuscript presents the process-mining-plus-LLM pipeline as recovering decision logic but does not validate the resulting models or explanations against the agent's internal MCTS tree statistics or Minimax value estimates. In revision we will insert an explicit validation step within the checkers case study: we will extract the process models, generate LLM explanations, and then measure their consistency with logged rollout values, visit counts, and final move selections produced by the hybrid agent. This comparison will be reported quantitatively and will help differentiate the approach from post-hoc rationalization. revision: yes

Circularity Check

0 steps flagged

No circularity: methodological proposal without derivations or self-referential predictions

full rationale

The paper introduces M2-PALE as a framework combining process mining (Alpha Miner, iDHM, Inductive Miner) with LLMs to generate explanations from agent traces in a hybrid MCTS-Minimax setting. No equations, fitted parameters, or predictive claims appear in the provided text. The demonstration is limited to a small-scale checkers environment with no internal reduction of results to inputs by construction. No self-citations are invoked as load-bearing uniqueness theorems or ansatzes. The central claim rests on external validation of explanation fidelity rather than any self-definitional or fitted-input structure.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The framework rests on the domain assumption that process-mining algorithms applied to agent traces will surface decision-relevant workflows and that LLMs can translate those models into faithful explanations; no free parameters or new entities are introduced in the abstract.

axioms (2)
  • domain assumption Process mining algorithms (Alpha Miner, iDHM, Inductive Miner) can extract meaningful behavioral workflows from MCTS-Minimax execution traces
    Invoked when the authors state that these algorithms are used to extract underlying behavioral workflows
  • domain assumption LLMs can synthesize the extracted process models into accurate causal and distal explanations
    Invoked in the description of how explanations are generated

pith-pipeline@v0.9.0 · 5532 in / 1410 out tokens · 30381 ms · 2026-05-10T11:39:24.752826+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

29 extracted references · 29 canonical work pages

  1. [1]

    WileyInterdisciplinary Reviews: Data Mining and Knowledge Discovery2(2), 182–192 (2012)

    Van der Aalst, W., Adriansyah, A., van Dongen, B.: Replaying history on process modelsforconformancecheckingand performanceanalysis. WileyInterdisciplinary Reviews: Data Mining and Knowledge Discovery2(2), 182–192 (2012)

  2. [2]

    In: IEEE 7th International Conference on Research Challenges in Information Science (RCIS)

    van der Aalst, W.M.: Mediating between modeled and observed behavior: The quest for the “right” process: keynote. In: IEEE 7th International Conference on Research Challenges in Information Science (RCIS). pp. 1–12. IEEE (2013)

  3. [3]

    arXiv preprint arXiv:2407.10820 (2024)

    An, Z., Baier, H., Dubey, A., Mukhopadhyay, A., Ma, M.: Enabling mcts ex- plainability for sequential planning through computation tree logic. arXiv preprint arXiv:2407.10820 (2024)

  4. [4]

    Information fusion58, 82–115 (2020)

    Arrieta, A.B., Díaz-Rodríguez, N., Del Ser, J., Bennetot, A., Tabik, S., Barbado, A., García, S., Gil-López, S., Molina, D., Benjamins, R., et al.: Explainable artifi- cial intelligence (xai): Concepts, taxonomies, opportunities and challenges toward responsible ai. Information fusion58, 82–115 (2020)

  5. [5]

    IEEE Transactions on Compu- tational Intelligence and AI in Games7(2), 167–179 (2014)

    Baier, H., Winands, M.H.: Mcts-minimax hybrids. IEEE Transactions on Compu- tational Intelligence and AI in Games7(2), 167–179 (2014)

  6. [6]

    In: PyCON Python Conference

    Beazley, D.: Understanding the python gil. In: PyCON Python Conference. At- lanta, Georgia (2010)

  7. [7]

    In: Proceedings of the 2020 conference on fairness, accountability, and transparency

    Bhatt, U., Xiang, A., Sharma, S., Weller, A., Taly, A., Jia, Y., Ghosh, J., Puri, R., Moura, J.M., Eckersley, P.: Explainable machine learning in deployment. In: Proceedings of the 2020 conference on fairness, accountability, and transparency. pp. 648–657 (2020)

  8. [8]

    Llms for explainable ai: A comprehensive survey.arXiv preprint arXiv:2504.00125, 2025

    Bilal, A., Ebert, D., Lin, B.: Llms for explainable ai: A comprehensive survey. arXiv preprint arXiv:2504.00125 (2025)

  9. [9]

    Browne,C.B.,Powley,E.,Whitehouse,D.,Lucas,S.M.,Cowling,P.I.,Rohlfshagen, P., Tavener, S., Perez, D., Samothrakis, S., Colton, S.: A survey of monte carlo tree searchmethods.IEEETransactionsonComputationalIntelligenceandAIingames 4(1), 1–43 (2012)

  10. [10]

    International Journal of Cooperative Information Systems23(01), 1440001 (2014)

    Buijs, J.C., van Dongen, B.F., van der Aalst, W.M.: Quality dimensions in pro- cess discovery: The importance of fitness, precision, generalization and simplicity. International Journal of Cooperative Information Systems23(01), 1440001 (2014)

  11. [11]

    arXiv preprint arXiv:2408.05488 (2024)

    Bustin, R., Goldman, C.V.: Structure and reduction of mcts for explainable-ai. arXiv preprint arXiv:2408.05488 (2024)

  12. [12]

    In: Proceedings of the AAAI Conference on Artificial In- telligence and Interactive Digital Entertainment

    Chaslot, G., Bakkes, S., Szita, I., Spronck, P.: Monte-carlo tree search: A new framework for game ai. In: Proceedings of the AAAI Conference on Artificial In- telligence and Interactive Digital Entertainment. vol. 4, pp. 216–217 (2008)

  13. [13]

    A survey on explainable deep reinforcement learning.CoRR, abs/2502.06869, 2025

    Cheng, Z., Yu, J., Xing, X.: A survey on explainable deep reinforcement learning. arXiv preprint arXiv:2502.06869 (2025)

  14. [14]

    Interpretable contrastive monte carlo tree search reasoning.arXiv preprint arXiv:2410.01707,

    Gao, Z., Niu, B., He, X., Xu, H., Liu, H., Liu, A., Hu, X., Wen, L.: Inter- pretable contrastive monte carlo tree search reasoning, 2024b. URL https://arxiv. org/abs/2410.01707

  15. [15]

    In: International Conference on Advanced Information Systems Engineering

    Gerlach, Y., Seeliger, A., Nolle, T., Mühlhäuser, M.: Inferring a multi-perspective likelihood graph from black-box next event predictors. In: International Conference on Advanced Information Systems Engineering. pp. 19–35. Springer (2022)

  16. [16]

    arXiv preprint arXiv:1610.07989 (2016)

    Ghawi, R.: Process discovery using inductive miner and decomposition. arXiv preprint arXiv:1610.07989 (2016)

  17. [17]

    International Journal of Advanced Engineering Technolo- gies and Innovations10(2), 603931 (2024) 18 Qian, Zhao, and Miller

    Khan,N.,Shahid,M.A.,Rasool,S.:Leveragingaiinaccountingandfinance:Trans- forming business operations and enhancing healthcare decision-making through brain-inspired analytics. International Journal of Advanced Engineering Technolo- gies and Innovations10(2), 603931 (2024) 18 Qian, Zhao, and Miller

  18. [18]

    In: Machine Learn- ing: ECML 2006: 17th European Conference on Machine Learning Berlin, Ger- many, September 18-22, 2006 Proceedings 17

    Kocsis, L., Szepesvári, C.: Bandit based monte-carlo planning. In: Machine Learn- ing: ECML 2006: 17th European Conference on Machine Learning Berlin, Ger- many, September 18-22, 2006 Proceedings 17. pp. 282–293. Springer (2006)

  19. [19]

    arXiv preprint arXiv:2001.10284 (2020)

    Madumal, P., Miller, T., Sonenberg, L., Vetere, F.: Distal explanations for model- free explainable reinforcement learning. arXiv preprint arXiv:2001.10284 (2020)

  20. [20]

    Artificial intelligence267, 1–38 (2019)

    Miller, T.: Explanation in artificial intelligence: Insights from the social sciences. Artificial intelligence267, 1–38 (2019)

  21. [21]

    The Knowledge Engineering Review36(2021)

    Miller, T.: Contrastive explanation: A structural-model approach. The Knowledge Engineering Review36(2021)

  22. [22]

    IBM Journal of research and development3(3), 210–229 (1959)

    Samuel, A.L.: Some studies in machine learning using the game of checkers. IBM Journal of research and development3(3), 210–229 (1959)

  23. [23]

    Trinity College Dublin (2011)

    Strong, G.: The minimax algorithm. Trinity College Dublin (2011)

  24. [24]

    In: Inter- national conference on application and theory of petri nets

    Van Dongen, B.F., de Medeiros, A.K.A., Verbeek, H., Weijters, A., van Der Aalst, W.M.: The prom framework: A new era in process mining tool support. In: Inter- national conference on application and theory of petri nets. pp. 444–454. Springer (2005)

  25. [25]

    Journal of Software: Evolution and Process31(6), e2170 (2019)

    Verenich, I., Dumas, M., La Rosa, M., Nguyen, H.: Predicting process performance: A white-box approach based on process models. Journal of Software: Evolution and Process31(6), e2170 (2019)

  26. [26]

    ACM Sig- plan Oops Messenger1(1), 7–87 (1990)

    Wegner, P.: Concepts and paradigms of object-oriented programming. ACM Sig- plan Oops Messenger1(1), 7–87 (1990)

  27. [27]

    0”),(2,(“lef t

    Ziyan, A., Wang, X., Baier, H., Chen, Z., Dubey, A., Johnson, T.T., Sprinkle, J., Mukhopadhyay, A., Ma, M.: Combining llms with a logic-based framework to explain mcts (2025) Title Suppressed Due to Excessive Length 19 A Evaluate Process Models Based on Trial 1, Trial 2, Trial 3 Trial 1: Variable Iteration Times – Iteration Times = 1000: •Red Agent:Figure...

  28. [28]

    left", “down

    Red Agent Strategic Analysis (Ref: Figure 6) The following insights are derived from the hierarchical transition layers of the Red agent’s Petri-net: Causal Selection (Q1) The recommendation to selectPiece 1 (left, up)or Piece 2 (left, up)in the second layer is driven by im- mediate reward optimization. These transitions yield7 reward points, correlating ...

  29. [29]

    right", “down

    White Agent Strategic Analysis (Ref: Figure 7) The interpretation of the White agent’s procedural patterns is summarized as follows: Causal Selection (Q1) Upon the Red agent moving Piece 3 (left, down), the sys- tem recommendsWhite Piece 2 (right, up). This se- lection is justified by its potential to trigger a capture or crowning event, identified as a 7...