SAT-RTS: A systematic framework for tactical knowledge extraction and visualization-based analysis in real-time strategy games

Changhe Li; Chunhui Bai; Lei Liu; Shoufei Han; Yuqiang Li

arxiv: 2606.30090 · v1 · pith:A2PEXR55new · submitted 2026-06-29 · 💻 cs.AI

SAT-RTS: A systematic framework for tactical knowledge extraction and visualization-based analysis in real-time strategy games

Chunhui Bai , Changhe Li , Yuqiang Li , Lei Liu , Shoufei Han This is my paper

Pith reviewed 2026-06-30 06:00 UTC · model grok-4.3

classification 💻 cs.AI

keywords tactical knowledge extractionreal-time strategy gamesvisualization-based analysisBK-tree clusteringstate-action sequencesmicromanagement analysisinterpretable decision patterns

0 comments

The pith

SAT-RTS turns high-dimensional RTS game sequences into hierarchical tactical labels and visualizations that reveal decision drivers.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces the SAT-RTS framework to extract tactical knowledge from real-time strategy game data where state-action sequences are high-dimensional and coupled. It combines a cluster-centric BK-tree algorithm using multi-aspect distance metrics to abstract state streams with a rule-based method that assigns discrete tactical labels to those sequences. These elements feed into a visualization pipeline that produces fitness landscapes and attribution views. If the framework works as described, analysts gain a concrete way to inspect the latent patterns that shape micromanagement choices instead of treating the decision process as opaque.

Core claim

The central claim is that adapting a cluster-centric BK-tree algorithm with specialized multi-aspect distance metrics for state-stream abstraction, paired with a rule-based multi-label extraction step, converts unstructured high-dimensional sequences into discrete interpretable tactical labels; when these steps are integrated into a hierarchical visualization pipeline, the result supplies fitness landscape views that expose the deep-seated drivers of critical decisions in RTS environments.

What carries the argument

The state-action-tactic analysis pipeline (SAT-RTS), which applies a cluster-centric BK-tree with multi-aspect distance metrics to abstract state streams and a rule-based extractor to produce discrete tactical labels from raw sequences.

If this is right

Analysts obtain concrete fitness-landscape visualizations that map how tactical choices evolve across game states.
Raw behavioral traces are systematically converted into discrete, queryable tactical categories.
Processing of continuous real-time data streams becomes feasible while retaining attribution links back to individual decisions.
The same pipeline supplies a route to compare latent patterns across different learning agents or human players.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The abstraction step could be tested on non-RTS sequential domains such as robotic task planning where state-action traces are similarly high-dimensional.
If the tactical labels prove stable across map variants, they might serve as building blocks for curriculum design in training new agents.
The visualization layer might be extended to highlight mismatches between learned policies and expert play without requiring new labeled data.

Load-bearing premise

The chosen multi-aspect distance metrics and BK-tree clustering produce state abstractions that keep the actual drivers of decisions intact rather than creating artifacts from the coupled high-dimensional data.

What would settle it

A controlled comparison in which human experts rate the tactical labels and visualizations as no more informative than direct inspection of the original state-action logs on the same set of game replays.

Figures

Figures reproduced from arXiv: 2606.30090 by Changhe Li, Chunhui Bai, Lei Liu, Shoufei Han, Yuqiang Li.

**Figure 2.** Figure 2: The overall architecture of the SAT-RTS pipeline [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

**Figure 3.** Figure 3: Schematic diagram of the cluster-centric BK-tree with insertion and query [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 4.** Figure 4: State-transition sequence distance metric and fitness landscape visualization [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

**Figure 5.** Figure 5: Sankey diagram of action sequence patterns and tactic extraction [PITH_FULL_IMAGE:figures/full_fig_p010_5.png] view at source ↗

**Figure 6.** Figure 6: Designed experimental scenarios with unit distributions of both sides swapped in mirror maps [PITH_FULL_IMAGE:figures/full_fig_p011_6.png] view at source ↗

**Figure 7.** Figure 7: Illustration of synthetic data and clustering results for three true cluster labels with six samples per cluster [PITH_FULL_IMAGE:figures/full_fig_p011_7.png] view at source ↗

**Figure 8.** Figure 8: Threshold sensitivity analysis of clustering results under different grid granularities and combat unit scales [PITH_FULL_IMAGE:figures/full_fig_p012_8.png] view at source ↗

**Figure 9.** Figure 9: Comparison of clustering performance metrics of the cluster-centric BK-tree and DenStream with various [PITH_FULL_IMAGE:figures/full_fig_p013_9.png] view at source ↗

**Figure 10.** Figure 10: State value landscape visualization in different scenarios [PITH_FULL_IMAGE:figures/full_fig_p013_10.png] view at source ↗

**Figure 11.** Figure 11: Fitness landscape visualization in different scenarios [PITH_FULL_IMAGE:figures/full_fig_p014_11.png] view at source ↗

**Figure 12.** Figure 12: Comprehensive Sankey diagram of action sequence and tactic patterns with Agglomerative algorithm for [PITH_FULL_IMAGE:figures/full_fig_p015_12.png] view at source ↗

**Figure 13.** Figure 13: Comparison of different analytical forms [PITH_FULL_IMAGE:figures/full_fig_p016_13.png] view at source ↗

**Figure 14.** Figure 14: State-transition networks in state value landscapes [PITH_FULL_IMAGE:figures/full_fig_p017_14.png] view at source ↗

**Figure 15.** Figure 15: Detailed process of the two sample solutions (solution 109 and 179) [PITH_FULL_IMAGE:figures/full_fig_p017_15.png] view at source ↗

**Figure 16.** Figure 16: Treemap of state-tactic payoff correlation [PITH_FULL_IMAGE:figures/full_fig_p018_16.png] view at source ↗

read the original abstract

Efficient tactical knowledge extraction and analysis in real-time strategy (RTS) games micromanagement are constrained by the high-dimensional coupled state-action sequential data and the black-box decision-making process. Current research rarely provides a hierarchical visualization-based attribution analysis from the perspective of data decoupling and abstraction. To facilitate interpretable tactical knowledge extraction and visualization-based analysis in RTS games, a systematic framework named state-action-tactic analysis pipeline (SAT-RTS) is proposed. To decipher the deep-seated drivers of critical decisions in RTS learning systems, this work integrates interpretable visualization with the automated extraction of latent tactical patterns from high-dimensional sequence data. By adapting a cluster-centric BK-tree algorithm and incorporating specialized distance metrics designed to quantify multi-aspect similarities, the proposed framework facilitates robust state-stream abstraction. Furthermore, a rule-based multi-label extraction method is developed to transform unstructured state-action sequences into discrete and interpretable tactical labels, effectively bridging the gap between raw behavioral data and high-level tactical insights. By holistically integrating these computational methods into a hierarchical visualization-based pipeline, the proposed framework effectively addresses the challenges of processing massive real-time data streams while providing fitness landscape visualizations and analytical insights to decipher deep-seated tactical drivers. Comprehensive experiments demonstrate that the proposed SAT-RTS significantly enhances the interpretability and efficiency of tactical analysis in complex RTS environments.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

SAT-RTS combines BK-tree clustering with rule-based multi-label extraction for RTS state abstraction and visualization, but the central claim of significant enhancement rests on zero reported experiments or metrics.

read the letter

The paper's main contribution is a pipeline that adapts a cluster-centric BK-tree with custom multi-aspect distance metrics to abstract high-dimensional state streams from RTS micromanagement, then uses rule-based methods to turn those abstractions into discrete tactical labels, feeding into a hierarchical visualization setup.

This integration looks new as a specific package for turning raw sequential data into interpretable tactical insights in game AI. The description of how the components fit together to handle massive real-time streams and produce fitness landscape views is clear and practical.

The soft spot is straightforward and load-bearing: the abstract states that comprehensive experiments show the framework significantly enhances interpretability and efficiency, yet no quantitative results, baselines, test environments, metrics for interpretability, or efficiency numbers appear anywhere. That leaves the key assumption—that the chosen clustering and distances preserve decision-critical features without artifacts—untested and unverified.

The work targets researchers building analysis tools for RTS agents. Someone in that niche could extract useful algorithmic details from the method sections, but the missing validation makes it hard to judge real value or compare against existing approaches.

I would not send this for peer review until the experiments are added with actual numbers and comparisons.

Referee Report

1 major / 0 minor

Summary. The manuscript proposes the SAT-RTS framework for tactical knowledge extraction and visualization-based analysis in RTS games. It adapts a cluster-centric BK-tree algorithm with multi-aspect distance metrics to abstract high-dimensional state streams, develops a rule-based multi-label extraction method to convert state-action sequences into discrete tactical labels, and integrates these into a hierarchical visualization pipeline for fitness landscapes and decision-driver analysis. The central claim is that comprehensive experiments show the framework significantly enhances interpretability and efficiency of tactical analysis in complex RTS environments.

Significance. If the empirical claims are substantiated with quantitative evidence, the work could provide a systematic pipeline for decoupling and abstracting coupled sequential data in RTS micromanagement, offering a concrete bridge between raw behavioral traces and interpretable high-level tactics that prior black-box approaches lack.

major comments (1)

[Abstract] Abstract: The assertion that 'Comprehensive experiments demonstrate that the proposed SAT-RTS significantly enhances the interpretability and efficiency of tactical analysis in complex RTS environments' is unsupported by any reported metrics, baselines, error bars, test environments, RTS scenarios, human-study scores, attribution fidelity measures, runtime numbers, or data-reduction statistics. This is load-bearing because the entire contribution is framed as an empirical advance over existing methods.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback on the abstract. We address the major comment below and will revise the manuscript accordingly.

read point-by-point responses

Referee: [Abstract] Abstract: The assertion that 'Comprehensive experiments demonstrate that the proposed SAT-RTS significantly enhances the interpretability and efficiency of tactical analysis in complex RTS environments' is unsupported by any reported metrics, baselines, error bars, test environments, RTS scenarios, human-study scores, attribution fidelity measures, runtime numbers, or data-reduction statistics. This is load-bearing because the entire contribution is framed as an empirical advance over existing methods.

Authors: We agree that the abstract's phrasing overstates the empirical support without referencing specific evidence. The manuscript body describes the RTS test environments and scenarios used for the visualizations and abstraction pipeline, along with qualitative demonstrations of improved interpretability via the hierarchical visualizations. However, the abstract does not include quantitative metrics such as runtime numbers or data-reduction statistics. We will revise the abstract to qualify the claim, explicitly reference the experimental setups from the results section, and remove the unsupported assertion of 'significantly enhances' unless backed by direct comparisons in the text. revision: yes

Circularity Check

0 steps flagged

No circularity: framework proposal with no self-referential derivations or fitted predictions

full rationale

The paper describes a methodological pipeline (BK-tree clustering with custom distances plus rule-based multi-label extraction) for RTS tactical analysis. No equations, parameters fitted to subsets then re-predicted, or self-citations appear in the provided text. The central claim of 'significant enhancement' is asserted via unspecified experiments rather than being definitionally equivalent to any input. This is the common case of an independent methodological contribution; no load-bearing step reduces to its own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review yields no concrete free parameters, axioms, or invented entities; the framework description does not enumerate fitted constants or new postulated objects.

pith-pipeline@v0.9.1-grok · 5777 in / 949 out tokens · 15722 ms · 2026-06-30T06:00:54.914530+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

50 extracted references · 43 canonical work pages · 2 internal anchors

[1]

Ontañón, G

S. Ontañón, G. Synnaeve, A. Uriarte, F. Richoux, D. Churchill, M. Preuss, A survey of real-time strategy game AI research and competition in StarCraft, IEEE Trans. Comput. Intell. AI Games 5 (4) (2013) 293–311. doi:10.1109/TCIAIG.2013.2286295

work page doi:10.1109/tciaig.2013.2286295 2013
[2]

StarCraft II: A New Challenge for Reinforcement Learning

O. Vinyals, T. Ewalds, S. Bartunov, P. Georgiev, A. S. Vezhnevets, M. Yeo, A. Makhzani, H. Küttler, J. Agapiou, J. Schrittwieser, J. Quan, S. Gaffney, S. Petersen, K. Simonyan, T. Schaul, H. van Hasselt, D. Silver, T. Lillicrap, K. Calderone, P. Keet, A. Brunasso, D. Lawrence, A. Ekermo, J. Repp, R. Tsing, StarCraft II: A new challenge for reinforcement l...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1708.04782 2017
[3]

X. Wang, S. Wang, X. Liang, D. Zhao, J. Huang, X. Xu, B. Dai, Q. Miao, Deep reinforcement learning: A survey, IEEE Trans. Neural Netw. Learn. Syst. 35 (4) (2024) 5064–5078.doi:10.1109/TNNLS.2022.3207346

work page doi:10.1109/tnnls.2022.3207346 2024
[4]

S. Qi, S. Zhang, Q. Wang, J. Zhang, X. Wang, Distributed scalable multi-agent reinforcement learning with intrinsic-episodic dual exploration, Future Gener. Comput. Syst. 175 (2026) 108040. doi:10.1016/j.future. 2025.108040

work page doi:10.1016/j.future 2026
[5]

Puiutta, E

E. Puiutta, E. M. S. P. Veith, Explainable reinforcement learning: A survey, in: Mach. Learn. Knowl. Extr., 2020, pp. 77–95.doi:10.1007/978-3-030-57321-8_5

work page doi:10.1007/978-3-030-57321-8_5 2020
[6]

Y . Cao, Z. Tian, Z. Liu, N. Jia, X. Liu, Reducing overestimation with attentional multi-agent twin delayed deep deterministic policy gradient, Eng. Appl. Artif. Intell. 146 (2025) 110352. doi:10.1016/j.engappai.2025. 110352

work page doi:10.1016/j.engappai.2025 2025
[7]

L. Xu, D. Perez-Liebana, A. Dockhorn, Towards applicable state abstractions: A preview in strategy games, in: Multi-Discip. Conf. Reinforc. Learn. Decis. Mak. (RLDM), 2022, pp. 1–7

2022
[8]

Perkins, Terrain analysis in real-time strategy games: An integrated approach to choke point detection and region decomposition, in: Proc

L. Perkins, Terrain analysis in real-time strategy games: An integrated approach to choke point detection and region decomposition, in: Proc. AAAI Conf. Artif. Intell. Interact. Digit. Entertain., V ol. 6(1), 2010, pp. 168–173. doi:10.1609/aiide.v6i1.12405

work page doi:10.1609/aiide.v6i1.12405 2010
[9]

Y . Jo, S. Lee, J. Yeom, S. Han, FoX: Formation-aware exploration in multi-agent reinforcement learning, in: Proc. AAAI Conf. Artif. Intell., V ol. 38(12), 2024, pp. 12985–12994.doi:10.1609/aaai.v38i12.29196

work page doi:10.1609/aaai.v38i12.29196 2024
[10]

Kuan, Y .-S

Y .-T. Kuan, Y .-S. Wang, J.-H. Chuang, Visualizing real-time strategy games: The example of StarCraft II, in: Proc. 2017 IEEE Conf. Vis. Anal. Sci. Technol. (V AST), 2017, pp. 71–80.doi:10.1109/VAST.2017.8585594

work page doi:10.1109/vast.2017.8585594 2017
[11]

Haley, A

J. Haley, A. Wearne, C. Copland, E. Ortiz, A. Bond, M. van Lent, R. Smith, Cluster analysis of deep embeddings in real-time strategy games, in: Artif. Intell. Mach. Learn. Multi-Dom. Oper. Appl. II, V ol. 11413, 2020, pp. 507–516.doi:10.1117/12.2558105

work page doi:10.1117/12.2558105 2020
[12]

Kozik, T

A. Kozik, T. Machalewski, M. Marek, A. Ochmann, Mimicking playstyle by adapting parameterized behavior trees in RTS games, arXiv preprint arXiv:2111.12144 (2021).doi:10.48550/arXiv.2111.12144

work page doi:10.48550/arxiv.2111.12144 2021
[13]

Heuillet, F

A. Heuillet, F. Couthouis, N. Díaz-Rodríguez, Explainability in deep reinforcement learning, Knowl.-Based Syst. 214 (2021) 106685.doi:10.1016/j.knosys.2020.106685

work page doi:10.1016/j.knosys.2020.106685 2021
[14]

Vinyals, I

O. Vinyals, I. Babuschkin, W. M. Czarnecki, M. Mathieu, A. Dudzik, J. Chung, D. H. Choi, R. Powell, T. Ewalds, P. Georgiev, J. Oh, D. Horgan, M. Kroiss, I. Danihelka, A. Huang, L. Sifre, T. Cai, J. P. Agapiou, M. Jaderberg, A. S. Vezhnevets, R. Leblond, T. Pohlen, V . Dalibard, D. Budden, Y . Sulsky, J. Molloy, T. L. Paine, C. Gulcehre, Z. Wang, T. Pfaff,...

work page doi:10.1038/s41586-019-1724-z 2019
[15]

TERC: A Transfer Entropy Redundancy Criterion for State Variable Selection in Reinforcement Learning

C. Westphal, S. Hailes, M. Musolesi, Information-theoretic state variable selection for reinforcement learning, arXiv preprint arXiv:2401.11512 (2024).doi:10.48550/arXiv.2401.11512

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2401.11512 2024
[16]

Z. Wang, C. Wang, X. Xiao, Y . Zhu, P. Stone, Building minimal and reusable causal state abstractions for reinforcement learning, in: Proc. AAAI Conf. Artif. Intell., V ol. 38(14), 2024, pp. 15778–15786.doi:10.1609/ aaai.v38i14.29507

2024
[17]

X. Zeng, H. Peng, A. Li, C. Liu, L. He, P. S. Yu, Hierarchical state abstraction based on structural information principles, in: Proc. 32nd Int. Jt. Conf. Artif. Intell., 2023, pp. 4549–4557.doi:10.24963/ijcai.2023/506. 19

work page doi:10.24963/ijcai.2023/506 2023
[18]

R. K. Nayyar, Learning generalizable and composable abstractions for transfer in reinforcement learning, in: Proc. AAAI Conf. Artif. Intell., V ol. 38(21), 2024, pp. 23403–23404.doi:10.1609/aaai.v38i21.30402

work page doi:10.1609/aaai.v38i21.30402 2024
[19]

Dockhorn, R

A. Dockhorn, R. Kruse, State and action abstraction for search and reinforcement learning algorithms, in: Y . P. Kondratenko, V . Kreinovich, W. Pedrycz, A. Chikrii, A. M. Gil-Lafuente (Eds.), Artif. Intell. Control Decis.-Mak. Syst., V ol. 1087, Springer Nature Switzerland, Cham, 2023, pp. 181–198.doi:10.1007/978-3-031-25759-9_ 9

work page doi:10.1007/978-3-031-25759-9_ 2023
[20]

Mici´c, D

A. Mici´c, D. Arnarsson, V . Jónsson, Developing game AI for the real-time strategy game StarCraft, Ph.D. thesis, Reykjavik University (2011)

2011
[21]

Synnaeve, P

G. Synnaeve, P. Bessière, A dataset for StarCraft AI and an example of armies clustering, in: Proc. AAAI Conf. Artif. Intell. Interact. Digit. Entertain., V ol. 8(3), 2012, pp. 25–30.doi:10.1609/aiide.v8i3.12546

work page doi:10.1609/aiide.v8i3.12546 2012
[22]

M.-J. Kim, D. Lee, J. S. Kim, C. W. Ahn, Surrogate-assisted Monte Carlo Tree Search for real-time video games, Eng. Appl. Artif. Intell. 133 (2024) 108152.doi:10.1016/j.engappai.2024.108152

work page doi:10.1016/j.engappai.2024.108152 2024
[23]

Y . Li, Y . Fang, Z. Akhtar, Accelerating deep reinforcement learning model for game strategy, Neurocomputing 408 (2020) 157–168.doi:10.1016/j.neucom.2019.06.110

work page doi:10.1016/j.neucom.2019.06.110 2020
[24]

J. Lee, B. Koo, K. Oh, State space optimization using plan recognition and reinforcement learning on RTS game, in: Proc. 7th WSEAS Int. Conf. Artif. Intell. Knowl. Eng. Data Bases, 2008, pp. 165–169

2008
[25]

J. R. Mariño, C. F. Toledo, Evolving interpretable strategies for zero-sum games, Appl. Soft Comput. 122 (2022) 108860.doi:10.1016/j.asoc.2022.108860

work page doi:10.1016/j.asoc.2022.108860 2022
[26]

Uriarte, S

A. Uriarte, S. Ontanón, Game-tree search over high-level game states in RTS games, in: Proc. AAAI Conf. Artif. Intell. Interact. Digit. Entertain., V ol. 10(1), 2014, pp. 73–79.doi:10.1609/aiide.v10i1.12706

work page doi:10.1609/aiide.v10i1.12706 2014
[27]

Uriarte, S

A. Uriarte, S. Ontañón, High-level representations for game-tree search in RTS games, in: Proc. AAAI Conf. Artif. Intell. Interact. Digit. Entertain., V ol. 10(2), 2014, pp. 14–18.doi:10.1609/aiide.v10i2.12734

work page doi:10.1609/aiide.v10i2.12734 2014
[28]

Uriarte, S

A. Uriarte, S. Ontañón, Combat models for RTS games, IEEE Trans. Games 10 (1) (2018) 29–41. doi: 10.1109/TCIAIG.2017.2669895

work page doi:10.1109/tciaig.2017.2669895 2018
[29]

Park, K.-J

H. Park, K.-J. Kim, MCTS with influence map for general video game playing, in: 2015 IEEE Conf. Comput. Intell. Games (CIG), 2015, pp. 534–535.doi:10.1109/CIG.2015.7317896

work page doi:10.1109/cig.2015.7317896 2015
[30]

Zhang, J

L. Zhang, J. Lieffers, A. Pyarelal, Enhancing interpretability in deep reinforcement learning through semantic clustering, arXiv preprint arXiv:2409.17411 (2025).doi:10.48550/arXiv.2409.17411

work page doi:10.48550/arxiv.2409.17411 2025
[31]

L. Xu, A. Dockhorn, D. Perez-Liebana, Elastic monte carlo tree search, IEEE Trans. Games 15 (4) (2023) 527–537. doi:10.1109/TG.2023.3282351

work page doi:10.1109/tg.2023.3282351 2023
[32]

L. Xu, D. Perez-Liebana, A. Dockhorn, Strategy game-playing with size-constrained state abstraction, in: Proc. 2024 IEEE Conf. Games (CoG), 2024, pp. 1–8.doi:10.1109/CoG60054.2024.10645643

work page doi:10.1109/cog60054.2024.10645643 2024
[33]

Wallner, A brief overview of data mining and analytics in games, in: Data Analytics Applications in Gaming and Entertainment, Auerbach Publications, 2019, pp

G. Wallner, A brief overview of data mining and analytics in games, in: Data Analytics Applications in Gaming and Entertainment, Auerbach Publications, 2019, pp. 1–14

2019
[34]

Metoyer, S

R. Metoyer, S. Stumpf, C. Neumann, J. Dodge, J. Cao, A. Schnabel, Explaining how to play real-time strategy games, Knowl.-Based Syst. 23 (4) (2010) 295–301.doi:10.1016/j.knosys.2009.11.006

work page doi:10.1016/j.knosys.2009.11.006 2010
[35]

Kleinman, J

E. Kleinman, J. Villareale, M. Shergadwala, Z. Teng, A. Bryant, J. Zhu, M. S. El-Nasr, Towards an understanding of how players make meaning from post-play process visualizations, in: Entertain. Comput. – ICEC 2022, 2022, pp. 47–58.doi:10.1007/978-3-031-20212-4_4

work page doi:10.1007/978-3-031-20212-4_4 2022
[36]

T. K. Mathes, J. Inman, A. Colón, S. Khan, CODEX: A cluster-based method for explainable reinforcement learning, arXiv preprint arXiv:2312.04216 (2023).doi:10.48550/arXiv.2312.04216

work page doi:10.48550/arxiv.2312.04216 2023
[37]

Ingram, C

B. Ingram, C. van Alten, R. Klein, B. Rosman, Generating interpretable play-style descriptions through deep unsupervised clustering of trajectories, IEEE Trans. Games 15 (4) (2023) 507–516. doi:10.1109/TG.2023. 3299074

work page doi:10.1109/tg.2023 2023
[38]

Izumigawa, C

C. Izumigawa, C. Lucero, L. Nans, K. Frederiksen, O. Hui, I. Enriquez, S. Rothman, R. Iden, Building human- autonomy teaming aids for real-time strategy games, in: Int. Conf. HCI Games, 2020, pp. 117–127. doi: 10.1007/978-3-030-50164-8_8

work page doi:10.1007/978-3-030-50164-8_8 2020
[39]

Keaveney, C

D. Keaveney, C. O’ Riordan, Analysing the fitness landscape of an abstract real-time strategy game, in: Proc. 9th Int. Conf. Intell. Games Simul., 2008, pp. 51–55

2008
[40]

Wallner, L

G. Wallner, L. Wang, C. Dormann, Visualizing the spatio-temporal evolution of gameplay using storyline visualization: A study with League of Legends, in: Proc. ACM Hum.-Comput. Interact., V ol. 7, 2023, pp. 1002–1024.doi:10.1145/3611058. 20

work page doi:10.1145/3611058 2023
[41]

A. P. Afonso, M. B. Carmo, R. Afonso, VisuaLeague: Visual analytics of multiple games, in: Proc. 2021 25th Int. Conf. Inf. Vis. (IV), 2021, pp. 54–62.doi:10.1109/IV53921.2021.00019

work page doi:10.1109/iv53921.2021.00019 2021
[42]

Šufliarsky, G

A. Šufliarsky, G. Walllner, S. Kriglstein, Through space and time: Spatio-temporal visualization of MOBA matches, in: Hum.-Comput. Interact. – INTERACT 2023, 2023, pp. 167–189.doi:10.1007/978-3-031-42283-6_9

work page doi:10.1007/978-3-031-42283-6_9 2023
[43]

H. W. Kuhn, The Hungarian method for the assignment problem, Nav. Res. Logist. Q. 2 (1-2) (1955) 83–97. doi:10.1002/nav.3800020109

work page doi:10.1002/nav.3800020109 1955
[44]

ACM 16 (4) (1973) 230–236

Burkhard W A, Keller R M, Some approaches to best-match file searching, Commun. ACM 16 (4) (1973) 230–236. doi:10.1145/362003.362025

work page doi:10.1145/362003.362025 1973
[45]

Sakoe, S

H. Sakoe, S. Chiba, Dynamic programming algorithm optimization for spoken word recognition, IEEE Trans. Acoust. Speech Signal Process. 26 (1) (1978) 43–49.doi:10.1109/TASSP.1978.1163055

work page doi:10.1109/tassp.1978.1163055 1978
[46]

Agrawal, R

R. Agrawal, R. Srikant, Fast algorithms for mining association rules in large databases, in: Proc. 20th Int. Conf. Very Large Data Bases, V ol. 13, 1994, pp. 487–499

1994
[47]

Mortazavi-Asl, H

Jian Pei, Jiawei Han, B. Mortazavi-Asl, H. Pinto, Qiming Chen, U. Dayal, Mei-Chun Hsu, PrefixSpan: Mining sequential patterns efficiently by prefix-projected pattern growth, in: Proc. 17th Int. Conf. Data Eng., 2001, pp. 215–224.doi:10.1109/icde.2001.914830

work page doi:10.1109/icde.2001.914830 2001
[48]

2006 SIAM Int

Feng Cao, Martin Estert, Weining Qian, Aoying Zhou, Density-based clustering over an evolving data stream with noise, in: Proc. 2006 SIAM Int. Conf. Data Min., 2006, pp. 328–339.doi:10.1137/1.9781611972764.29

work page doi:10.1137/1.9781611972764.29 2006
[49]

J. H. Ward Jr., Hierarchical grouping to optimize an objective function, J. Am. Stat. Assoc. 58 (301) (1963) 236–244.doi:10.1080/01621459.1963.10500845

work page doi:10.1080/01621459.1963.10500845 1963
[50]

SAT-RTS: A systematic framework for tactical knowledge extraction and visualization-based analysis in real-time strategy games

Y . Diao, C. Li, S. Zeng, S. Yang, C. A. Coello Coello, Nearest-Better Network for Fitness Landscape Analysis of Continuous Optimization Problems, IEEE Trans. Evol. Comput. 29 (5) (2025) 2089–2103. doi:10.1109/TEVC. 2024.3478825. 21 A Supplementary Material for SAT-RTS This document provides supplementary technical evidence for the paper "SAT-RTS: A syste...

work page doi:10.1109/tevc 2025

[1] [1]

Ontañón, G

S. Ontañón, G. Synnaeve, A. Uriarte, F. Richoux, D. Churchill, M. Preuss, A survey of real-time strategy game AI research and competition in StarCraft, IEEE Trans. Comput. Intell. AI Games 5 (4) (2013) 293–311. doi:10.1109/TCIAIG.2013.2286295

work page doi:10.1109/tciaig.2013.2286295 2013

[2] [2]

StarCraft II: A New Challenge for Reinforcement Learning

O. Vinyals, T. Ewalds, S. Bartunov, P. Georgiev, A. S. Vezhnevets, M. Yeo, A. Makhzani, H. Küttler, J. Agapiou, J. Schrittwieser, J. Quan, S. Gaffney, S. Petersen, K. Simonyan, T. Schaul, H. van Hasselt, D. Silver, T. Lillicrap, K. Calderone, P. Keet, A. Brunasso, D. Lawrence, A. Ekermo, J. Repp, R. Tsing, StarCraft II: A new challenge for reinforcement l...

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1708.04782 2017

[3] [3]

X. Wang, S. Wang, X. Liang, D. Zhao, J. Huang, X. Xu, B. Dai, Q. Miao, Deep reinforcement learning: A survey, IEEE Trans. Neural Netw. Learn. Syst. 35 (4) (2024) 5064–5078.doi:10.1109/TNNLS.2022.3207346

work page doi:10.1109/tnnls.2022.3207346 2024

[4] [4]

S. Qi, S. Zhang, Q. Wang, J. Zhang, X. Wang, Distributed scalable multi-agent reinforcement learning with intrinsic-episodic dual exploration, Future Gener. Comput. Syst. 175 (2026) 108040. doi:10.1016/j.future. 2025.108040

work page doi:10.1016/j.future 2026

[5] [5]

Puiutta, E

E. Puiutta, E. M. S. P. Veith, Explainable reinforcement learning: A survey, in: Mach. Learn. Knowl. Extr., 2020, pp. 77–95.doi:10.1007/978-3-030-57321-8_5

work page doi:10.1007/978-3-030-57321-8_5 2020

[6] [6]

Y . Cao, Z. Tian, Z. Liu, N. Jia, X. Liu, Reducing overestimation with attentional multi-agent twin delayed deep deterministic policy gradient, Eng. Appl. Artif. Intell. 146 (2025) 110352. doi:10.1016/j.engappai.2025. 110352

work page doi:10.1016/j.engappai.2025 2025

[7] [7]

L. Xu, D. Perez-Liebana, A. Dockhorn, Towards applicable state abstractions: A preview in strategy games, in: Multi-Discip. Conf. Reinforc. Learn. Decis. Mak. (RLDM), 2022, pp. 1–7

2022

[8] [8]

Perkins, Terrain analysis in real-time strategy games: An integrated approach to choke point detection and region decomposition, in: Proc

L. Perkins, Terrain analysis in real-time strategy games: An integrated approach to choke point detection and region decomposition, in: Proc. AAAI Conf. Artif. Intell. Interact. Digit. Entertain., V ol. 6(1), 2010, pp. 168–173. doi:10.1609/aiide.v6i1.12405

work page doi:10.1609/aiide.v6i1.12405 2010

[9] [9]

Y . Jo, S. Lee, J. Yeom, S. Han, FoX: Formation-aware exploration in multi-agent reinforcement learning, in: Proc. AAAI Conf. Artif. Intell., V ol. 38(12), 2024, pp. 12985–12994.doi:10.1609/aaai.v38i12.29196

work page doi:10.1609/aaai.v38i12.29196 2024

[10] [10]

Kuan, Y .-S

Y .-T. Kuan, Y .-S. Wang, J.-H. Chuang, Visualizing real-time strategy games: The example of StarCraft II, in: Proc. 2017 IEEE Conf. Vis. Anal. Sci. Technol. (V AST), 2017, pp. 71–80.doi:10.1109/VAST.2017.8585594

work page doi:10.1109/vast.2017.8585594 2017

[11] [11]

Haley, A

J. Haley, A. Wearne, C. Copland, E. Ortiz, A. Bond, M. van Lent, R. Smith, Cluster analysis of deep embeddings in real-time strategy games, in: Artif. Intell. Mach. Learn. Multi-Dom. Oper. Appl. II, V ol. 11413, 2020, pp. 507–516.doi:10.1117/12.2558105

work page doi:10.1117/12.2558105 2020

[12] [12]

Kozik, T

A. Kozik, T. Machalewski, M. Marek, A. Ochmann, Mimicking playstyle by adapting parameterized behavior trees in RTS games, arXiv preprint arXiv:2111.12144 (2021).doi:10.48550/arXiv.2111.12144

work page doi:10.48550/arxiv.2111.12144 2021

[13] [13]

Heuillet, F

A. Heuillet, F. Couthouis, N. Díaz-Rodríguez, Explainability in deep reinforcement learning, Knowl.-Based Syst. 214 (2021) 106685.doi:10.1016/j.knosys.2020.106685

work page doi:10.1016/j.knosys.2020.106685 2021

[14] [14]

Vinyals, I

O. Vinyals, I. Babuschkin, W. M. Czarnecki, M. Mathieu, A. Dudzik, J. Chung, D. H. Choi, R. Powell, T. Ewalds, P. Georgiev, J. Oh, D. Horgan, M. Kroiss, I. Danihelka, A. Huang, L. Sifre, T. Cai, J. P. Agapiou, M. Jaderberg, A. S. Vezhnevets, R. Leblond, T. Pohlen, V . Dalibard, D. Budden, Y . Sulsky, J. Molloy, T. L. Paine, C. Gulcehre, Z. Wang, T. Pfaff,...

work page doi:10.1038/s41586-019-1724-z 2019

[15] [15]

TERC: A Transfer Entropy Redundancy Criterion for State Variable Selection in Reinforcement Learning

C. Westphal, S. Hailes, M. Musolesi, Information-theoretic state variable selection for reinforcement learning, arXiv preprint arXiv:2401.11512 (2024).doi:10.48550/arXiv.2401.11512

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2401.11512 2024

[16] [16]

Z. Wang, C. Wang, X. Xiao, Y . Zhu, P. Stone, Building minimal and reusable causal state abstractions for reinforcement learning, in: Proc. AAAI Conf. Artif. Intell., V ol. 38(14), 2024, pp. 15778–15786.doi:10.1609/ aaai.v38i14.29507

2024

[17] [17]

X. Zeng, H. Peng, A. Li, C. Liu, L. He, P. S. Yu, Hierarchical state abstraction based on structural information principles, in: Proc. 32nd Int. Jt. Conf. Artif. Intell., 2023, pp. 4549–4557.doi:10.24963/ijcai.2023/506. 19

work page doi:10.24963/ijcai.2023/506 2023

[18] [18]

R. K. Nayyar, Learning generalizable and composable abstractions for transfer in reinforcement learning, in: Proc. AAAI Conf. Artif. Intell., V ol. 38(21), 2024, pp. 23403–23404.doi:10.1609/aaai.v38i21.30402

work page doi:10.1609/aaai.v38i21.30402 2024

[19] [19]

Dockhorn, R

A. Dockhorn, R. Kruse, State and action abstraction for search and reinforcement learning algorithms, in: Y . P. Kondratenko, V . Kreinovich, W. Pedrycz, A. Chikrii, A. M. Gil-Lafuente (Eds.), Artif. Intell. Control Decis.-Mak. Syst., V ol. 1087, Springer Nature Switzerland, Cham, 2023, pp. 181–198.doi:10.1007/978-3-031-25759-9_ 9

work page doi:10.1007/978-3-031-25759-9_ 2023

[20] [20]

Mici´c, D

A. Mici´c, D. Arnarsson, V . Jónsson, Developing game AI for the real-time strategy game StarCraft, Ph.D. thesis, Reykjavik University (2011)

2011

[21] [21]

Synnaeve, P

G. Synnaeve, P. Bessière, A dataset for StarCraft AI and an example of armies clustering, in: Proc. AAAI Conf. Artif. Intell. Interact. Digit. Entertain., V ol. 8(3), 2012, pp. 25–30.doi:10.1609/aiide.v8i3.12546

work page doi:10.1609/aiide.v8i3.12546 2012

[22] [22]

M.-J. Kim, D. Lee, J. S. Kim, C. W. Ahn, Surrogate-assisted Monte Carlo Tree Search for real-time video games, Eng. Appl. Artif. Intell. 133 (2024) 108152.doi:10.1016/j.engappai.2024.108152

work page doi:10.1016/j.engappai.2024.108152 2024

[23] [23]

Y . Li, Y . Fang, Z. Akhtar, Accelerating deep reinforcement learning model for game strategy, Neurocomputing 408 (2020) 157–168.doi:10.1016/j.neucom.2019.06.110

work page doi:10.1016/j.neucom.2019.06.110 2020

[24] [24]

J. Lee, B. Koo, K. Oh, State space optimization using plan recognition and reinforcement learning on RTS game, in: Proc. 7th WSEAS Int. Conf. Artif. Intell. Knowl. Eng. Data Bases, 2008, pp. 165–169

2008

[25] [25]

J. R. Mariño, C. F. Toledo, Evolving interpretable strategies for zero-sum games, Appl. Soft Comput. 122 (2022) 108860.doi:10.1016/j.asoc.2022.108860

work page doi:10.1016/j.asoc.2022.108860 2022

[26] [26]

Uriarte, S

A. Uriarte, S. Ontanón, Game-tree search over high-level game states in RTS games, in: Proc. AAAI Conf. Artif. Intell. Interact. Digit. Entertain., V ol. 10(1), 2014, pp. 73–79.doi:10.1609/aiide.v10i1.12706

work page doi:10.1609/aiide.v10i1.12706 2014

[27] [27]

Uriarte, S

A. Uriarte, S. Ontañón, High-level representations for game-tree search in RTS games, in: Proc. AAAI Conf. Artif. Intell. Interact. Digit. Entertain., V ol. 10(2), 2014, pp. 14–18.doi:10.1609/aiide.v10i2.12734

work page doi:10.1609/aiide.v10i2.12734 2014

[28] [28]

Uriarte, S

A. Uriarte, S. Ontañón, Combat models for RTS games, IEEE Trans. Games 10 (1) (2018) 29–41. doi: 10.1109/TCIAIG.2017.2669895

work page doi:10.1109/tciaig.2017.2669895 2018

[29] [29]

Park, K.-J

H. Park, K.-J. Kim, MCTS with influence map for general video game playing, in: 2015 IEEE Conf. Comput. Intell. Games (CIG), 2015, pp. 534–535.doi:10.1109/CIG.2015.7317896

work page doi:10.1109/cig.2015.7317896 2015

[30] [30]

Zhang, J

L. Zhang, J. Lieffers, A. Pyarelal, Enhancing interpretability in deep reinforcement learning through semantic clustering, arXiv preprint arXiv:2409.17411 (2025).doi:10.48550/arXiv.2409.17411

work page doi:10.48550/arxiv.2409.17411 2025

[31] [31]

L. Xu, A. Dockhorn, D. Perez-Liebana, Elastic monte carlo tree search, IEEE Trans. Games 15 (4) (2023) 527–537. doi:10.1109/TG.2023.3282351

work page doi:10.1109/tg.2023.3282351 2023

[32] [32]

L. Xu, D. Perez-Liebana, A. Dockhorn, Strategy game-playing with size-constrained state abstraction, in: Proc. 2024 IEEE Conf. Games (CoG), 2024, pp. 1–8.doi:10.1109/CoG60054.2024.10645643

work page doi:10.1109/cog60054.2024.10645643 2024

[33] [33]

Wallner, A brief overview of data mining and analytics in games, in: Data Analytics Applications in Gaming and Entertainment, Auerbach Publications, 2019, pp

G. Wallner, A brief overview of data mining and analytics in games, in: Data Analytics Applications in Gaming and Entertainment, Auerbach Publications, 2019, pp. 1–14

2019

[34] [34]

Metoyer, S

R. Metoyer, S. Stumpf, C. Neumann, J. Dodge, J. Cao, A. Schnabel, Explaining how to play real-time strategy games, Knowl.-Based Syst. 23 (4) (2010) 295–301.doi:10.1016/j.knosys.2009.11.006

work page doi:10.1016/j.knosys.2009.11.006 2010

[35] [35]

Kleinman, J

E. Kleinman, J. Villareale, M. Shergadwala, Z. Teng, A. Bryant, J. Zhu, M. S. El-Nasr, Towards an understanding of how players make meaning from post-play process visualizations, in: Entertain. Comput. – ICEC 2022, 2022, pp. 47–58.doi:10.1007/978-3-031-20212-4_4

work page doi:10.1007/978-3-031-20212-4_4 2022

[36] [36]

T. K. Mathes, J. Inman, A. Colón, S. Khan, CODEX: A cluster-based method for explainable reinforcement learning, arXiv preprint arXiv:2312.04216 (2023).doi:10.48550/arXiv.2312.04216

work page doi:10.48550/arxiv.2312.04216 2023

[37] [37]

Ingram, C

B. Ingram, C. van Alten, R. Klein, B. Rosman, Generating interpretable play-style descriptions through deep unsupervised clustering of trajectories, IEEE Trans. Games 15 (4) (2023) 507–516. doi:10.1109/TG.2023. 3299074

work page doi:10.1109/tg.2023 2023

[38] [38]

Izumigawa, C

C. Izumigawa, C. Lucero, L. Nans, K. Frederiksen, O. Hui, I. Enriquez, S. Rothman, R. Iden, Building human- autonomy teaming aids for real-time strategy games, in: Int. Conf. HCI Games, 2020, pp. 117–127. doi: 10.1007/978-3-030-50164-8_8

work page doi:10.1007/978-3-030-50164-8_8 2020

[39] [39]

Keaveney, C

D. Keaveney, C. O’ Riordan, Analysing the fitness landscape of an abstract real-time strategy game, in: Proc. 9th Int. Conf. Intell. Games Simul., 2008, pp. 51–55

2008

[40] [40]

Wallner, L

G. Wallner, L. Wang, C. Dormann, Visualizing the spatio-temporal evolution of gameplay using storyline visualization: A study with League of Legends, in: Proc. ACM Hum.-Comput. Interact., V ol. 7, 2023, pp. 1002–1024.doi:10.1145/3611058. 20

work page doi:10.1145/3611058 2023

[41] [41]

A. P. Afonso, M. B. Carmo, R. Afonso, VisuaLeague: Visual analytics of multiple games, in: Proc. 2021 25th Int. Conf. Inf. Vis. (IV), 2021, pp. 54–62.doi:10.1109/IV53921.2021.00019

work page doi:10.1109/iv53921.2021.00019 2021

[42] [42]

Šufliarsky, G

A. Šufliarsky, G. Walllner, S. Kriglstein, Through space and time: Spatio-temporal visualization of MOBA matches, in: Hum.-Comput. Interact. – INTERACT 2023, 2023, pp. 167–189.doi:10.1007/978-3-031-42283-6_9

work page doi:10.1007/978-3-031-42283-6_9 2023

[43] [43]

H. W. Kuhn, The Hungarian method for the assignment problem, Nav. Res. Logist. Q. 2 (1-2) (1955) 83–97. doi:10.1002/nav.3800020109

work page doi:10.1002/nav.3800020109 1955

[44] [44]

ACM 16 (4) (1973) 230–236

Burkhard W A, Keller R M, Some approaches to best-match file searching, Commun. ACM 16 (4) (1973) 230–236. doi:10.1145/362003.362025

work page doi:10.1145/362003.362025 1973

[45] [45]

Sakoe, S

H. Sakoe, S. Chiba, Dynamic programming algorithm optimization for spoken word recognition, IEEE Trans. Acoust. Speech Signal Process. 26 (1) (1978) 43–49.doi:10.1109/TASSP.1978.1163055

work page doi:10.1109/tassp.1978.1163055 1978

[46] [46]

Agrawal, R

R. Agrawal, R. Srikant, Fast algorithms for mining association rules in large databases, in: Proc. 20th Int. Conf. Very Large Data Bases, V ol. 13, 1994, pp. 487–499

1994

[47] [47]

Mortazavi-Asl, H

Jian Pei, Jiawei Han, B. Mortazavi-Asl, H. Pinto, Qiming Chen, U. Dayal, Mei-Chun Hsu, PrefixSpan: Mining sequential patterns efficiently by prefix-projected pattern growth, in: Proc. 17th Int. Conf. Data Eng., 2001, pp. 215–224.doi:10.1109/icde.2001.914830

work page doi:10.1109/icde.2001.914830 2001

[48] [48]

2006 SIAM Int

Feng Cao, Martin Estert, Weining Qian, Aoying Zhou, Density-based clustering over an evolving data stream with noise, in: Proc. 2006 SIAM Int. Conf. Data Min., 2006, pp. 328–339.doi:10.1137/1.9781611972764.29

work page doi:10.1137/1.9781611972764.29 2006

[49] [49]

J. H. Ward Jr., Hierarchical grouping to optimize an objective function, J. Am. Stat. Assoc. 58 (301) (1963) 236–244.doi:10.1080/01621459.1963.10500845

work page doi:10.1080/01621459.1963.10500845 1963

[50] [50]

SAT-RTS: A systematic framework for tactical knowledge extraction and visualization-based analysis in real-time strategy games

Y . Diao, C. Li, S. Zeng, S. Yang, C. A. Coello Coello, Nearest-Better Network for Fitness Landscape Analysis of Continuous Optimization Problems, IEEE Trans. Evol. Comput. 29 (5) (2025) 2089–2103. doi:10.1109/TEVC. 2024.3478825. 21 A Supplementary Material for SAT-RTS This document provides supplementary technical evidence for the paper "SAT-RTS: A syste...

work page doi:10.1109/tevc 2025