Promoting Simple Agents: Ensemble Methods for Event-Log Prediction

Benedikt Bollig; Matthias F\"ugger; Paul Zeinaty; Thomas Nowak

arxiv: 2604.21629 · v1 · submitted 2026-04-23 · 💻 cs.LG · cs.AI· cs.DC· cs.FL

Promoting Simple Agents: Ensemble Methods for Event-Log Prediction

Benedikt Bollig , Matthias F\"ugger , Thomas Nowak , Paul Zeinaty This is my paper

Pith reviewed 2026-05-09 22:55 UTC · model grok-4.3

classification 💻 cs.LG cs.AIcs.DCcs.FL

keywords n-gramsevent logsnext-activity predictionensemble methodsprocess miningpromotion algorithmneural networks

0 comments

The pith

Lightweight n-gram models combined with a promotion ensemble achieve accuracy comparable to neural networks for event-log prediction at lower computational cost.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper aims to show that simple n-gram based agents can perform as well as complex neural models in predicting next activities from streaming event logs, but with significant savings in resources. Through experiments on synthetic and real process mining data, it demonstrates that n-grams offer stable accuracy while neural models with windows can be unstable. The key innovation is the promotion algorithm, which selects dynamically between two models to avoid the high cost of full voting ensembles. This matters for applications where computational efficiency is crucial, such as real-time monitoring of business processes. The results indicate that these promoted ensembles can even surpass some neural approaches on practical datasets.

Core claim

Experiments on synthetic patterns and five real-world process mining datasets show that n-grams with appropriate context windows achieve comparable accuracy to neural models while requiring substantially fewer resources. Unlike windowed neural architectures, which show unstable performance patterns, n-grams provide stable and consistent accuracy. While classical ensemble methods like voting improve n-gram performance, they require running many agents in parallel during inference, increasing memory consumption and latency. The proposed promotion algorithm dynamically selects between two active models during inference, reducing overhead compared to classical voting schemes. On real-world data,

What carries the argument

The promotion algorithm, which dynamically selects between two active n-gram models during inference to reduce overhead.

If this is right

N-grams with suitable context windows achieve comparable accuracy to neural models but require substantially fewer resources.
N-grams deliver stable and consistent accuracy unlike windowed neural architectures that fluctuate.
Classical voting improves n-gram performance but raises memory and latency costs; promotion reduces this overhead.
On real-world datasets the resulting ensembles match or exceed non-windowed neural models with lower cost.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Dynamic selection like promotion could extend to other streaming prediction domains where full ensembles are too expensive at inference time.
Process mining systems running on limited hardware might adopt n-grams to enable real-time monitoring without neural-scale compute.
Controlled experiments varying log complexity could clarify exactly when the stability of n-grams outweighs neural capacity.

Load-bearing premise

The five real-world process mining datasets and the chosen context windows are representative enough for the claimed general superiority in the resource-accuracy trade-off, with no hidden data leakage in window selection.

What would settle it

A new independent event-log dataset where the promotion ensembles fail to match or exceed non-windowed neural accuracy while using lower computational cost would disprove the central result.

Figures

Figures reproduced from arXiv: 2604.21629 by Benedikt Bollig, Matthias F\"ugger, Paul Zeinaty, Thomas Nowak.

**Figure 2.** Figure 2: Window-size impact on next-activity prediction accuracies for randomized [PITH_FULL_IMAGE:figures/full_fig_p011_2.png] view at source ↗

read the original abstract

We compare lightweight automata-based models (n-grams) with neural architectures (LSTM, Transformer) for next-activity prediction in streaming event logs. Experiments on synthetic patterns and five real-world process mining datasets show that n-grams with appropriate context windows achieve comparable accuracy to neural models while requiring substantially fewer resources. Unlike windowed neural architectures, which show unstable performance patterns, n-grams provide stable and consistent accuracy. While we demonstrate that classical ensemble methods like voting improve n-gram performance, they require running many agents in parallel during inference, increasing memory consumption and latency. We propose an ensemble method, the promotion algorithm, that dynamically selects between two active models during inference, reducing overhead compared to classical voting schemes. On real-world datasets, these ensembles match or exceed the accuracy of non-windowed neural models with lower computational cost.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The promotion algorithm is a new low-overhead way to ensemble n-grams for next-activity prediction and the experiments show it can match non-windowed neural accuracy on the five logs at lower cost, but the window selection and lack of stats leave the general claim shaky.

read the letter

The paper's main takeaway is that a dynamic promotion rule for picking between two n-gram models during inference gives most of the accuracy lift of voting ensembles without running everything in parallel, and these setups reach or beat non-windowed LSTMs and Transformers on accuracy while using fewer resources on the tested real-world logs.

Referee Report

3 major / 2 minor

Summary. The paper compares lightweight n-gram automata models against neural architectures (LSTM, Transformer) for next-activity prediction on streaming event logs. It shows that n-grams with suitable context windows achieve comparable accuracy to non-windowed neural models at lower computational cost, while being more stable than windowed neural variants. The authors introduce a 'promotion algorithm' ensemble that dynamically switches between two active n-gram models during inference to reduce the overhead of classical voting ensembles, and validate the approach on synthetic patterns plus five real-world process mining datasets.

Significance. If the accuracy and resource claims hold after addressing experimental gaps, the work would provide a practical, low-overhead alternative for event-log prediction in resource-constrained process mining settings. The promotion algorithm offers a targeted ensemble technique that improves on voting by limiting active models at inference time. The multi-dataset empirical comparison is a strength, though it remains entirely empirical without parameter-free derivations or machine-checked proofs.

major comments (3)

[Experiments] Experiments section (real-world results): the central claim that n-gram ensembles 'match or exceed the accuracy of non-windowed neural models with lower computational cost' is presented without error bars, standard deviations across runs, or statistical significance tests on the five datasets. This directly affects whether the reported stability and trade-off can be considered reliable.
[Model Description and Experiments] Context window selection procedure (described in the n-gram model setup and experimental protocol): insufficient detail is given on how 'appropriate context windows' were chosen for each dataset. If any test-set information was used in this selection, it would constitute leakage and undermine the general superiority claim in the abstract.
[Results] Comparison to baselines (results tables): the non-windowed LSTM/Transformer baselines must be confirmed to use identical train/test splits, preprocessing, and metrics as the n-gram ensembles. Any mismatch in implementation would invalidate the accuracy-cost conclusion.

minor comments (2)

[Algorithm] The promotion algorithm pseudocode could benefit from explicit notation for the promotion/demotion thresholds and state transitions to improve reproducibility.
[Tables] Some result tables would be clearer with explicit column headers indicating whether accuracy or resource metrics are reported, and with consistent ordering of methods across datasets.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback. We address each major comment below, providing clarifications and committing to revisions that strengthen the experimental reporting without altering the core findings.

read point-by-point responses

Referee: [Experiments] Experiments section (real-world results): the central claim that n-gram ensembles 'match or exceed the accuracy of non-windowed neural models with lower computational cost' is presented without error bars, standard deviations across runs, or statistical significance tests on the five datasets. This directly affects whether the reported stability and trade-off can be considered reliable.

Authors: We agree that reporting variability and statistical tests would make the reliability of the accuracy and stability claims more robust. In the revised manuscript, we will add error bars representing standard deviations across multiple independent runs and include paired statistical significance tests (e.g., Wilcoxon signed-rank) for the key comparisons on the five real-world datasets. revision: yes
Referee: [Model Description and Experiments] Context window selection procedure (described in the n-gram model setup and experimental protocol): insufficient detail is given on how 'appropriate context windows' were chosen for each dataset. If any test-set information was used in this selection, it would constitute leakage and undermine the general superiority claim in the abstract.

Authors: Context windows were selected solely from training data using cross-validation on the training portions of each dataset, with no test-set information involved at any stage. We will revise the n-gram model setup and experimental protocol sections to provide a step-by-step description of this leakage-free procedure, including the validation strategy employed. revision: yes
Referee: [Results] Comparison to baselines (results tables): the non-windowed LSTM/Transformer baselines must be confirmed to use identical train/test splits, preprocessing, and metrics as the n-gram ensembles. Any mismatch in implementation would invalidate the accuracy-cost conclusion.

Authors: The non-windowed LSTM and Transformer baselines were trained and evaluated using precisely the same train/test splits, preprocessing pipeline, and evaluation metrics as the n-gram models and ensembles. We will add explicit confirmation of this equivalence, along with implementation details, to the experimental setup and results sections in the revision. revision: yes

Circularity Check

0 steps flagged

No circularity; claims rest on direct empirical comparisons

full rationale

The paper is an empirical comparison of n-gram ensembles against LSTM/Transformer models on synthetic patterns and five real-world process-mining logs. No mathematical derivation chain, equations, or proofs are present that could reduce by construction to fitted parameters, self-definitions, or self-citations. Performance claims (accuracy, stability, resource cost) are supported by reported experimental measurements rather than any tautological reduction. Self-citations, if any, are not load-bearing for the central results.

Axiom & Free-Parameter Ledger

1 free parameters · 0 axioms · 0 invented entities

The central claims rest on empirical performance rather than derivation from axioms; the only notable free parameter is the context-window size for n-grams, which is described as 'appropriate' without a selection procedure.

free parameters (1)

context window size
Chosen per dataset to achieve reported accuracy; no automatic or cross-validated procedure described in abstract.

pith-pipeline@v0.9.0 · 5447 in / 1186 out tokens · 52005 ms · 2026-05-09T22:55:04.128690+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

44 extracted references · 26 canonical work pages · 1 internal anchor

[1]

Springer (2016)

van der Aalst, W.M.P.: Process Mining - Data Science in Action, Second Edition. Springer (2016). https://doi.org/10.1007/978-3-662-49851-4, https://doi.org/10. 1007/978-3-662-49851-4

work page doi:10.1007/978-3-662-49851-4 2016
[2]

van der Aalst, W.M.P., Schonenberg, M.H., Song, M.: Time prediction based on process mining. Inf. Syst.36(2), 450–475 (2011). https://doi.org/10.1016/J.IS. 2010.09.001, https://doi.org/10.1016/j.is.2010.09.001

work page doi:10.1016/j.is 2011
[3]

Balle, B., Castro, J., Gavaldà, R.: Adaptively learning probabilistic deterministic automata from data streams. Mach. Learn.96(1-2), 99–127 (2014). https://doi. org/10.1007/S10994-013-5408-X, https://doi.org/10.1007/s10994-013-5408-x Promoting Simple Agents: Ensemble Methods for Event-Log Prediction 15

work page doi:10.1007/s10994-013-5408-x 2014
[4]

In: Coste, F., Ouardi, F., Rabusseau, G

Baumgartner,R.,Verwer,S.:Learningstatemachinesfromdatastreams:Ageneric strategy and an improved heuristic. In: Coste, F., Ouardi, F., Rabusseau, G. (eds.) International Conference on Grammatical Inference, ICGI 2023, 10-13 July 2023, Rabat, Morocco. Proceedings of Machine Learning Research, vol. 217, pp. 117–141. PMLR (2023), https://proceedings.mlr.press...

2023
[5]

Bollig,B.,Függer,M.,Nowak,T.,Zeinaty,P.:logicsponge-processmining:Alibrary for process-mining tasks and next activity prediction in business processes., https: //github.com/innatelogic/logicsponge-processmining.git, accessed: 2026-02-13

2026
[6]

In: Touili, T., Cook, B., Jackson, P.B

Bollig, B., Katoen, J., Kern, C., Leucker, M., Neider, D., Piegdon, D.R.: libalf: The automata learning framework. In: Touili, T., Cook, B., Jackson, P.B. (eds.) Com- puter Aided Verification, 22nd International Conference, CAV 2010, Edinburgh, UK, July 15-19, 2010. Proceedings. Lecture Notes in Computer Science, vol. 6174, pp. 360–364. Springer (2010). h...

work page doi:10.1007/978-3-642-14295-6_32 2010
[7]

MIS Q.40(4), 1009–1034 (2016)

Breuker, D., Matzner, M., Delfmann, P., Becker, J.: Comprehensible predictive models for business processes. MIS Q.40(4), 1009–1034 (2016). https://doi.org/ 10.25300/MISQ/2016/40.4.10, https://doi.org/10.25300/misq/2016/40.4.10

work page doi:10.25300/misq/2016/40.4.10 2016
[8]

https://doi.org/10.48550/ arXiv.2104.00721, https://arxiv.org/abs/2104.00721

Bukhsh, Z.A., Saeed, A., Dijkman, R.M.: Processtransformer: Predictive business process monitoring with transformer network (2021). https://doi.org/10.48550/ arXiv.2104.00721, https://arxiv.org/abs/2104.00721

work page arXiv 2021
[9]

In: van der Aalst, W.M.P., Carmona, J

Burattin, A.: Streaming process mining. In: van der Aalst, W.M.P., Carmona, J. (eds.) Process Mining Handbook, Lecture Notes in Business Information Processing, vol. 448, pp. 349–372. Springer (2022). https://doi.org/10.1007/ 978-3-031-08848-3_11, https://doi.org/10.1007/978-3-031-08848-3_11

work page doi:10.1007/978-3-031-08848-3_11 2022
[10]

In: Proceedings of the IEEE Congress on Evolution- ary Computation, CEC 2014, Beijing, China, July 6-11, 2014

Burattin, A., Sperduti, A., van der Aalst, W.M.P.: Control-flow discovery from event streams. In: Proceedings of the IEEE Congress on Evolution- ary Computation, CEC 2014, Beijing, China, July 6-11, 2014. pp. 2420–

2014
[11]

https://doi.org/10.1109/CEC.2014.6900341, https://doi.org/ 10.1109/CEC.2014.6900341

IEEE (2014). https://doi.org/10.1109/CEC.2014.6900341, https://doi.org/ 10.1109/CEC.2014.6900341

work page doi:10.1109/cec.2014.6900341 2014
[12]

In: International Colloquium on Grammatical Inference

Carrasco, R.C., Oncina, J.: Learning stochastic regular grammars by means of a state merging method. In: International Colloquium on Grammatical Inference. pp. 139–152. Springer (1994)

1994
[13]

In: Dzeroski, S., Panov, P., Kocev, D., Todorovski, L

Ceci, M., Lanotte, P.F., Fumarola, F., Cavallo, D.P., Malerba, D.: Completion time and next activity prediction of processes using sequential pattern min- ing. In: Dzeroski, S., Panov, P., Kocev, D., Todorovski, L. (eds.) Discovery Sci- ence - 17th International Conference, DS 2014, Bled, Slovenia, October 8-10,

2014
[14]

Lecture Notes in Computer Science, vol

Proceedings. Lecture Notes in Computer Science, vol. 8777, pp. 49–61. Springer (2014). https://doi.org/10.1007/978-3-319-11812-3_5, https://doi.org/ 10.1007/978-3-319-11812-3_5

work page doi:10.1007/978-3-319-11812-3_5 2014
[15]

Journal of algorithms3(1), 14–30 (1982)

Dolev, D.: The byzantine generals strike again. Journal of algorithms3(1), 14–30 (1982)

1982
[16]

Information and Control 52(3), 257–274 (1982)

Dolev, D., Fischer, M.J., Fowler, R., Lynch, N.A., Strong, H.R.: An efficient al- gorithm for byzantine agreement without authentication. Information and Control 52(3), 257–274 (1982)

1982
[19]

https://doi.org/10

van Dongen, B., Borchert, F.: BPI Challenge 2018. https://doi.org/10. 4121/uuid:3301445f-95e8-4ff0-98a4-901f1f204972 (2018). https://doi.org/10.4121/ UUID:3301445F-95E8-4FF0-98A4-901F1F204972

2018
[20]

In: Miclet, L., de la Higuera, C

Dupont, P.: Incremental regular inference. In: Miclet, L., de la Higuera, C. (eds.) Grammatical Inference: Learning Syntax from Sentences, 3rd International Col- loquium, ICGI-96, Montpellier, France, September 25-27, 1996, Proceedings. Lec- ture Notes in Computer Science, vol. 1147, pp. 222–237. Springer (1996). https: //doi.org/10.1007/BFB0033357, https...

work page doi:10.1007/bfb0033357 1996
[21]

https://doi.org/10.48550/arXiv.2404.06267, https://arxiv.org/abs/2404.06267

Elyasi, K.A., van der Aa, H., Stuckenschmidt, H.: Pgtnet: A process graph trans- former network for remaining time prediction of business process instances (2024). https://doi.org/10.48550/arXiv.2404.06267, https://arxiv.org/abs/2404.06267

work page doi:10.48550/arxiv.2404.06267 2024
[22]

Cambridge University Press, USA (2010)

de la Higuera, C.: Grammatical Inference: Learning Automata and Grammars. Cambridge University Press, USA (2010)

2010
[23]

Neural Computation 9(8), 1735–1780 (1997)

Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Computation 9(8), 1735–1780 (1997)

1997
[24]

In: Kroening, D., Pasareanu, C.S

Isberner, M., Howar, F., Steffen, B.: The open-source learnlib - A framework for active automata learning. In: Kroening, D., Pasareanu, C.S. (eds.) Com- puter Aided Verification - 27th International Conference, CAV 2015, San Fran- cisco, CA, USA, July 18-24, 2015, Proceedings, Part I. Lecture Notes in Com- puter Science, vol. 9206, pp. 487–495. Springer (...

work page doi:10.1007/978-3-319-21690-4_32 2015
[25]

Krawczyk, B., Cano, A.: Online ensemble learning with abstaining classifiers for drifting and noisy data streams. Appl. Soft Comput.68, 677–692 (2018). https:// doi.org/10.1016/J.ASOC.2017.12.008, https://doi.org/10.1016/j.asoc.2017.12.008

work page doi:10.1016/j.asoc.2017.12.008 2018
[26]

ACM Trans

Lamport, L., Shostak, R., Pease, M.: The byzantine generals problem. ACM Trans. Program. Lang. Syst.4(3), 382–401 (1982)

1982
[27]

In: de Boer, F.S., Bonsangue, M.M., Graf, S., de Roever, W.P

Leucker, M.: Learning meets verification. In: de Boer, F.S., Bonsangue, M.M., Graf, S., de Roever, W.P. (eds.) Formal Methods for Components and Objects, 5th International Symposium, FMCO 2006, Amsterdam, The Netherlands, November 7-10, 2006, Revised Lectures. Lecture Notes in Computer Science, vol. 4709, pp. 127–151. Springer (2006). https://doi.org/10.1...

work page doi:10.1007/978-3-540-74792-5_6 2006
[28]

You are given a context below. Your task is to generate 15 diverse questions and answers based on this context:\n\n

Lischka, A., Rauch, S., Stritzel, O.: Directly follows graphs go predictive process monitoring with graph neural networks (2025). https://doi.org/10.48550/arXiv. 2503.03197, https://arxiv.org/abs/2503.03197

work page internal anchor Pith review doi:10.48550/arxiv 2025
[30]

Mao, H., Chen, Y., Jaeger, M., Nielsen, T.D., Larsen, K.G., Nielsen, B.: Learning deterministic probabilistic automata from a model checking perspective. Mach. Learn.105(2), 255–299 (2016). https://doi.org/10.1007/S10994-016-5565-9, https: //doi.org/10.1007/s10994-016-5565-9

work page doi:10.1007/s10994-016-5565-9 2016
[31]

In: Coste, F., Ouardi, F., Rabusseau, G

Mayr, F., Yovine, S., Carrasco, M., Pan, F., Vilensky, F.: A congruence-based approach to active automata learning from neural language models. In: Coste, F., Ouardi, F., Rabusseau, G. (eds.) International Conference on Grammatical Inference, ICGI 2023, 10-13 July 2023, Rabat, Morocco. Proceedings of Machine Learning Research, vol. 217, pp. 250–264. PMLR ...

2023
[32]

Muskardin, E., Aichernig, B.K., Pill, I., Pferscher, A., Tappler, M.: AALpy: an active automata learning library. Innov. Syst. Softw. Eng.18(3), Promoting Simple Agents: Ensemble Methods for Event-Log Prediction 17 417–426 (2022). https://doi.org/10.1007/S11334-022-00449-3, https://doi.org/10. 1007/s11334-022-00449-3

work page doi:10.1007/s11334-022-00449-3 2022
[33]

In: Abramowicz, W., Auer, S., Lewan- ska, E

Pegoraro, M., Uysal, M.S., Georgi, D.B., van der Aalst, W.M.P.: Text-aware pre- dictive monitoring of business processes. In: Abramowicz, W., Auer, S., Lewan- ska, E. (eds.) 24th International Conference on Business Information Systems, BIS 2021, Hannover, Germany, June 15-17, 2021. pp. 221–232 (2021). https: //doi.org/10.52825/BIS.V1I.62, https://doi.org...

work page doi:10.52825/bis.v1i.62 2021
[34]

Computing100(9), 1005–1031 (2018)

Polato, M., Sperduti, A., Burattin, A., de Leoni, M.: Time and activ- ity sequence prediction of business process instances. Computing100(9), 1005–1031 (2018). https://doi.org/10.1007/S00607-018-0593-X, https://doi.org/ 10.1007/s00607-018-0593-x

work page doi:10.1007/s00607-018-0593-x 2018
[35]

https://doi

Rama-Maneiro, E., Vidal, J.C., Lama, M.: Embedding graph convolutional net- works in recurrent neural networks for predictive monitoring (2021). https://doi. org/10.48550/arXiv.2112.09641, https://arxiv.org/abs/2112.09641

work page doi:10.48550/arxiv.2112.09641 2021
[36]

In: Oh, A., Naumann, T., Globerson, A., Saenko, K., Hardt, M., Levine, S

Schick, T., Dwivedi-Yu, J., Dessì, R., Raileanu, R., Lomeli, M., Hambro, E., Zettle- moyer, L., Cancedda, N., Scialom, T.: Toolformer: Language models can teach themselves to use tools. In: Oh, A., Naumann, T., Globerson, A., Saenko, K., Hardt, M., Levine, S. (eds.) Advances in Neural Information Processing Systems 36: Annual Conference on Neural Informat...

2023
[37]

Schmidt, J., Kramer, S.: Online induction of probabilistic real-time automata. J. Comput. Sci. Technol.29(3), 345–360 (2014). https://doi.org/10.1007/ S11390-014-1435-8, https://doi.org/10.1007/s11390-014-1435-8

work page doi:10.1007/s11390-014-1435-8 2014
[38]

The Bell system tech- nical journal27(3), 379–423 (1948)

Shannon, C.E.: A mathematical theory of communication. The Bell system tech- nical journal27(3), 379–423 (1948)

1948
[39]

https://doi.org/10.4121/uuid: 500573e6-accc-4b0c-9576-aa5468b10cee (2013)

Steeman, W.: BPI Challenge 2013, incidents. https://doi.org/10.4121/uuid: 500573e6-accc-4b0c-9576-aa5468b10cee (2013). https://doi.org/10.4121/UUID: 500573E6-ACCC-4B0C-9576-AA5468B10CEE

work page doi:10.4121/uuid: 2013
[40]

Vaandrager, F.W.: Model learning. Commun. ACM60(2), 86–95 (2017). https: //doi.org/10.1145/2967606, https://doi.org/10.1145/2967606

work page doi:10.1145/2967606 2017
[41]

In: Proceedings of the 31st International Conference on Neural Information Processing Systems

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. p. 6000–6010. NIPS’17, Curran Associates Inc., Red Hook, NY, USA (2017)

2017
[42]

In: 2017 IEEE International Conference on Software Maintenance and Evo- lution, ICSME 2017, Shanghai, China, September 17-22, 2017

Verwer, S., Hammerschmidt, C.A.: flexfringe: A passive automaton learning pack- age. In: 2017 IEEE International Conference on Software Maintenance and Evo- lution, ICSME 2017, Shanghai, China, September 17-22, 2017. pp. 638–642. IEEE Computer Society (2017). https://doi.org/10.1109/ICSME.2017.58, https: //doi.org/10.1109/ICSME.2017.58

work page doi:10.1109/icsme.2017.58 2017
[43]

https://doi.org/10

Wang, F., Damiani, E.: Time-aware and transition-semantic graph neural networks for interpretable predictive business process monitoring (2025). https://doi.org/10. 48550/arXiv.2508.09527, https://arxiv.org/abs/2508.09527

work page arXiv 2025
[44]

In: Gurevych, I., Miyao, Y

Weiss, G., Goldberg, Y., Yahav, E.: On the practical computational power of finite precision rnns for language recognition. In: Gurevych, I., Miyao, Y. (eds.) Proceed- ings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL 2018, Melbourne, Australia, July 15-20, 2018, Volume 2: Short Papers. pp. 740–745. Association for Comp...

2018
[45]

In: The Eleventh Interna- tional Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023

Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K.R., Cao, Y.: ReAct: Synergizing reasoning and acting in language models. In: The Eleventh Interna- tional Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net (2023) 18 B. Bollig, M. Függer, T. Nowak, and P. Zeinaty

2023
[46]

van Zelst, S.J., van Dongen, B.F., van der Aalst, W.M.P.: Event stream-based process discovery using abstract representations. Knowl. Inf. Syst.54(2), 407– 435 (2018). https://doi.org/10.1007/S10115-017-1060-2, https://doi.org/10.1007/ s10115-017-1060-2

work page doi:10.1007/s10115-017-1060-2 2018
[47]

Chapman & Hal- l/CRC, 1st edn

Zhou, Z.H.: Ensemble Methods: Foundations and Algorithms. Chapman & Hal- l/CRC, 1st edn. (2012)

2012

[1] [1]

Springer (2016)

van der Aalst, W.M.P.: Process Mining - Data Science in Action, Second Edition. Springer (2016). https://doi.org/10.1007/978-3-662-49851-4, https://doi.org/10. 1007/978-3-662-49851-4

work page doi:10.1007/978-3-662-49851-4 2016

[2] [2]

van der Aalst, W.M.P., Schonenberg, M.H., Song, M.: Time prediction based on process mining. Inf. Syst.36(2), 450–475 (2011). https://doi.org/10.1016/J.IS. 2010.09.001, https://doi.org/10.1016/j.is.2010.09.001

work page doi:10.1016/j.is 2011

[3] [3]

Balle, B., Castro, J., Gavaldà, R.: Adaptively learning probabilistic deterministic automata from data streams. Mach. Learn.96(1-2), 99–127 (2014). https://doi. org/10.1007/S10994-013-5408-X, https://doi.org/10.1007/s10994-013-5408-x Promoting Simple Agents: Ensemble Methods for Event-Log Prediction 15

work page doi:10.1007/s10994-013-5408-x 2014

[4] [4]

In: Coste, F., Ouardi, F., Rabusseau, G

Baumgartner,R.,Verwer,S.:Learningstatemachinesfromdatastreams:Ageneric strategy and an improved heuristic. In: Coste, F., Ouardi, F., Rabusseau, G. (eds.) International Conference on Grammatical Inference, ICGI 2023, 10-13 July 2023, Rabat, Morocco. Proceedings of Machine Learning Research, vol. 217, pp. 117–141. PMLR (2023), https://proceedings.mlr.press...

2023

[5] [5]

Bollig,B.,Függer,M.,Nowak,T.,Zeinaty,P.:logicsponge-processmining:Alibrary for process-mining tasks and next activity prediction in business processes., https: //github.com/innatelogic/logicsponge-processmining.git, accessed: 2026-02-13

2026

[6] [6]

In: Touili, T., Cook, B., Jackson, P.B

Bollig, B., Katoen, J., Kern, C., Leucker, M., Neider, D., Piegdon, D.R.: libalf: The automata learning framework. In: Touili, T., Cook, B., Jackson, P.B. (eds.) Com- puter Aided Verification, 22nd International Conference, CAV 2010, Edinburgh, UK, July 15-19, 2010. Proceedings. Lecture Notes in Computer Science, vol. 6174, pp. 360–364. Springer (2010). h...

work page doi:10.1007/978-3-642-14295-6_32 2010

[7] [7]

MIS Q.40(4), 1009–1034 (2016)

Breuker, D., Matzner, M., Delfmann, P., Becker, J.: Comprehensible predictive models for business processes. MIS Q.40(4), 1009–1034 (2016). https://doi.org/ 10.25300/MISQ/2016/40.4.10, https://doi.org/10.25300/misq/2016/40.4.10

work page doi:10.25300/misq/2016/40.4.10 2016

[8] [8]

https://doi.org/10.48550/ arXiv.2104.00721, https://arxiv.org/abs/2104.00721

Bukhsh, Z.A., Saeed, A., Dijkman, R.M.: Processtransformer: Predictive business process monitoring with transformer network (2021). https://doi.org/10.48550/ arXiv.2104.00721, https://arxiv.org/abs/2104.00721

work page arXiv 2021

[9] [9]

In: van der Aalst, W.M.P., Carmona, J

Burattin, A.: Streaming process mining. In: van der Aalst, W.M.P., Carmona, J. (eds.) Process Mining Handbook, Lecture Notes in Business Information Processing, vol. 448, pp. 349–372. Springer (2022). https://doi.org/10.1007/ 978-3-031-08848-3_11, https://doi.org/10.1007/978-3-031-08848-3_11

work page doi:10.1007/978-3-031-08848-3_11 2022

[10] [10]

In: Proceedings of the IEEE Congress on Evolution- ary Computation, CEC 2014, Beijing, China, July 6-11, 2014

Burattin, A., Sperduti, A., van der Aalst, W.M.P.: Control-flow discovery from event streams. In: Proceedings of the IEEE Congress on Evolution- ary Computation, CEC 2014, Beijing, China, July 6-11, 2014. pp. 2420–

2014

[11] [11]

https://doi.org/10.1109/CEC.2014.6900341, https://doi.org/ 10.1109/CEC.2014.6900341

IEEE (2014). https://doi.org/10.1109/CEC.2014.6900341, https://doi.org/ 10.1109/CEC.2014.6900341

work page doi:10.1109/cec.2014.6900341 2014

[12] [12]

In: International Colloquium on Grammatical Inference

Carrasco, R.C., Oncina, J.: Learning stochastic regular grammars by means of a state merging method. In: International Colloquium on Grammatical Inference. pp. 139–152. Springer (1994)

1994

[13] [13]

In: Dzeroski, S., Panov, P., Kocev, D., Todorovski, L

Ceci, M., Lanotte, P.F., Fumarola, F., Cavallo, D.P., Malerba, D.: Completion time and next activity prediction of processes using sequential pattern min- ing. In: Dzeroski, S., Panov, P., Kocev, D., Todorovski, L. (eds.) Discovery Sci- ence - 17th International Conference, DS 2014, Bled, Slovenia, October 8-10,

2014

[14] [14]

Lecture Notes in Computer Science, vol

Proceedings. Lecture Notes in Computer Science, vol. 8777, pp. 49–61. Springer (2014). https://doi.org/10.1007/978-3-319-11812-3_5, https://doi.org/ 10.1007/978-3-319-11812-3_5

work page doi:10.1007/978-3-319-11812-3_5 2014

[15] [15]

Journal of algorithms3(1), 14–30 (1982)

Dolev, D.: The byzantine generals strike again. Journal of algorithms3(1), 14–30 (1982)

1982

[16] [16]

Information and Control 52(3), 257–274 (1982)

Dolev, D., Fischer, M.J., Fowler, R., Lynch, N.A., Strong, H.R.: An efficient al- gorithm for byzantine agreement without authentication. Information and Control 52(3), 257–274 (1982)

1982

[17] [19]

https://doi.org/10

van Dongen, B., Borchert, F.: BPI Challenge 2018. https://doi.org/10. 4121/uuid:3301445f-95e8-4ff0-98a4-901f1f204972 (2018). https://doi.org/10.4121/ UUID:3301445F-95E8-4FF0-98A4-901F1F204972

2018

[18] [20]

In: Miclet, L., de la Higuera, C

Dupont, P.: Incremental regular inference. In: Miclet, L., de la Higuera, C. (eds.) Grammatical Inference: Learning Syntax from Sentences, 3rd International Col- loquium, ICGI-96, Montpellier, France, September 25-27, 1996, Proceedings. Lec- ture Notes in Computer Science, vol. 1147, pp. 222–237. Springer (1996). https: //doi.org/10.1007/BFB0033357, https...

work page doi:10.1007/bfb0033357 1996

[19] [21]

https://doi.org/10.48550/arXiv.2404.06267, https://arxiv.org/abs/2404.06267

Elyasi, K.A., van der Aa, H., Stuckenschmidt, H.: Pgtnet: A process graph trans- former network for remaining time prediction of business process instances (2024). https://doi.org/10.48550/arXiv.2404.06267, https://arxiv.org/abs/2404.06267

work page doi:10.48550/arxiv.2404.06267 2024

[20] [22]

Cambridge University Press, USA (2010)

de la Higuera, C.: Grammatical Inference: Learning Automata and Grammars. Cambridge University Press, USA (2010)

2010

[21] [23]

Neural Computation 9(8), 1735–1780 (1997)

Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Computation 9(8), 1735–1780 (1997)

1997

[22] [24]

In: Kroening, D., Pasareanu, C.S

Isberner, M., Howar, F., Steffen, B.: The open-source learnlib - A framework for active automata learning. In: Kroening, D., Pasareanu, C.S. (eds.) Com- puter Aided Verification - 27th International Conference, CAV 2015, San Fran- cisco, CA, USA, July 18-24, 2015, Proceedings, Part I. Lecture Notes in Com- puter Science, vol. 9206, pp. 487–495. Springer (...

work page doi:10.1007/978-3-319-21690-4_32 2015

[23] [25]

Krawczyk, B., Cano, A.: Online ensemble learning with abstaining classifiers for drifting and noisy data streams. Appl. Soft Comput.68, 677–692 (2018). https:// doi.org/10.1016/J.ASOC.2017.12.008, https://doi.org/10.1016/j.asoc.2017.12.008

work page doi:10.1016/j.asoc.2017.12.008 2018

[24] [26]

ACM Trans

Lamport, L., Shostak, R., Pease, M.: The byzantine generals problem. ACM Trans. Program. Lang. Syst.4(3), 382–401 (1982)

1982

[25] [27]

In: de Boer, F.S., Bonsangue, M.M., Graf, S., de Roever, W.P

Leucker, M.: Learning meets verification. In: de Boer, F.S., Bonsangue, M.M., Graf, S., de Roever, W.P. (eds.) Formal Methods for Components and Objects, 5th International Symposium, FMCO 2006, Amsterdam, The Netherlands, November 7-10, 2006, Revised Lectures. Lecture Notes in Computer Science, vol. 4709, pp. 127–151. Springer (2006). https://doi.org/10.1...

work page doi:10.1007/978-3-540-74792-5_6 2006

[26] [28]

You are given a context below. Your task is to generate 15 diverse questions and answers based on this context:\n\n

Lischka, A., Rauch, S., Stritzel, O.: Directly follows graphs go predictive process monitoring with graph neural networks (2025). https://doi.org/10.48550/arXiv. 2503.03197, https://arxiv.org/abs/2503.03197

work page internal anchor Pith review doi:10.48550/arxiv 2025

[27] [30]

Mao, H., Chen, Y., Jaeger, M., Nielsen, T.D., Larsen, K.G., Nielsen, B.: Learning deterministic probabilistic automata from a model checking perspective. Mach. Learn.105(2), 255–299 (2016). https://doi.org/10.1007/S10994-016-5565-9, https: //doi.org/10.1007/s10994-016-5565-9

work page doi:10.1007/s10994-016-5565-9 2016

[28] [31]

In: Coste, F., Ouardi, F., Rabusseau, G

Mayr, F., Yovine, S., Carrasco, M., Pan, F., Vilensky, F.: A congruence-based approach to active automata learning from neural language models. In: Coste, F., Ouardi, F., Rabusseau, G. (eds.) International Conference on Grammatical Inference, ICGI 2023, 10-13 July 2023, Rabat, Morocco. Proceedings of Machine Learning Research, vol. 217, pp. 250–264. PMLR ...

2023

[29] [32]

Muskardin, E., Aichernig, B.K., Pill, I., Pferscher, A., Tappler, M.: AALpy: an active automata learning library. Innov. Syst. Softw. Eng.18(3), Promoting Simple Agents: Ensemble Methods for Event-Log Prediction 17 417–426 (2022). https://doi.org/10.1007/S11334-022-00449-3, https://doi.org/10. 1007/s11334-022-00449-3

work page doi:10.1007/s11334-022-00449-3 2022

[30] [33]

In: Abramowicz, W., Auer, S., Lewan- ska, E

Pegoraro, M., Uysal, M.S., Georgi, D.B., van der Aalst, W.M.P.: Text-aware pre- dictive monitoring of business processes. In: Abramowicz, W., Auer, S., Lewan- ska, E. (eds.) 24th International Conference on Business Information Systems, BIS 2021, Hannover, Germany, June 15-17, 2021. pp. 221–232 (2021). https: //doi.org/10.52825/BIS.V1I.62, https://doi.org...

work page doi:10.52825/bis.v1i.62 2021

[31] [34]

Computing100(9), 1005–1031 (2018)

Polato, M., Sperduti, A., Burattin, A., de Leoni, M.: Time and activ- ity sequence prediction of business process instances. Computing100(9), 1005–1031 (2018). https://doi.org/10.1007/S00607-018-0593-X, https://doi.org/ 10.1007/s00607-018-0593-x

work page doi:10.1007/s00607-018-0593-x 2018

[32] [35]

https://doi

Rama-Maneiro, E., Vidal, J.C., Lama, M.: Embedding graph convolutional net- works in recurrent neural networks for predictive monitoring (2021). https://doi. org/10.48550/arXiv.2112.09641, https://arxiv.org/abs/2112.09641

work page doi:10.48550/arxiv.2112.09641 2021

[33] [36]

In: Oh, A., Naumann, T., Globerson, A., Saenko, K., Hardt, M., Levine, S

Schick, T., Dwivedi-Yu, J., Dessì, R., Raileanu, R., Lomeli, M., Hambro, E., Zettle- moyer, L., Cancedda, N., Scialom, T.: Toolformer: Language models can teach themselves to use tools. In: Oh, A., Naumann, T., Globerson, A., Saenko, K., Hardt, M., Levine, S. (eds.) Advances in Neural Information Processing Systems 36: Annual Conference on Neural Informat...

2023

[34] [37]

Schmidt, J., Kramer, S.: Online induction of probabilistic real-time automata. J. Comput. Sci. Technol.29(3), 345–360 (2014). https://doi.org/10.1007/ S11390-014-1435-8, https://doi.org/10.1007/s11390-014-1435-8

work page doi:10.1007/s11390-014-1435-8 2014

[35] [38]

The Bell system tech- nical journal27(3), 379–423 (1948)

Shannon, C.E.: A mathematical theory of communication. The Bell system tech- nical journal27(3), 379–423 (1948)

1948

[36] [39]

https://doi.org/10.4121/uuid: 500573e6-accc-4b0c-9576-aa5468b10cee (2013)

Steeman, W.: BPI Challenge 2013, incidents. https://doi.org/10.4121/uuid: 500573e6-accc-4b0c-9576-aa5468b10cee (2013). https://doi.org/10.4121/UUID: 500573E6-ACCC-4B0C-9576-AA5468B10CEE

work page doi:10.4121/uuid: 2013

[37] [40]

Vaandrager, F.W.: Model learning. Commun. ACM60(2), 86–95 (2017). https: //doi.org/10.1145/2967606, https://doi.org/10.1145/2967606

work page doi:10.1145/2967606 2017

[38] [41]

In: Proceedings of the 31st International Conference on Neural Information Processing Systems

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. p. 6000–6010. NIPS’17, Curran Associates Inc., Red Hook, NY, USA (2017)

2017

[39] [42]

In: 2017 IEEE International Conference on Software Maintenance and Evo- lution, ICSME 2017, Shanghai, China, September 17-22, 2017

Verwer, S., Hammerschmidt, C.A.: flexfringe: A passive automaton learning pack- age. In: 2017 IEEE International Conference on Software Maintenance and Evo- lution, ICSME 2017, Shanghai, China, September 17-22, 2017. pp. 638–642. IEEE Computer Society (2017). https://doi.org/10.1109/ICSME.2017.58, https: //doi.org/10.1109/ICSME.2017.58

work page doi:10.1109/icsme.2017.58 2017

[40] [43]

https://doi.org/10

Wang, F., Damiani, E.: Time-aware and transition-semantic graph neural networks for interpretable predictive business process monitoring (2025). https://doi.org/10. 48550/arXiv.2508.09527, https://arxiv.org/abs/2508.09527

work page arXiv 2025

[41] [44]

In: Gurevych, I., Miyao, Y

Weiss, G., Goldberg, Y., Yahav, E.: On the practical computational power of finite precision rnns for language recognition. In: Gurevych, I., Miyao, Y. (eds.) Proceed- ings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL 2018, Melbourne, Australia, July 15-20, 2018, Volume 2: Short Papers. pp. 740–745. Association for Comp...

2018

[42] [45]

In: The Eleventh Interna- tional Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023

Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K.R., Cao, Y.: ReAct: Synergizing reasoning and acting in language models. In: The Eleventh Interna- tional Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net (2023) 18 B. Bollig, M. Függer, T. Nowak, and P. Zeinaty

2023

[43] [46]

van Zelst, S.J., van Dongen, B.F., van der Aalst, W.M.P.: Event stream-based process discovery using abstract representations. Knowl. Inf. Syst.54(2), 407– 435 (2018). https://doi.org/10.1007/S10115-017-1060-2, https://doi.org/10.1007/ s10115-017-1060-2

work page doi:10.1007/s10115-017-1060-2 2018

[44] [47]

Chapman & Hal- l/CRC, 1st edn

Zhou, Z.H.: Ensemble Methods: Foundations and Algorithms. Chapman & Hal- l/CRC, 1st edn. (2012)

2012