Discovering Process Models With Long-Term Dependencies While Providing Guarantees and Filtering Infrequent Behavior Patterns
Pith reviewed 2026-05-24 10:36 UTC · model grok-4.3
The pith
Adaptations to the eST-Miner select place subsets so the resulting Petri net guarantees a minimum fitness on the event log while preserving high precision.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By adapting place selection inside the eST-Miner, a subset of places can be chosen so the assembled Petri net meets a user-specified minimum fitness on the event log; a refined place fitness metric is defined and tested that reduces blocking of infrequent activity labels.
What carries the argument
Place selection strategies that enforce a combined fitness guarantee together with a refined token-replay metric for isolated places.
If this is right
- The produced nets can capture long-term dependencies while still meeting the stated fitness bound.
- Precision stays high because only places that fit the chosen threshold are retained.
- Infrequent labels are less likely to be excluded by the place evaluation step.
- The method can be run on both real and artificial logs to produce comparable results across selection variants.
Where Pith is reading between the lines
- The same selection logic could be transferred to other place-based miners that currently evaluate candidates independently.
- Scaling the approach to very large logs may require additional pruning before the fitness checks are performed.
- If the refined metric proves stable, it could replace the original metric in any replay-based discovery tool.
Load-bearing premise
Places that each replay enough traces in isolation can be assembled without creating deadlocks or dropping below the target fitness for the whole net.
What would settle it
An event log and noise threshold where the adapted algorithm returns a net that replays strictly less than the required fraction of traces.
Figures
read the original abstract
In process discovery, the goal is to find, for a given event log, the model describing the underlying process. While process models can be represented in a variety of ways, Petri nets form a theoretically well-explored description language and are therefore often used. In this paper, we extend the eST-Miner process discovery algorithm. The eST-Miner computes a set of Petri net places which are considered to be fitting with respect to a certain fraction of the behavior described by the given event log as indicated by a given noise threshold. It evaluates all possible candidate places using token-based replay. The set of replayable traces is determined for each place in isolation, i.e., these sets do not need to be consistent. This allows the algorithm to abstract from infrequent behavioral patterns occurring only in some traces. However, when combining places into a Petri net by connecting them to the corresponding uniquely labeled transitions, the resulting net can replay exactly those traces from the event log that are allowed by the combination of all inserted places. Thus, inserting places one-by-one without considering their combined effect may result in deadlocks and low fitness of the Petri net. In this paper, we explore adaptions of the eST-Miner, that aim to select a subset of places such that the resulting Petri net guarantees a definable minimal fitness while maintaining high precision with respect to the input event log. Furthermore, current place evaluation techniques tend to block the execution of infrequent activity labels. Thus, a refined place fitness metric is introduced and thoroughly investigated. In our experiments we use real and artificial event logs to evaluate and compare the impact of the various place selection strategies and place fitness evaluation metrics on the returned Petri net.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper extends the eST-Miner algorithm, which discovers Petri net places via isolated token-based replay against a noise threshold. It introduces place-selection strategies to ensure the combined net achieves a user-specified minimum fitness while retaining high precision, and proposes a refined place fitness metric intended to avoid blocking infrequent activity labels. The adaptations and metric are evaluated on real and artificial event logs.
Significance. The work directly targets a known limitation of place-based miners: isolated fitness does not guarantee joint replayability. If the selection procedures deliver the claimed fitness bounds without excessive precision loss, the result would strengthen the practical utility of eST-Miner-style algorithms in noisy logs. The experimental comparison of multiple selection heuristics and the refined metric supplies concrete evidence on trade-offs.
major comments (2)
- [§4.2] §4.2 (place-selection procedure): the proof sketch that the greedy or exhaustive subset selection preserves the target fitness relies on the assumption that adding a place never decreases the replayable trace set; this needs an explicit invariant or counter-example check, because the token-replay semantics of the combined net can introduce deadlocks not visible in isolated places.
- [Table 2] Table 2 (fitness results on real logs): the reported fitness values for the refined metric are only 0.02–0.05 higher than the baseline, yet the paper claims the metric “thoroughly” prevents blocking of infrequent labels; the effect size appears too small to support the central claim without additional statistical tests or larger logs.
minor comments (2)
- [Eq. (7)] Notation for the refined fitness function (Eq. 7) re-uses the symbol f without re-defining it after the original token-replay fitness; a short clarifying sentence would avoid ambiguity.
- [§5] The experimental section lists log characteristics but omits the exact parameter settings (noise threshold, minimum place support) used for each run; these should be tabulated for reproducibility.
Simulated Author's Rebuttal
We thank the referee for the constructive comments on our manuscript. We address each major comment below and note the revisions that will be incorporated.
read point-by-point responses
-
Referee: [§4.2] §4.2 (place-selection procedure): the proof sketch that the greedy or exhaustive subset selection preserves the target fitness relies on the assumption that adding a place never decreases the replayable trace set; this needs an explicit invariant or counter-example check, because the token-replay semantics of the combined net can introduce deadlocks not visible in isolated places.
Authors: We agree that the proof sketch in §4.2 is insufficiently rigorous. Adding a place can only produce a non-increasing replayable trace set due to additional constraints, and deadlocks may arise from interactions not visible in isolation. The selection strategies aim to ensure the combined fitness bound by construction (via exhaustive enumeration or greedy addition with combined checks), but we will revise the section to state an explicit invariant: the selected subset is guaranteed to meet the target fitness because candidate places are only retained after verifying the joint replayability on the log. A formal argument or counter-example verification will be added. revision: yes
-
Referee: [Table 2] Table 2 (fitness results on real logs): the reported fitness values for the refined metric are only 0.02–0.05 higher than the baseline, yet the paper claims the metric “thoroughly” prevents blocking of infrequent labels; the effect size appears too small to support the central claim without additional statistical tests or larger logs.
Authors: The manuscript uses 'thoroughly investigated' to describe the evaluation of the metric, not a claim that it thoroughly prevents blocking. The modest aggregate fitness gains mask per-trace improvements where the refined metric avoids blocking infrequent labels that the baseline blocks. We will add statistical tests on the differences and results from additional larger synthetic logs to better substantiate the practical impact. revision: partial
Circularity Check
No significant circularity; claims rest on standard Petri net semantics and token replay
full rationale
The paper extends eST-Miner by selecting place subsets that guarantee definable minimal fitness in the combined net and introduces a refined place fitness metric. No equations, definitions, or self-citations reduce the fitness guarantees or metric to fitted parameters or prior results by construction. The approach uses independent place evaluation via token-based replay, with the combination step treated as a separate algorithmic adaptation rather than a self-referential fit. This is the expected non-finding for a paper whose central contribution is an engineering adaptation of established process-mining primitives.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Petri nets are a suitable formal language for process models and token-based replay can evaluate individual place fitness against an event log
Reference graph
Works this paper leans on
-
[1]
Understanding Petri Nets: Modeling Techniques, Analysis Methods, Case Studies
Reisig W. Understanding Petri Nets: Modeling Techniques, Analysis Methods, Case Studies. Springer Berlin Heidelberg, 2013. ISBN:9783642332784
work page 2013
-
[2]
Petri nets and business process management
Desel J, Oberweis A, Reisig W, Rozenberg G. Petri nets and business process management. Saarbr¨ucken: Gesch¨aftsstelle Schloss Dagstuhl, 1998
work page 1998
-
[3]
Desel J, Esparza J. Free-Choice Petri Nets. Cambridge Tracts in Theoretical Computer Science. Cambridge University Press, 1995. ISBN:9780521465199
work page 1995
-
[4]
Transformations and Decompositions of Nets
Berthelot G. Transformations and Decompositions of Nets. In: Petri Nets: Central Models and Their Properties. Springer, Berlin, Heidelberg, 1987 pp. 359–376
work page 1987
-
[5]
Process Mining: Data Science in Action
van der Aalst W. Process Mining: Data Science in Action. Springer, Heidelberg, 2 edition, 2016. doi:10.1007/978-3-662-49851-4
-
[6]
Mining process models with non-free-choice constructs
Wen L, van der Aalst WMP, Wang J, Sun J. Mining process models with non-free-choice constructs. Data Mining and Knowledge Discovery, 2007. 15(2):145–180. doi:10.1007/s10618-007-0065-y. L.L. Mannel and W. M.P . van der Aalst/ Discovering Process Models while Providing Guarantees... 157
-
[7]
Discovering Block-Structured Process Models from Event Logs - A Constructive Approach
Leemans S, Fahland D, van der Aalst W. Discovering Block-Structured Process Models from Event Logs - A Constructive Approach. Application and Theory of Petri Nets and Concurrency, 2013. Lecture Notes in Computer Science, vol 7927. doi:10.1007/978-3-642-38697-8 17
-
[8]
Flexible Heuristics Miner (FHM)
Weijters AJMM, Ribeiro JTS. Flexible Heuristics Miner (FHM). In: Proceedings of the IEEE Symposium on Computational Intelligence and Data Mining, CIDM 2011, part of the IEEE Symposium Series on Computational Intelligence 2011, April 11-15, 2011, Paris, France. IEEE, 2011 pp. 310–317. doi:10.1109/CIDM.2011.5949453
-
[9]
Badouel E, Bernardinello L, Darondeau P. Petri Net Synthesis. Text in Theoretical Computer Science, EATCS Series. Springer, 2015. doi:10.1007/978-3-662-47967-4
-
[10]
How to Synthesize Nets from Languages: A Survey
Lorenz R, Mauser S, Juh ´as G. How to Synthesize Nets from Languages: A Survey. In: Proceedings of the 39th Conference on Winter Simulation: 40 Years! The Best is Yet to Come, WSC ’07. IEEE Press, Piscataway, NJ, USA, 2007 pp. 637–647. doi:10.1109/WSC.2007.4419657
-
[12]
Process Discovery Using Integer Linear Programming
van der Werf JM, van Dongen B, Hurkens C, Serebrenik A. Process Discovery Using Integer Linear Programming. In: Applications and Theory of Petri Nets. Springer, Berlin, Heidelberg, 2008. doi:10.1007/978-3-540-68746-7 24
-
[13]
Avoiding Over-Fitting in ILP-Based Process Discovery
van Zelst S, van Dongen B, van der Aalst W. Avoiding Over-Fitting in ILP-Based Process Discovery. In: Business Process Management. Springer International Publishing, Cham, 2015 pp. 163–171. doi:10.1007/978-3-319-23063-4 10
-
[14]
ILP-Based Process Discovery Using Hybrid Regions
van Zelst S, van Dongen B, van der Aalst W. ILP-Based Process Discovery Using Hybrid Regions. In: ATAED@Petri Nets/ACSD. 2015
work page 2015
-
[15]
A region-based algorithm for discovering Petri nets from event logs
Carmona J, Cortadella J, Kishinevsky M. A region-based algorithm for discovering Petri nets from event logs. In: Business Process Management. Springer, 2008 p. 358–373
work page 2008
-
[16]
Deriving unbounded Petri nets from formal languages
Darondeau P. Deriving unbounded Petri nets from formal languages. In: CONCUR’98 Concurrency Theory. Springer, Berlin, Heidelberg. 1998 pp. 533–548. ISBN:978-3-540-68455-8
work page 1998
-
[17]
Synthesis of Petri Nets from Finite Partial Languages
Bergenthum R, Desel J, Lorenz R, Mauser S. Synthesis of Petri Nets from Finite Partial Languages. Fundam. Informaticae, 2008. 88(4):437–468
work page 2008
-
[18]
Ehrenfeucht A, Rozenberg G. Partial (set) 2-structures. Acta Informatica, 1990. 27(4):343–368
work page 1990
-
[19]
A Symbolic Algorithm for the Synthesis of Bounded Petri Nets
Carmona J, Cortadella J, Kishinevsky M, Kondratyev A, Lavagno L, Yakovlev A. A Symbolic Algorithm for the Synthesis of Bounded Petri Nets. In: van Hee KM, Valk R (eds.), Applications and Theory of Petri Nets. Springer Berlin Heidelberg, Berlin, Heidelberg. 2008 pp. 92–111. ISBN:978-3-540-68746-7
work page 2008
-
[20]
Automated Repair of Process Models Using Non-Local Constraints
Kalenkova A, Carmona J, Polyvyanyy A, La Rosa M. Automated Repair of Process Models Using Non-Local Constraints. In: Application and Theory of Petri Nets and Concurrency. Springer-Verlag, Berlin, Heidelberg. 2020 p. 280–300. ISBN:978-3-030-51830-1
work page 2020
-
[21]
Finding Complex Process-Structures by Exploiting the Token-Game
Mannel LL, van der Aalst WMP. Finding Complex Process-Structures by Exploiting the Token-Game. In: Donatelli S, Haar S (eds.), Application and Theory of Petri Nets and Concurrency - 40th International Conference, PETRI NETS 2019, Aachen, Germany, June 23-28, 2019, Proceedings, volume 11522 of Lecture Notes in Computer Science. Springer, 2019 pp. 258–278. ...
-
[22]
Mannel LL, van der Aalst WMP. Discovering Process Models with Long-Term Dependencies While Providing Guarantees and Handling Infrequent Behavior. In: Bernardinello L, Petrucci L (eds.), Application and Theory of Petri Nets and Concurrency - 43rd International Conference, PETRI NETS 2022, Bergen, Norway, June 19-24, 2022, Proceedings, volume 13288 of Lectu...
-
[23]
Implicit places in net systems
Garcia-Valles F, Colom J. Implicit places in net systems. Proceedings 8th International Workshop on Petri Nets and Performance Models, 1999. pp. 104–113. doi:10.1109/PNPM.1999.796557
-
[24]
Petri Net Reductions for Counting Markings
Berthomieu B, Botlan DL, Dal-Zilio S. Petri Net Reductions for Counting Markings. In: Gallardo M, Merino P (eds.), Model Checking Software - 25th International Symposium, SPIN 2018, Malaga, Spain, June 20-22, 2018, Proceedings, volume 10869 of LNCS. Springer, 2018 pp. 65–84
work page 2018
-
[25]
Improving the linearly based characterization of P/T nets
Colom J, Silva M. Improving the linearly based characterization of P/T nets. In: Advances in Petri Nets
-
[26]
Springer, Berlin, Heidelberg, 1991 pp. 113–145. doi:10.1007/3-540-53863-1 23
-
[27]
Removing Implicit Places Using Regions for Process Discovery
Mannel LL, Bergenthum R, van der Aalst WMP. Removing Implicit Places Using Regions for Process Discovery. In: Proceedings of the International Workshop on Algorithms & Theories for the Analysis of Event Data (ATAED) 2020, volume 2625. CEUR-WS.org pp. 20–32
work page 2020
-
[28]
Discovering the ”Glue” Connecting Activities - Exploiting Monotonicity to Learn Places Faster
van der Aalst WMP. Discovering the ”Glue” Connecting Activities - Exploiting Monotonicity to Learn Places Faster. In: It’s All About Coordination - Essays to Celebrate the Lifelong Scientific Achievements of Farhad Arbab. 2018 pp. 1–20. doi:10.1007/978-3-319-90089-6 1
-
[29]
Conformance Checking - Relating Processes and Models
Carmona J, van Dongen B, Solti A, Weidlich M. Conformance Checking - Relating Processes and Models. Springer, Cham, 2018. doi:10.1007/978-3-319-99414-7
-
[30]
Sepsis Cases - Event Log, 2016
Mannhardt, F. Sepsis Cases - Event Log, 2016. doi:10.4121/UUID:915D2BFB-7E84-49AD-A286-DC35F063A460
work page doi:10.4121/uuid:915d2bfb-7e84-49ad-a286-dc35f063a460 2016
-
[31]
Road Traffic Fine Management Process, 2015
De Leoni, M, Mannhardt, F. Road Traffic Fine Management Process, 2015
work page 2015
-
[32]
van der Aalst WMP. Spreadsheets for BPM. Business Process Management Journal, 2010. 24:105–127
work page 2010
-
[33]
Aligning observed and modeled behavior
Adriansyah A. Aligning observed and modeled behavior. Ph.D. thesis, Mathematics and Computer Science, 2014. doi:10.6100/IR770080
-
[34]
A Fresh Look at Precision in Process Conformance
Munoz-Gama J, Carmona J. A Fresh Look at Precision in Process Conformance. In: BPM, volume 6336. 2010 pp. 211–226. ISBN:978-3-642-15617-5
work page 2010
-
[35]
Soundness of Resource-Constrained Workflow Nets
van Dongen B, de Medeiros A, Verbeek H, Weijters A, van der Aalst W. The ProM Framework: A New Era in Process Mining Tool Support. In: Applications and Theory of Petri Nets 2005. Springer, Berlin, Heidelberg, 2005 pp. 444–454. doi:10.1007/11494744 25
-
[36]
Mendling J, Reijers HA, Cardoso J. What Makes Process Models Understandable? In: Alonso G, Dadam P, Rosemann M (eds.), Business Process Management. Springer Berlin Heidelberg, 2007 pp. 48–63. doi:10.1007/978-3-540-75183-0 4
-
[37]
Discovering more precise process models from event logs by filtering out chaotic activities
Tax N, Sidorova N, van der Aalst WMP. Discovering more precise process models from event logs by filtering out chaotic activities. J. Intell. Inf. Syst., 2019. 52(1):107–139. doi:10.1007/s10844-018-0507-6
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.