pith. sign in

arxiv: 2509.15346 · v2 · submitted 2025-09-18 · 💻 cs.DB

Revealing Inherent Concurrency in Event Data: A Partial Order Approach to Process Discovery

Pith reviewed 2026-05-18 15:31 UTC · model grok-4.3

classification 💻 cs.DB
keywords process discoverypartial ordersevent logsconcurrencyprocess miningprocess modelsscalability
0
0 comments X

The pith

Deriving partially ordered traces from event data enables process models that preserve inherent concurrency and fit the data perfectly.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Traditional process discovery algorithms force events into a linear order, which obscures the true concurrency present in many real processes. This paper presents an algorithm that instead extracts partial orders directly from event logs to represent possible concurrent executions. These partial orders are then aggregated using a hierarchical method that abstracts exclusive choices and loops, producing a process model that is guaranteed to be sound and to fit the observed behavior exactly. A sympathetic reader would care because more accurate models of concurrency lead to better understanding, simulation, and improvement of business processes that involve parallel activities.

Core claim

The paper claims that by deriving partially ordered traces from event data and aggregating them hierarchically, it is possible to construct process models that are sound by construction, perfectly fit the event data, and preserve the inherent concurrency without the loss caused by linearization.

What carries the argument

The hierarchical aggregation of partially ordered traces, which systematically preserves concurrency while abstracting choices and loop patterns.

If this is right

  • The process models remain sound regardless of the input log structure.
  • Models fit the data with no deviations or missing behaviors from the partial orders.
  • Concurrency is maintained in the model representation rather than being linearized.
  • The hierarchical abstraction reduces model size while increasing precision for concurrent logs.
  • The method applies successfully to large and complex real-life event logs.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Integrating this partial-order approach with existing process mining tools could enhance their handling of parallel process flows.
  • Applying the method to streaming event data might enable real-time discovery of concurrent behaviors.
  • Comparing the compactness of these models against traditional ones on the same logs could quantify the benefits for visualization and analysis.
  • Extending the derivation of partial orders to handle noisy or incomplete event data would broaden applicability to practical scenarios.

Load-bearing premise

Partially ordered traces can be accurately derived from standard event data to capture inherent concurrency without introducing artificial orderings or losing information.

What would settle it

Running the algorithm on an event log from a process with known concurrent activities and finding that the output model imposes a fixed order on those activities would indicate the partial orders failed to capture the concurrency.

Figures

Figures reproduced from arXiv: 2509.15346 by Gyunam Park, Humam Kourani, Wil M.P. van der Aalst.

Figure 1
Figure 1. Figure 1: Trace variants for the hospital process, represented as four partial orders. [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: The POWL model discovered from the BPI Challenge 2012 event log with [PITH_FULL_IMAGE:figures/full_fig_p011_2.png] view at source ↗
read the original abstract

Process discovery algorithms traditionally linearize events, failing to capture the inherent concurrency of real-world processes. While some techniques can handle partially ordered data, they often struggle with scalability on large event logs. We introduce a novel, scalable algorithm that directly leverages partial orders in process discovery. Our approach derives partially ordered traces from event data and aggregates them into a sound-by-construction, perfectly fitting process model. Our hierarchical algorithm preserves inherent concurrency while systematically abstracting exclusive choices and loop patterns, enhancing model compactness and precision. We have implemented our technique and demonstrated its applicability on complex real-life event logs. Our work contributes a scalable solution for a more faithful representation of process behavior, especially when concurrency is prevalent in event data.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper claims to introduce a novel scalable algorithm for process discovery that derives partially ordered traces directly from event data, aggregates them into a sound-by-construction and perfectly fitting process model, and employs a hierarchical abstraction mechanism to handle exclusive choices and loops while preserving inherent concurrency and improving model compactness and precision. The work is demonstrated on complex real-life event logs.

Significance. If the derivation of partial orders from linear logs is shown to be information-preserving and the aggregation step rigorously guarantees soundness and perfect fit without artifacts, the approach could meaningfully advance process mining by better capturing concurrency than traditional linearization methods, with potential benefits for scalability and model quality in domains where concurrency is prevalent.

major comments (2)
  1. [Abstract and derivation description] The central claim that partially ordered traces can be derived from standard sequential event logs while accurately capturing inherent concurrency without introducing spurious orderings or information loss is load-bearing but lacks formal justification or error-handling details; standard logs record total orders, so any inference step (via timestamps or heuristics) must be proven to avoid artifacts that would undermine the subsequent aggregation's soundness guarantee.
  2. [Aggregation and hierarchical algorithm] The assertion of a 'sound-by-construction, perfectly fitting' model via aggregation of partial orders requires explicit derivation or proof; without it, it is unclear how hierarchical abstraction of exclusive choices and loops maintains these properties or affects precision, especially given the absence of evidence on how abstractions impact results.
minor comments (2)
  1. Clarify notation for partial orders and traces to avoid ambiguity in how concurrency is represented.
  2. The demonstration on real-life logs would benefit from quantitative comparison to baseline discovery algorithms on metrics like fitness, precision, and scalability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback, which identifies key areas where additional formal detail would strengthen the presentation. We respond to each major comment below and will revise the manuscript accordingly to address the concerns raised.

read point-by-point responses
  1. Referee: [Abstract and derivation description] The central claim that partially ordered traces can be derived from standard sequential event logs while accurately capturing inherent concurrency without introducing spurious orderings or information loss is load-bearing but lacks formal justification or error-handling details; standard logs record total orders, so any inference step (via timestamps or heuristics) must be proven to avoid artifacts that would undermine the subsequent aggregation's soundness guarantee.

    Authors: We agree that the derivation of partial orders from sequential event logs is a load-bearing step that requires explicit formal justification to rule out spurious orderings and information loss. The manuscript currently describes this derivation at a conceptual level. In the revised version we will add a dedicated subsection that formally defines the derivation function, states the assumptions on the input log (including timestamp availability), specifies error-handling for ambiguous cases, and includes a proof sketch showing that the inferred partial orders preserve the information in the original total orders without introducing artifacts. This will directly support the soundness claims made for the aggregation step. revision: yes

  2. Referee: [Aggregation and hierarchical algorithm] The assertion of a 'sound-by-construction, perfectly fitting' model via aggregation of partial orders requires explicit derivation or proof; without it, it is unclear how hierarchical abstraction of exclusive choices and loops maintains these properties or affects precision, especially given the absence of evidence on how abstractions impact results.

    Authors: We concur that an explicit derivation or proof of the sound-by-construction and perfect-fit properties is needed, along with clarification of how hierarchical abstraction affects precision. The manuscript argues these properties from the direct use of partial orders in aggregation. In the revision we will insert a formal theorem together with its proof (in the main text or an appendix) establishing that the aggregation produces a sound model that perfectly fits the input partial-order traces. We will also add an analysis of the hierarchical abstraction step, including its effect on precision, supported by quantitative results from the real-life logs that compare precision and other quality metrics across different abstraction levels. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation presented as direct construction from partial orders.

full rationale

The provided abstract and context describe deriving partially ordered traces from event data then aggregating them into a sound-by-construction model. No equations, self-citations, or fitted parameters are quoted that reduce any prediction or central result back to its own inputs by definition. The approach is framed as a hierarchical algorithm preserving concurrency, with no load-bearing self-referential steps or uniqueness theorems imported from prior author work visible in the given text. This is the expected non-finding for a paper whose core steps remain independent of the target claims.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that event data inherently supports partial order representation of concurrency; no free parameters or invented entities are mentioned in the abstract.

axioms (1)
  • domain assumption Event data can be represented as partial orders that capture inherent concurrency without forcing artificial linear orders.
    This is the core premise enabling derivation of partially ordered traces as stated in the abstract.

pith-pipeline@v0.9.0 · 5654 in / 1273 out tokens · 56437 ms · 2026-05-18T15:31:01.218604+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

17 extracted references · 17 canonical work pages

  1. [1]

    Arthur H. M. ter Hofstede et al.: Process-data quality: The true frontier of process mining. ACM J. Data Inf. Qual.15(3), 29:1–29:21 (2023)

  2. [2]

    In: ICPM 2020 Workshops

    Augusto, A., Dumas, M., Rosa, M.L.: Automated discovery of process models with true concurrency and inclusive choices. In: ICPM 2020 Workshops. pp. 43–56 (2020)

  3. [3]

    In: ICPM 2019

    Bergenthum, R.: Prime miner - process discovery using prime event structures. In: ICPM 2019. pp. 41–48 (2019)

  4. [4]

    In: Applications of Region Theory 2011

    Bergenthum, R., Mauser, S.: Folding partially ordered runs. In: Applications of Region Theory 2011. pp. 52–62 (2011)

  5. [5]

    van Dongen, B.F., Desel, J., van der Aalst, W.M.P.: Aggregating causal runs into workflow nets. Trans. Petri Nets Other Model. Concurr.6, 334–363 (2012)

  6. [6]

    In: PETRI NETS 2015

    Dumas, M., García-Bañuelos, L.: Process mining reloaded: Event structures as a unified representation of process models and event logs. In: PETRI NETS 2015. pp. 33–48 (2015)

  7. [7]

    In: PETRI NETS 2023

    Folz-Weinstein, S., Bergenthum, R., Desel, J., Kovár, J.: ILP2 miner - process discovery for partially ordered event logs using integer linear programming. In: PETRI NETS 2023. pp. 59–76 (2023)

  8. [8]

    In: ICPM 2023

    Kourani, H., Schuster, D., van der Aalst, W.M.P.: Scalable discovery of partially ordered workflow models with formal guarantees. In: ICPM 2023. pp. 89–96 (2023)

  9. [9]

    Kourani, H., van Zelst, S.J.: POWL: partially ordered workflow language. In: BPM

  10. [10]

    Kourani, H., van Zelst, S.J., Schuster, D., van der Aalst, W.M.P.: Discovering partially ordered workflow models. Inf. Syst.128, 102493 (2025)

  11. [11]

    In: AWPN 2024 (2024)

    Kovář, J., Bergenthum, R.: Exploratory process discovery for petri nets. In: AWPN 2024 (2024)

  12. [12]

    In: BPM 2015 Workshops

    Leemans, S.J.J., Fahland, D., van der Aalst, W.M.P.: Using life cycle information in process discovery. In: BPM 2015 Workshops. pp. 204–217 (2015) Revealing Inherent Concurrency in Event Data 13

  13. [13]

    Leemans, S.J.J., van Zelst, S.J., Lu, X.: Partial-order-based process mining: a sur- vey and outlook. Knowl. Inf. Syst.65(1), 1–29 (2023)

  14. [14]

    In: ATVA 2015

    de León, H.P., Rodríguez, C., Carmona, J., Heljanko, K., Haar, S.: Unfolding-based process discovery. In: ATVA 2015. pp. 31–47 (2015)

  15. [15]

    In: Fernandez-Llatas, C

    Martin, N.: Data quality in process mining. In: Fernandez-Llatas, C. (ed.) Inter- active Process Mining in Healthcare. pp. 53–79. Springer (2020)

  16. [16]

    arXiv preprint arXiv:2504.08372 (2025)

    Sabine Folz-Weinstein et al.: eST2 miner–process discovery based on firing partial orders. arXiv preprint arXiv:2504.08372 (2025)

  17. [17]

    In: PNCWB 2005

    Van Dongen, B.F., Van der Aalst, W.M.: Multi-phase process mining: Aggregating instance graphs into EPCs and Petri nets. In: PNCWB 2005. pp. 35–58 (2005)