pith. sign in

arxiv: 2605.26938 · v1 · pith:74R7NIXEnew · submitted 2026-05-26 · 💻 cs.AI · math.OC

Developing a Totally Unimodular Linear Program for Optimal Conformance Checking: When and Why It Complements A*

Pith reviewed 2026-06-29 17:01 UTC · model grok-4.3

classification 💻 cs.AI math.OC
keywords conformance checkingalignmenttotally unimodularlinear programmingA* searchprocess miningnetwork flowreachability graph
0
0 comments X

The pith

Reformulating conformance checking as a totally unimodular linear program guarantees integral optimal solutions via standard LP solvers.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that alignment-based conformance checking can be recast as a linear program whose constraint matrix is totally unimodular due to the network-flow structure of the synchronous product's reachability graph. This property ensures the LP relaxation yields an integral optimum, removing the need for integer variables or branch-and-bound search. The resulting LP is faster than A* on long traces that contain deviations, while A* remains faster on short, well-conforming traces. Simple selection rules that switch between the two methods produce an average 38.6 percent runtime reduction across more than 2.1 million instances while preserving 96 percent selection accuracy.

Core claim

By modeling alignment as a minimum-cost flow on the reachability graph of the synchronous product and exploiting its network-flow structure, the LP constraint matrix is totally unimodular; therefore the optimal extreme-point solution of the relaxed LP is guaranteed to be integral and can be obtained without combinatorial overhead.

What carries the argument

The totally unimodular linear program on the reachability graph of the synchronous product, whose network-flow structure supplies the total unimodularity that converts the relaxed LP into an exact integer solution.

If this is right

  • The LP formulation supplies substantial speedups precisely on longer traces that contain deviations.
  • A* remains preferable on short traces that conform well to the model.
  • A hybrid selector that chooses between the two methods based on trace characteristics achieves 38.6 percent average runtime savings at 96 percent accuracy.
  • The approach eliminates the exponential worst-case behavior of A* when substantial deviations are present.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same network-flow structure may appear in other alignment variants, allowing analogous LP reformulations without integer variables.
  • Process-mining platforms could embed both solvers and apply the selection rule to handle mixed workloads efficiently in practice.
  • Sensitivity information from the LP optimum could be used to rank which deviations matter most, an analysis not directly available from A*.

Load-bearing premise

The reachability graph of the synchronous product has an underlying network-flow structure that makes the LP constraint matrix totally unimodular.

What would settle it

An instance in which the optimal solution to the LP relaxation is fractional rather than integral.

Figures

Figures reproduced from arXiv: 2605.26938 by Izack Cohen.

Figure 1
Figure 1. Figure 1: Two Petri nets, without and with a loop. [PITH_FULL_IMAGE:figures/full_fig_p008_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Three models for traces (a) ⟨abcde⟩ (b) ⟨abe⟩ (c) ⟨acbdbe⟩. 8 [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: A synchronous product of process and trace models. [PITH_FULL_IMAGE:figures/full_fig_p013_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: The complete reachability graph of the synchronous product in Figure 3. Arcs [PITH_FULL_IMAGE:figures/full_fig_p015_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: (a) A process of handling insurance claims by an insurance company, and (b) [PITH_FULL_IMAGE:figures/full_fig_p022_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Mean execution time by alignment cost. Values in parentheses indicate mean [PITH_FULL_IMAGE:figures/full_fig_p030_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Mean execution time vs. trace length for [PITH_FULL_IMAGE:figures/full_fig_p032_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: RG construction time vs. graph size across all datasets. The fitted line shows [PITH_FULL_IMAGE:figures/full_fig_p037_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: LP win rate by expected deviations across all datasets. The dashed line marks [PITH_FULL_IMAGE:figures/full_fig_p039_9.png] view at source ↗
read the original abstract

Alignment-based conformance checking is the state-of-the-art approach for comparing observed process executions with normative process models. The standard exact solution relies on an A*-based heuristic search, which can exhibit exponential runtime in the presence of long traces or substantial deviations. This paper introduces a reformulation of alignment-based conformance checking as a totally unimodular linear program (LP) defined on the reachability graph of the synchronous product. By exploiting the underlying network-flow structure, the proposed formulation guarantees the existence of an integral optimal extreme-point solution through LP relaxation, thereby avoiding the combinatorial overhead associated with integer variables and branch-and-bound search. We conduct an extensive empirical evaluation on more than 2.1 million conformance checking instances derived from real-world and synthetic benchmark datasets. The results show that A* and the LP approach exhibit complementary performance characteristics: the former performs best on short, well-conforming traces, while the LP formulation provides substantial speedups for longer traces with deviations, precisely where conformance checking is most informative. Based on these findings, we derive simple algorithm-selection guidelines that combine both approaches, achieving average runtime savings of 38.6% with 96% selection accuracy compared to always using A*.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper claims that alignment-based conformance checking can be reformulated as a totally unimodular linear program on the reachability graph of the synchronous product. By exploiting an underlying network-flow structure, the LP relaxation is asserted to yield integral optimal solutions without branch-and-bound. Large-scale experiments on >2.1 million instances from real and synthetic data show complementary runtime behavior with A* (LP faster on long traces with deviations), leading to a selection heuristic with 38.6% average savings at 96% accuracy.

Significance. If the total-unimodularity claim is correct, the work supplies a useful complementary exact method for conformance checking that scales better than A* precisely on the harder instances. The scale of the empirical evaluation (millions of instances) is a clear strength and supports the practical value of the derived selection guidelines.

major comments (2)
  1. [Abstract] Abstract (reformulation paragraph): the central guarantee that 'the proposed formulation guarantees the existence of an integral optimal extreme-point solution through LP relaxation' rests on the constraint matrix being totally unimodular because of a network-flow structure, yet no explicit LP (variables for log/model/synchronous moves, flow-conservation equalities, objective) or proof that the matrix satisfies the network-matrix TU characterization (at most two nonzero entries per row of opposite sign) is supplied.
  2. [Empirical evaluation] Empirical evaluation (results paragraph): the reported 38.6% average runtime savings and 96% selection accuracy are presented without any table, per-dataset breakdown, or raw timing data, making it impossible to verify the complementarity claim or the conditions under which the LP is selected.
minor comments (1)
  1. The abstract states 'more than 2.1 million' instances but does not give the exact count or the split between real-world and synthetic benchmarks; this should be stated precisely in the experimental section.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the positive assessment of the work's significance and the scale of the empirical evaluation. We address each major comment below and will revise the manuscript to improve clarity and verifiability of the claims.

read point-by-point responses
  1. Referee: [Abstract] Abstract (reformulation paragraph): the central guarantee that 'the proposed formulation guarantees the existence of an integral optimal extreme-point solution through LP relaxation' rests on the constraint matrix being totally unimodular because of a network-flow structure, yet no explicit LP (variables for log/model/synchronous moves, flow-conservation equalities, objective) or proof that the matrix satisfies the network-matrix TU characterization (at most two nonzero entries per row of opposite sign) is supplied.

    Authors: The abstract is intentionally concise and does not contain the explicit LP or TU proof. The full manuscript (Section 3) defines the LP with variables for log moves, model moves, and synchronous moves on the reachability graph of the synchronous product, along with flow-conservation equalities and the standard alignment cost objective. The TU property is established by showing the constraint matrix is a network matrix (each row has at most two nonzero entries of opposite sign) due to the flow structure. We will revise the abstract to include a one-sentence pointer to this structure and add an explicit statement of the LP and a short proof sketch (or reference to the network-matrix characterization) in the main text or an appendix. revision: yes

  2. Referee: [Empirical evaluation] Empirical evaluation (results paragraph): the reported 38.6% average runtime savings and 96% selection accuracy are presented without any table, per-dataset breakdown, or raw timing data, making it impossible to verify the complementarity claim or the conditions under which the LP is selected.

    Authors: We agree that summary statistics alone limit verifiability. The manuscript will be revised to include a table (or set of tables) providing per-dataset breakdowns of runtime savings, selection accuracy, and the simple selection rules (e.g., based on trace length and number of deviations). We will also add a link to a public repository containing the raw timing data for the >2.1 million instances to allow independent verification of the complementarity between A* and the LP approach. revision: yes

Circularity Check

0 steps flagged

No circularity: TU property asserted from graph encoding, not reduced to self-fit or self-citation

full rationale

The paper's central claim is that the reformulation of conformance checking as an LP on the synchronous-product reachability graph yields a totally unimodular constraint matrix due to its network-flow structure, guaranteeing integral optima via LP relaxation. No equations, fitted parameters, or self-citations are exhibited that would make the TU guarantee equivalent to its inputs by construction (e.g., no redefinition of the matrix via the result itself, no renaming of known results, and no load-bearing self-citation chain). The derivation is presented as a direct structural property of the encoding, making the result self-contained against external verification rather than internally circular.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Based solely on abstract; the central claim rests on one domain assumption about network-flow structure.

axioms (1)
  • domain assumption The reachability graph of the synchronous product has a network-flow structure that makes the LP constraint matrix totally unimodular.
    Invoked to guarantee integral optimal solutions from LP relaxation without integer programming.

pith-pipeline@v0.9.1-grok · 5742 in / 1278 out tokens · 66801 ms · 2026-06-29T17:01:07.247408+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

7 extracted references · 6 canonical work pages

  1. [1]

    PhD The- sis.TechnischeUniversiteitEindhoven

    Aligning observed and modeled behavior. PhD The- sis.TechnischeUniversiteitEindhoven. URL:https://doi.org/10.6100/ IR770080, doi:10.6100/IR770080. Adriansyah, A., van Dongen, B.F., van der Aalst, W.M.,

  2. [2]

    Efficient approximate conformance checking using trie data structures, in: 2021 3rd International Confer- ence on Process Mining (ICPM), pp. 1–8. doi:10.1109/ICPM53251.2021. 9576845. Bauer, M., Van der Aa, H., Weidlich, M.,

  3. [3]

    Symbolicallyaligning observed and modelled behaviour, in: 18th International Conference on Application of Concurrency to System Design (ACSD), pp. 50–59. doi:10. 1109/ACSD.2018.00008. Bogdanov, E., Cohen, I., Gal, A.,

  4. [4]

    arXiv preprint arXiv:2406.05439

    A scalable and near-optimal confor- mance checking approach for long traces. arXiv preprint arXiv:2406.05439 . Boltenhagen, M., Chatain, T., Carmona, J.,

  5. [5]

    Aligning event logs and process models for multi-perspective conformance checking: An approach based on integer linear programming, in: Business Process Management: 11th International Conference, BPM 2013, Beijing, China, August 26-30,

  6. [7]

    Efficiently computing alignments: Using the extended marking equation, in: Business Process Management (BPM), Springer. pp. 197–214. doi:10.1007/978-3-319-98648-7_12. Fani Sani, M., van Zelst, S.J., van der Aalst, W.M.,

  7. [10]

    4TU.ResearchData

    BPI Challenge 2013, incidents. 4TU.ResearchData. URL:https://doi.org/10.4121/uuid: 500573e6-accc-4b0c-9576-aa5468b10cee, doi:10.4121/uuid: 500573e6-accc-4b0c-9576-aa5468b10cee. Tardos, É.,