pith. sign in

arxiv: 2505.05880 · v2 · pith:2XAYFRGYnew · submitted 2025-05-09 · 💻 cs.AI · cs.LG

Combining Abstract Argumentation and Machine Learning for Efficiently Analyzing Low-Level Process Event Streams

Pith reviewed 2026-05-22 16:40 UTC · model grok-4.3

classification 💻 cs.AI cs.LG
keywords neuro-symbolic approachabstract argumentation frameworksequence taggingprocess event streamsdata-efficient learningevent interpretationprocess monitoring
0
0 comments X

The pith

A neuro-symbolic approach uses machine learning to suggest event interpretations and argumentation to refine them with prior knowledge.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper develops a method for interpreting low-level events in process traces when the mapping to business activities is uncertain. It trains a sequence-tagging model on limited examples to propose candidate interpretations and then applies an Abstract Argumentation Framework to refine those candidates using existing process knowledge. The goal is to make the analysis more data-efficient and to provide explanations for conflicting interpretations. A reader would care because many real-world monitoring tasks lack enough labeled data for pure machine learning solutions, and pure reasoning can be too slow or uninformative in uncertain cases.

Core claim

The paper claims that a hybrid system, in which a sequence tagger generates context-aware candidate interpretations and an AAF-based reasoner refines them, leverages prior knowledge to compensate for scarce training data in event stream analysis.

What carries the argument

The neuro-symbolic pipeline consisting of an example-driven sequence tagger followed by an Abstract Argumentation Framework (AAF) reasoner that refines interpretations according to process constraints.

If this is right

  • The hybrid method yields more informative results than reasoning alone in uncertain mapping scenarios.
  • It requires fewer manually annotated traces than a standalone sequence-tagging model.
  • Conflicting interpretations can be explained using the arguments in the framework.
  • Computation remains efficient even for ongoing traces by focusing refinement on candidates.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • This combination could apply to other sequence interpretation tasks where domain rules can correct neural predictions.
  • Future work might explore online learning versions that update the tagger based on the reasoner's outputs.
  • Scalability to very long traces or many concurrent processes would be a natural next test.

Load-bearing premise

The sequence-tagging model produces sufficiently accurate candidate interpretations from limited training data so that the AAF reasoner can improve them without adding inconsistencies.

What would settle it

Running the system on a dataset with very few training traces and finding that the hybrid performs worse than the tagger alone or introduces new errors would falsify the claim that the approach compensates for data scarcity.

read the original abstract

Monitoring and analyzing process traces is a critical task for modern companies and organizations. In scenarios where there is a gap between trace events and reference business activities, this entails an interpretation problem, amounting to translating each event of any ongoing trace into the corresponding step of the activity instance. Building on a recent approach that frames the interpretation problem as an acceptance problem within an Abstract Argumentation Framework (AAF), one can elegantly analyze plausible event interpretations (possibly in an aggregated form), as well as offer explanations for those that conflict with prior process knowledge. Since, in settings where event-to-activity mapping is highly uncertain (or simply under-specified) this reasoning-based approach may yield lowly-informative results and heavy computation, one can think of discovering a sequence-tagging model, trained to suggest highly-probable candidate event interpretations in a context-aware way. However, training such a model optimally may require using a large amount of manually-annotated example traces. We then propose a data-efficient neuro-symbolic approach to the problem, where the candidate interpretations returned by the example-driven sequence tagger is refined by the AAF-based reasoner. This allows us to also leverage prior knowledge to compensate for the scarcity of example data, as confirmed by experimenftal results.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper claims to introduce a data-efficient neuro-symbolic approach for interpreting low-level process event streams into business activities. It combines an example-driven sequence-tagging ML model to generate candidate event-to-activity mappings with an AAF-based reasoner that refines those candidates by injecting prior process knowledge, thereby compensating for scarce training data as asserted to be confirmed by experimental results.

Significance. If the central claim holds, the work could advance neuro-symbolic methods in process mining by showing how symbolic AAF reasoning can improve ML outputs under data scarcity and mapping uncertainty. The design avoids circularity by treating the ML component as independent and subject to post-hoc AAF refinement, which is a methodological strength. However, the absence of concrete validation limits assessment of real-world utility.

major comments (2)
  1. [Abstract] Abstract: The claim that the neuro-symbolic approach 'allows us to also leverage prior knowledge to compensate for the scarcity of example data, as confirmed by experimental results' is unsupported because the manuscript provides no details on training-set sizes, datasets, metrics (e.g., accuracy or F1), baselines (ML-only vs. combined), or statistical tests. This directly undermines verification of the load-bearing assertion that AAF refinement yields net gains over the sequence tagger alone.
  2. [Proposed Approach] Proposed Approach section: The assumption that the AAF reasoner can reliably improve candidate interpretations in highly uncertain event-to-activity mappings without introducing new inconsistencies is not demonstrated; no formal argument or empirical check is given showing that the refinement step preserves consistency or produces measurable accuracy lifts when the tagger's candidates are noisy due to limited data.
minor comments (1)
  1. [Abstract] Abstract: Typo 'experimenftal' should be 'experimental'.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback and the recommendation for major revision. The comments highlight important areas for improving clarity and substantiation of our claims. We address each major comment point by point below, indicating the revisions we plan to make in the next version of the manuscript.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The claim that the neuro-symbolic approach 'allows us to also leverage prior knowledge to compensate for the scarcity of example data, as confirmed by experimental results' is unsupported because the manuscript provides no details on training-set sizes, datasets, metrics (e.g., accuracy or F1), baselines (ML-only vs. combined), or statistical tests. This directly undermines verification of the load-bearing assertion that AAF refinement yields net gains over the sequence tagger alone.

    Authors: We agree that the abstract's claim would benefit from more explicit supporting details to allow immediate verification. In the revised manuscript, we will expand the abstract to include references to the specific datasets, training set sizes used to illustrate data efficiency, evaluation metrics such as F1-score, direct comparisons against the ML-only baseline, and mention of statistical significance. Corresponding details and results will also be highlighted in the experiments section. revision: yes

  2. Referee: [Proposed Approach] Proposed Approach section: The assumption that the AAF reasoner can reliably improve candidate interpretations in highly uncertain event-to-activity mappings without introducing new inconsistencies is not demonstrated; no formal argument or empirical check is given showing that the refinement step preserves consistency or produces measurable accuracy lifts when the tagger's candidates are noisy due to limited data.

    Authors: We acknowledge the need for explicit demonstration of the refinement step's properties. The AAF reasoner operates by computing acceptable arguments under Dung's semantics with respect to the prior knowledge encoded as attacks, which by definition only retains interpretations consistent with that knowledge and cannot introduce new inconsistencies. In the revision, we will add a formal argument in the Proposed Approach section explaining this consistency preservation. We will also augment the experiments with targeted evaluations on limited-data scenarios, reporting accuracy/F1 lifts and verifying absence of introduced inconsistencies. revision: yes

Circularity Check

0 steps flagged

No circularity; derivation is self-contained

full rationale

The paper describes a neuro-symbolic pipeline in which a sequence-tagging model is trained on annotated traces to generate candidate event interpretations, after which an independently defined AAF reasoner (drawn from prior work) refines those candidates by injecting process knowledge. No equation or claim equates a fitted parameter to a subsequent prediction, nor does any load-bearing step reduce to a self-citation that itself assumes the target result. The experimental confirmation is presented as external evidence rather than a tautological restatement of the inputs. The central claim therefore rests on the interaction of two distinct components rather than on any definitional or fitting circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim depends on the prior AAF modeling of event interpretations and the assumption that ML suggestions can be meaningfully refined by it; no new free parameters or invented entities are introduced in the abstract.

axioms (1)
  • domain assumption Abstract Argumentation Frameworks can model plausible event interpretations and conflicts with prior process knowledge
    Invoked when describing the baseline approach that the new method builds upon.

pith-pipeline@v0.9.0 · 5763 in / 1141 out tokens · 84204 ms · 2026-05-22T16:40:24.388138+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

35 extracted references · 35 canonical work pages · 1 internal anchor

  1. [1]

    J., Mannhardt, F., de Leoni, M

    van Zelst, S. J., Mannhardt, F., de Leoni, M. & Koschmider, A. Event abstraction in process mining: literature review and taxonomy.Granular Computing6, 719–736 (2021)

  2. [2]

    Tax, N. Human activity prediction in smart home environments with LSTM neural networks.14th International Conference on Intelligent Environments, IE 2018, Roma, Italy, June 25-28, 201840–47 (2018)

  3. [3]

    & van der Aalst, W

    Tax, N., Sidorova, N., Haakma, R. & van der Aalst, W. M. P. Bi, Y ., Kapoor, S. & Bhatia, R. (eds)Event abstraction for process mining using supervised learning techniques. (eds Bi, Y ., Kapoor, S. & Bhatia, R.)Proceedings of SAI Intelligent Systems Conference (IntelliSys) 2016, 251–269 (Springer International Publishing, Cham, 2018)

  4. [4]

    & Pontieri, L

    Fazzinga, B., Flesca, S., Furfaro, F. & Pontieri, L. Process mining meets argumen- tation: Explainable interpretations of low-level event logs via abstract argumentation. Information Systems107(2022)

  5. [5]

    & Weske, M

    Baier, T., Di Ciccio, C., Mendling, J. & Weske, M. Gaaloul, K., Schmidt, R., Nurcan, S., Guerreiro, S. & Ma, Q. (eds)Matching of events and activities - an approach using declarative modeling constraints. (eds Gaaloul, K., Schmidt, R., Nurcan, S., Guerreiro, S. & Ma, Q.)Enterprise, Business-Process and Information Systems Modeling, 119–134 (Springer Inter...

  6. [6]

    & Weske, M

    Baier, T., Di Ciccio, C., Mendling, J. & Weske, M. Matching events and activities by integrating behavioral aspects and label analysis.Software & Systems Modeling17, 573–598 (2018)

  7. [7]

    & Weske, M

    Baier, T., Mendling, J. & Weske, M. Bridging abstraction layers in process mining.Inf. Syst.46, 123–139 (2014)

  8. [8]

    & Weske, M

    Baier, T., Rogge-Solti, A., Mendling, J. & Weske, M. Matching of events and activities: an approach based on behavioral constraint satisfaction.Proceedings of the 30th Annual ACM Symposium on Applied Computing (SAC 2015)), Salamanca, Spain, 2015.1225– 1230

  9. [9]

    & Pontieri, L

    Fazzinga, B., Flesca, S., Furfaro, F., Masciari, E. & Pontieri, L. Efficiently interpreting traces of low level events in business process logs.Inf. Syst.73, 1–24 (2018)

  10. [10]

    R., Szimanski, F

    Ferreira, D. R., Szimanski, F. & Ralha, C. G. Improving process models by mining mappings of low-level events to high-level activities.J. of Intell. Inf. Syst.43, 379–407 (2014)

  11. [11]

    A., van der Aalst, W

    Mannhardt, F., de Leoni, M., Reijers, H. A., van der Aalst, W. M. & Toussaint, P. J. Guided process discovery – a pattern-based approach.Inf. Syst.76, 1–18 (2018)

  12. [12]

    & Montani, S

    Leonardi, G., Striani, M., Quaglini, S., Cavallini, A. & Montani, S. Ceravolo, P., van Keulen, M. & Stoffel, K. (eds)Towards semantic process mining through knowledge- based trace abstraction. (eds Ceravolo, P., van Keulen, M. & Stoffel, K.)Data-Driven Process Discovery and Analysis, 45–64 (Springer International Publishing, Cham, 2019)

  13. [13]

    & Mandelbaum, A

    Senderovich, A., Rogge-Solti, A., Gal, A., Mendling, J. & Mandelbaum, A. Nurcan, S., Soffer, P., Bajec, M. & Eder, J. (eds)The road from sensor data to process instances via interaction mining. (eds Nurcan, S., Soffer, P., Bajec, M. & Eder, J.)Advanced Information Systems Engineering, 257–273 (Springer International Publishing, Cham, 31 2016)

  14. [14]

    van der Aalst, W. M. P., Pesic, M. & Schonenberg, H. Declarative workflows: Balancing between flexibility and support.Comp. Sc. R&D23, 99–113 (2009)

  15. [15]

    Deep Learning: A Critical Appraisal

    Marcus, G. Deep learning: a critical appraisal.arXiv preprint arXiv:1801.00631(2018)

  16. [16]

    d.et al.Neural-symbolic learning and reasoning: A survey and interpreta- tion.Neuro-Symbolic Artificial Intelligence: The State of the Art342, 1 (2022)

    Garcez, A. d.et al.Neural-symbolic learning and reasoning: A survey and interpreta- tion.Neuro-Symbolic Artificial Intelligence: The State of the Art342, 1 (2022)

  17. [17]

    & Vergari, A

    Ahmed, K., Teso, S., Chang, K.-W., Van den Broeck, G. & Vergari, A. Semantic probabilistic layers for neuro-symbolic learning.Proceedings of the 36th International Conference on Neural Information Processing Systems, New Orleans, LA, USA, 2022. 2171

  18. [18]

    & Song, M

    Lim, J. & Song, M. De Weerdt, J. & Pufahl, L. (eds)Navigating event abstraction in process mining: A comprehensive analysis of sub-problems, data, and process charac- teristic considerations. (eds De Weerdt, J. & Pufahl, L.)Business Process Management Workshops, 174–185 (Springer Nature Switzerland, Cham, 2024)

  19. [19]

    A., van ˆA der Aalst, W

    Mannhardt, F., de Leoni, M., Reijers, H. A., van ˆA der Aalst, W. M. P. & Toussaint, P. J. La Rosa, M., Loos, P. & Pastor, O. (eds)From low-level events to activities - a pattern-based approach. (eds La Rosa, M., Loos, P. & Pastor, O.)Business Process Management, 125–141 (Springer International Publishing, Cham, 2016)

  20. [20]

    & Trzcionkowska, A

    Brzychczy, E. & Trzcionkowska, A. Burduk, A., Chlebus, E., Nowakowski, T. & Tubis, A. (eds)Process-oriented approach for analysis of sensor data from longwall moni- toring system. (eds Burduk, A., Chlebus, E., Nowakowski, T. & Tubis, A.)Intelligent Systems in Production Engineering and Maintenance, 611–621 (Springer International Publishing, Cham, 2019)

  21. [21]

    & Tax, N

    Mannhardt, F. & Tax, N. Unsupervised event abstraction using pattern abstraction and local process models.RADAR+EMISA 2017, June 12-13, 2017, Essen, Germany55–63

  22. [22]

    & D ¨undar, S

    de Leoni, M. & D ¨undar, S. Event-log abstraction using batch session identification and clustering.Proceedings of the 35th Annual ACM Symposium on Applied Computing (SAC 2020), Brno, Czech Republic, 2020.36 ˆaC“44

  23. [23]

    L., Sidorova, N

    van Eck, M. L., Sidorova, N. & van der Aalst, W. M. P. Enabling process mining on sensor data from smart products 1–12

  24. [24]

    W., Rozinat, A

    G ¨unther, C. W., Rozinat, A. & van der Aalst, W. M. P. Rinderle-Ma, S., Sadiq, S. & Leymann, F. (eds)Activity mining by global trace segmentation. (eds Rinderle-Ma, S., Sadiq, S. & Leymann, F.)Business Process Management Workshops, 128–139 (Springer Berlin Heidelberg, Berlin, Heidelberg, 2010). 32

  25. [25]

    & Pontieri, L

    Folino, F., Guarascio, M. & Pontieri, L. Jarke, M.et al.(eds)Mining predictive pro- cess models out of low-level multidimensional logs. (eds Jarke, M.et al.)Advanced Information Systems Engineering, 533–547 (Springer International Publishing, Cham, 2014)

  26. [26]

    & Pontieri, L

    Fazzinga, B., Flesca, S., Furfaro, F. & Pontieri, L. Krogstie, J. & Reijers, H. A. (eds)Pro- cess discovery from low-level event logs. (eds Krogstie, J. & Reijers, H. A.)Advanced Information Systems Engineering, 257–273 (Springer International Publishing, Cham, 2018)

  27. [27]

    & Weske, M

    Baier, T., Di Ciccio, C., Mendling, J. & Weske, M. Matching events and activities by integrating behavioral aspects and label analysis.Software and Systems Modeling17 (2018)

  28. [28]

    Mannhardt, F., de Leoni, M., Reijers, H. A. & van der Aalst, W. M. P. Balanced multi- perspective checking of process conformance.Computing98, 407–437 (2016)

  29. [29]

    Giunchiglia, E., Stoian, M. C. & Lukasiewicz, T. Raedt, L. D. (ed.)Deep learning with logical constraints. (ed.Raedt, L. D.)Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, 5478–5485 (International Joint Con- ferences on Artificial Intelligence Organization, 2022). URL https://doi.org/10.24963/ ijcai.202...

  30. [30]

    & Van den Broeck, G

    Xu, J., Zhang, Z., Friedman, T., Liang, Y . & Van den Broeck, G. Dy, J. & Krause, A. (eds)A semantic loss function for deep learning with symbolic knowledge. (eds Dy, J. & Krause, A.)Proceedings of the 35th International Conference on Machine Learning, V ol. 80 ofProceedings of Machine Learning Research, 5502–5511 (PMLR, 2018)

  31. [31]

    Dung, P. M. On the acceptability of arguments and its fundamental role in nonmonotonic reasoning, logic programming and n-person games.Art. Int.77, 321–358 (1995)

  32. [32]

    Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S. & Dean, J. Distributed representa- tions of words and phrases and their compositionality.Advances in neural information processing systems26(2013)

  33. [33]

    A survey on data-efficient algorithms in big data era.Journal of Big Data8, 24 (2021)

    Adadi, A. A survey on data-efficient algorithms in big data era.Journal of Big Data8, 24 (2021)

  34. [34]

    & Bengio, Y

    Grandvalet, Y . & Bengio, Y . Semi-supervised learning by entropy minimization. Advances in neural information processing systems17(2004)

  35. [35]

    Ren, P.et al.A survey of deep active learning.ACM computing surveys54, 1–40 (2021). 33