pith. sign in

arxiv: 2605.21758 · v1 · pith:A2VLS634new · submitted 2026-05-20 · 💻 cs.AI

A Causal Argumentation Method for Explainability of Machine Learning Models

Pith reviewed 2026-05-22 08:53 UTC · model grok-4.3

classification 💻 cs.AI
keywords explainable AIcausal discoverybipolar argumentation frameworkmachine learning interpretabilitysemi-stable semantics
0
0 comments X

The pith

A method that discovers causal relationships among features and maps them into a bipolar argumentation framework can explain why a machine learning model selects particular outcomes.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper seeks to move explainable AI past simply listing important features toward showing the reasons a model reaches a decision. It starts by applying causal discovery algorithms to the input data to recover directed relationships among variables. Those relationships are then converted into a Bipolar Argumentation Framework in which features appear as arguments that support or attack one another. Semi-stable semantics are used to compute sets of features that together justify the observed prediction. A reader would care because current post-hoc methods often leave the causal or argumentative structure of a decision opaque.

Core claim

By identifying causal relationships among variables using causal discovery methods and translating these into a Bipolar Argumentation Framework (BAF) to represent supportive and opposing interactions among features, then using semi-stable semantics to find extensions of features that explain why certain outcomes may have been chosen, the method explains why models may be making predictions.

What carries the argument

Bipolar Argumentation Framework (BAF) built from causal discovery output, with semi-stable semantics that select consistent sets of features as explanatory extensions.

If this is right

  • The approach yields explanations that incorporate both positive and negative interactions among features rather than treating features in isolation.
  • Results can be produced for standard tabular benchmarks and directly compared with existing post-hoc attribution methods.
  • Explanations remain tied to the causal structure present in the data instead of purely correlational importance scores.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the causal step is sound, the method could flag cases where a model exploits non-causal shortcuts that standard feature-importance tools overlook.
  • The same pipeline could be reused on new domains by swapping the causal discovery algorithm for one tuned to the data type.
  • User-supplied constraints could be added as additional attack or support relations to steer the extensions toward domain-specific requirements.

Load-bearing premise

The causal relationships recovered by standard discovery algorithms from the training data are accurate and complete enough that the resulting argumentation extensions meaningfully reflect the model's actual decision process.

What would settle it

If the feature sets returned by the method on a benchmark dataset fail to match the features that actually change the model's output when perturbed, or if they include variables known to be non-causal in the data-generating process, the explanatory value of the extensions would be refuted.

Figures

Figures reproduced from arXiv: 2605.21758 by Henry Salgado, Martine Ceberio, Meagan R. Kendall.

Figure 1
Figure 1. Figure 1: Skeleton after removing the A–C edge based on A ⊥⊥ C | B. The edge-orientation phase then uses distinctive statistical patterns to de￾termine causal direction where possible. A critical orientation pattern is the unshielded collider (or v-structure): X → Y ← Z, where two non-adjacent variables X and Z both influence a common effect Y . Colliders are uniquely identifiable because they create a distinctive s… view at source ↗
Figure 2
Figure 2. Figure 2: Oriented collider structure A → B ← C identified from the pattern where A and C are marginally independent but both adjacent to B [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Example bipolar argumentation framework showing attack (dashed red) [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Comparison of causal graphs generated under two encoding strategies for [PITH_FULL_IMAGE:figures/full_fig_p011_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Pruned decision tree trained on the dataset. The root split on [PITH_FULL_IMAGE:figures/full_fig_p013_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: SHAP summary plot illustrating global feature importance. The strongest [PITH_FULL_IMAGE:figures/full_fig_p013_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Unified causal graph for the Pima Indians Diabetes dataset, merged from [PITH_FULL_IMAGE:figures/full_fig_p014_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Pruned decision tree trained on the Pima Diabetes dataset. The root split [PITH_FULL_IMAGE:figures/full_fig_p016_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: SHAP summary plot for the Pima Diabetes dataset. The strongest contrib [PITH_FULL_IMAGE:figures/full_fig_p016_9.png] view at source ↗
read the original abstract

Explainable AI (XAI) methods identify which features are relevant to a model's predictions but often fail to clarify why certain decisions are made. In this work, we present a novel method that integrates causality with argument-based reasoning to explain why models may be making predictions. Our approach first identifies causal relationships among variables using causal discovery methods and then translates these into a Bipolar Argumentation Framework (BAF) to represent supportive and opposing interactions among features. By using semi-stable semantics, we find extensions of features that explain why certain outcomes may have been chosen. We demonstrate our method on two benchmark datasets and compare its results against standard post-hoc explainability approaches.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper claims to introduce a novel XAI method that integrates causal discovery with Bipolar Argumentation Frameworks (BAF). Causal relationships are identified from data, mapped to support/attack relations in BAF, and semi-stable extensions are used to explain model predictions. It is demonstrated on two benchmark datasets with comparisons to standard post-hoc explainability approaches.

Significance. If the method's explanations are shown to faithfully represent the model's reasoning, it could provide a more structured and causal basis for interpretability compared to feature attribution methods. The use of argumentation semantics adds a layer of reasoning that may help in understanding 'why' decisions are made. However, without detailed results or validation, the significance remains potential rather than demonstrated.

major comments (2)
  1. The abstract outlines the intended workflow but supplies no derivation details, no quantitative results, no error analysis, and no description of how the BAF extensions are validated against the model's actual behavior, making it impossible to assess whether the data or method supports the central claim.
  2. The pipeline runs causal discovery on the input dataset and constructs the BAF without inspecting the trained model (weights, decision surface, or feature attributions internal to the predictor). The central claim therefore requires that the model's predictions are driven by the recovered causal structure; when the model instead exploits non-causal correlations or approximations, the BAF extensions will not correspond to the features the model actually used.
minor comments (2)
  1. Clarify the specific causal discovery algorithms used and the exact translation rules from causal graph to BAF relations.
  2. Provide more details on the benchmark datasets, the ML models trained, and the quantitative comparison metrics used against standard XAI methods.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful and constructive feedback on our paper. We address each of the major comments in detail below and indicate the revisions we plan to make to strengthen the manuscript.

read point-by-point responses
  1. Referee: The abstract outlines the intended workflow but supplies no derivation details, no quantitative results, no error analysis, and no description of how the BAF extensions are validated against the model's actual behavior, making it impossible to assess whether the data or method supports the central claim.

    Authors: We agree that the abstract, as currently written, is high-level and does not include specific details on derivations, results, or validation. The body of the manuscript provides these elements: derivation of the BAF construction is detailed in Section 3, quantitative results and comparisons to other XAI methods are presented in Section 5 with error analysis, and validation of the semi-stable extensions against model predictions is described in Section 4. To address the referee's concern and improve the standalone readability of the abstract, we will revise it to include a concise summary of the key results and validation approach. revision: yes

  2. Referee: The pipeline runs causal discovery on the input dataset and constructs the BAF without inspecting the trained model (weights, decision surface, or feature attributions internal to the predictor). The central claim therefore requires that the model's predictions are driven by the recovered causal structure; when the model instead exploits non-causal correlations or approximations, the BAF extensions will not correspond to the features the model actually used.

    Authors: The referee correctly identifies a key characteristic of our proposed method. By design, the approach is model-agnostic and derives explanations from the causal structure in the data rather than from internal model parameters. This allows it to provide causal argumentation-based explanations applicable to any ML model. We do not claim that the explanations always perfectly mirror the model's internal reasoning if the model has learned non-causal patterns. In the revised version, we will clarify this scope in the introduction and add a limitations section discussing scenarios where the model may rely on spurious correlations. Additionally, we will include new experiments that compare our causal explanations with SHAP values or other model-specific attributions on the benchmark datasets to empirically assess their correspondence. revision: partial

Circularity Check

0 steps flagged

No circularity in the causal argumentation explainability pipeline

full rationale

The paper proposes a constructive pipeline that applies standard causal discovery algorithms to the input dataset, maps the resulting graph into a Bipolar Argumentation Framework to encode support and attack relations among features, and extracts semi-stable extensions as explanatory sets. No equations, fitted parameters, or self-referential definitions appear in the described method; the output extensions are generated by applying established argumentation semantics to the discovered causal structure rather than being defined in terms of themselves or statistically forced by a fit on the same data. No load-bearing self-citations, uniqueness theorems imported from prior author work, or ansatzes smuggled via citation are invoked. The central claim is therefore a methodological construction whose validity rests on external assumptions about causal discovery accuracy and correspondence to model behavior, not on any internal reduction of the claimed explanations to the paper's own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The approach rests on the assumption that off-the-shelf causal discovery algorithms produce reliable input for the BAF construction and that semi-stable semantics yield human-interpretable explanations; no free parameters or invented entities are explicitly named in the abstract.

axioms (2)
  • domain assumption Causal discovery methods applied to the dataset will recover the relevant causal relationships among features.
    Invoked when the method begins by identifying causal relationships using causal discovery methods.
  • domain assumption The translation of causal relationships into a Bipolar Argumentation Framework preserves the explanatory power needed for model decisions.
    Required for the step that represents supportive and opposing interactions among features.

pith-pipeline@v0.9.0 · 5634 in / 1401 out tokens · 25932 ms · 2026-05-22T08:53:11.419270+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

  • IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear
    ?
    unclear

    Relation between the paper passage and the cited Recognition theorem.

    Our approach first identifies causal relationships among variables using causal discovery methods and then translates these into a Bipolar Argumentation Framework (BAF) to represent supportive and opposing interactions among features. By using semi-stable semantics, we find extensions of features that explain why certain outcomes may have been chosen.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

21 extracted references · 21 canonical work pages · 1 internal anchor

  1. [1]

    Beckers, S.: Causal Explanations and XAI (Feb 2022).https://doi.org/10.48550 /arXiv.2201.13169,http://arxiv.org/abs/2201.13169, arXiv:2201.13169 [cs]

  2. [2]

    In: 2022 IEEE 34th International Conference on Tools with Artificial Intelligence (ICTAI)

    Bistarelli, S., Mancinelli, A., Santini, F., Taticchi, C.: Arg-XAI: a Tool for Explaining Machine Learning Results. In: 2022 IEEE 34th International Conference on Tools with Artificial Intelligence (ICTAI). pp. 205–212 (Oct 2022).https://doi.org/10 .1109/ICTAI56018.2022.00037 , https://ieeexplore.ieee.org/document/100 97957

  3. [3]

    Transactions on Machine Learning Research (Jan 2024),https: //openreview.net/forum?id=PGLbZpVk2n

    Bystrova, D., Assaad, C.K., Arbel, J., Devijver, E., Gaussier, E., Thuiller, W.: Causal Discovery from Time Series with Hybrids of Constraint-Based and Noise- Based Algorithms. Transactions on Machine Learning Research (Jan 2024),https: //openreview.net/forum?id=PGLbZpVk2n

  4. [4]

    In: Proceedings of the 2006 conference on Computational Models of Argument: Proceedings of COMMA 2006

    Caminada, M.: Semi-Stable Semantics. In: Proceedings of the 2006 conference on Computational Models of Argument: Proceedings of COMMA 2006. pp. 121–130. IOS Press, NLD (May 2006)

  5. [5]

    In: Godo, L

    Cayrol, C., Lagasquie-Schiex, M.C.: On the Acceptability of Arguments in Bipolar Argumentation Frameworks. In: Godo, L. (ed.) Symbolic and Quantitative Ap- proaches to Reasoning with Uncertainty. pp. 378–389. Springer, Berlin, Heidelberg (2005).https://doi.org/10.1007/11518655_33

  6. [6]

    National Geographic Books (Oct 2020), google-Books-ID: KGCNEAAAQBAJ

    Christian, B.: The Alignment Problem: Machine Learning and Human Values. National Geographic Books (Oct 2020), google-Books-ID: KGCNEAAAQBAJ

  7. [7]

    Artificial Intelligence 77(2), pp

    Dung, P.M.: On the acceptability of arguments and its fundamental role in non- monotonic reasoning, logic programming andn-person games. Artificial Intelligence 77(2), 321–357 (Sep 1995).https://doi.org/10.1016/0004-3702(94)00041-X , https://www.sciencedirect.com/science/article/pii/000437029400041X

  8. [8]

    Friedman, J.H.: Greedy function approximation: A gradient boosting machine. The Annals of Statistics29(5), 1189–1232 (Oct 2001).https://doi.org/10.1214/aos/ 1013203451, https://projecteuclid.org/journals/annals-of-statistics/vo lume-29/issue-5/Greedy-function-approximation-A-gradient-boosting-mac hine/10.1214/aos/1013203451.full

  9. [9]

    Glymour, C., Zhang, K., Spirtes, P.: Review of Causal Discovery Methods Based on Graphical Models. Frontiers in Genetics10(Jun 2019).https://doi.org/10.3 389/fgene.2019.00524, https://www.frontiersin.org/journals/genetics/ar ticles/10.3389/fgene.2019.00524/full A Causal-Argumentation Method 19

  10. [10]

    Cambridge University Press (Aug 1996), google-Books-ID: uu6zXrogwWAC

    Josephson, J.R., Josephson, S.G.: Abductive Inference: Computation, Philos- ophy, Technology. Cambridge University Press (Aug 1996), google-Books-ID: uu6zXrogwWAC

  11. [11]

    Artif Intell345, 104346 (2025)

    Miller, T.: Explanation in artificial intelligence: Insights from the social sciences. Artificial Intelligence267, 1–38 (Feb 2019).https://doi.org/10.1016/j.artint .2018.07.007, https://www.sciencedirect.com/science/article/pii/S00043 70218305988

  12. [12]

    Packt Publishing Ltd (May 2023), google-Books-ID: 4V3DEAAAQBAJ

    Molak, A.: Causal Inference and Discovery in Python: Unlock the secrets of mod- ern causal machine learning with DoWhy, EconML, PyTorch and more. Packt Publishing Ltd (May 2023), google-Books-ID: 4V3DEAAAQBAJ

  13. [13]

    Molnar, C.: Interpretable Machine Learning: A Guide for Making Black Box Models Explainable, vol. 510. Christoph Molnar, 1 edn. (2019)

  14. [14]

    In: Holzinger, A., Goebel, R., Fong, R., Moon, T., Müller, K.R., Samek, W

    Molnar, C., König, G., Herbinger, J., Freiesleben, T., Dandl, S., Scholbeck, C.A., Casalicchio, G., Grosse-Wentrup, M., Bischl, B.: General Pitfalls of Model-Agnostic Interpretation Methods for Machine Learning Models. In: Holzinger, A., Goebel, R., Fong, R., Moon, T., Müller, K.R., Samek, W. (eds.) xxAI - Beyond Explainable AI: International Workshop, He...

  15. [15]

    Vera and Bellamy, Rachel K

    Mothilal, R.K., Sharma, A., Tan, C.: Explaining Machine Learning Classifiers through Diverse Counterfactual Explanations. In: Proceedings of the 2020 Confer- ence on Fairness, Accountability, and Transparency. pp. 607–617 (Jan 2020).https: //doi.org/10.1145/3351095.3372850 , http://arxiv.org/abs/1905.07697 , arXiv:1905.07697 [cs]

  16. [16]

    Basic Books (May 2018), google-Books-ID: 9H0dDQAAQBAJ

    Pearl, J., Mackenzie, D.: The Book of Why: The New Science of Cause and Effect. Basic Books (May 2018), google-Books-ID: 9H0dDQAAQBAJ

  17. [17]

    In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.M

    Reimers, C., Runge, J., Denzler, J.: Determining the Relevance of Features for Deep Neural Networks. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.M. (eds.) Computer Vision – ECCV 2020, vol. 12371, pp. 330–346. Springer International Publishing, Cham (2020). https://doi.org/10.1007/978-3-030-58574-7_20 , https://link.springer.com/10.1007/978- 3- 030- 5...

  18. [18]

    "Why Should I Trust You?": Explaining the Predictions of Any Classifier

    Ribeiro, M.T., Singh, S., Guestrin, C.: "Why Should I Trust You?": Explaining the Predictions of Any Classifier (Aug 2016),http://arxiv.org/abs/1602.04938, arXiv:1602.04938 [cs, stat]

  19. [19]

    In: The 17th International Conference on Ambient Systems, Networks and Technologies (ANT 2023) / The 3rd International Workshop on Causality, Agents and Large Models (CALM-26)

    Salgado, H., Kendall, M.R., Ceberio, M.: Causal Discovery for Explainable AI: A Dual-Encoding Approach. In: The 17th International Conference on Ambient Systems, Networks and Technologies (ANT 2023) / The 3rd International Workshop on Causality, Agents and Large Models (CALM-26). Procedia Computer Science, Springer, Istanbul, Turkey (Apr 2026).https://doi...

  20. [20]

    MIT Press (Jan 2001), google-Books-ID: OZ0vEAAAQBAJ

    Spirtes, P., Glymour, C., Scheines, R.: Causation, Prediction, and Search. MIT Press (Jan 2001), google-Books-ID: OZ0vEAAAQBAJ

  21. [21]

    Zheng, Y., Huang, B., Chen, W., Ramsey, J., Gong, M., Cai, R., Shimizu, S., Spirtes, P., Zhang, K.: Causal-learn: Causal Discovery in Python (Jul 2023).https: //doi.org/10.48550/arXiv.2307.16405 , http://arxiv.org/abs/2307.16405 , arXiv:2307.16405 [cs]