pith. machine review for the scientific record. sign in

arxiv: 2604.09628 · v2 · submitted 2026-03-19 · 💻 cs.CY · cs.AI

Recognition: no theorem link

Assessing Model-Agnostic XAI Methods against EU AI Act Explainability Requirements

Authors on Pith no claims yet

Pith reviewed 2026-05-15 08:52 UTC · model grok-4.3

classification 💻 cs.CY cs.AI
keywords XAIEU AI Actexplainabilitycompliance scoringmodel-agnostic methodsinterpretabilityregulatory requirements
0
0 comments X

The pith

A scoring framework converts expert judgments on XAI features into compliance scores for the EU AI Act.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper creates a system that takes qualitative expert reviews of model-agnostic XAI methods and turns them into quantitative scores tied to the explainability rules in the EU AI Act. This step addresses the current mismatch between what technical XAI tools can provide and what the regulation demands for AI systems sold or used in Europe. A sympathetic reader would see this as practical help for companies that need to pick or adapt explanation methods to avoid legal risks. The work also flags specific gaps in existing methods that call for more technical development and clearer rules from regulators.

Core claim

The authors propose a qualitative-to-quantitative scoring framework in which expert assessments of XAI properties are aggregated into a regulation-specific compliance score that relates model-agnostic explanation methods directly to the requirements of the EU AI Act.

What carries the argument

The qualitative-to-quantitative scoring framework that aggregates expert assessments of interpretability features into compliance scores aligned with EU AI Act rules.

If this is right

  • Companies can use the scores to select or adjust XAI methods that better meet EU legal explanation duties.
  • The framework reveals concrete technical shortcomings in current model-agnostic XAI tools that need further research.
  • Practitioners receive guidance on closing the gap between technical capabilities and regulatory demands in the EU market.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The scoring approach could be applied to other emerging AI regulations outside the EU once similar requirements are defined.
  • Automating parts of the expert assessment step would allow faster and more repeatable use of the framework at scale.
  • Validation against actual enforcement outcomes would strengthen the link between the numerical scores and legal risk.

Load-bearing premise

Expert judgments about XAI interpretability features can be turned into reliable numerical scores that match the legal requirements of the EU AI Act.

What would settle it

Empirical tests in which the framework's compliance scores fail to predict whether a given XAI method actually satisfies EU regulators during real compliance reviews or audits.

Figures

Figures reproduced from arXiv: 2604.09628 by Francesco Sovrano, Giulia Vilone, Michael Lognoul.

Figure 1
Figure 1. Figure 1: Methodology overview: starting from a set of XAI methods, the proposed [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Results of the sensitivity analysis over the compliance scores related to [PITH_FULL_IMAGE:figures/full_fig_p015_2.png] view at source ↗
read the original abstract

Explainable AI (XAI) has evolved in response to expectations and regulations, such as the EU AI Act, which introduces regulatory requirements on AI-powered systems. However, a persistent gap remains between existing XAI methods and society's legal requirements, leaving practitioners without clear guidance on how to approach compliance in the EU market. To bridge this gap, we study model-agnostic XAI methods and relate their interpretability features to the requirements of the AI Act. We then propose a qualitative-to-quantitative scoring framework: qualitative expert assessments of XAI properties are aggregated into a regulation-specific compliance score. This helps practitioners identify when XAI solutions may support legal explanation requirements while highlighting technical issues that require further research and regulatory clarification.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The manuscript studies model-agnostic XAI methods and maps their interpretability features to the explainability requirements of the EU AI Act. It proposes a qualitative-to-quantitative scoring framework in which expert assessments of XAI properties are aggregated into regulation-specific compliance scores intended to guide practitioners on legal compliance.

Significance. If the aggregation procedure can be shown to be reliable and legally aligned, the framework would supply a concrete tool for selecting XAI methods under the EU AI Act and would usefully flag technical gaps that still require regulatory clarification. The interdisciplinary linkage between XAI properties and specific legal criteria is a timely contribution.

major comments (1)
  1. [Framework description (following the abstract)] The central aggregation step that converts qualitative expert ratings into numeric compliance scores is described only at a high level; no inter-rater reliability statistics, weighting scheme, normalization procedure, or calibration against actual regulatory decisions or case law is supplied. This absence directly undermines the claim that the resulting scores meaningfully indicate support for EU AI Act requirements.
minor comments (1)
  1. The abstract refers to “model-agnostic XAI methods” without enumerating the concrete methods examined or the criteria used to select them.

Simulated Author's Rebuttal

1 responses · 1 unresolved

We thank the referee for the constructive feedback on the framework's aggregation procedure. We address the major comment below and will revise the manuscript to provide greater transparency.

read point-by-point responses
  1. Referee: The central aggregation step that converts qualitative expert ratings into numeric compliance scores is described only at a high level; no inter-rater reliability statistics, weighting scheme, normalization procedure, or calibration against actual regulatory decisions or case law is supplied. This absence directly undermines the claim that the resulting scores meaningfully indicate support for EU AI Act requirements.

    Authors: We agree that the aggregation step is described at a high level and that additional detail is needed to support the framework's claims. In the revised manuscript we will expand the methods section to explicitly describe the aggregation procedure, including the weighting scheme (derived from mapping XAI properties to specific AI Act articles), the normalization steps to produce [0,1] compliance scores, and the rationale for treating the authors' expert assessments as the initial input. We will also add a dedicated limitations subsection that reports the absence of formal inter-rater reliability statistics and discusses the implications. Regarding calibration against regulatory decisions or case law, we will clarify that such calibration is not feasible at present because the AI Act has only recently been adopted and relevant precedents remain limited; this will be framed as an important direction for future work rather than a current capability of the framework. revision: yes

standing simulated objections not resolved
  • Calibration of the compliance scores against actual regulatory decisions or case law, given the recent adoption of the EU AI Act and the current scarcity of relevant precedents.

Circularity Check

0 steps flagged

No circularity: framework is a proposed aggregation method without reduction to fitted inputs or self-citations

full rationale

The paper proposes a qualitative-to-quantitative scoring framework that aggregates expert assessments of XAI properties into a compliance score aligned with the EU AI Act. No equations, derivations, or load-bearing steps are presented that reduce by construction to prior fitted parameters, self-defined quantities, or unverified self-citations. The central claim is methodological and independent of any quantitative inputs from the same work; it does not rename known results or smuggle ansatzes via citation chains. This is a standard non-circular proposal of a new assessment approach.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 1 invented entities

Abstract-only; no explicit free parameters, axioms, or independent evidence are described. The scoring framework itself is treated as the main invented entity.

invented entities (1)
  • qualitative-to-quantitative scoring framework no independent evidence
    purpose: Aggregate expert assessments of XAI properties into regulation-specific compliance scores
    Introduced in the abstract as the primary contribution to bridge XAI methods and EU AI Act requirements

pith-pipeline@v0.9.0 · 5419 in / 1047 out tokens · 36018 ms · 2026-05-15T08:52:39.372697+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

40 extracted references · 40 canonical work pages · 1 internal anchor

  1. [1]

    Information fusion58, 82–115 (2020), https://doi.org/10.1016/j.inffus.2019.12.012

    Arrieta, A.B., Díaz-Rodríguez, N., Del Ser, J., Bennetot, A., Tabik, S., Barbado, A., García, S., Gil-López, S., Molina, D., Benjamins, R., et al.: Explainableartificialintelligence(xai):Concepts,taxonomies,opportunities and challenges toward responsible ai. Information fusion58, 82–115 (2020), https://doi.org/10.1016/j.inffus.2019.12.012

  2. [2]

    Interpreting Blackbox Models via Model Extraction

    Bastani, O., Kim, C., Bastani, H.: Interpreting blackbox models via model extraction. arXiv preprint arXiv:1705.08504 (2017)

  3. [3]

    Kluwer Academic Publishers, Boston, MA (2002)

    Belton, V., Stewart, T.J.: Multiple Criteria Decision Analysis: An Inte- grated Approach. Kluwer Academic Publishers, Boston, MA (2002)

  4. [4]

    Artificial Intelligence and Law29, 149– 169 (2021),https://doi.org/10.1007/s10506-020-09270-4

    Bibal, A., Lognoul, M., De Streel, A., Frénay, B.: Legal requirements on explainability in machine learning. Artificial Intelligence and Law29, 149– 169 (2021),https://doi.org/10.1007/s10506-020-09270-4

  5. [5]

    Oxford University Press, USA (2020),https://doi.org/10.1093/oso/ 9780190088583.001.0001

    Bradford, A.: The Brussels effect: How the European Union rules the world. Oxford University Press, USA (2020),https://doi.org/10.1093/oso/ 9780190088583.001.0001

  6. [6]

    Principles of data mining pp

    Bramer, M.: Avoiding overfitting of decision trees. Principles of data mining pp. 119–134 (2007)

  7. [7]

    International Review of Law, Computers & Technol- ogy pp

    Bringas Colmenarejo, A., State, L., Comandé, G.: How should an explana- tion be? a mapping of technical and legal desiderata of explanations for ma- chine learning models. International Review of Law, Computers & Technol- ogy pp. 1–32 (2025),https://doi.org/10.1080/13600869.2025.2497633

  8. [8]

    In: Workshop on Trustworthy and Socially Responsible Machine Learning, NeurIPS 2022, pp

    Chen, Z., Subhash, V., Havasi, M., Pan, W., Doshi-Velez, F.: What makes a good explanation?: A harmonized view of properties of explanations. In: Workshop on Trustworthy and Socially Responsible Machine Learning, NeurIPS 2022, pp. 1–11 (2022),https://doi.org/10.48550/arXiv.2211. 05667, URLhttps://openreview.net/forum?id=TnFHizNosji

  9. [9]

    Omega96, 102261 (2020),https://doi.org/ 10.1016/j.omega.2020.102261

    Cinelli, M., Kadziński, M., Gonzalez, M., Słowiński, R.: How to support the application of multiple criteria decision analysis? let us start with a Assessing XAI Methods against AI Act Requirements 17 comprehensive taxonomy. Omega96, 102261 (2020),https://doi.org/ 10.1016/j.omega.2020.102261

  10. [10]

    europa.eu/eli/reg/2024/1689/oj/eng

    Commission, E.: Artificial intelligence act (2024), URLhttps://eur-lex. europa.eu/eli/reg/2024/1689/oj/eng

  11. [11]

    In: Hasan, M.A., Xiong, L

    Cugny, R., Aligon, J., Chevalier, M., Roman-Jimenez, G., Teste, O.: Au- toxai: A framework to automatically select the most adapted XAI solution. In: Hasan, M.A., Xiong, L. (eds.) Proceedings of the 31st ACM Interna- tional Conference on Information & Knowledge Management, Atlanta, GA, USA, October 17-21, 2022, pp. 315–324, ACM (2022),https://doi.org/ 10....

  12. [12]

    In: Bengio, S., Wallach, H.M., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R

    Dhurandhar, A., Chen, P., Luss, R., Tu, C., Ting, P., Shanmugam, K., Das, P.: Explanations based on the missing: Towards contrastive explana- tions with pertinent negatives. In: Bengio, S., Wallach, H.M., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R. (eds.) Advances in Neural Information Processing Systems 31: Annual Conference on Neural Info...

  13. [13]

    International journal of qualitative methods (2006),https: //doi.org/10.1177/160940690600500107

    Fereday, J., Muir-Cochrane, E.: Demonstrating rigor using thematic analy- sis: A hybrid approach of inductive and deductive coding and theme de- velopment. International journal of qualitative methods (2006),https: //doi.org/10.1177/160940690600500107

  14. [14]

    In: Das, S., Green, B.P., Varshney, K., Ganap- ini, M., Renda, A

    Frész, B., Dubovitskaya, E., Brajovic, D., Huber, M.F., Horz, C.: How should AI decisions be explained? requirements for explanations from the perspective of european law. In: Das, S., Green, B.P., Varshney, K., Ganap- ini, M., Renda, A. (eds.) Proceedings of the Seventh AAAI/ACM Con- ference on AI, Ethics, and Society (AIES-24) - Full Archival Papers, Oc...

  15. [15]

    Annals of statistics pp

    Friedman, J.H.: Greedy function approximation: a gradient boosting ma- chine. Annals of statistics pp. 1189–1232 (2001),https://doi.org/10. 1214/aos/1013203451

  16. [16]

    The Annals of Applied Statistics pp

    Friedman, J.H., Popescu, B.E.: Predictive learning via rule ensembles. The Annals of Applied Statistics pp. 916–954 (2008),https://doi.org/10. 1214/07-AOAS148

  17. [17]

    journal of Computational and Graphical Statistics24(1), 44–65 (2015),https://doi.org/10.1080/10618600.2014.907095

    Goldstein, A., Kapelner, A., Bleich, J., Pitkin, E.: Peeking inside the black box: Visualizing statistical learning with plots of individual conditional ex- pectation. journal of Computational and Graphical Statistics24(1), 44–65 (2015),https://doi.org/10.1080/10618600.2014.907095

  18. [18]

    Gyevnar, B., Ferguson, N., Schafer, B.: Bridging the transparency gap: What can explainable AI learn from the AI act? In: Gal, K., Nowé, A., Nalepa, G.J., Fairstein, R., Radulescu, R. (eds.) ECAI 2023 - 26th Euro- pean Conference on Artificial Intelligence, September 30 - October 4, 2023, Kraków, Poland - Including 12th Conference on Prestigious Applicati...

  19. [19]

    Intelligent Automation & Soft Computing39(6) (2024)

    Halabaku, E., Bytyçi, E.: Overfitting in machine learning: A comparative analysis of decision trees and random forests. Intelligent Automation & Soft Computing39(6) (2024)

  20. [20]

    Information Fusion79, 263– 278 (2022),https://doi.org/10.1016/j.inffus.2021.10.007

    Holzinger, A., Dehmer, M., Emmert-Streib, F., Cucchiara, R., Augenstein, I., Del Ser, J., Samek, W., Jurisica, I., Díaz-Rodríguez, N.: Information fusion as an integrative cross-cutting enabler to achieve robust, explainable, and trustworthy medical artificial intelligence. Information Fusion79, 263– 278 (2022),https://doi.org/10.1016/j.inffus.2021.10.007

  21. [21]

    Liao, Q.V., Pribic, M., Han, J., Miller, S., Sow, D.: Question-driven design processforexplainableAIuserexperiences.CoRRabs/2104.03483(2021), URLhttps://arxiv.org/abs/2104.03483

  22. [22]

    Longo, L., Brcic, M., Cabitza, F., Choi, J., Confalonieri, R., Ser, J.D., Guidotti, R., Hayashi, Y., Herrera, F., Holzinger, A., Jiang, R., Khosravi, H., Lécué, F., Malgieri, G., Páez, A., Samek, W., Schneider, J., Speith, T., Stumpf, S.: Explainable artificial intelligence (XAI) 2.0: A manifesto of open challenges and interdisciplinary research direction...

  23. [23]

    In: Guyon, I., von Luxburg, U., Bengio, S., Wallach, H.M., Fergus, R., Vishwanathan, S.V.N., Garnett, R

    Lundberg, S.M., Lee, S.: A unified approach to interpreting model predic- tions. In: Guyon, I., von Luxburg, U., Bengio, S., Wallach, H.M., Fergus, R., Vishwanathan, S.V.N., Garnett, R. (eds.) Advances in Neural Infor- mation Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, pp...

  24. [24]

    The Journal of Financial Data Science (2020),https:// doi.org/10.3905/jfds.2020.1.047

    Man, X., Chan, E.P.: The best way to select features? comparing mda, lime, and shap. The Journal of Financial Data Science (2020),https:// doi.org/10.3905/jfds.2020.1.047

  25. [25]

    In: Hildebrandt, M., Castillo, C., Celis, L.E., Ruggieri, S., Taylor, L., Zanfir-Fortuna, G

    Mothilal, R.K., Sharma, A., Tan, C.: Explaining machine learning classifiers through diverse counterfactual explanations. In: Hildebrandt, M., Castillo, C., Celis, L.E., Ruggieri, S., Taylor, L., Zanfir-Fortuna, G. (eds.) FAT* ’20: Conference on Fairness, Accountability, and Transparency, Barcelona, Spain, January 27-30, 2020, pp. 607–617, ACM (2020),http...

  26. [26]

    ACM Comput

    Nauta, M., Trienes, J., Pathak, S., Nguyen, E., Peters, M., Schmitt, Y., Schlötterer, J., van Keulen, M., Seifert, C.: From anecdotal evidence to quantitative evaluation methods: A systematic review on evaluating ex- plainable AI. ACM Comput. Surv.55(13s), 295:1–295:42 (2023),https: //doi.org/10.1145/3583558, URLhttps://doi.org/10.1145/3583558

  27. [27]

    Panigutti, C., Hamon, R., Hupont, I., Llorca, D.F., Yela, D.F., Junkle- witz, H., Scalzo, S., Mazzini, G., Sánchez, I., Garrido, J.S., Gómez, E.: The role of explainable AI in the context of the AI act. In: Proceed- Assessing XAI Methods against AI Act Requirements 19 ings of the 2023 ACM Conference on Fairness, Accountability, and Trans- parency, FAccT 2...

  28. [28]

    Why should I trust you?

    Ribeiro, M.T., Singh, S., Guestrin, C.: "why should I trust you?": Ex- plaining the predictions of any classifier. In: Krishnapuram, B., Shah, M., Smola, A.J., Aggarwal, C.C., Shen, D., Rastogi, R. (eds.) Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, August 13-17, 2016, pp. 1135...

  29. [29]

    In: McIlraith, S.A., Weinberger, K.Q

    Ribeiro, M.T., Singh, S., Guestrin, C.: Anchors: High-precision model- agnostic explanations. In: McIlraith, S.A., Weinberger, K.Q. (eds.) Pro- ceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18),the30thinnovativeApplicationsofArtificialIntelligence(IAAI- 18),andthe8thAAAISymposiumonEducationalAdvancesinArtificialIn- tellig...

  30. [30]

    Digital Soci- ety3(1), 1 (2024),https://doi.org/10.1007/s44206-023-00081-z

    Richmond,K.M.,Muddamsetty,S.M.,Gammeltoft-Hansen,T.,Olsen,H.P., Moeslund, T.B.: Explainable ai and law: an evidential survey. Digital Soci- ety3(1), 1 (2024),https://doi.org/10.1007/s44206-023-00081-z

  31. [31]

    Sovrano, F.: How to explain: from theory to practice. Ph.D. thesis, University of Bologna (June 2023),https://doi.org/10.48676/unibo/ amsdottorato/10943, URLhttp://amsdottorato.unibo.it/10943/

  32. [32]

    CoRRabs/2505.11189 (2025),https://doi.org/10.48550/ARXIV.2505.11189, URLhttps:// doi.org/10.48550/arXiv.2505.11189

    Sovrano, F.: Can global xai methods reveal injected behaviours in llms? shap vs rule extraction vs ruleshap. CoRRabs/2505.11189 (2025),https://doi.org/10.48550/ARXIV.2505.11189, URLhttps:// doi.org/10.48550/arXiv.2505.11189

  33. [33]

    Sovrano, F., Hine, E., Anzolut, S., Bacchelli, A.: Simplifying soft- ware compliance: AI technologies in drafting technical documenta- tion for the AI act. Empir. Softw. Eng.30(3), 91 (2025),https: //doi.org/10.1007/S10664-025-10645-X, URLhttps://doi.org/10. 1007/s10664-025-10645-x

  34. [34]

    In: Endriss, U., Melo, F.S., Bach, K., Diz, A.J.B., Alonso-Moral, J.M., Barro, S., Heintz, F

    Sovrano, F., Lognoul, M., Vilone, G.: Aligning XAI with EU regulations for smart biomedical devices: A methodology for compliance analysis. In: Endriss, U., Melo, F.S., Bach, K., Diz, A.J.B., Alonso-Moral, J.M., Barro, S., Heintz, F. (eds.) ECAI 2024 - 27th European Conference on Artificial Intelligence, 19-24 October 2024, Santiago de Compostela, Spain -...

  35. [35]

    In: Schweighofer, E

    Sovrano, F., Sapienza, S., Palmirani, M., Vitali, F.: A survey on meth- ods and metrics for the assessment of explainability under the proposed AI act. In: Schweighofer, E. (ed.) Legal Knowledge and Information Sys- 20 F. Sovrano et al. tems - JURIX 2021: The Thirty-fourth Annual Conference, Vilnius, Lithua- nia, 8-10 December 2021, Frontiers in Artificia...

  36. [36]

    In: World Conference on Explainable Artificial Intelligence, pp

    Sovrano, F., Vitali, F.: Perlocution vs illocution: How different interpreta- tions of the act of explaining impact on the evaluation of explanations and XAI. In: Longo, L. (ed.) Explainable Artificial Intelligence - First World Conference, xAI 2023, Lisbon, Portugal, July 26-28, 2023, Proceedings, Part I, Communications in Computer and Information Scienc...

  37. [37]

    In: Ko, A., Francesconi, E., Kotsis, G., Tjoa, A.M., Khalil, I

    Sovrano, F., Vitali, F., Palmirani, M.: Modelling gdpr-compliant explana- tions for trustworthy AI. In: Ko, A., Francesconi, E., Kotsis, G., Tjoa, A.M., Khalil, I. (eds.) Electronic Government and the Information Sys- tems Perspective - 9th International Conference, EGOVIS 2020, Bratislava, Slovakia, September 14-17, 2020, Proceedings, Lecture Notes in Co...

  38. [38]

    Vilone, G., Longo, L.: Classification of explainable artificial intelligence methods through their output formats. Mach. Learn. Knowl. Extr.3(3), 615–661 (2021),https://doi.org/10.3390/MAKE3030032, URLhttps: //doi.org/10.3390/make3030032

  39. [39]

    Information Fusion76, 89–106 (2021), https://doi.org/10.1016/j.inffus.2021.05.009

    Vilone, G., Longo, L.: Notions of explainability and evaluation approaches for explainable artificial intelligence. Information Fusion76, 89–106 (2021), https://doi.org/10.1016/j.inffus.2021.05.009

  40. [40]

    International data privacy law7(2), 76–99 (2017),https:// doi.org/10.1093/idpl/ipx005

    Wachter, S., Mittelstadt, B., Floridi, L.: Why a right to explanation of automated decision-making does not exist in the general data protection regulation. International data privacy law7(2), 76–99 (2017),https:// doi.org/10.1093/idpl/ipx005