pith. sign in

arxiv: 2510.12957 · v3 · submitted 2025-10-14 · 💻 cs.LG · cs.AI

Reveal-to-Revise: Explainable Bias-Aware Generative Modeling with Multimodal Attention

Pith reviewed 2026-05-18 07:07 UTC · model grok-4.3

classification 💻 cs.LG cs.AI
keywords bias-aware generative modelingexplainable AImultimodal attentionGrad-CAM++WGAN-GPfeedback loopfairness in AIimage generation
0
0 comments X

The pith

The Reveal-to-Revise framework fuses multimodal attention, Grad-CAM++ attributions, and an iterative feedback loop inside a bias-regularized conditional WGAN-GP to raise accuracy and fairness in generative modeling.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents a generative model that weaves cross-modal attention fusion together with Grad-CAM++ explanations and a Reveal-to-Revise feedback loop so that bias signals detected during generation can revise the model on the fly. It tests the approach on Multimodal MNIST, Fashion MNIST, and a toxic-text benchmark using standard 80/20 splits, early stopping, and three random seeds. A sympathetic reader would care because the results claim both higher task performance and measurable gains in subgroup fairness and image coherence, offering one concrete route toward trustworthy generative systems in high-stakes settings. Ablation checks indicate that the attention fusion, the attribution step, and the bias feedback each add distinct value.

Core claim

The Reveal-to-Revise architecture couples a conditional attention WGAN-GP with bias regularization and an iterative local-explanation feedback loop that feeds Grad-CAM++ attributions back into training; on the multimodal benchmark the model records 93.2 percent accuracy, 91.6 percent F1-score, and 78.1 percent IoU-XAI while explanations raise structural coherence to SSIM of 88.8 percent and NMI of 84.9 percent, and adversarial training recovers 73 to 77 percent robustness on Fashion MNIST.

What carries the argument

The Reveal-to-Revise feedback loop, which uses Grad-CAM++ attributions to supply iterative local explanation signals that regularize bias and revise the generative parameters inside a conditional attention WGAN-GP.

If this is right

  • Fusion, Grad-CAM++, and bias feedback each contribute independently to final accuracy and IoU-XAI.
  • The iterative explanations raise structural coherence, reaching SSIM of 88.8 percent and NMI of 84.9 percent.
  • Adversarial training inside the framework restores 73 to 77 percent robustness on Fashion MNIST.
  • The same pipeline supports subgroup auditing and fairness measurement across protected attributes.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The feedback-loop pattern could be tested on diffusion or transformer-based generators to check whether the same attribution-driven revision improves coherence in those architectures.
  • If the loop truly avoids injecting new biases, the method offers a ready template for fairness constraints in medical-image synthesis or other domains where both generation quality and subgroup parity matter.
  • Extending the cross-modal fusion to include audio or video alongside text and images would provide a direct test of whether the reported gains scale beyond the current MNIST-style benchmarks.

Load-bearing premise

That feeding Grad-CAM++ attributions through the Reveal-to-Revise loop improves fairness and coherence without the explanations themselves creating fresh biases or the iteration causing overfitting on the chosen validation splits.

What would settle it

On the same multimodal benchmark, an ablation that disables the Reveal-to-Revise loop or the bias-regularization term and still records equal or higher IoU-XAI and subgroup-fairness scores would falsify the claim that the integrated feedback mechanism is responsible for the observed gains.

Figures

Figures reproduced from arXiv: 2510.12957 by Md Muntaqim Meherab, Noor Islam S. Mohammad.

Figure 1
Figure 1. Figure 1: Comparison of true versus predicted values for the black-box model. The red dashed line [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: The unified explainable, bias-aware generative framework. The attention-augmented [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: CNNs for Bias Detection Grad-CAM Heatmap [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Integration of saliency maps with deep autoencoders enables interpretable bias detection in [PITH_FULL_IMAGE:figures/full_fig_p011_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Model training performance: Training and validation accuracy across epochs, showing [PITH_FULL_IMAGE:figures/full_fig_p016_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Comparison of black-box and explainer model performance across epochs, showing [PITH_FULL_IMAGE:figures/full_fig_p016_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Correctly predicted classes by the proposed model. [PITH_FULL_IMAGE:figures/full_fig_p017_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Model predictions highlighting the detection of targeted digit classes 1, 3, and 7. [PITH_FULL_IMAGE:figures/full_fig_p017_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Attention-based detection performance for targeted classes 3 and 5. [PITH_FULL_IMAGE:figures/full_fig_p018_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Model prediction uncertainty and bias observed under adversarial attack conditions. [PITH_FULL_IMAGE:figures/full_fig_p019_10.png] view at source ↗
read the original abstract

We present an explainable, bias-aware generative framework that unifies cross-modal attention fusion, Grad-CAM++ attribution, and a Reveal-to-Revise feedback loop within a single training paradigm. The architecture couples a conditional attention WGAN GP with bias regularization and iterative local explanation feedback and is evaluated on Multimodal MNIST and Fashion MNIST for image generation and subgroup auditing, as well as a toxic/non-toxic text classification benchmark. All experiments use stratified 80/20 splits, validation-based early stopping, and AdamW with cosine annealing, and results are averaged over three random seeds. The proposed model achieves 93.2% accuracy, a 91.6% F1-score, and a 78.1% IoU-XAI on the multimodal benchmark, outperforming all baselines across every metric, while adversarial training restores 73 to 77% robustness on Fashion MNIST. Ablation studies confirm that fusion, Grad-CAM++, and bias feedback each contribute independently to final performance, with explanations improving structural coherence (SSIM = 88.8%, NMI = 84.9%) and fairness across protected subgroups. These results establish attribution and guided generative learning as a practical and trustworthy approach for high-stakes AI applications.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript presents Reveal-to-Revise, a bias-aware generative framework that integrates cross-modal attention fusion within a conditional attention WGAN-GP, Grad-CAM++ attributions, and an iterative Reveal-to-Revise feedback loop with bias regularization. It evaluates the approach on Multimodal MNIST and Fashion MNIST for image generation and subgroup auditing, plus a toxic/non-toxic text classification task, reporting 93.2% accuracy, 91.6% F1-score, and 78.1% IoU-XAI on the multimodal benchmark while claiming independent contributions from fusion, explanations, and bias feedback via ablations, plus restored robustness under adversarial training.

Significance. If the empirical results can be verified with proper controls, the integration of explanation feedback directly into generative training could offer a practical route to improving structural coherence and subgroup fairness in multimodal models, with relevance to high-stakes applications where both performance and explainability matter.

major comments (2)
  1. [Experiments] Experimental setup: the Reveal-to-Revise loop computes Grad-CAM++ attributions on the same validation portion of the 80/20 stratified splits used for early stopping and generator revision. Without a separate attribution hold-out set or an explicit anti-overfitting term on the explanation loss, the reported gains in IoU-XAI (78.1%) and fairness metrics risk being inflated by validation leakage rather than true generalization.
  2. [Results] Results and ablations: headline metrics (93.2% accuracy, 91.6% F1, 78.1% IoU-XAI) and the claim of independent contributions from each component are presented without error bars, standard deviations across the three random seeds, or statistical significance tests against baselines. This makes it impossible to assess whether outperformance is robust or attributable to hyperparameter effects.
minor comments (2)
  1. Dataset descriptions for Multimodal MNIST and the text benchmark are incomplete; full details on class balance, protected attributes, and preprocessing steps should be added.
  2. The notation for the bias regularization weight and the combined loss in the feedback loop would benefit from an explicit equation to improve reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address each major comment below and have incorporated revisions to strengthen the experimental design and results presentation.

read point-by-point responses
  1. Referee: [Experiments] Experimental setup: the Reveal-to-Revise loop computes Grad-CAM++ attributions on the same validation portion of the 80/20 stratified splits used for early stopping and generator revision. Without a separate attribution hold-out set or an explicit anti-overfitting term on the explanation loss, the reported gains in IoU-XAI (78.1%) and fairness metrics risk being inflated by validation leakage rather than true generalization.

    Authors: We agree that the current protocol risks validation leakage because Grad-CAM++ attributions and bias feedback are computed on the same validation portion used for early stopping. In the revised manuscript we will reserve a distinct 10% stratified hold-out set (drawn from the original training data) used exclusively for attribution computation and the Reveal-to-Revise feedback loop. Early stopping will be performed on a separate validation subset, and we will introduce an explicit L2 penalty on the explanation loss to discourage overfitting to the attribution signals. Updated tables and figures will report results under this corrected protocol. revision: yes

  2. Referee: [Results] Results and ablations: headline metrics (93.2% accuracy, 91.6% F1, 78.1% IoU-XAI) and the claim of independent contributions from each component are presented without error bars, standard deviations across the three random seeds, or statistical significance tests against baselines. This makes it impossible to assess whether outperformance is robust or attributable to hyperparameter effects.

    Authors: We acknowledge that the absence of error bars and statistical tests limits interpretability. Although results were averaged over three random seeds, standard deviations were not reported. In the revision we will add error bars (mean ± std) to all tables and figures for accuracy, F1, IoU-XAI, SSIM, NMI, fairness metrics, and every ablation entry. We will also include paired t-tests (or Wilcoxon signed-rank tests where normality assumptions fail) against each baseline and report p-values, thereby allowing readers to evaluate whether the observed gains are statistically reliable. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical results on benchmarks with no derivation chain

full rationale

The paper presents an empirical ML framework evaluated on Multimodal MNIST, Fashion MNIST, and text benchmarks using stratified 80/20 splits, validation-based early stopping, and AdamW optimization. All claims (93.2% accuracy, 91.6% F1, 78.1% IoU-XAI, ablation contributions) rest on reported experimental outcomes averaged over three seeds. No mathematical derivation, equations, or first-principles chain is offered that could reduce to its inputs by construction. Ablations are described as confirming independent contributions from fusion, Grad-CAM++, and bias feedback, but these are standard experimental controls without self-referential fitting or renaming of results. The work is self-contained against external benchmarks and does not invoke self-citations or uniqueness theorems as load-bearing premises.

Axiom & Free-Parameter Ledger

1 free parameters · 0 axioms · 0 invented entities

Only the abstract is available, so specific free parameters, axioms, and invented entities cannot be extracted in detail; the approach implicitly relies on standard assumptions of GAN convergence and attribution faithfulness.

free parameters (1)
  • bias regularization weight
    Mentioned as part of the architecture but no value or fitting procedure is stated in the abstract.

pith-pipeline@v0.9.0 · 5757 in / 1295 out tokens · 44126 ms · 2026-05-18T07:07:04.935364+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

67 extracted references · 67 canonical work pages · 1 internal anchor

  1. [1]

    Y . Pi. Beyond XAI: Obstacles towards responsible AI. arXiv preprint arXiv:2302.13456, 2023. https://doi.org/10.48550/arXiv.2309.03638

  2. [3]

    Longo, M

    L. Longo, M. Brcic, F. Cabitza, et al. Explainable artificial intelligence (XAI) 2.0: A man- ifesto of open challenges and interdisciplinary research directions. Inf. Fusion, 106:1–24, 2024. https://doi:10.1016/j.inffus.2023.101945

  3. [4]

    Langer, D

    M. Langer, D. Oster, T. Speith, et al. What do we want from explainable artificial intelligence (XAI)? A stakeholder perspective on XAI and a conceptual model guiding interdisciplinary XAI research. Artif. Intell., 296:1–22, 2021. https://doi:10.1016/j.artint.2021.103473

  4. [5]

    Adadi and M

    A. Adadi and M. Berrada. Peeking inside the black-box: A survey on explainable artificial intelligence (XAI). IEEE Access, 6:52138–52160, 2018. https://doi:10.1109/ACCESS.2018.2870052

  5. [6]

    A. B. Haque, A. N. Islam, and P. Mikalef. Explainable artificial intelligence (XAI) from a user perspective: A synthesis of prior literature and problematizing avenues for future research. Electron. Markets, 33(1):1–18,

  6. [7]

    https://doi:10.1007/s12525-023-00644-9 19

  7. [9]

    Räz, T. (2024). ML interpretability: Simple isn’t easy. Studies in history and philosophy of science, 103, 159-167. https://doi.org/10.1016/j.shpsa.2023.12.007

  8. [10]

    Sengupta, Y

    S. Sengupta, Y . Zhang, S. Maharjan, and F. Eliassen. Balancing explainability-accuracy of complex models. In Proc. IEEE Int. Conf. Artif. Intell., 2023:234–241. doi:10.1109/AI.2023.10123456

  9. [11]

    Di Martino and F

    F. Di Martino and F. Delmastro. Explainable AI for clinical and remote health applications: A survey on tabular and time series data. IEEE Access, 10:123456–123463, 2022. doi:10.1109/ACCESS.2022.32112345

  10. [12]

    Gosiewska, A

    A. Gosiewska, A. Gacek, P. Lubon, and P. Biecek. SAFE ML: Surrogate assisted feature extraction for model learning. In Proc. IEEE Int. Conf. Data Mining, 2020:156–163. doi:10.1109/ICDM.2020.9876543

  11. [13]

    Human Uncertainty Makes Classification More Robust

    R. Kleinlein, A. Hepburn, R. Santos-Rodríguez, and F. Fernández-Martínez. Sampling based on natural image statistics improves local surrogate explainers. In Proc. IEEE Int. Conf. Comput. Vis., 2022:234–241. doi:10.1109/ICCV .2022.10123456

  12. [15]

    Sanneman and J

    L. Sanneman and J. A. Shah. A situation awareness-based framework for design and evaluation of explainable AI. In Proc. IEEE Int. Conf. Hum.-Mach. Syst., 2020:78–85. doi:10.1109/HMS.2020.9123456

  13. [16]

    Albahri, A

    A. Albahri, A. M. Duhaim, M. A. Fadhel, et al. A systematic review of trustworthy and explainable artificial intelligence in healthcare: Assessment of quality, bias risk, and data fusion. Inf. Fusion, 96:156–191, 2023. doi:10.1016/j.inffus.2023.03.008

  14. [17]

    arXiv preprint arXiv:2006.11371 , year=

    A. Das and P. Rad. Opportunities and challenges in explainable artificial intelligence (XAI): A survey. arXiv preprint arXiv:2006.11371, 2020. doi:10.48550/arXiv.2006.11371

  15. [18]

    J. Sun, Q. V . Liao, M. Muller, et al. Investigating explainability of generative AI for code through scenario-based design. arXiv preprint arXiv:2202.07237, 2022. doi:10.48550/arXiv.2202.07237

  16. [19]

    Saeed and C

    W. Saeed and C. Omlin. Explainable AI (XAI): A systematic meta-survey of current challenges and future opportunities. arXiv preprint arXiv:2111.06420, 2021. doi:10.48550/arXiv.2111.06420

  17. [20]

    Weber, S

    L. Weber, S. Lapuschkin, A. Binder, and W. Samek. Beyond explaining: Opportunities and challenges of XAI-based model improvement. arXiv preprint arXiv:2202.10304, 2022. doi:10.48550/arXiv.2202.10304

  18. [21]

    J. L. M. Brand and L. Nannini. Does explainable AI have moral value? In Proc. IEEE Int. Conf. Artif. Intell. Ethics, 2023:1–8. doi:10.1109/AIEthics.2023.10234567

  19. [22]

    P. Ratz, F. Hu, and A. Charpentier. Fairness explainability using optimal transport with appli- cations in image classification. In Proc. IEEE Int. Conf. Mach. Learn. Appl., 2023:123–130. doi:10.1109/ICMLA.2023.10123456

  20. [23]

    M. T. Hosain, M. H. Anik, S. Rafi, et al. Path to gain functional transparency in artificial intelligence with meaningful explainability. arXiv preprint arXiv:2305.17902, 2023. doi:10.48550/arXiv.2305.17902

  21. [24]

    Sankaran

    K. Sankaran. Data science principles for interpretable and explainable AI. In Proc. IEEE Int. Conf. Data Sci. Adv. Anal., 2024:1–10. doi:10.1109/DSAA.2024.10567890

  22. [25]

    Schneider

    J. Schneider. Explainable generative AI (GenXAI): A survey, conceptualization, and research agenda. arXiv preprint arXiv:2401.11826, 2024. doi:10.48550/arXiv.2401.11826

  23. [26]

    Sanneman and J

    P. Nyoni and M. Velempini. Privacy and user awareness in social media: A case study. In Proc. IEEE Int. Conf. Inf. Commun. Technol., 2020:45–52. doi:10.1109/ICT.2020.9123456

  24. [27]

    Cremonini

    M. Cremonini. A critical take on privacy in a datafied society. IEEE Trans. Privacy, 1(2):89–97, 2023. doi:10.1109/TPRIV .2023.3278901

  25. [28]

    Smith, N

    J. Smith, N. Sonboli, C. Fiesler, and R. Burke. Exploring user opinions of fairness in recommender systems. In Proc. IEEE Int. Conf. Recommender Syst., 2020:234–241. doi:10.1109/RecSys.2020.0003456

  26. [29]

    Crowcroft and A

    J. Crowcroft and A. Gascon. Analytics without tears: Is there a way for data to be anonymized and yet still useful? IEEE Internet Comput., 22(3):12–19, 2020. doi:10.1109/MIC.2020.2987654

  27. [30]

    Morley, A

    J. Morley, A. Elhalal, F. Garcia, et al. Ethics as a service: A pragmatic operationalisation of AI ethics. In Proc. IEEE Int. Conf. Ethics AI, 2021:56–63. doi:10.1109/AIEthics.2021.9876543

  28. [31]

    J. Bayer. Between anarchy and censorship: Public discourse and the duties of social media. CEPS Paper Liberty Security Europe, no. 2019-03, 2020. doi:10.2139/ssrn.3456789 20

  29. [33]

    Radanliev, O

    P. Radanliev, O. Santos, A. Brandon-Jones, and A. Joinson. Ethics and responsible AI deployment. IEEE Trans. Technol. Soc., 5(1):34–42, 2024. doi:10.1109/TTS.2024.3367890

  30. [34]

    Sanneman and J

    M. Veale, M. Van Kleek, and R. Binns. Fairness and accountability design needs for algorithmic support in high-stakes public sector decision-making. In Proc. IEEE Int. Conf. AI Ethics, 2020:89–96. doi:10.1109/AIEthics.2020.9123456

  31. [35]

    Barnett and N

    J. Barnett and N. Diakopoulos. Crowdsourcing impacts: Exploring the utility of crowds for anticipat- ing societal impacts of algorithmic decision making. In Proc. IEEE Int. Conf. AI Soc., 2022:123–130. doi:10.1109/AISoc.2022.9876543

  32. [36]

    J. Lee, Y . Bu, P. Sattigeri, et al. A maximal correlation framework for fair machine learning. In Proc. IEEE Int. Conf. Mach. Learn., 2022:145–152. doi:10.1109/ICML.2022.10123456

  33. [37]

    K. L. Hohn, A. A. Braswell, and J. M. DeVita. Preventing and protecting against internet re- search fraud in anonymous web-based research. In Proc. IEEE Int. Conf. Web Sci., 2022:67–74. doi:10.1109/WebSci.2022.9876543

  34. [38]

    Pahde, M

    F. Pahde, M. Dreyer, W. Samek, and S. Lapuschkin. Reveal to Revise: An Explainable AI Life Cycle for Iterative Bias Correction of Deep Models. In Proc. MICCAI, 2023:596–606. doi:10.1007/978-3-031-43907-0-57

  35. [39]

    Fernandez, F

    A. Fernandez, F. Herrera, O. Cordon, et al. Evolutionary fuzzy systems for explainable artificial intelligence: Why, when, what for, and where to? IEEE Comput. Intell. Mag., 14(1):69–81, 2020. doi:10.1109/MCI.2019.2959053

  36. [40]

    Huang and J

    X. Huang and J. Marques-Silva. From decision trees to explained decision sets. In Proc. 26th Eur. Conf. Artif. Intell., 2023:1100–1108. doi:10.3233/FAIA230567

  37. [41]

    TabTransformer: Tabular Data Modeling Using Contextual Embeddings

    X. Huang, A. Khetan, M. Cvitkovic, and Z. Karnin. Tabtransformer: Tabular data modeling using contextual embeddings. arXiv preprint arXiv:2012.06678, 2020. doi:10.48550/arXiv.2012.06678

  38. [42]

    Gorishniy, I

    Y . Gorishniy, I. Rubachev, V . Khrulkov, and A. Babenko. Revisiting deep learning models for tabular data. Adv. Neural Inf. Process. Syst., 34:18932–18943, 2021. doi:10.5555/3540261.3541724

  39. [43]

    S. Ö. Arik and T. Pfister. Tabnet: Attentive interpretable tabular learning. In Proc. AAAI Conf. Artif. Intell., 35(8):6679–6687, 2021. doi:10.1609/aaai.v35i8.16826

  40. [44]

    Sengupta, Y

    T. Speith and M. Langer. A new perspective on evaluation methods for explainable artificial in- telligence (XAI). In Proc. 31st IEEE Int. Requirements Eng. Conf. Workshops, 2023:325–331. doi:10.1109/REW.2023.10123456

  41. [45]

    ˇCyras, A

    K. ˇCyras, A. Rago, E. Albini, P. Baroni, and F. Toni. Argumentative XAI: A survey. In Proc. 30th Int. Joint Conf. Artif. Intell., 2021:4392–4399. doi:10.24963/ijcai.2021/602

  42. [46]

    K. Baum, H. Hermanns, and T. Speith. From machine ethics to machine explainability and back. In Proc. Int. Symp. Artif. Intell. Math., 2020:1–8. doi:10.48550/arXiv.2011.12345

  43. [47]

    Krishnan

    M. Krishnan. Against interpretability: A critical examination of the interpretability problem in machine learning. Philos. Technol., 33(3):487–502, 2020. doi:10.1007/s13347-019-00392-2

  44. [48]

    Zhang, P

    Y . Zhang, P. Tiˇno, A. Leonardis, and K. Tang. A survey on neural network interpretability. IEEE Trans. Emerg. Top. Comput. Intell., 5(5):726–742, 2021. doi:10.1109/TETCI.2021.3106431

  45. [49]

    Tomsett, A

    R. Tomsett, A. Preece, D. Braines, et al. Rapid trust calibration through interpretable and uncertainty-aware AI. Patterns, 1(4):1–12, 2020. doi:10.1016/j.patter.2020.100049

  46. [50]

    J. Kim, H. Maathuis, and D. Sent. Human-centered evaluation of explainable AI applications: A systematic review. Front. Artif. Intell., 7:1–20, 2024. doi:10.3389/frai.2024.1456486

  47. [51]

    Alufaisan, L

    Y . Alufaisan, L. R. Marusich, J. Z. Bakdash, et al. Does explainable artificial intelligence improve human decision-making? In Proc. AAAI Conf. Artif. Intell., 2021:6618–6626. doi:10.1609/aaai.v35i8.16819

  48. [52]

    S. G. Anjara, A. Janik, A. Dunford-Stenger, et al. Examining explainable clinical decision support systems with think aloud protocols. PLoS ONE, 18(10):1–15, 2023. doi:10.1371/journal.pone.0291443

  49. [53]

    H. S. Eriksson and G. Grov. Towards XAI in the SOC – A user-centric study of explainable alerts with SHAP and LIME. In Proc. IEEE Int. Conf. Big Data, 2022:2595–2600. doi:10.1109/BigData55660.2022.10020248

  50. [54]

    A. K. Faulhaber, I. Ni, and L. Schmidt. The effect of explanations on trust in an assistance system for public transport users and the role of the propensity to trust. In Proc. Mensch Comput., 2021:303–310. doi:10.1145/3473856.3473886 21

  51. [55]

    G. J. Fernandes, A. Choi, J. M. Schauer, et al. An explainable artificial intelligence software tool for weight management experts (PRIMO): Mixed methods study. J. Med. Internet Res., 25:1–15, 2023. doi:10.2196/42047

  52. [56]

    B. Ghai, Q. V . Liao, Y . Zhang, et al. Explainable active learning (XAL): Toward AI explana- tions as interfaces for machine teachers. Proc. ACM Hum.-Comput. Interact., 4(CSCW3):1–28, 2021. doi:10.1145/3432934.3511111

  53. [57]

    L. Guo, E. M. Daly, O. Alkan, et al. Building trust in interactive machine learning via user-contributed in- terpretable rules. In Proc. 27th Int. Conf. Intell. User Interfaces, 2022:537–548. doi:10.1145/3490099.3511111

  54. [58]

    A. C. Oksuz, A. Halimi, and E. Ayday. AUTOLYCUS: Exploiting explainable AI (XAI) for model extraction attacks against decision tree models. arXiv preprint arXiv:2302.02162, 2023. doi:10.48550/arXiv.2302.02162

  55. [59]

    Survey of Explainable AI Techniques in Healthcare

    A. Chaddad, J. Peng, J. Xu, and A. Bouridane. Survey of explainable AI techniques in healthcare. Sensors, 23(2):634–650, 2023. doi:10.3390/s23020634

  56. [60]

    Tjoa and C

    E. Tjoa and C. Guan. A survey on explainable artificial intelligence (XAI): Toward medical XAI. IEEE Trans. Neural Netw. Learn. Syst., 32(11):4793–4813, 2020. doi:10.1109/TNNLS.2020.3027314

  57. [61]

    P. P. Angelov, E. A. Soares, R. Jiang, et al. Explainable artificial intelligence: An analytical review. Wiley Interdiscip. Rev. Data Min. Knowl. Discov., 11(5):1–22, 2021. doi:10.1002/widm.1424

  58. [62]

    Vilone and L

    G. Vilone and L. Longo. Classification of explainable artificial intelligence methods through their output formats. Mach. Learn. Knowl. Extr., 3(3):1–25, 2021. doi:10.3390/make3030027

  59. [63]

    A. K. Dombrowski, M. Alber, C. J. Anders, et al. Explanations can be manipulated, and geometry is to blame. In Proc. Neural Inf. Process. Syst., 2020:1234–1241. doi:10.5555/3495724.3495828

  60. [64]

    Circle loss: A unified perspective of pair similarity optimization

    X. Cheng, Z. Rao, Y . Chen, and Q. Zhang. Explaining knowledge distillation by quantifying the knowledge. In Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2020:987–994. doi:10.1109/CVPR42600.2020.00990

  61. [65]

    Chromik and M

    M. Chromik and M. Schuessler. A taxonomy for human subject evaluation of black-box explanations in XAI. In Proc. ExSS-ATEC, 2020:34–41. doi:10.1609/aaai.v34i09.7076

  62. [66]

    L. Chu, X. Hu, J. Hu, et al. Exact and consistent interpretation for piecewise linear neural networks: A closed form solution. In Proc. ACM SIGKDD Int. Conf. Knowl. Discov. Data Mining, 2020:124–131. doi:10.1145/3394486.3403089

  63. [67]

    Circle loss: A unified perspective of pair similarity optimization

    C.-Y . Chuang, J. Li, A. Torralba, and S. Fidler. Learning to act properly: Predicting and explaining affordances from images. In Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2020:156–163. doi:10.1109/CVPR42600.2020.00163

  64. [68]

    Crabbé, Y

    J. Crabbé, Y . Zhang, W. R. Zame, and M. van der Schaar. Learning outside the black-box: The pursuit of interpretable models. In Proc. Neural Inf. Process. Syst., 2020:1789–1796. doi:10.5555/3495724.3495874

  65. [69]

    A. B. Arrieta, N. Díaz-Rodríguez, J. Del Ser, et al. Explainable artificial intelligence (XAI): Con- cepts, taxonomies, opportunities and challenges toward responsible AI. Inf. Fusion, 58:82–115, 2020. doi:10.1016/j.inffus.2019.12.012

  66. [70]

    L. Türkmen. The review of studies on explainable artificial intelligence in educational research. J. Educ. Res., 10(1):248–256, 2025. doi:10.1177/07342829241234567

  67. [71]

    Sengupta, Y

    R. Gunawardena, Y . Yin, Y . Huang, et al. Usability of privacy controls in top health websites. In Proc. IEEE Int. Conf. Health Inf., 2023:78–85. doi:10.1109/HealthInf.2023.10123456 22