pith. sign in

arxiv: 2605.06404 · v1 · submitted 2026-05-07 · 💻 cs.LG

FRInGe: Distribution-Space Integrated Gradients with Fisher--Rao Geometry

Pith reviewed 2026-05-08 12:52 UTC · model grok-4.3

classification 💻 cs.LG
keywords integrated gradientsfisher-rao geometryattribution methodsexplainable aicalibrationdeep learninggeodesicspredictive distributions
0
0 comments X

The pith

FRInGe defines Integrated Gradients paths in predictive distribution space using Fisher-Rao geodesics instead of straight lines in input space.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Standard Integrated Gradients can produce brittle explanations because they rely on arbitrary input baselines, straight-line interpolation, and discretization choices. FRInGe moves the reference point and the interpolation schedule into the space of the model's predictive distributions, selecting the maximum-entropy distribution as the reference and following the Fisher-Rao geodesic on the probability simplex. The resulting trajectory is realized in input space through the pullback of the Fisher metric and stabilized by KL and Euclidean trust regions before gradients are integrated along it. On six ImageNet architectures the approach improves calibration-oriented attribution metrics, particularly MAS scores, while remaining competitive on perturbation AUC and infidelity. A reader would care because more stable and calibration-aware attributions could make explanations of deep models more reliable for diagnosis and trust.

Core claim

FRInGe defines both the reference and the interpolation schedule in predictive distribution space, using a maximum-entropy predictive distribution as reference and the Fisher-Rao geodesic on the simplex as the path, then pulls this trajectory back to input space via the pullback Fisher metric to obtain attributions by integrating input gradients along the stabilized path.

What carries the argument

The Fisher-Rao geodesic on the probability simplex, pulled back to input space via the pullback Fisher metric to serve as the integration trajectory for gradients.

If this is right

  • Attributions become less sensitive to the choice of input-space baseline.
  • Calibration-sensitive attribution metrics improve across multiple ImageNet architectures.
  • Performance stays competitive on perturbation-based and infidelity measures.
  • The geometric construction supplies a principled, non-heuristic schedule for the integration path.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • This construction could be applied to other gradient-based attribution methods that currently use straight-line paths.
  • Models whose predictive distributions are already close to maximum entropy might exhibit smaller gains than poorly calibrated models.
  • The pullback mechanism suggests a general route for importing information-geometric objects into input-space explanation techniques.

Load-bearing premise

That choosing the maximum-entropy predictive distribution as reference and interpolating along the Fisher-Rao geodesic in distribution space, then pulling the path back to input space, produces attributions that are meaningfully better and more faithful than those from standard Integrated Gradients.

What would settle it

If FRInGe shows no improvement or a clear drop in MAS scores relative to standard Integrated Gradients when tested on additional model families or datasets, while also failing to match on AUC or infidelity, the advantage of the distribution-space construction would be falsified.

Figures

Figures reproduced from arXiv: 2605.06404 by Gabriele Martino, Sebastian Tschiatschek.

Figure 1
Figure 1. Figure 1: Visualization of the Fisher–Rao pullback metric to project the gradient toward the maximum view at source ↗
Figure 2
Figure 2. Figure 2: Top: normalized predictive entropy along the path. Middle: intermediate inputs show controlled attenuation of informative features. Bottom: cumulative attribution maps over equal integration eras. FRInGe concentrates attribution mass in the early, high-curvature part of the Fisher– Rao path, reducing late-stage saturation. or equivalently in vector form IGγ(x; x ′ ) = Z 1 0 ∇F t view at source ↗
Figure 3
Figure 3. Figure 3: Qualitative comparison across architectures. We show a strong case (Inception-v3, top), view at source ↗
Figure 4
Figure 4. Figure 4: Geometric profiles of the integration trajectory. view at source ↗
Figure 5
Figure 5. Figure 5: Example trajectory of Euclidean Distribution Tracking on ResNet-18. view at source ↗
Figure 6
Figure 6. Figure 6: Example trajectory of Unregularized FRInGe on ResNet-18 model. view at source ↗
Figure 7
Figure 7. Figure 7: Example trajectory of full FRInGe on ResNet-18 model. view at source ↗
Figure 8
Figure 8. Figure 8: Trajectory induced by FRInGe with γstep > 0 and γprior = 0 on ResNet-18. The predictive entropy evolves smoothly, indicating that regularizing the direction field stabilizes the geometry-aware path. However, the blur-change maps remain sparse and speckled, and the final attribution map is still noisy, showing that direction regularization alone is insufficient to produce meaningful input-space transformati… view at source ↗
Figure 9
Figure 9. Figure 9: Trajectory induced by FRInGe with γstep = 0 and γprior > 0 on ResNet-18. In this case, the blur-change maps become much more spatially coherent and concentrate on semantically meaningful regions, showing that γprior regularizes the transformed input effectively. At the same time, the entropy increases only weakly, indicating that this term alone does not provide the same predictive-space progress as the fu… view at source ↗
Figure 10
Figure 10. Figure 10: Trajectory induced by the non-geometric variant view at source ↗
Figure 11
Figure 11. Figure 11: Sensitivity analysis of FRInGe hyperparameters across all evaluated architectures. Each view at source ↗
Figure 12
Figure 12. Figure 12: Qualitative comparison across all attribution methods on ResNet-18. Images are selected view at source ↗
Figure 13
Figure 13. Figure 13: Qualitative comparison across all attribution methods on ResNet-50. Images are selected view at source ↗
Figure 14
Figure 14. Figure 14: Qualitative comparison across all attribution methods on ResNet-101. Images are selected view at source ↗
Figure 15
Figure 15. Figure 15: Qualitative comparison across all attribution methods on ResNet-152. Images are selected view at source ↗
Figure 16
Figure 16. Figure 16: Qualitative comparison across all attribution methods on Inception-V3. Images are view at source ↗
Figure 17
Figure 17. Figure 17: Qualitative comparison across all attribution methods on VGG-19. Images are selected view at source ↗
read the original abstract

Gradient-based attribution methods are model-faithful and scalable, but Integrated Gradients (IG) can be brittle because explanations depend on heuristic baselines, straight-line paths, discretization, and saturation. We propose Fisher--Rao Integrated Gradients (FRInGe), which defines both the reference and interpolation schedule in predictive distribution space. FRInGe replaces input baselines with a maximum-entropy predictive reference and follows a Fisher-Rao geodesic on the probability simplex. The corresponding input-space trajectory is realized through the pullback Fisher metric and stabilized by KL and Euclidean trust regions; attributions are obtained by integrating input gradients along this trajectory. Across six ImageNet architectures, FRInGe most clearly improves calibration-oriented attribution metrics, especially MAS scores, while remaining competitive on perturbation AUC and infidelity.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. The manuscript proposes Fisher-Rao Integrated Gradients (FRInGe), which relocates the baseline and interpolation path of Integrated Gradients from input space to the space of predictive distributions. The reference is the maximum-entropy distribution on the simplex, the path follows the Fisher-Rao geodesic, and the trajectory is realized in input space via the pullback Fisher metric regularized by KL and Euclidean trust regions. Attributions are obtained by integrating input gradients along this path. Experiments across six ImageNet architectures report the clearest gains on calibration-oriented metrics (especially MAS) while remaining competitive on perturbation AUC and infidelity.

Significance. If the reported improvements hold under fuller verification, the work supplies a geometrically principled alternative to heuristic baselines and straight-line paths in attribution methods. By grounding the construction in the Fisher-Rao metric and maximum-entropy reference, it offers a reproducible, parameter-light route to more stable explanations, particularly for calibration-sensitive downstream tasks. The multi-architecture empirical comparison provides a concrete benchmark that future distribution-space methods can be measured against.

minor comments (3)
  1. [Abstract] Abstract: the phrase 'stabilized by KL and Euclidean trust regions' is used without indicating the specific radius or weighting parameters; these should be stated explicitly or referenced to the method section so readers can reproduce the exact path.
  2. The manuscript should clarify whether the reported MAS improvements are accompanied by statistical significance tests across the six architectures or merely point estimates; this affects the strength of the 'most clearly improves' claim.
  3. Notation: ensure the pullback metric is denoted consistently (e.g., g_FR or similar) and that the distinction between the distribution-space geodesic and its input-space image is made explicit in every equation that uses the integrated gradient formula.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their careful reading and positive assessment of our work on Fisher-Rao Integrated Gradients (FRInGe). The referee summary accurately captures the method's core contributions, including the use of the maximum-entropy reference, Fisher-Rao geodesics, and the pullback construction with trust-region stabilization. We are pleased that the empirical results on calibration metrics across six ImageNet models were viewed as a useful benchmark. No specific major comments were provided in the report.

Circularity Check

0 steps flagged

No significant circularity detected in derivation chain

full rationale

The paper defines FRInGe by replacing IG baselines with a maximum-entropy predictive distribution on the simplex, interpolating along the Fisher-Rao geodesic, realizing the input trajectory via pullback of the Fisher metric, and integrating gradients along that path (stabilized by KL/Euclidean trust regions). All components draw from established external concepts (Integrated Gradients, Fisher-Rao geometry, KL divergence) without any self-definitional reduction, fitted parameter renamed as prediction, or load-bearing self-citation chain. The central construction is independent of the reported empirical outcomes on MAS/AUC/infidelity, which are presented as observed results rather than deductive necessities. The derivation remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Based on the abstract alone, no specific free parameters, axioms, or invented entities can be identified in detail. The method introduces concepts like the pullback Fisher metric and trust regions, but their exact parameterization is not specified here.

pith-pipeline@v0.9.0 · 5426 in / 1190 out tokens · 36162 ms · 2026-05-08T12:52:20.359349+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

25 extracted references · 25 canonical work pages

  1. [1]

    Constructing sensible baselines for integrated gradients.arXiv preprint arXiv:2412.13864, 2024

    Jai Bardhan, Cyrin Neeraj, Mihir Rawat, and Subhadip Mitra. Constructing sensible baselines for integrated gradients.arXiv preprint arXiv:2412.13864, 2024

  2. [2]

    Explanations can be manipulated and geometry is to blame

    Ann-Kathrin Dombrowski, Maximillian Alber, Christopher Anders, Marcel Ackermann, Klaus- Robert Müller, and Pan Kessel. Explanations can be manipulated and geometry is to blame. Advances in Neural Information Processing Systems (NeurIPS), 32, 2019

  3. [3]

    Improving performance of deep learning models with axiomatic attribution priors and expected gradients

    Gabriel Erion, Joseph D Janizek, Pascal Sturmfels, Scott M Lundberg, and Su-In Lee. Improving performance of deep learning models with axiomatic attribution priors and expected gradients. Nature Machine Intelligence, 3(7):620–631, 2021

  4. [4]

    Guided integrated gradients: An adaptive path method for removing noise

    Andrei Kapishnikov, Subhashini Venugopalan, Besim Avci, Ben Wedin, Michael Terry, and Tolga Bolukbasi. Guided integrated gradients: An adaptive path method for removing noise. InConference on Computer Vision and Pattern Recognition (CVPR), pages 5050–5058. IEEE, 2021

  5. [5]

    A new baseline assumption of integated gradients based on shaply value

    Shuyang Liu, Zixuan Chen, Ge Shi, Ji Wang, Changjie Fan, Yu Xiong, Runze Wu Yujing Hu, Ze Ji, and Yang Gao. A new baseline assumption of integated gradients based on shaply value. arXiv preprint arXiv:2310.04821, 2023

  6. [6]

    Explaining deep neural network models with adversarial gradient integration

    Deng Pan, Xin Li, and Dongxiao Zhu. Explaining deep neural network models with adversarial gradient integration. InInternational Joint Conference on Artificial Intelligence (IJCAI), 2021

  7. [7]

    Geometrically guided saliency maps

    Md Mahfuzur Rahman, Noah Lewis, and Sergey Plis. Geometrically guided saliency maps. InICLR 2022 Workshop on PAIR2Struct: Privacy, Accountability, Interpretability, Robustness, Reasoning on Structured Data, 2022

  8. [8]

    Improving integrated gradient-based transferable adversarial examples by refining the integration path

    Yuchen Ren, Zhengyu Zhao, Chenhao Lin, Bo Yang, Lu Zhou, Zhe Liu, and Chao Shen. Improving integrated gradient-based transferable adversarial examples by refining the integration path. InConference on Artificial Intelligence (AAAI), volume 39, pages 6731–6739, 2025

  9. [9]

    Using the path of least resistance to explain deep networks

    Sina Salek and Joseph Enguehard. Using the path of least resistance to explain deep networks. arXiv preprint arXiv:2502.12108, 2025

  10. [10]

    Visualizing the impact of feature attribution baselines.Distill, 2020

    Pascal Sturmfels, Scott Lundberg, and Su-In Lee. Visualizing the impact of feature attribution baselines.Distill, 2020. https://distill.pub/2020/attribution-baselines

  11. [11]

    The many shapley values for model explanation

    Mukund Sundararajan and Amir Najmi. The many shapley values for model explanation. In International Conference on Machine Learning (ICML), pages 9269–9278, 2020. 10

  12. [12]

    Axiomatic attribution for deep networks

    Mukund Sundararajan, Ankur Taly, and Qiqi Yan. Axiomatic attribution for deep networks. In International Conference on Machine Learning (ICML), pages 3319–3328, 2017

  13. [13]

    Maximum entropy baseline for integrated gradients

    Hanxiao Tan. Maximum entropy baseline for integrated gradients. In2023 International Joint Conference on Neural Networks (IJCNN), pages 1–8. IEEE, 2023

  14. [14]

    Weighted integrated gradients for feature attribution.arXiv preprint arXiv:2505.03201, 2025

    Kien Tran Duc Tuan, Tam Nguyen Trong, Son Nguyen Hoang, Khoat Than, and Anh Nguyen Duc. Weighted integrated gradients for feature attribution.arXiv preprint arXiv:2505.03201, 2025

  15. [15]

    Integrated decision gradients: Compute your attributions where the model makes its decision

    Chase Walker, Sumit Jha, Kenny Chen, and Rickard Ewetz. Integrated decision gradients: Compute your attributions where the model makes its decision. InConference on Artificial Intelligence (AAAI), volume 38, pages 5289–5297, 2024

  16. [16]

    Attribution in scale and space

    Shawn Xu, Subhashini Venugopalan, and Mukund Sundararajan. Attribution in scale and space. InConference on Computer Vision and Pattern Recognition (CVPR), pages 9680–9689. IEEE, 2020

  17. [17]

    Manifold integrated gradients: Riemannian geometry for feature attribution

    Eslam Zaher, Maciej Trzaskowski, Quan Nguyen, and Fred Roosta. Manifold integrated gradients: Riemannian geometry for feature attribution. InInternational Conference on Machine Learning (ICML), pages 58090–58104, 2024

  18. [18]

    Ig 2: Integrated gradient on iterative gradient path for feature attribution.Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 46(11):7173– 7190, 2024

    Yue Zhuo and Zhiqiang Ge. Ig 2: Integrated gradient on iterative gradient path for feature attribution.Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 46(11):7173– 7190, 2024. 11 A FRInGe Implementation Details A.1 FRInGe Pseudocode Algorithm 1FRInGe Require: Input x, target t, logits F ; damping λ, KL budget τ, step cap ηmax, Euclidean...

  19. [19]

    a Jacobian-vector product (JVP) to computeJ F (x)v,

  20. [20]

    multiplication by the analytic softmax Fisher matrixS(p(x)),

  21. [21]

    This matrix-free implementation is the key step that makes the method computationally feasible for image-sized inputs

    a vector-Jacobian product (VJP) to map the result back to input space. This matrix-free implementation is the key step that makes the method computationally feasible for image-sized inputs. A.4 Regularized Linear System, Decoupled Smoothing, and Sobolev Preconditioning At each waypoint k, FRInGe computes an update direction vk by solving a regularized lin...

  22. [22]

    one forward pass to evaluatep(x k),

  23. [23]

    one backward pass to compute the waypoint-tracking gradientg k =∇ xLk(xk),

  24. [24]

    Thus, an approximate total cost for FRInGe is CFRInGe ≈T[(1 +K)C fwd + (2 +K)C bwd]

    one backward pass for attribution accumulation along the path, 4.K matrix-vector products with the pullback Fisher operator, where each product requires one JVP and one VJP. Thus, an approximate total cost for FRInGe is CFRInGe ≈T[(1 +K)C fwd + (2 +K)C bwd]. AssumingC fwd ≈C bwd, the relative overhead scales as CFRInGe CIG ≈ T N (K+ 1.5). Worst-Case Estim...

  25. [25]

    This control experiment shows that the benefits of FRInGe do not arise from γprior alone, but from its interaction with the Fisher-aware geometry and the regularized solve

    Although the blur-change maps already focus on informative image regions, the final attribution map remains largely noisy. This control experiment shows that the benefits of FRInGe do not arise from γprior alone, but from its interaction with the Fisher-aware geometry and the regularized solve. C Experimental Protocol and Reproducibility C.1 Dataset, Mode...