Time series saliency maps: explaining models across multiple domains

Christodoulos Kechris; David Atienza; Jonathan Dan

arxiv: 2505.13100 · v3 · submitted 2025-05-19 · 💻 cs.LG

Time series saliency maps: explaining models across multiple domains

Christodoulos Kechris , Jonathan Dan , David Atienza This is my paper

Pith reviewed 2026-05-22 14:40 UTC · model grok-4.3

classification 💻 cs.LG

keywords time seriessaliency mapsintegrated gradientsexplainable AIfrequency domaincross-domain transformsmodel interpretability

0 comments

The pith

Cross-domain Integrated Gradients enables interpretable saliency maps for time series in any invertible transformed domain.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Cross-domain Integrated Gradients, generalizing the standard Integrated Gradients method to compute feature attributions in domains other than the raw time domain. This is achieved by applying the attribution technique after an invertible and differentiable transformation of the input, with a specific extension to the complex domain for frequency-based analysis. A sympathetic reader would care because time series data often contain semantically important information in frequency spectra, independent components, or trend decompositions that are invisible when only examining time points. The authors provide theoretical guarantees of path independence and completeness for these attributions and validate them through controlled experiments and real-world applications in heart rate monitoring, seizure detection, and time series forecasting.

Core claim

By formulating attributions as a path integral in the space of an invertible differentiable transform of the time series input, including complex transforms, the method produces attributions that directly indicate the contribution of features in that transformed domain while satisfying the completeness axiom that the sum of attributions equals the output difference from baseline.

What carries the argument

Cross-domain Integrated Gradients, which computes the integrated gradient along a path in the transformed domain obtained from the time series via an invertible differentiable map.

If this is right

Frequency attributions identify spectral features driving wearable heart rate regression predictions.
Independent component attributions reveal relevant sources in EEG-based seizure classification.
Seasonal-trend attributions explain decisions in zero-shot forecasting foundation models.
The method applies across different model types and tasks while maintaining theoretical properties.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Users could apply the same framework to wavelet transforms or other signal representations for domain-specific insights.
The library release makes it straightforward to test whether models attend to expected domain features in new applications.
Comparing attributions from multiple transforms on the same model might reveal robustness or inconsistencies in learned representations.

Load-bearing premise

The transformations from the time domain must be invertible and differentiable, and extending Integrated Gradients to complex numbers must not require extra constraints to keep path independence and completeness.

What would settle it

Finding a case where the attributions computed in the frequency domain do not add up to the model's prediction difference or fail to match the known frequency components responsible for the output in a controlled synthetic dataset.

read the original abstract

Traditional saliency map methods, popularized in computer vision, highlight individual points (pixels) of the input that contribute the most to the model's output. However, in time series, they offer limited insights, as semantically meaningful features are often found in other domains. We introduce Cross-domain Integrated Gradients, a generalization of Integrated Gradients. Our method enables feature attributions in any domain that can be formulated as an invertible, differentiable transformation of the time domain. Crucially, our derivation extends the original Integrated Gradients into the complex domain, enabling frequency-based attributions. We provide the necessary theoretical guarantees, namely, path independence and completeness. We validate our method via controlled experiments with mechanistic analysis, quantitative faithfulness tests, and real-world case studies. Our approach reveals interpretable, problem-specific attributions that time-domain methods cannot capture in three real-world tasks across a variety of model architectures, machine-learning tasks, and cross-domain transforms: frequency-based attribution for a regression task in wearable heart rate extraction, independent component analysis in a classification task for electroencephalography-based seizure detection, and seasonal-trend decomposition for a forecasting problem with a zero-shot time-series foundation model. We release an open-source TensorFlow/PyTorch library to enable plug-and-play cross-domain explainability for time-series models. These results demonstrate the ability of Cross-Domain Integrated Gradients to provide semantically meaningful insights into time-series models that are impossible to achieve with traditional saliency in the time domain.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces Cross-domain Integrated Gradients (CDIG), a generalization of Integrated Gradients that computes saliency attributions in any domain reachable by an invertible differentiable transform of the time series input. The central technical contribution is an extension of the IG line integral to complex-valued domains (e.g., FFT), together with proofs of path independence and completeness. The method is validated on three tasks: frequency attributions for wearable heart-rate regression, ICA-based attributions for EEG seizure classification, and STL decomposition for zero-shot forecasting with a foundation model. An open-source TensorFlow/PyTorch library is released.

Significance. If the complex-domain extension is rigorously justified, the work supplies a principled route to semantically meaningful explanations for time-series models that standard time-domain IG cannot provide. The release of reusable code and the breadth of evaluated transforms and architectures increase the potential impact on explainable AI for healthcare and forecasting applications.

major comments (2)

[§3] §3 (Theoretical derivation): The claim that path independence and completeness are preserved when extending IG to the complex domain rests on the line integral along a straight path in ℂ. For arbitrary (non-holomorphic) compositions of neural networks with transforms such as FFT, the integral is not guaranteed to be path-independent without additional constraints or an explicit deformation-invariant path. Please provide the explicit proof or state the precise analyticity assumptions under which the fundamental theorem of calculus extends to this setting; otherwise completeness may fail for frequency attributions.
[§4.2] §4.2 (Quantitative faithfulness tests): The reported faithfulness metrics for cross-domain attributions versus time-domain baselines are central to the empirical claim. The manuscript should report the exact definition of the faithfulness score, the choice of baseline, and the statistical significance of the improvement across the three tasks; without these details it is difficult to assess whether the observed gains are robust or task-specific.

minor comments (2)

[§3] Notation: the complex-valued gradient operator and the definition of the integration path in the complex plane should be written explicitly (e.g., as an equation) rather than described only in prose.
[Figure 2] Figure 2: the caption should state the exact transform (FFT, ICA, or STL) and model architecture used for each panel so that readers can map the visualizations to the quantitative results.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and insightful comments, which have helped us improve the clarity and rigor of the manuscript. We address each major comment point by point below.

read point-by-point responses

Referee: [§3] §3 (Theoretical derivation): The claim that path independence and completeness are preserved when extending IG to the complex domain rests on the line integral along a straight path in ℂ. For arbitrary (non-holomorphic) compositions of neural networks with transforms such as FFT, the integral is not guaranteed to be path-independent without additional constraints or an explicit deformation-invariant path. Please provide the explicit proof or state the precise analyticity assumptions under which the fundamental theorem of calculus extends to this setting; otherwise completeness may fail for frequency attributions.

Authors: We thank the referee for this important observation. The original Integrated Gradients method defines attributions via a straight-line path in input space and invokes the fundamental theorem of calculus along that specific parameterized path; it does not require path independence over arbitrary paths. Our cross-domain extension follows the same construction: the invertible differentiable transform is applied to both the baseline and the input, after which we integrate along the straight line in the transformed (possibly complex) domain. Because the path is fixed and the composition is continuously differentiable along the path, the line integral remains well-defined and completeness holds by telescoping, without requiring holomorphicity of the overall mapping. We will add an explicit appendix derivation that spells out this parameterization (treating real and imaginary parts jointly as a real vector field) and states the minimal assumption of continuous differentiability along the chosen path. This clarifies that no stronger analyticity conditions are needed. revision: yes
Referee: [§4.2] §4.2 (Quantitative faithfulness tests): The reported faithfulness metrics for cross-domain attributions versus time-domain baselines are central to the empirical claim. The manuscript should report the exact definition of the faithfulness score, the choice of baseline, and the statistical significance of the improvement across the three tasks; without these details it is difficult to assess whether the observed gains are robust or task-specific.

Authors: We agree that these implementation details are necessary for reproducibility and proper evaluation of the empirical results. In the revised manuscript we will explicitly define the faithfulness score as the area under the perturbation curve (model output change when features are removed in decreasing order of attribution magnitude), state the baseline used for each domain and task (zero for frequency-domain experiments, mean for time-domain and STL), and report statistical significance via paired t-tests (or Wilcoxon signed-rank tests where normality assumptions are violated) with p-values across repeated runs or cross-validation folds for all three tasks. These additions will appear in Section 4.2 together with updated tables. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation extends IG with independent theoretical claims

full rationale

The paper's core contribution is a generalization of Integrated Gradients to invertible differentiable transforms (including complex domain for frequency attributions) together with stated guarantees of path independence and completeness. No quoted equation or section reduces the claimed guarantees to a fitted parameter, self-referential definition, or prior self-citation chain. The derivation is presented as building on the standard IG formulation with an explicit extension whose axioms are asserted to hold under the stated conditions (invertibility and differentiability). This is the most common honest finding for papers whose central result is a mathematical generalization rather than a re-labeling of inputs. Any concerns about whether the complex-domain proof actually holds belong under correctness or completeness risk, not circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on standard mathematical assumptions about transformations rather than new fitted parameters or invented entities.

axioms (1)

domain assumption Transformations of the time domain are invertible and differentiable
Required to map attributions from the transformed domain back to the original input while preserving completeness.

pith-pipeline@v0.9.0 · 5791 in / 1144 out tokens · 38175 ms · 2026-05-22T14:40:46.100240+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We derive a generalization of the Integrated Gradients for real-valued functions with a complex domain, enabling the generation of frequency-domain saliency maps... path independence and completeness
IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Cross-domain Integrated Gradients... invertible, differentiable transformation of the time domain

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.