pith. sign in

arxiv: 2605.06834 · v1 · submitted 2026-05-07 · 💻 cs.LG

Attribution-Based Neuron Utility for Plasticity Restoration in Deep Networks

Pith reviewed 2026-05-11 00:47 UTC · model grok-4.3

classification 💻 cs.LG
keywords continual learningplasticity restorationneuron utilitygradient attributionadaptive resetloss of plasticitydeep networks
0
0 comments X

The pith

A reference-based gradient attribution measure estimates the functional cost of replacing neurons to guide more reliable resets that restore plasticity.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Deep networks lose plasticity during continual learning as they become progressively harder to update with new data due to neuron saturation and related effects. Adaptive reset interventions try to fix this by selectively reinitializing low-utility units, but common utility proxies such as activation size or raw gradient activity often fail to match the actual benefit of the reset. The paper defines GXD as the product of a unit's gradient and its difference from a chosen reference, which supplies a first-order estimate of the functional cost of replacement. When this cost-aligned signal selects units for reset, the interventions remain effective in regimes where prior criteria degrade. Readers should care because the approach reframes plasticity restoration as a cost-estimation task rather than a heuristic search.

Core claim

GXD, computed via reference-based gradient attribution, estimates the first-order functional cost of replacing a given unit; utility measures that align with this cost yield more reliable adaptive resets than existing proxy signals such as activation magnitude or gradient activity, particularly once prior reset criteria have begun to degrade.

What carries the argument

GXD (gradient times difference from reference), the utility signal derived from reference-based gradient attribution that quantifies the first-order cost of neuron replacement.

If this is right

  • Adaptive reset policies become more stable once utility is tied directly to measured intervention cost.
  • Continual learning agents can sustain longer sequences of distribution shifts without manual intervention.
  • The problem of lost plasticity is recast as an explicit cost-estimation task rather than a search over heuristics.
  • Reset decisions can be made on the basis of a single forward-backward pass using the reference gradient.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same attribution logic could be applied to decide when to prune or freeze units instead of resetting them.
  • Extending the reference choice to multiple historical states might further improve cost estimates in long task sequences.
  • The method suggests a general template for any plasticity intervention that can be expressed as a parameter replacement operation.

Load-bearing premise

Reference-based gradient attribution supplies an accurate first-order estimate of the actual performance change that would result from replacing a unit, and this estimate correctly identifies which units' resets will restore trainability.

What would settle it

A controlled experiment in which units ranked highest by GXD are reset and the observed recovery in learning rate or task performance shows no statistically significant improvement over resets chosen by existing proxy measures or random selection.

Figures

Figures reproduced from arXiv: 2605.06834 by Dawer Jamshed, Lucas Beauchemin, Patrick Elisii.

Figure 1
Figure 1. Figure 1: Failure cases of existing utility measures on Online Permuted MNIST. [PITH_FULL_IMAGE:figures/full_fig_p005_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Shock@5%: mean output perturbation from individually setting each neuron of the bottom [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Test accuracy on Permuted MNIST across activation functions. GXD maintains plasticity [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Left: Test accuracy on Permuted MNIST with ReLU and LayerNorm. GXD is the only tested utility that mitigates plasticity loss. Right: Continuous CIFAR-100 with a ResNet. Test accuracy relative to backprop baseline (50-task MA). GXD consistently outperforms Contribution and MC Adaptable Contribution in this feature stability test. methods use the same nonzero replacement budget and reset mechanism, but choos… view at source ↗
Figure 5
Figure 5. Figure 5: Shock@5% for GXD variants: GXI (target), GXD (target), and GXD (full-L1). [PITH_FULL_IMAGE:figures/full_fig_p015_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Test accuracy on Permuted MNIST comparing GXI (target), GXD (target), and Loss [PITH_FULL_IMAGE:figures/full_fig_p015_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Absolute test accuracy on Continuous CIFAR-100 with ResNet. [PITH_FULL_IMAGE:figures/full_fig_p016_7.png] view at source ↗
read the original abstract

Continual learning research attempts to conserve two fundamental capabilities: new knowledge acquisition and the preservation of previously acquired knowledge. While knowledge in this case can be measured through performance over an implicit or explicit task space, model plasticity generally concerns adaptability as data distributions evolve. Though much of the literature has focused on catastrophic forgetting, deep networks can also suffer from loss of plasticity, becoming progressively harder to update under continued training. Recent research has identified multiple mechanisms underlying this phenomenon, including neuron saturation, parameter norm growth, and loss of useful curvature directions. Adaptive reset-based interventions, which selectively reinitialize low-utility network parameters, have emerged as practical solutions to restore trainability. Existing utility measures used to guide resets, such as activation magnitude, contribution utility, or gradient-based activity, rely on proxy signals that can become misaligned with the intervention they are meant to guide. In this paper, we introduce gradient times difference from reference (GXD), a theoretically motivated utility measure based on reference-based gradient attribution that estimates the first-order functional cost of replacing a unit. Our results show that utility measures aligned with the functional cost of the reset can make interventions more reliable in settings where existing reset criteria degrade. GXD reframes adaptive resetting as an intervention cost estimation problem, providing a practical path toward more robust continual learning systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces GXD (gradient times difference from reference), a utility measure for neurons based on reference-based gradient attribution. It estimates the first-order functional cost of replacing a unit to guide adaptive resets that restore plasticity in continual learning. The central claim is that cost-aligned utilities outperform existing proxies (activation magnitude, contribution utility, gradient-based activity) in settings where those degrade, reframing resets as an intervention cost estimation problem.

Significance. If the alignment between GXD and actual reset cost holds empirically and the first-order approximation is validated, the work could improve reliability of plasticity-restoration interventions. It offers a principled alternative to proxy signals and may support more robust continual learning systems.

major comments (2)
  1. [GXD definition and theoretical motivation] The claim that GXD supplies an accurate first-order estimate of the functional cost of a discrete, finite reset is load-bearing for the central result. The manuscript provides no derivation or bound showing that the linear term in the Taylor expansion around the reference point remains close to the true delta-loss after reset, despite known high curvature, neuron interactions, and saturation in deep networks.
  2. [Experimental results and evaluation] Experimental validation of the core assumption is missing: there is no direct comparison (e.g., correlation or ranking agreement) between GXD scores and the measured change in loss or plasticity metric after performing the actual reset on the selected neurons. Without this, it is impossible to confirm that GXD reliably identifies neurons whose reset restores plasticity better than baselines.
minor comments (2)
  1. [Abstract] The abstract refers to 'settings where existing reset criteria degrade' but does not specify the continual learning benchmarks, task sequences, or degradation metrics used.
  2. [Method] Notation for the reference point and gradient computation should be formalized with an equation to allow reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed report. The comments identify important gaps in the theoretical grounding and direct empirical validation of GXD. We address each point below and commit to revisions that strengthen the manuscript without altering its core claims.

read point-by-point responses
  1. Referee: [GXD definition and theoretical motivation] The claim that GXD supplies an accurate first-order estimate of the functional cost of a discrete, finite reset is load-bearing for the central result. The manuscript provides no derivation or bound showing that the linear term in the Taylor expansion around the reference point remains close to the true delta-loss after reset, despite known high curvature, neuron interactions, and saturation in deep networks.

    Authors: We appreciate this observation. GXD is constructed precisely as the first-order term of the Taylor expansion of the loss with respect to a neuron’s activation: the product of the gradient of the loss w.r.t. the activation and the difference between the current activation and a chosen reference value. This follows directly from the definition of gradient-based attribution methods. We acknowledge that the manuscript does not contain an explicit derivation section or any analytic bound on the remainder term. In the revision we will add a dedicated subsection that (i) derives GXD from the first-order Taylor expansion, (ii) states the assumptions under which the linear approximation is expected to be useful, and (iii) discusses known limitations arising from curvature and inter-neuron dependencies, citing relevant work on attribution reliability. We will not claim a universal bound, as none is available in the literature for the general case. revision: yes

  2. Referee: [Experimental results and evaluation] Experimental validation of the core assumption is missing: there is no direct comparison (e.g., correlation or ranking agreement) between GXD scores and the measured change in loss or plasticity metric after performing the actual reset on the selected neurons. Without this, it is impossible to confirm that GXD reliably identifies neurons whose reset restores plasticity better than baselines.

    Authors: We agree that indirect evidence via downstream task performance is insufficient to validate the central modeling assumption. Our current experiments demonstrate that GXD-guided resets outperform proxy-based baselines on plasticity metrics, but they do not report the direct relationship between GXD values and the observed loss change after reset. In the revised manuscript we will add a new experimental subsection that (i) selects neurons according to GXD and the competing criteria, (ii) performs the actual resets, (iii) records the immediate change in loss and in a plasticity probe metric, and (iv) reports Spearman rank correlations and top-k agreement between the utility scores and the measured deltas. These results will be presented alongside the existing continual-learning benchmarks. revision: yes

Circularity Check

0 steps flagged

No circularity detected in derivation chain

full rationale

The paper defines GXD as a reference-based gradient attribution measure and states that it estimates the first-order functional cost of unit replacement. The abstract presents this as theoretically motivated without supplying equations here, but no load-bearing step can be shown to reduce by construction to its own inputs because the full derivation (including any Taylor expansion or attribution formula) is not exhibited in a way that collapses the claim into a tautology or self-citation. Empirical results on intervention reliability are presented as independent validation rather than a fitted prediction. No self-citation load-bearing, ansatz smuggling, or renaming of known results is identifiable from the provided text. The derivation remains self-contained against external benchmarks such as standard first-order attribution methods.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Only abstract available; the central claim rests on the unverified assumption that gradient attribution estimates replacement cost and that this cost predicts reset utility. No free parameters, additional axioms, or invented entities are described.

axioms (1)
  • domain assumption Reference-based gradient attribution estimates the first-order functional cost of replacing a unit
    Invoked to motivate GXD in the abstract

pith-pipeline@v0.9.0 · 5533 in / 1168 out tokens · 58964 ms · 2026-05-11T00:47:04.772505+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

29 extracted references · 29 canonical work pages

  1. [1]

    Towards better understanding of gradient-based attribution methods for deep neural networks

    Ancona, M., Ceolini, E., \"O ztireli, C., and Gross, M. Towards better understanding of gradient-based attribution methods for deep neural networks. In International Conference on Learning Representations, 2018

  2. [2]

    Ash, J. T. and Adams, R. P. On warm-starting neural network training. In Advances in Neural Information Processing Systems, 2020

  3. [3]

    How important is a neuron? In International Conference on Learning Representations, 2019

    Dhamdhere, K., Sundararajan, M., and Yan, Q. How important is a neuron? In International Conference on Learning Representations, 2019

  4. [4]

    Dohare, A

    Dohare, S., Sutton, R. S., and Mahmood, A. R. Continual backprop: Stochastic gradient descent with persistent randomness. arXiv preprint arXiv:2108.06325, 2021

  5. [5]

    F., Lan, Q., Rahman, P., Mahmood, A

    Dohare, S., Hernandez-Garcia, J. F., Lan, Q., Rahman, P., Mahmood, A. R., and Sutton, R. S. Loss of plasticity in deep continual learning. Nature, 632:768--774, 2024

  6. [6]

    Maintaining plasticity in deep continual learning.arXiv preprint arXiv:2306.13812, 2023

    Dohare, S., Hernandez-Garcia, J. F., Rahman, P., Mahmood, A. R., and Sutton, R. S. Maintaining plasticity in deep continual learning. arXiv preprint arXiv:2306.13812, 2023

  7. [7]

    Deep residual learning for image recognition

    He, K., Zhang, X., Ren, S., and Sun, J. Deep residual learning for image recognition. In IEEE Conference on Computer Vision and Pattern Recognition, pp.\ 770--778, 2016

  8. [8]

    Reinitializing weights vs units for maintaining plasticity in neural networks

    Hernandez-Garcia, J. F., Dohare, S., Luo, J., and Sutton, R. S. Reinitializing weights vs units for maintaining plasticity in neural networks. arXiv preprint arXiv:2508.00212v2, 2025

  9. [9]

    Learning multiple layers of features from tiny images

    Krizhevsky, A. Learning multiple layers of features from tiny images. Technical report, University of Toronto, 2009

  10. [10]

    Maintaining plasticity in continual learning via regenerative regularization

    Kumar, S., Marklund, H., and Van Roy, B. Maintaining plasticity in continual learning via regenerative regularization. In Proceedings of the 3rd Conference on Lifelong Learning Agents, volume 274 of Proceedings of Machine Learning Research, pp.\ 410--430. PMLR, 2025

  11. [11]

    Gradient-based learning applied to document recognition

    LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278--2324, 1998

  12. [12]

    Lewandowski, A., Tanaka, H., Schuurmans, D., and Machado, M. C. Directions of curvature as an explanation for loss of plasticity. arXiv preprint arXiv:2312.00246v4, 2024

  13. [13]

    Lewandowski, A., Kumar, S., Schuurmans, D., Gy \"o rgy, A., and Machado, M. C. Learning continually by spectral regularization. arXiv preprint arXiv:2406.06811v2, 2024

  14. [14]

    S., Courville, A., and Pan, L

    Liu, J., Wu, Z., Obando-Ceron, J., Castro, P. S., Courville, A., and Pan, L. Measure gradients, not activations! Enhancing neuronal activity in deep reinforcement learning. In Advances in Neural Information Processing Systems, 2025

  15. [15]

    Lundberg, S. M. and Lee, S.-I. A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems, 2017

  16. [16]

    and Wu, J

    Luo, J.-H. and Wu, J. Neural network pruning with residual-connections and limited-data. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.\ 1458--1467, 2020

  17. [17]

    Disentangling the causes of plasticity loss in neural networks

    Lyle, C., Zheng, Z., Khetarpal, K., van Hasselt, H., Pascanu, R., Martens, J., and Dabney, W. Disentangling the causes of plasticity loss in neural networks. In Proceedings of the 3rd Conference on Lifelong Learning Agents, volume 274 of Proceedings of Machine Learning Research, pp.\ 750--783. PMLR, 2025

  18. [18]

    Learning continually at peak performance with continuous continual backpropagation

    McCutcheon, L., Chatzaroulas, E., and Fallah, S. Learning continually at peak performance with continuous continual backpropagation. OpenReview preprint, submitted to ICLR 2026, 2026

  19. [19]

    The primacy bias in deep reinforcement learning

    Nikishin, E., Schwarzer, M., D'Oro, P., Bacon, P.-L., and Courville, A. The primacy bias in deep reinforcement learning. In International Conference on Machine Learning, pp.\ 16828--16847, 2022

  20. [20]

    Deep reinforcement learning with plasticity injection

    Nikishin, E., Oh, J., Ostrovski, G., Lyle, C., Pascanu, R., Dabney, W., and Barreto, A. Deep reinforcement learning with plasticity injection. In Advances in Neural Information Processing Systems, 2023

  21. [21]

    Shapley, L. S. A value for n -person games. In Kuhn, H. W. and Tucker, A. W. (eds.), Contributions to the Theory of Games, volume II, pp.\ 307--317. Princeton University Press, 1953

  22. [22]

    Learning important features through propagating activation differences

    Shrikumar, A., Greenside, P., and Kundaje, A. Learning important features through propagating activation differences. In International Conference on Machine Learning, pp.\ 3145--3153, 2017

  23. [23]

    Computationally efficient measures of internal neuron importance

    Shrikumar, A., Su, J., and Kundaje, A. Computationally efficient measures of internal neuron importance. arXiv preprint arXiv:1807.09946, 2018

  24. [24]

    S., and Evci, U

    Sokar, G., Agarwal, R., Castro, P. S., and Evci, U. The dormant neuron phenomenon in deep reinforcement learning. In International Conference on Machine Learning, volume 202 of Proceedings of Machine Learning Research, pp.\ 32145--32168. PMLR, 2023

  25. [25]

    Axiomatic attribution for deep networks

    Sundararajan, M., Taly, A., and Yan, Q. Axiomatic attribution for deep networks. In International Conference on Machine Learning, pp.\ 3319--3328, 2017

  26. [26]

    A comprehensive survey of continual learning: Theory, method and application

    Wang, L., Zhang, X., Su, H., and Zhu, J. A comprehensive survey of continual learning: Theory, method and application. IEEE Transactions on Pattern Analysis and Machine Intelligence, 46(8):5362--5383, 2024

  27. [27]

    Feature squeezing: Detecting adversarial examples in deep neural networks

    Xu, W., Evans, D., and Qi, Y. Feature squeezing: Detecting adversarial examples in deep neural networks. In Network and Distributed System Security Symposium, 2018

  28. [28]

    Pruning by explaining: A novel criterion for deep neural network pruning

    Yeom, S.-K., Seegerer, P., Lapuschkin, S., Binder, A., Wiedemann, S., M \"u ller, K.-R., and Samek, W. Pruning by explaining: A novel criterion for deep neural network pruning. Pattern Recognition, 115:107899, 2021

  29. [29]

    SInGE: Sparsity via integrated gradients estimation of neuron relevance

    Yvinec, E., Dapogny, A., Cord, M., and Bailly, K. SInGE: Sparsity via integrated gradients estimation of neuron relevance. In Advances in Neural Information Processing Systems, 2022