Shapley Neuron Values for Continual Learning: Which Neurons Matter Most?

Abhisek Ray; Mohammad Ali Vahedifar; Qi Zhang

arxiv: 2605.15877 · v1 · pith:DXL46QJ7new · submitted 2026-05-15 · 💻 cs.LG · cs.AI

Shapley Neuron Values for Continual Learning: Which Neurons Matter Most?

Mohammad Ali Vahedifar , Abhisek Ray , Qi Zhang This is my paper

Pith reviewed 2026-05-20 20:24 UTC · model grok-4.3

classification 💻 cs.LG cs.AI

keywords continual learningcatastrophic forgettingShapley valuesneuron importancebuffer-free learningclass incremental learningtask incremental learning

0 comments

The pith

Shapley values can identify which neurons to freeze so a network learns new tasks without forgetting old ones or storing data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Shapley Neuron Valuation to score how much each neuron contributes to performance on tasks already learned. High-scoring neurons are frozen while others stay free to change, letting the same fixed network handle a sequence of tasks. This removes the need for replay buffers or growing the model size, two common costs in continual learning. A reader cares because many real systems must keep improving on new data without losing what they already know, yet most current fixes either use extra memory or change the architecture. The approach grounds the choice of which parts to protect in cooperative game theory rather than ad-hoc rules.

Core claim

We present Shapley Neuron Valuation (SNV), a framework that treats neurons as players in a cooperative game and computes their marginal contribution to the network's output on prior tasks. SNV then freezes neurons with the highest values to protect earlier knowledge while leaving lower-value neurons plastic for new learning. On ImageNet-1k this yields +2.88 percent accuracy in class-incremental learning and +6.46 percent in task-incremental learning over the strongest buffer-free baseline.

What carries the argument

Shapley Neuron Valuation (SNV), which assigns each neuron a value equal to its average marginal contribution across all possible coalitions of neurons, to decide which neurons must remain unchanged during subsequent training.

If this is right

SNV enables continual learning on large image datasets without storing past examples.
Accuracy gains appear in both class-incremental and task-incremental protocols.
The network size stays constant because only selected neurons are frozen rather than new layers added.
Neuron importance is computed once per task and then used to set a binary freeze mask.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same scoring could be applied to other layer types if an equivalent notion of 'neuron' is defined.
Periodic re-calculation of Shapley values after several tasks might further reduce forgetting under distribution shift.
Combining SNV with lightweight regularization on the plastic neurons could produce additive gains.

Load-bearing premise

That the importance ranking of neurons computed on past tasks will still mark the right neurons to protect when entirely new tasks arrive.

What would settle it

Run the same continual-learning schedule on ImageNet-1k but freeze neurons chosen at random instead of by SNV scores; if the accuracy gap over baselines vanishes, the method's advantage is not explained by the Shapley ranking.

Figures

Figures reproduced from arXiv: 2605.15877 by Abhisek Ray, Mohammad Ali Vahedifar, Qi Zhang.

**Figure 1.** Figure 1: An illustration of Shapley Neuron Values where for each task task t − 1, task t, and task t + 1, we identify and freeze the Neurons whose Shapley Values fall within the top r%. Neurons marked with specific colors indicate that the same Neuron appears within the top r% for t specific tasks. τi is each task’s top r% threshold. task can be reliably identified and structurally preserved. We propose Shapley Neu… view at source ↗

**Figure 2.** Figure 2: ACC evaluation for CIL across 10 tasks on each dataset. Each point represents the average classification accuracy evaluated after learning a given task. For example, the value at task 5 corresponds to the average accuracy of the model on the test sets of tasks 1 through 5 after completing training on task 5. For details, see [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: Network parameter usage efficiency across datasets in the CIL scenario for 10 tasks. The dashed vertical lines indicate the critical pruning percentage where each method experiences significant accuracy degradation. A sharper decline at lower pruning percentages indicates more efficient usage of network capacity, as it suggests the method utilizes essential parameters with minimal redundancy. using a fixed… view at source ↗

**Figure 4.** Figure 4: Capacity analysis on CIFAR-100 and TinyImageNet for SNV. (a) Accuracy as a function of model capacity across incremental tasks. (b) Capacity versus Accuracy. However, WSN reaches only 64.00% on CIFAR-100 and 61.06% on Tiny-ImageNet at the same capacity (c=0.5), trailing SNV by 15.76 and 13.76 points respectively. This gap widens under tighter budgets: at c=0.03, SNV achieves 71.74% versus WSN’s 59.65% on C… view at source ↗

**Figure 5.** Figure 5: Computational cost of all compared methods across CIFAR-100, Tiny-ImageNet, and ImageNet-1k (Class-IL, 10 tasks). Left: total training FLOPs on a log scale. Right: peak GPU memory. 3.3. Computational Cost Comparison A natural concern with Shapley-based importance estimation is cost: Monte Carlo sampling over Neuron coalitions adds computation that simpler proxies like Fisher information avoid entirely [… view at source ↗

**Figure 6.** Figure 6: Performance evaluation metrics of continual learning methods. RAC is the Random model ACcuracy. 3. Removing the Cumulative Shapley Mask. We experimented with training without the cumulative Shapley mask and found that performance collapsed. This confirmed that the mask is a core mechanism: without it, the network fails to properly freeze parameters associated with previous tasks. 4. Allowing a Small Perce… view at source ↗

**Figure 7.** Figure 7: ACC evaluation comparison for the CIL for 20 tasks for each dataset. Each point represents the average classification accuracy evaluated after learning a given task, averaged over all tasks learned up to that point. For example, the value at task 10 corresponds to the average accuracy of the model on the test sets of tasks 1 through 10 after completing training on task 10. For details, see [PITH_FULL_IMAG… view at source ↗

**Figure 8.** Figure 8: ACC matrix for SNV for the CIL for 10 tasks for each dataset. proaches under current evaluation protocols? Are memory-based approaches even applicable in realworld scenarios where privacy constraints, data retention policies, or legal regulations restrict storing past examples? F.2. CIL Analysis for CIFAR-100 and Tiny-ImageNet [PITH_FULL_IMAGE:figures/full_fig_p015_8.png] view at source ↗

**Figure 9.** Figure 9: ACC matrix for SNV for the CIL for 20 tasks for each dataset. stantial forgetting that a replay buffer must continuously patch. On Tiny-ImageNet the story is similar: DyTox leads by 3.7 points in accuracy but trails by 0.34 in backward transfer. For any deployment where data retention is restricted, whether by privacy regulation, memory constraints, or both, SNV offers a compelling trade-off: nearly equiv… view at source ↗

**Figure 10.** Figure 10: Layer-wise Shapley Neuron Importance Across Datasets. T1 T2 T3 T4 T5 T6 T7 T8 T9 T10 Task ID T1 T2 T3 T4 T5 T6 T7 T8 T9 T10 Task ID 1.00 0.07 1.00 0.06 0.05 1.00 0.05 0.07 0.08 1.00 0.10 0.08 0.08 0.05 1.00 0.08 0.08 0.12 0.07 0.06 1.00 0.07 0.07 0.08 0.07 0.07 0.06 1.00 0.06 0.07 0.06 0.05 0.08 0.10 0.05 1.00 0.09 0.07 0.08 0.07 0.08 0.08 0.05 0.10 1.00 0.08 0.05 0.12 0.09 0.05 0.08 0.05 0.08 0.04 1.00 C… view at source ↗

**Figure 11.** Figure 11: Mask overlap between tasks across datasets for 10 tasks in the CIL scenario. reveal critical insights into SNV’s Neuron allocation strategy: moderate overlap values (typically 0.3–0.5) indicate that SNV achieves an effective balance between knowledge sharing and interference avoidance. Shared Neurons facilitate positive forward transfer by reusing generalizable features across tasks, while distinct Neur… view at source ↗

read the original abstract

Continual learning enables neural networks to learn tasks sequentially without forgetting previously acquired knowledge. However, neural networks suffer from catastrophic forgetting, where learning new tasks degrades performance on earlier ones. We address this problem with Shapley Neuron Valuation (SNV), a principled framework that quantifies Neuron importance in continual learning, grounded in cooperative game theory. SNV selectively freezes important Neurons while keeping others plastic, enabling buffer-free continual learning without expanding architecture. Experiments on ImageNet-1k show that SNV consistently outperforms existing buffer-free methods. In particular, SNV improves accuracy by +2.88% in the class incremental learning and +6.46% in the task incremental learning scenarios compared to the second baseline.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

SNV applies Shapley values to rank neurons for freezing in buffer-free continual learning and reports modest gains on ImageNet-1k, but the cross-task stability of those rankings is not clearly shown.

read the letter

The punchline with this paper is that it uses Shapley values to pick which neurons to freeze for continual learning without a replay buffer. The authors get some accuracy gains on ImageNet-1k splits. They do a good job grounding the selection in cooperative game theory rather than heuristic scores. Neurons are valued based on their marginal contribution to performance on the current task, and the high-value ones get protected while the rest adapt. That produces the reported improvements of roughly 3% and 6% over the second-best buffer-free baseline in the two settings. The experiments use a standard large-scale dataset and relevant comparisons, which is positive. Where it is softer is on the cross-task validity. Computing importance on task t does not automatically mean those neurons are the ones whose freezing will best preserve earlier knowledge after training on t+1. The paper does not appear to include direct checks on ranking stability or on whether the value function accounts for feature reuse across distributions. That leaves the central claim resting on the empirical wins without a clear mechanistic explanation. They also need to detail the Shapley approximation scheme, since exact values are not feasible, and show how sensitive results are to the freezing threshold. This is targeted at the continual learning subfield, especially work on architecture or parameter based methods for avoiding forgetting. A reader working on similar problems would find the game-theoretic framing useful to consider. It deserves peer review. The approach is novel enough in application and the numbers are positive, so referees can dig into the assumptions and the implementation details.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes Shapley Neuron Valuation (SNV), a framework grounded in cooperative game theory to quantify the importance of individual neurons for continual learning. SNV computes Shapley values using a value function on the current task's data to rank neurons, then selectively freezes the most important ones while keeping others plastic. This approach aims to mitigate catastrophic forgetting in a buffer-free manner without expanding the network architecture. Experiments on ImageNet-1k report accuracy gains of +2.88% in class-incremental learning and +6.46% in task-incremental learning over the second baseline.

Significance. If the central claim holds, SNV would provide a principled, game-theoretic method for identifying neurons to protect across tasks, advancing buffer-free continual learning. The grounding in cooperative game theory and the scale of the ImageNet-1k experiments are strengths that could influence future work on neuron-level regularization. However, the significance hinges on whether task-local Shapley rankings generalize to prevent forgetting, which requires further substantiation.

major comments (2)

[Abstract and §4 (Experiments)] Abstract and §4 (Experiments): The headline accuracy gains (+2.88% class-incremental, +6.46% task-incremental) are stated without derivation details for the Shapley approximation, baseline descriptions, statistical tests, error bars, or ablation results on the importance threshold or value function. This prevents assessment of whether the central claim is supported or influenced by post-hoc choices.
[§3 (SNV framework)] §3 (SNV framework): The load-bearing assumption that Shapley values computed via value function v on the current task's data identify neurons whose freezing prevents forgetting on future tasks lacks supporting derivation or experiment. No analysis demonstrates that marginal contributions under v on task t correlate with cross-task preservation or stability under distribution shift; the selected neurons could simply be those most active on the current distribution.

minor comments (1)

[§3] The notation for the characteristic function v(S) and how it is defined for neuron coalitions in the continual-learning setting could be clarified with an explicit equation.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their thorough review and constructive suggestions. We address the major comments point by point below, providing clarifications and outlining the revisions we will make to strengthen the manuscript. We believe these changes will better support the central claims of our work on Shapley Neuron Valuation for continual learning.

read point-by-point responses

Referee: [Abstract and §4 (Experiments)] The headline accuracy gains (+2.88% class-incremental, +6.46% task-incremental) are stated without derivation details for the Shapley approximation, baseline descriptions, statistical tests, error bars, or ablation results on the importance threshold or value function. This prevents assessment of whether the central claim is supported or influenced by post-hoc choices.

Authors: We appreciate this observation and agree that more detailed reporting is necessary for reproducibility and to substantiate the claims. In the revised version, we will include: (1) the specific method for approximating Shapley values, such as the number of samples or the algorithm used (e.g., Monte Carlo sampling); (2) comprehensive descriptions of the baselines, including their implementation details and hyperparameters; (3) results presented with error bars from at least 3 independent runs and statistical tests (e.g., Wilcoxon signed-rank test) to confirm significance of the gains; (4) ablation studies varying the importance threshold (e.g., freezing top 10%, 20%, 30% neurons) and different value functions v (e.g., accuracy vs. loss-based). These additions will be placed in an expanded Section 4 and supplementary material. revision: yes
Referee: [§3 (SNV framework)] The load-bearing assumption that Shapley values computed via value function v on the current task's data identify neurons whose freezing prevents forgetting on future tasks lacks supporting derivation or experiment. No analysis demonstrates that marginal contributions under v on task t correlate with cross-task preservation or stability under distribution shift; the selected neurons could simply be those most active on the current distribution.

Authors: This is a valid concern regarding the theoretical grounding. The SNV framework posits that neurons with high Shapley values on the current task's data are those that contribute most to the model's performance on that task, and by freezing them, we preserve the representations learned so far. While we do not provide a formal proof that these marginal contributions directly correlate with future task stability, the empirical results demonstrate that this selection leads to better retention of previous knowledge compared to baselines that do not use such principled selection. To address this, in the revision we will add a subsection in §3 providing a more detailed motivation based on the cooperative game theory interpretation, arguing that high-value neurons are critical for the function approximation on the seen data distribution. We will also include an experiment analyzing the overlap of important neurons across tasks or the forgetting rate when using SNV vs. random freezing. However, we maintain that the primary validation comes from the end-to-end performance improvements on ImageNet-1k, which show reduced catastrophic forgetting. revision: partial

Circularity Check

0 steps flagged

SNV applies standard Shapley valuation to neurons with no reduction of the continual-learning claim to fitted inputs or self-citations

full rationale

The paper grounds SNV directly in cooperative game theory by defining neuron importance via the Shapley value of a value function v computed on the current task's data distribution. This construction is independent of the target continual-learning outcome (future-task accuracy after freezing); the link between current-task marginal contributions and cross-task stability is presented as an empirical hypothesis rather than a definitional identity. No equations equate the protection mask to a fit on forgetting metrics, no self-citation supplies a uniqueness theorem that forces the method, and the reported accuracy gains are measured on held-out future tasks rather than being recovered by construction from the valuation step itself. The derivation therefore remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

Abstract-only review limits visibility into parameters and assumptions; the method appears to rest on treating neurons as cooperative-game players and on the existence of a computable importance threshold for freezing.

free parameters (1)

importance threshold for freezing
A cutoff value must exist to decide which neurons to freeze; its selection is not described in the abstract.

axioms (1)

domain assumption Shapley values computed over neuron coalitions accurately reflect contribution to task performance in a neural network
The framework is explicitly grounded in cooperative game theory as stated in the abstract.

pith-pipeline@v0.9.0 · 5648 in / 1197 out tokens · 56219 ms · 2026-05-20T20:24:13.089850+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

81 extracted references · 81 canonical work pages

[1]

ICML , year =

Linear Mode Connectivity and the Lottery Ticket Hypothesis , author =. ICML , year =

work page
[2]

NeurIPS , year=

Learning both weights and connections for efficient neural network , author=. NeurIPS , year=

work page
[3]

Sokar, Ghada and Mocanu, Decebal Constantin and Pechenizkiy, Mykola , journal =

work page
[4]

Gurbuz, Mustafa B and Dovrolis, Constantine , booktitle =

work page
[5]

IJCAI , year =

On the Discrimination and Consistency for Exemplar-Free Class Incremental Learning , author =. IJCAI , year =

work page
[6]

CVPR , year =

Learning to Prompt for Continual Learning , author =. CVPR , year =

work page
[7]

International Journal of Game Theory , year=

Monotonic solutions of cooperative games , author=. International Journal of Game Theory , year=

work page
[8]

Contributions to the Theory of Games II , publisher =

A Value for n-Person Games , author =. Contributions to the Theory of Games II , publisher =

work page
[9]

ICML , year =

Data Shapley: Equitable Valuation of Data for Machine Learning , author =. ICML , year =

work page
[10]

Neuron Shapley: Discovering the Responsible Neurons , year =

Ghorbani, Amirata and Zou, James Y , booktitle =. Neuron Shapley: Discovering the Responsible Neurons , year =

work page
[11]

2026 , url=

No Forgetting Learning: Buffer-free Continual Learning Classification , author=. 2026 , url=

work page 2026
[12]

ICML , year =

Forget-free Continual Learning with Winning Subnetworks , author =. ICML , year =

work page
[13]

TMLR , year =

Hyperparameters in Continual Learning: A Reality Check , author =. TMLR , year =

work page
[14]

2024 , url =

Hyperparameter Selection in Continual Learning , author =. 2024 , url =

work page 2024
[15]

and Ba, Jimmy , booktitle =

Kingma, Diederik P. and Ba, Jimmy , booktitle =. Adam: A Method for Stochastic Optimization. , year =

work page
[16]

ICLR , year =

Data Shapley in One Training Run , author =. ICLR , year =

work page
[17]

CVPR , year=

ImageNet: A Large-Scale Hierarchical Image Database , author=. CVPR , year=

work page
[18]

, year =

Le, Ya and Yang, Xuan S. , year =. Tiny

work page
[19]

2009 , institution =

Learning Multiple Layers of Features from Tiny Images , author =. 2009 , institution =

work page 2009
[20]

International Conference on Artificial Intelligence and Statistics , year =

Non-stochastic Best Arm Identification and Hyperparameter Optimization , author =. International Conference on Artificial Intelligence and Statistics , year =

work page
[21]

JMLR , year=

Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization , author=. JMLR , year=

work page
[22]

2024 , booktitle =

Titans: Learning to Memorize at Test Time , author=. 2024 , booktitle =

work page 2024
[23]

A Practical Guide, 1st Ed., Cham: Springer International Publishing , year=

The EU General Data Protection Regulation (GDPR) , author=. A Practical Guide, 1st Ed., Cham: Springer International Publishing , year=

work page
[24]

ICLR , year=

Prediction Error-based Classification for Class-Incremental Learning , author=. ICLR , year=

work page
[25]

Zenke, Friedemann and Poole, Ben and Ganguli, Surya , booktitle=

work page
[26]

ECCV , year=

Douillard, Arthur and Cord, Matthieu and Ollion, Charles and Robert, Thomas and Valle, Eduardo , title=. ECCV , year=

work page
[27]

2015 , note=

Distilling the Knowledge in a Neural Network , author=. 2015 , note=

work page 2015
[28]

Proceedings of the National Academy of Sciences , year=

Overcoming Catastrophic Forgetting in Neural Networks , author=. Proceedings of the National Academy of Sciences , year=

work page
[29]

Learning without Forgetting , year=

Li, Zhizhong and Hoiem, Derek , journal=. Learning without Forgetting , year=

work page
[30]

CVPR , year =

Wu, Yue and Chen, Yinpeng and Wang, Lijuan and Ye, Yuancheng and Liu, Zicheng and Guo, Yandong and Fu, Yun , title =. CVPR , year =

work page
[31]

Neural Networks , year=

Continual lifelong learning with neural networks: A review , author=. Neural Networks , year=

work page
[32]

ICLR , year=

Gradient Projection Memory for Continual Learning , author=. ICLR , year=

work page
[33]

Nature Machine Intelligence , year =

Guanxiong Zeng and Yang Chen and Bo Cui and Shan Yu , title =. Nature Machine Intelligence , year =

work page
[34]

AISTATS , year =

Understanding the difficulty of training deep feedforward neural networks , author =. AISTATS , year =

work page
[35]

ICCV , year =

He, Kaiming and Zhang, Xiangyu and Ren, Shaoqing and Sun, Jian , title =. ICCV , year =

work page
[36]

ImageNet Classification with Deep Convolutional Neural Networks , year =

Krizhevsky, Alex and Sutskever, Ilya and Hinton, Geoffrey E , booktitle =. ImageNet Classification with Deep Convolutional Neural Networks , year =

work page
[37]

Pfülb and A

B. Pfülb and A. Gepperth , booktitle=. A comprehensive, application-oriented study of catastrophic forgetting in

work page
[38]

TAI , year=

Continual Learning: A Review of Techniques, Challenges and Future Directions , author=. TAI , year=

work page
[39]

Wang, Liyuan and Zhang, Xingxing and Su, Hang and Zhu, Jun , journal=

work page
[40]

2025 , journal=

A Comprehensive Survey of Forgetting in Deep Learning Beyond Continual Learning , author=. 2025 , journal=

work page 2025
[41]

2024 , journal=

Continual Learning: Applications and the Road Forward , author=. 2024 , journal=

work page 2024
[42]

CVPR , year=

iCaRL: Incremental classifier and representation learning , author=. CVPR , year=

work page
[43]

IJCV , year=

Knowledge distillation: A survey , author=. IJCV , year=

work page
[44]

ICCV , year=

Encoder-based lifelong learning , author=. ICCV , year=

work page
[45]

CVPR , year=

Learning without memorizing , author=. CVPR , year=

work page
[46]

ECCV , year=

Memory-efficient incremental learning through feature adaptation , author=. ECCV , year=

work page
[47]

ECCV , year=

End-to-end incremental learning , author=. ECCV , year=

work page
[48]

ECCV , year=

Podnet: Pooled outputs distillation for small-tasks incremental learning , author=. ECCV , year=

work page
[49]

CVPR , year=

Learning a unified classifier incrementally via rebalancing , author=. CVPR , year=

work page
[50]

Tiny ImageNet Visual Recognition Challenge , author=

work page
[51]

ICCV , year=

Overcoming catastrophic forgetting with unlabeled data in the wild , author=. ICCV , year=

work page
[52]

ICCV , year=

Lifelong gan: Continual learning for conditional image generation , author=. ICCV , year=

work page
[53]

AAAI , year=

Continual learning through retrieval and imagination , author=. AAAI , year=

work page
[54]

NeurIPS , year=

Dark experience for general continual learning: a strong, simple baseline , author=. NeurIPS , year=

work page
[55]

ICLR , year=

Functional Regularisation for Continual Learning with Gaussian Processes , author=. ICLR , year=

work page
[56]

NeurIPS , year=

Continual deep learning by functional regularisation of memorable past , author=. NeurIPS , year=

work page
[57]

ICML , year=

Continual learning via sequential function-space variational inference , author=. ICML , year=

work page
[58]

NeurIPS , year=

Memory replay gans: Learning to generate new categories without forgetting , author=. NeurIPS , year=

work page
[59]

2009 , institution=

Learning multiple layers of features from tiny images , author=. 2009 , institution=

work page 2009
[60]

CVPR , year =

He, Kaiming and Zhang, Xiangyu and Ren, Shaoqing and Sun, Jian , title =. CVPR , year =

work page
[61]

Nature , year=

Deep learning , author=. Nature , year=

work page
[62]

Gradient Episodic Memory for Continual Learning , year =

Lopez-Paz, David and Ranzato, Marc Aurelio , booktitle =. Gradient Episodic Memory for Continual Learning , year =

work page
[63]

2018 , booktitle=

Don't forget, there is more than forgetting: new metrics for Continual Learning , author=. 2018 , booktitle=

work page 2018
[64]

CVPR , year =

Aljundi, Rahaf and Chakravarty, Punarjay and Tuytelaars, Tinne , title =. CVPR , year =

work page
[65]

TPAMI , year =

Zhou, Da-Wei and Wang, Qi-Wei and Qi, Zhi-Hong and Ye, Han-Jia and Zhan, De-Chuan and Liu, Ziwei , title =. TPAMI , year =

work page
[66]

Deep Learning , author=

work page
[67]

doi:10.5281/zenodo.18774099 , url =

Vahedifar, Mohammad Ali and Zhang, Qi and Iosifidis, Alexandros , title =. doi:10.5281/zenodo.18774099 , url =

work page doi:10.5281/zenodo.18774099
[68]

and Bagdanov, Andrew D

Liu, Xialei and Masana, Marc and Herranz, Luis and Van de Weijer, Joost and López, Antonio M. and Bagdanov, Andrew D. , booktitle=

work page
[69]

ECCV , year =

Aljundi, Rahaf and Babiloni, Francesca and Elhoseiny, Mohamed and Rohrbach, Marcus and Tuytelaars, Tinne , title =. ECCV , year =

work page
[70]

Rusu and Neil C

Andrei A. Rusu and Neil C. Rabinowitz and Guillaume Desjardins and Hubert Soyer and James Kirkpatrick and Koray Kavukcuoglu and Razvan Pascanu and Raia Hadsell , year=

work page
[71]

Jaehong Yoon and Eunho Yang and Jeongtae Lee and Sung Ju Hwang , year=

work page
[72]

CVPR , year =

Douillard, Arthur and Ram\'e, Alexandre and Couairon, Guillaume and Cord, Matthieu , title =. CVPR , year =

work page
[73]

2024 , booktitle =

Liu, Yaoyao and Schiele, Bernt and Sun, Qianru , title =. 2024 , booktitle =

work page 2024
[74]

Sun, Qing and Lyu, Fan and Shang, Fanhua and Feng, Wei and Wan, Liang , booktitle =

work page
[75]

Da-Wei Zhou and Hai-Long Sun and Jingyi Ning and Han-Jia Ye and De-Chuan Zhan , year=

work page
[76]

Haoxuan Qu and Hossein Rahmani and Li Xu and Bryan Williams and Jun Liu , year=

work page
[77]

Khetarpal, Khimya and Riemer, Matthew and Rish, Irina and Precup, Doina , journal=

work page
[78]

2021 , booktitle =

Zhou, Da-Wei and Ye, Han-Jia and Zhan, De-Chuan , title =. 2021 , booktitle =

work page 2021
[79]

ICLR , year=

A Model or 603 Exemplars: Towards Memory-Efficient Class-Incremental Learning , author=. ICLR , year=

work page
[80]

and Ajanthan, Thalaiyasingam and Torr, Philip H

Chaudhry, Arslan and Dokania, Puneet K. and Ajanthan, Thalaiyasingam and Torr, Philip H. S. , title =. ECCV , year =

work page

Showing first 80 references.

[1] [1]

ICML , year =

Linear Mode Connectivity and the Lottery Ticket Hypothesis , author =. ICML , year =

work page

[2] [2]

NeurIPS , year=

Learning both weights and connections for efficient neural network , author=. NeurIPS , year=

work page

[3] [3]

Sokar, Ghada and Mocanu, Decebal Constantin and Pechenizkiy, Mykola , journal =

work page

[4] [4]

Gurbuz, Mustafa B and Dovrolis, Constantine , booktitle =

work page

[5] [5]

IJCAI , year =

On the Discrimination and Consistency for Exemplar-Free Class Incremental Learning , author =. IJCAI , year =

work page

[6] [6]

CVPR , year =

Learning to Prompt for Continual Learning , author =. CVPR , year =

work page

[7] [7]

International Journal of Game Theory , year=

Monotonic solutions of cooperative games , author=. International Journal of Game Theory , year=

work page

[8] [8]

Contributions to the Theory of Games II , publisher =

A Value for n-Person Games , author =. Contributions to the Theory of Games II , publisher =

work page

[9] [9]

ICML , year =

Data Shapley: Equitable Valuation of Data for Machine Learning , author =. ICML , year =

work page

[10] [10]

Neuron Shapley: Discovering the Responsible Neurons , year =

Ghorbani, Amirata and Zou, James Y , booktitle =. Neuron Shapley: Discovering the Responsible Neurons , year =

work page

[11] [11]

2026 , url=

No Forgetting Learning: Buffer-free Continual Learning Classification , author=. 2026 , url=

work page 2026

[12] [12]

ICML , year =

Forget-free Continual Learning with Winning Subnetworks , author =. ICML , year =

work page

[13] [13]

TMLR , year =

Hyperparameters in Continual Learning: A Reality Check , author =. TMLR , year =

work page

[14] [14]

2024 , url =

Hyperparameter Selection in Continual Learning , author =. 2024 , url =

work page 2024

[15] [15]

and Ba, Jimmy , booktitle =

Kingma, Diederik P. and Ba, Jimmy , booktitle =. Adam: A Method for Stochastic Optimization. , year =

work page

[16] [16]

ICLR , year =

Data Shapley in One Training Run , author =. ICLR , year =

work page

[17] [17]

CVPR , year=

ImageNet: A Large-Scale Hierarchical Image Database , author=. CVPR , year=

work page

[18] [18]

, year =

Le, Ya and Yang, Xuan S. , year =. Tiny

work page

[19] [19]

2009 , institution =

Learning Multiple Layers of Features from Tiny Images , author =. 2009 , institution =

work page 2009

[20] [20]

International Conference on Artificial Intelligence and Statistics , year =

Non-stochastic Best Arm Identification and Hyperparameter Optimization , author =. International Conference on Artificial Intelligence and Statistics , year =

work page

[21] [21]

JMLR , year=

Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization , author=. JMLR , year=

work page

[22] [22]

2024 , booktitle =

Titans: Learning to Memorize at Test Time , author=. 2024 , booktitle =

work page 2024

[23] [23]

A Practical Guide, 1st Ed., Cham: Springer International Publishing , year=

The EU General Data Protection Regulation (GDPR) , author=. A Practical Guide, 1st Ed., Cham: Springer International Publishing , year=

work page

[24] [24]

ICLR , year=

Prediction Error-based Classification for Class-Incremental Learning , author=. ICLR , year=

work page

[25] [25]

Zenke, Friedemann and Poole, Ben and Ganguli, Surya , booktitle=

work page

[26] [26]

ECCV , year=

Douillard, Arthur and Cord, Matthieu and Ollion, Charles and Robert, Thomas and Valle, Eduardo , title=. ECCV , year=

work page

[27] [27]

2015 , note=

Distilling the Knowledge in a Neural Network , author=. 2015 , note=

work page 2015

[28] [28]

Proceedings of the National Academy of Sciences , year=

Overcoming Catastrophic Forgetting in Neural Networks , author=. Proceedings of the National Academy of Sciences , year=

work page

[29] [29]

Learning without Forgetting , year=

Li, Zhizhong and Hoiem, Derek , journal=. Learning without Forgetting , year=

work page

[30] [30]

CVPR , year =

Wu, Yue and Chen, Yinpeng and Wang, Lijuan and Ye, Yuancheng and Liu, Zicheng and Guo, Yandong and Fu, Yun , title =. CVPR , year =

work page

[31] [31]

Neural Networks , year=

Continual lifelong learning with neural networks: A review , author=. Neural Networks , year=

work page

[32] [32]

ICLR , year=

Gradient Projection Memory for Continual Learning , author=. ICLR , year=

work page

[33] [33]

Nature Machine Intelligence , year =

Guanxiong Zeng and Yang Chen and Bo Cui and Shan Yu , title =. Nature Machine Intelligence , year =

work page

[34] [34]

AISTATS , year =

Understanding the difficulty of training deep feedforward neural networks , author =. AISTATS , year =

work page

[35] [35]

ICCV , year =

He, Kaiming and Zhang, Xiangyu and Ren, Shaoqing and Sun, Jian , title =. ICCV , year =

work page

[36] [36]

ImageNet Classification with Deep Convolutional Neural Networks , year =

Krizhevsky, Alex and Sutskever, Ilya and Hinton, Geoffrey E , booktitle =. ImageNet Classification with Deep Convolutional Neural Networks , year =

work page

[37] [37]

Pfülb and A

B. Pfülb and A. Gepperth , booktitle=. A comprehensive, application-oriented study of catastrophic forgetting in

work page

[38] [38]

TAI , year=

Continual Learning: A Review of Techniques, Challenges and Future Directions , author=. TAI , year=

work page

[39] [39]

Wang, Liyuan and Zhang, Xingxing and Su, Hang and Zhu, Jun , journal=

work page

[40] [40]

2025 , journal=

A Comprehensive Survey of Forgetting in Deep Learning Beyond Continual Learning , author=. 2025 , journal=

work page 2025

[41] [41]

2024 , journal=

Continual Learning: Applications and the Road Forward , author=. 2024 , journal=

work page 2024

[42] [42]

CVPR , year=

iCaRL: Incremental classifier and representation learning , author=. CVPR , year=

work page

[43] [43]

IJCV , year=

Knowledge distillation: A survey , author=. IJCV , year=

work page

[44] [44]

ICCV , year=

Encoder-based lifelong learning , author=. ICCV , year=

work page

[45] [45]

CVPR , year=

Learning without memorizing , author=. CVPR , year=

work page

[46] [46]

ECCV , year=

Memory-efficient incremental learning through feature adaptation , author=. ECCV , year=

work page

[47] [47]

ECCV , year=

End-to-end incremental learning , author=. ECCV , year=

work page

[48] [48]

ECCV , year=

Podnet: Pooled outputs distillation for small-tasks incremental learning , author=. ECCV , year=

work page

[49] [49]

CVPR , year=

Learning a unified classifier incrementally via rebalancing , author=. CVPR , year=

work page

[50] [50]

Tiny ImageNet Visual Recognition Challenge , author=

work page

[51] [51]

ICCV , year=

Overcoming catastrophic forgetting with unlabeled data in the wild , author=. ICCV , year=

work page

[52] [52]

ICCV , year=

Lifelong gan: Continual learning for conditional image generation , author=. ICCV , year=

work page

[53] [53]

AAAI , year=

Continual learning through retrieval and imagination , author=. AAAI , year=

work page

[54] [54]

NeurIPS , year=

Dark experience for general continual learning: a strong, simple baseline , author=. NeurIPS , year=

work page

[55] [55]

ICLR , year=

Functional Regularisation for Continual Learning with Gaussian Processes , author=. ICLR , year=

work page

[56] [56]

NeurIPS , year=

Continual deep learning by functional regularisation of memorable past , author=. NeurIPS , year=

work page

[57] [57]

ICML , year=

Continual learning via sequential function-space variational inference , author=. ICML , year=

work page

[58] [58]

NeurIPS , year=

Memory replay gans: Learning to generate new categories without forgetting , author=. NeurIPS , year=

work page

[59] [59]

2009 , institution=

Learning multiple layers of features from tiny images , author=. 2009 , institution=

work page 2009

[60] [60]

CVPR , year =

He, Kaiming and Zhang, Xiangyu and Ren, Shaoqing and Sun, Jian , title =. CVPR , year =

work page

[61] [61]

Nature , year=

Deep learning , author=. Nature , year=

work page

[62] [62]

Gradient Episodic Memory for Continual Learning , year =

Lopez-Paz, David and Ranzato, Marc Aurelio , booktitle =. Gradient Episodic Memory for Continual Learning , year =

work page

[63] [63]

2018 , booktitle=

Don't forget, there is more than forgetting: new metrics for Continual Learning , author=. 2018 , booktitle=

work page 2018

[64] [64]

CVPR , year =

Aljundi, Rahaf and Chakravarty, Punarjay and Tuytelaars, Tinne , title =. CVPR , year =

work page

[65] [65]

TPAMI , year =

Zhou, Da-Wei and Wang, Qi-Wei and Qi, Zhi-Hong and Ye, Han-Jia and Zhan, De-Chuan and Liu, Ziwei , title =. TPAMI , year =

work page

[66] [66]

Deep Learning , author=

work page

[67] [67]

doi:10.5281/zenodo.18774099 , url =

Vahedifar, Mohammad Ali and Zhang, Qi and Iosifidis, Alexandros , title =. doi:10.5281/zenodo.18774099 , url =

work page doi:10.5281/zenodo.18774099

[68] [68]

and Bagdanov, Andrew D

Liu, Xialei and Masana, Marc and Herranz, Luis and Van de Weijer, Joost and López, Antonio M. and Bagdanov, Andrew D. , booktitle=

work page

[69] [69]

ECCV , year =

Aljundi, Rahaf and Babiloni, Francesca and Elhoseiny, Mohamed and Rohrbach, Marcus and Tuytelaars, Tinne , title =. ECCV , year =

work page

[70] [70]

Rusu and Neil C

Andrei A. Rusu and Neil C. Rabinowitz and Guillaume Desjardins and Hubert Soyer and James Kirkpatrick and Koray Kavukcuoglu and Razvan Pascanu and Raia Hadsell , year=

work page

[71] [71]

Jaehong Yoon and Eunho Yang and Jeongtae Lee and Sung Ju Hwang , year=

work page

[72] [72]

CVPR , year =

Douillard, Arthur and Ram\'e, Alexandre and Couairon, Guillaume and Cord, Matthieu , title =. CVPR , year =

work page

[73] [73]

2024 , booktitle =

Liu, Yaoyao and Schiele, Bernt and Sun, Qianru , title =. 2024 , booktitle =

work page 2024

[74] [74]

Sun, Qing and Lyu, Fan and Shang, Fanhua and Feng, Wei and Wan, Liang , booktitle =

work page

[75] [75]

Da-Wei Zhou and Hai-Long Sun and Jingyi Ning and Han-Jia Ye and De-Chuan Zhan , year=

work page

[76] [76]

Haoxuan Qu and Hossein Rahmani and Li Xu and Bryan Williams and Jun Liu , year=

work page

[77] [77]

Khetarpal, Khimya and Riemer, Matthew and Rish, Irina and Precup, Doina , journal=

work page

[78] [78]

2021 , booktitle =

Zhou, Da-Wei and Ye, Han-Jia and Zhan, De-Chuan , title =. 2021 , booktitle =

work page 2021

[79] [79]

ICLR , year=

A Model or 603 Exemplars: Towards Memory-Efficient Class-Incremental Learning , author=. ICLR , year=

work page

[80] [80]

and Ajanthan, Thalaiyasingam and Torr, Philip H

Chaudhry, Arslan and Dokania, Puneet K. and Ajanthan, Thalaiyasingam and Torr, Philip H. S. , title =. ECCV , year =

work page