Maintaining plasticity in deep continual learning

Shibhansh Dohare, J · 2023 · arXiv 2306.13812

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

read on arXiv browse 6 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Predicting Plasticity in Deep Continual Learning: A Theoretical Perspective

cs.LG · 2026-05-09 · unverdicted · novelty 7.0

Optimization readiness, defined from gradient strength and reliability, lower-bounds one-step optimization gain and outperforms rank-based diagnostics in predicting neural network trainability across continual learning settings.

When is Warmstarting Effective for Scaling Language Models?

cs.LG · 2026-05-13 · unverdicted · novelty 6.0

A 2x growth factor in model warmstarting yields reliable training speedups for language models under 20 tokens/parameter budgets, with an empirical upper bound on effective growth factors.

Adapt or Forget: Provable Tradeoffs Between Adam and SGD in Nonstationary Optimization

stat.ML · 2026-05-05 · unverdicted · novelty 6.0

Adam's adaptive preconditioning and first-moment averaging improve high-probability tracking error in noise-dominated nonstationary regimes but can increase it under strong drift, where SGD achieves a smaller floor, with explicit beta-dependent bounds.

FlashSAC: Fast and Stable Off-Policy Reinforcement Learning for High-Dimensional Robot Control

cs.LG · 2026-04-06 · unverdicted · novelty 6.0 · 2 refs

FlashSAC improves training speed and final performance of off-policy RL on high-dimensional robot tasks by reducing update frequency, increasing model scale, and bounding norms to limit critic error accumulation.

Attribution-Based Neuron Utility for Plasticity Restoration in Deep Networks

cs.LG · 2026-05-07 · unverdicted · novelty 5.0

GXD estimates the first-order functional cost of replacing a neuron via gradient attribution to make adaptive resets more reliable for preserving plasticity in continual learning.

Plasticity Loss in Deep Reinforcement Learning: A Survey

cs.AI · 2024-11-07 · unverdicted · novelty 4.0

Survey unifies the definition of plasticity loss in DRL, taxonomizes over 50 mitigations, identifies evaluation gaps, and finds general regularization often outperforms domain-specific methods.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Adapt or Forget: Provable Tradeoffs Between Adam and SGD in Nonstationary Optimization stat.ML · 2026-05-05 · unverdicted · none · ref 113
Adam's adaptive preconditioning and first-moment averaging improve high-probability tracking error in noise-dominated nonstationary regimes but can increase it under strong drift, where SGD achieves a smaller floor, with explicit beta-dependent bounds.

Maintaining plasticity in deep continual learning

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer