FADE adapts per-parameter weight decay rates online via approximate meta-gradient descent to improve controlled forgetting over fixed decay in online tracking and streaming classification.
Reinitializing weights vs units for maintaining plasticity in neural networks
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.LG 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
GXD estimates the first-order functional cost of replacing a neuron via gradient attribution to make adaptive resets more reliable for preserving plasticity in continual learning.
citing papers explorer
-
Learning to Forget: Continual Learning with Adaptive Weight Decay
FADE adapts per-parameter weight decay rates online via approximate meta-gradient descent to improve controlled forgetting over fixed decay in online tracking and streaming classification.
-
Attribution-Based Neuron Utility for Plasticity Restoration in Deep Networks
GXD estimates the first-order functional cost of replacing a neuron via gradient attribution to make adaptive resets more reliable for preserving plasticity in continual learning.