Gradient modifications before Adam inflate old-direction learning rates via the second-moment term, but routing modifications solely to the first moment with adaptive strength prevents collapse and yields 3.8-4.8 unit gains over baselines in 8- and 16-domain continual learning.
Splitlora: Balancing stability and plasticity in continual learning through gradient space splitting
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
iGSP uses implicit gradient subspace projection in two phases to enable efficient continual adaptation of vision-language models, claiming SOTA accuracy with 42.7% fewer trainable parameters and 86.9% less total parameter growth.
citing papers explorer
-
iGSP:Implicit Gradient Subspace Projection for Efficient Continual Learning of Vision-Language Models
iGSP uses implicit gradient subspace projection in two phases to enable efficient continual adaptation of vision-language models, claiming SOTA accuracy with 42.7% fewer trainable parameters and 86.9% less total parameter growth.