FOREVER aligns replay intervals in LLM continual learning with a model-centric time based on optimizer update magnitudes and an Ebbinghaus-inspired forgetting curve to reduce catastrophic forgetting.
Progressive prompts: Continual learning for language models
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
representative citing papers
Muon-OGD introduces a spectral-norm constrained orthogonal projection method solved via dual iterations and Newton-Schulz approximations to improve stability-plasticity trade-off in sequential LLM adaptation.
MoRAM frames continual learning as incremental addition of rank-1 adapters viewed as self-activating key-value associative memory units in a mixture-of-experts setup.
A survey paper providing an overview of Large Language Models, their background, and recent advances in the field.
citing papers explorer
-
FOREVER: Forgetting Curve-Inspired Memory Replay for Language Model Continual Learning
FOREVER aligns replay intervals in LLM continual learning with a model-centric time based on optimizer update magnitudes and an Ebbinghaus-inspired forgetting curve to reduce catastrophic forgetting.
-
Muon-OGD: Muon-based Spectral Orthogonal Gradient Projection for LLM Continual Learning
Muon-OGD introduces a spectral-norm constrained orthogonal projection method solved via dual iterations and Newton-Schulz approximations to improve stability-plasticity trade-off in sequential LLM adaptation.
-
Little by Little: Continual Learning via Incremental Mixture of Rank-1 Associative Memory Experts
MoRAM frames continual learning as incremental addition of rank-1 adapters viewed as self-activating key-value associative memory units in a mixture-of-experts setup.
-
A Comprehensive Overview of Large Language Models
A survey paper providing an overview of Large Language Models, their background, and recent advances in the field.