Preconditioned delta-rule models with a diagonal curvature approximation improve upon standard DeltaNet, GDN, and KDA by better approximating the test-time regression objective.
Variational Continual Learning
3 Pith papers cite this work. Polarity classification is still indexing.
abstract
This paper develops variational continual learning (VCL), a simple but general framework for continual learning that fuses online variational inference (VI) and recent advances in Monte Carlo VI for neural networks. The framework can successfully train both deep discriminative models and deep generative models in complex continual learning settings where existing tasks evolve over time and entirely new tasks emerge. Experimental results show that VCL outperforms state-of-the-art continual learning methods on a variety of tasks, avoiding catastrophic forgetting in a fully automatic way.
fields
cs.LG 3verdicts
UNVERDICTED 3representative citing papers
BRPC is an online Bayesian calibration framework that decouples parameter tracking from discrepancy modeling for gradual nonstationarity and adds restart mechanisms to handle abrupt regime shifts.
DLC inserts lightweight classifier-proximal plugins into distillation-based continual learning to achieve 8% accuracy gains on large benchmarks with only 4% extra backbone parameters.
citing papers explorer
-
Preconditioned DeltaNet: Curvature-aware Sequence Modeling for Linear Recurrences
Preconditioned delta-rule models with a diagonal curvature approximation improve upon standard DeltaNet, GDN, and KDA by better approximating the test-time regression objective.
-
Online Bayesian Calibration under Gradual and Abrupt System Changes
BRPC is an online Bayesian calibration framework that decouples parameter tracking from discrepancy modeling for gradual nonstationarity and adds restart mechanisms to handle abrupt regime shifts.
-
Pushing the Limits of Distillation-Based Continual Learning via Classifier-Proximal Lightweight Plugins
DLC inserts lightweight classifier-proximal plugins into distillation-based continual learning to achieve 8% accuracy gains on large benchmarks with only 4% extra backbone parameters.