Learning to Learn without Forgetting by Maximizing Transfer and Minimizing Interference

Gerald Tesauro; Ignacio Cases; Irina Rish; Matthew Riemer; Miao Liu; Robert Ajemian; Yuhai Tu

arxiv: 1810.11910 · v3 · pith:XUYN2KRVnew · submitted 2018-10-29 · 💻 cs.LG · cs.AI· stat.ML

Learning to Learn without Forgetting by Maximizing Transfer and Minimizing Interference

Matthew Riemer , Ignacio Cases , Robert Ajemian , Miao Liu , Irina Rish , Yuhai Tu , Gerald Tesauro This is my paper

classification 💻 cs.LG cs.AIstat.ML

keywords learningcontinualinterferencenon-stationarytransferacrossexperimentsfuture

0 comments

read the original abstract

Lack of performance when it comes to continual learning over non-stationary distributions of data remains a major challenge in scaling neural network learning to more human realistic settings. In this work we propose a new conceptualization of the continual learning problem in terms of a temporally symmetric trade-off between transfer and interference that can be optimized by enforcing gradient alignment across examples. We then propose a new algorithm, Meta-Experience Replay (MER), that directly exploits this view by combining experience replay with optimization based meta-learning. This method learns parameters that make interference based on future gradients less likely and transfer based on future gradients more likely. We conduct experiments across continual lifelong supervised learning benchmarks and non-stationary reinforcement learning environments demonstrating that our approach consistently outperforms recently proposed baselines for continual learning. Our experiments show that the gap between the performance of MER and baseline algorithms grows both as the environment gets more non-stationary and as the fraction of the total experiences stored gets smaller.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 4 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Analytic Drift Resister for Non-Exemplar Continual Graph Learning
cs.LG 2026-04 unverdicted novelty 6.0

ADR achieves theoretically zero-forgetting class-incremental graph learning by combining backpropagation adaptation with ridge-regression-based layer-wise merging of GNN linear transformations.
Continual Reinforcement Learning with Diversity Exploration and Adversarial Self-Correction
cs.LG 2019-06 unverdicted novelty 6.0

CDAN framework uses diversity exploration and adversarial self-correction for continual RL in continuous control, evaluated on new CAM environment with NSD metric showing 18.35% NSD improvement over baseline.
MANGO: Meta-Adaptive Network Gradient Optimization for Online Continual Learning
cs.LG 2026-05 unverdicted novelty 5.0

MANGO combines gradient-gating and meta-learned regularization to balance stability and plasticity in single-pass online continual learning, reporting state-of-the-art accuracy on CLEAR-10, CIFAR-100, and Tiny-ImageNet.
Scalable Multi-Task Learning through Spiking Neural Networks with Adaptive Task-Switching Policy for Intelligent Autonomous Agents
cs.NE 2025-04 unverdicted novelty 5.0

SwitchMT uses adaptive task-switching in deep spiking Q-networks with active dendrites to reduce task interference in multi-task RL, achieving competitive Atari scores without added network complexity.