pith. sign in

hub Canonical reference

Progressive Neural Networks

Canonical reference. 77% of citing Pith papers cite this work as background.

80 Pith papers citing it
Background 77% of classified citations
abstract

Learning to solve complex sequences of tasks--while both leveraging transfer and avoiding catastrophic forgetting--remains a key obstacle to achieving human-level intelligence. The progressive networks approach represents a step forward in this direction: they are immune to forgetting and can leverage prior knowledge via lateral connections to previously learned features. We evaluate this architecture extensively on a wide variety of reinforcement learning tasks (Atari and 3D maze games), and show that it outperforms common baselines based on pretraining and finetuning. Using a novel sensitivity measure, we demonstrate that transfer occurs at both low-level sensory and high-level control layers of the learned policy.

hub tools

citation-role summary

background 12 baseline 1

citation-polarity summary

claims ledger

  • abstract Learning to solve complex sequences of tasks--while both leveraging transfer and avoiding catastrophic forgetting--remains a key obstacle to achieving human-level intelligence. The progressive networks approach represents a step forward in this direction: they are immune to forgetting and can leverage prior knowledge via lateral connections to previously learned features. We evaluate this architecture extensively on a wide variety of reinforcement learning tasks (Atari and 3D maze games), and show that it outperforms common baselines based on pretraining and finetuning. Using a novel sensitivi

co-cited works

representative citing papers

ReConText3D: Replay-based Continual Text-to-3D Generation

cs.CV · 2026-04-15 · conditional · novelty 8.0

ReConText3D is the first replay-memory framework for continual text-to-3D generation that prevents catastrophic forgetting on new textual categories while preserving quality on previously seen classes.

Continuous-time Optimal Stopping through Deep Reinforcement Learning

cs.LG · 2026-06-16 · unverdicted · novelty 7.0

CARLOS employs an aggregate deep neural network trained on progressively finer time grids with adaptive sampling to learn continuous-time exercise boundaries for optimal stopping, delivering higher values than discrete Bermudan methods.

Continual Learning of Domain-Invariant Representations

cs.LG · 2026-05-15 · unverdicted · novelty 7.0

Introduces replay-based continual learning with sequential invariance alignment to learn domain-invariant representations, outperforming baselines on generalization to unseen domains across six datasets in vision, medicine, manufacturing, and ecology.

A Generalist Agent

cs.AI · 2022-05-12 · accept · novelty 7.0

Gato is a multi-modal, multi-task, multi-embodiment generalist policy using one transformer network to handle text, vision, games, and robotics tasks.

NetTailor: Tuning the Architecture, Not Just the Weights

cs.CV · 2019-06-29 · unverdicted · novelty 7.0

NetTailor adapts CNN architecture for new tasks by assembling pre-trained universal blocks with task-specific layers, trained via activation mimicry and complexity penalties to match accuracy while reducing size for simpler tasks.

The Long-Term Effects of Data Selection in LLM Fine-Tuning

cs.LG · 2026-05-28 · unverdicted · novelty 6.0

Short-term data selectors in multi-stage LLM fine-tuning can slow future learning and increase forgetting, formalized as myopic selection with a proposed LHAS objective to address it.

Janus-LoRA: A Balanced Low-Rank Adaptation for Continual Learning

cs.CV · 2026-05-27 · unverdicted · novelty 6.0

Janus-LoRA uses gradient rectification via online subspace estimation and a decoupled margin loss to enforce parameter orthogonality and feature separation in LoRA-based continual learning, reporting new SOTA results.

Understanding Goal Generalisation in Sequential Reinforcement Learning

cs.LG · 2026-05-22 · unverdicted · novelty 6.0

Empirical analysis of over 100 sequential RL training pipelines across 250+ OOD environments finds salient features drive generalization and early goals persist, with latent policy gradients simulating latent variable evolution to predict OOD behavior from training history.

citing papers explorer

Showing 50 of 80 citing papers.