{"work":{"id":"0700d73f-b94d-4cd3-be40-086e4c4544c4","openalex_id":null,"doi":null,"arxiv_id":"1606.04671","raw_key":null,"title":"Progressive Neural Networks","authors":null,"authors_text":"Andrei A. Rusu, Neil C. Rabinowitz, Guillaume Desjardins, Hubert Soyer, James Kirkpatrick, Koray Kavukcuoglu","year":2016,"venue":"cs.LG","abstract":"Learning to solve complex sequences of tasks--while both leveraging transfer and avoiding catastrophic forgetting--remains a key obstacle to achieving human-level intelligence. The progressive networks approach represents a step forward in this direction: they are immune to forgetting and can leverage prior knowledge via lateral connections to previously learned features. We evaluate this architecture extensively on a wide variety of reinforcement learning tasks (Atari and 3D maze games), and show that it outperforms common baselines based on pretraining and finetuning. Using a novel sensitivity measure, we demonstrate that transfer occurs at both low-level sensory and high-level control layers of the learned policy.","external_url":"https://arxiv.org/abs/1606.04671","cited_by_count":null,"metadata_source":"pith","metadata_fetched_at":"2026-06-29T16:43:39.970658+00:00","pith_arxiv_id":"1606.04671","created_at":"2026-05-09T06:00:36.957724+00:00","updated_at":"2026-06-29T16:43:39.970658+00:00","title_quality_ok":false,"display_title":"Progressive Neural Networks","render_title":"Progressive Neural Networks"},"hub":{"state":{"work_id":"0700d73f-b94d-4cd3-be40-086e4c4544c4","tier":"hub","tier_reason":"10+ Pith inbound or 1,000+ external citations","pith_inbound_count":81,"external_cited_by_count":null,"distinct_field_count":8,"first_pith_cited_at":"2019-06-21T15:44:41+00:00","last_pith_cited_at":"2026-06-16T05:49:43+00:00","author_build_status":"not_needed","summary_status":"needed","contexts_status":"needed","graph_status":"needed","ask_index_status":"not_needed","reader_status":"not_needed","recognition_status":"not_needed","updated_at":"2026-06-29T17:09:04.074040+00:00","tier_text":"hub"},"tier":"hub","role_counts":[{"context_role":"background","n":12},{"context_role":"baseline","n":1}],"polarity_counts":[{"context_polarity":"background","n":10},{"context_polarity":"unclear","n":2},{"context_polarity":"baseline","n":1}],"runs":{"context_extract":{"job_type":"context_extract","status":"succeeded","result":{"enqueued_papers":25},"error":null,"updated_at":"2026-05-14T18:29:29.344996+00:00"},"graph_features":{"job_type":"graph_features","status":"succeeded","result":{"co_cited":[{"title":"An empirical investigation of catastrophic forgetting in gradient-based neural networks.arXiv preprint arXiv:1312.6211","work_id":"2d7055f4-b7d2-4ddb-a9c7-481bd728d360","shared_citers":5},{"title":"Chaudhry et al","work_id":"001da84d-4329-4699-85e9-5431d143af9b","shared_citers":5},{"title":"Adam: A Method for Stochastic Optimization","work_id":"1910796d-9b52-4683-bf5c-de9632c1028b","shared_citers":4},{"title":"BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding","work_id":"ed240a10-5b19-406c-baa5-30803f465785","shared_citers":4},{"title":"Proximal Policy Optimization Algorithms","work_id":"240c67fe-d14d-4520-91c1-38a4e272ca19","shared_citers":4},{"title":"van de Ven and Andreas S","work_id":"e2f0ec8a-2a8e-497a-b839-48a0b4e62c27","shared_citers":4},{"title":"and Milan, Kieran and Quan, John and Ramalho, Tiago and Grabska-Barwinska, Agnieszka and Hassabis, Demis and Clopath, Claudia and Kumaran, Dharshan and Hadsell, Raia , year=","work_id":"6c82f0ba-a185-45cb-a5eb-efcbacd90a09","shared_citers":3},{"title":"Efficient lifelong learning with A-GEM.CoRR, abs/1812.00420","work_id":"c1e3aa81-39b5-44a3-a1b5-6fa343e0b956","shared_citers":3},{"title":"Lifelong learning with dynamically expandable networks","work_id":"e6bf3c15-5317-49f7-b85c-e533d387a9cc","shared_citers":3},{"title":"On the Opportunities and Risks of Foundation Models","work_id":"a18039e9-928d-47c9-a836-32656a71bf71","shared_citers":3},{"title":"Playing Atari with Deep Reinforcement Learning","work_id":"736a8ddf-e365-4940-ad58-4699fddedb86","shared_citers":3},{"title":"","work_id":"67145523-6c66-41a2-a3b0-ceb37c0dc852","shared_citers":2},{"title":"Distilling the Knowledge in a Neural Network","work_id":"d927ab1f-17b8-4002-9d09-c3d55764fbad","shared_citers":2},{"title":"Exact solutions to the nonlinear dynamics of learning in deep linear neural networks","work_id":"adbbf9c7-c3a4-4cb7-9a00-c98b12f8a315","shared_citers":2},{"title":"GPT-4o System Card","work_id":"f37bf1c7-4964-4e56-9762-d20da8d9009f","shared_citers":2},{"title":"Investigating continual pretraining in large language models: Insights and implications","work_id":"2bd925f2-1e27-4cf2-aa2a-c916449c32ad","shared_citers":2},{"title":"One big net for everything.Preprint arXiv:1802.08864","work_id":"79101b93-beac-4b27-86b5-e893024a0b77","shared_citers":2},{"title":"On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima","work_id":"01efb355-7c12-4b89-bc42-91ee46ee276b","shared_citers":2},{"title":"OpenAI Gym","work_id":"6af98f3f-f074-41ae-a689-7dd7b4b8efde","shared_citers":2},{"title":"robosuite: A Modular Simulation Framework and Benchmark for Robot Learning","work_id":"d616d4ba-7713-4e3e-8c9e-dfebbb8f1abf","shared_citers":2},{"title":"Rotate: Knowledge graph embedding by relational rotation in complex space","work_id":"0377b575-80ef-446c-a3de-69519d1e8b1e","shared_citers":2},{"title":null,"work_id":"14099ea4-1093-4e83-9092-c2b968641fe6","shared_citers":2},{"title":"11 Preprint","work_id":"7b77683d-809f-47fb-86ca-7176ee4195ee","shared_citers":1},{"title":"13 Subhash Kantamneni, Ziming Liu, and Max Tegmark","work_id":"f390a2e4-5e3a-488e-9634-28eedd3729f0","shared_citers":1}],"time_series":[{"n":1,"year":2019},{"n":1,"year":2022},{"n":1,"year":2023},{"n":29,"year":2026}],"dependency_candidates":[]},"error":null,"updated_at":"2026-05-14T18:30:16.226818+00:00"},"identity_refresh":{"job_type":"identity_refresh","status":"succeeded","result":{"items":[{"title":"Qwen3 Technical Report","outcome":"unchanged","work_id":"25a4e30c-1232-48e7-9925-02fa12ba7c9e","resolver":"local_arxiv","confidence":0.98,"old_work_id":"25a4e30c-1232-48e7-9925-02fa12ba7c9e"}],"counts":{"fixed":0,"merged":0,"unchanged":1,"quarantined":0,"needs_external_resolution":0},"errors":[],"attempted":1},"error":null,"updated_at":"2026-05-14T18:29:25.466427+00:00"},"summary_claims":{"job_type":"summary_claims","status":"succeeded","result":{"title":"Progressive Neural Networks","claims":[{"claim_text":"Learning to solve complex sequences of tasks--while both leveraging transfer and avoiding catastrophic forgetting--remains a key obstacle to achieving human-level intelligence. The progressive networks approach represents a step forward in this direction: they are immune to forgetting and can leverage prior knowledge via lateral connections to previously learned features. We evaluate this architecture extensively on a wide variety of reinforcement learning tasks (Atari and 3D maze games), and show that it outperforms common baselines based on pretraining and finetuning. Using a novel sensitivi","claim_type":"abstract","evidence_strength":"source_metadata"}],"why_cited":"Pith tracks Progressive Neural Networks because it crossed a citation-hub threshold.","role_counts":[]},"error":null,"updated_at":"2026-05-14T18:30:16.236202+00:00"}},"summary":{"title":"Progressive Neural Networks","claims":[{"claim_text":"Learning to solve complex sequences of tasks--while both leveraging transfer and avoiding catastrophic forgetting--remains a key obstacle to achieving human-level intelligence. The progressive networks approach represents a step forward in this direction: they are immune to forgetting and can leverage prior knowledge via lateral connections to previously learned features. We evaluate this architecture extensively on a wide variety of reinforcement learning tasks (Atari and 3D maze games), and show that it outperforms common baselines based on pretraining and finetuning. Using a novel sensitivi","claim_type":"abstract","evidence_strength":"source_metadata"}],"why_cited":"Pith tracks Progressive Neural Networks because it crossed a citation-hub threshold.","role_counts":[]},"graph":{"co_cited":[{"title":"An empirical investigation of catastrophic forgetting in gradient-based neural networks.arXiv preprint arXiv:1312.6211","work_id":"2d7055f4-b7d2-4ddb-a9c7-481bd728d360","shared_citers":5},{"title":"Chaudhry et al","work_id":"001da84d-4329-4699-85e9-5431d143af9b","shared_citers":5},{"title":"Adam: A Method for Stochastic Optimization","work_id":"1910796d-9b52-4683-bf5c-de9632c1028b","shared_citers":4},{"title":"BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding","work_id":"ed240a10-5b19-406c-baa5-30803f465785","shared_citers":4},{"title":"Proximal Policy Optimization Algorithms","work_id":"240c67fe-d14d-4520-91c1-38a4e272ca19","shared_citers":4},{"title":"van de Ven and Andreas S","work_id":"e2f0ec8a-2a8e-497a-b839-48a0b4e62c27","shared_citers":4},{"title":"and Milan, Kieran and Quan, John and Ramalho, Tiago and Grabska-Barwinska, Agnieszka and Hassabis, Demis and Clopath, Claudia and Kumaran, Dharshan and Hadsell, Raia , year=","work_id":"6c82f0ba-a185-45cb-a5eb-efcbacd90a09","shared_citers":3},{"title":"Efficient lifelong learning with A-GEM.CoRR, abs/1812.00420","work_id":"c1e3aa81-39b5-44a3-a1b5-6fa343e0b956","shared_citers":3},{"title":"Lifelong learning with dynamically expandable networks","work_id":"e6bf3c15-5317-49f7-b85c-e533d387a9cc","shared_citers":3},{"title":"On the Opportunities and Risks of Foundation Models","work_id":"a18039e9-928d-47c9-a836-32656a71bf71","shared_citers":3},{"title":"Playing Atari with Deep Reinforcement Learning","work_id":"736a8ddf-e365-4940-ad58-4699fddedb86","shared_citers":3},{"title":"","work_id":"67145523-6c66-41a2-a3b0-ceb37c0dc852","shared_citers":2},{"title":"Distilling the Knowledge in a Neural Network","work_id":"d927ab1f-17b8-4002-9d09-c3d55764fbad","shared_citers":2},{"title":"Exact solutions to the nonlinear dynamics of learning in deep linear neural networks","work_id":"adbbf9c7-c3a4-4cb7-9a00-c98b12f8a315","shared_citers":2},{"title":"GPT-4o System Card","work_id":"f37bf1c7-4964-4e56-9762-d20da8d9009f","shared_citers":2},{"title":"Investigating continual pretraining in large language models: Insights and implications","work_id":"2bd925f2-1e27-4cf2-aa2a-c916449c32ad","shared_citers":2},{"title":"One big net for everything.Preprint arXiv:1802.08864","work_id":"79101b93-beac-4b27-86b5-e893024a0b77","shared_citers":2},{"title":"On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima","work_id":"01efb355-7c12-4b89-bc42-91ee46ee276b","shared_citers":2},{"title":"OpenAI Gym","work_id":"6af98f3f-f074-41ae-a689-7dd7b4b8efde","shared_citers":2},{"title":"robosuite: A Modular Simulation Framework and Benchmark for Robot Learning","work_id":"d616d4ba-7713-4e3e-8c9e-dfebbb8f1abf","shared_citers":2},{"title":"Rotate: Knowledge graph embedding by relational rotation in complex space","work_id":"0377b575-80ef-446c-a3de-69519d1e8b1e","shared_citers":2},{"title":null,"work_id":"14099ea4-1093-4e83-9092-c2b968641fe6","shared_citers":2},{"title":"11 Preprint","work_id":"7b77683d-809f-47fb-86ca-7176ee4195ee","shared_citers":1},{"title":"13 Subhash Kantamneni, Ziming Liu, and Max Tegmark","work_id":"f390a2e4-5e3a-488e-9634-28eedd3729f0","shared_citers":1}],"time_series":[{"n":1,"year":2019},{"n":1,"year":2022},{"n":1,"year":2023},{"n":29,"year":2026}],"dependency_candidates":[]},"authors":[]}}