Lifelong Learning Starting From Zero

Claes Stranneg{\aa}rd; Filip Slottner Seholm; Fredrik M\"akel\"ainen; Herman Carlstr\"om; Morteza Haghir Chehreghani; Niklas Engsner

arxiv: 1906.09852 · v1 · pith:WUXMHHQ7new · submitted 2019-06-24 · 💻 cs.LG · stat.ML

Lifelong Learning Starting From Zero

Claes Stranneg{\aa}rd , Herman Carlstr\"om , Niklas Engsner , Fredrik M\"akel\"ainen , Filip Slottner Seholm , Morteza Haghir Chehreghani This is my paper

Pith reviewed 2026-05-25 17:24 UTC · model grok-4.3

classification 💻 cs.LG stat.ML

keywords lifelong learningcontinual learningneural networksneuroplasticitydynamic architecturesblank slateadaptive networksnode expansion

0 comments

The pith

A neural network that starts with zero nodes develops lifelong learning using four rules of expansion, generalization, forgetting, and backpropagation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a neural network model that begins as a blank slate with no nodes and grows continuously in response to environmental signals. It applies expansion to add nodes for new input combinations, generalization to create nodes that cover broader patterns, forgetting to remove low-use nodes, and backpropagation to adjust parameters. The model is evaluated on accuracy, energy efficiency, and versatility, with claims of better performance than other network models in several cases. A sympathetic reader would care because the approach addresses continual adaptation without fixed initial structures or the need for retraining from scratch on new data.

Core claim

The central claim is that a deep neural-network model inspired by neuroplasticity, beginning as a blank slate with no nodes, can develop continuously according to four rules—expansion, generalization, forgetting, and backpropagation—and thereby achieve competitive or superior performance in accuracy, energy efficiency, and versatility compared to other network models.

What carries the argument

The four rules—expansion (adding nodes to memorize new input combinations), generalization (adding nodes that generalize from existing ones), forgetting (removing nodes of relatively little use), and backpropagation (fine-tuning parameters)—that together drive network development from an initial state of zero nodes.

If this is right

The network can adapt to new data indefinitely without requiring a predefined initial size or structure.
Energy use stays low because only useful nodes are retained after forgetting.
Versatility grows as the network builds both specific and generalized representations over time.
In several evaluated cases the model shows higher accuracy than comparison networks while using fewer resources.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the rules function as stated, networks could in principle scale their capacity exactly to the complexity of experienced data rather than over- or under-provisioning in advance.
The forgetting rule might reduce interference between old and new tasks, but this would need direct measurement on long task sequences.
Energy-efficiency gains could be quantified by tracking total node count and forward-pass cost across an extended sequence of tasks.
The same developmental rules might be combined with other plasticity mechanisms to handle even more abrupt distribution shifts.

Load-bearing premise

That the four rules of expansion, generalization, forgetting, and backpropagation can be implemented together to deliver the claimed gains in accuracy, energy efficiency, and versatility.

What would settle it

Implement the model and test it on sequential lifelong learning benchmarks such as permuted MNIST or split CIFAR-10; if accuracy or resource metrics do not exceed those of fixed-architecture networks or other continual-learning baselines over multiple tasks, the performance claim does not hold.

Figures

Figures reproduced from arXiv: 1906.09852 by Claes Stranneg{\aa}rd, Filip Slottner Seholm, Fredrik M\"akel\"ainen, Herman Carlstr\"om, Morteza Haghir Chehreghani, Niklas Engsner.

**Figure 2.** Figure 2: The network shown is created following receipt of the first data point. The [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Illustration of the generalization rule. Presuppose the network to the left. [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 4.** Figure 4: Left: The network produced by LL0 on the spirals data set, with the two output nodes and their connections omitted for sake of readability. The architecture converged after less than one epoch with about 160 nodes, depth six, and max fan-in five. The yellow node was created by the generalization rule. Right: The spirals data set with the generated decision boundary. Input points that triggered the extensio… view at source ↗

**Figure 5.** Figure 5: Results on the spirals data set. Left: LL0 reaches 100% accuracy on the test set after less than one epoch. By contrast, the best baseline model FC10*3 reaches 80% accuracy after about 350 epochs. Right: FC10*3 consumes over 1000 times more energy than LL0 to reach 80% accuracy [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

**Figure 6.** Figure 6: Results on the digits data set. Left: All models eventually reach approximately the same accuracy. LL0 learns relatively fast. Right: The energy curves converge [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗

**Figure 7.** Figure 7: Results on the radiology data set. Left: LL0 learns about ten times faster than the baselines. Right: LL0 consumes about 10% as much energy [PITH_FULL_IMAGE:figures/full_fig_p008_7.png] view at source ↗

**Figure 8.** Figure 8: Results on the wine data set. Left: LL0 learns much more quickly, but peaks at an accuracy level slightly below the best baseline. Right: Energy consumption. 4 Conclusion This paper has presented a model for lifelong learning inspired by four types of neuroplasticity. The LLO model can be used for constructing networks automatically instead of manually. It starts from a blank slate and develops its deep … view at source ↗

read the original abstract

We present a deep neural-network model for lifelong learning inspired by several forms of neuroplasticity. The neural network develops continuously in response to signals from the environment. In the beginning, the network is a blank slate with no nodes at all. It develops according to four rules: (i) expansion, which adds new nodes to memorize new input combinations; (ii) generalization, which adds new nodes that generalize from existing ones; (iii) forgetting, which removes nodes that are of relatively little use; and (iv) backpropagation, which fine-tunes the network parameters. We analyze the model from the perspective of accuracy, energy efficiency, and versatility and compare it to other network models, finding better performance in several cases.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper sketches a zero-start neural net grown by four plasticity rules but the abstract gives no results or details to back the performance claims.

read the letter

The paper's main contribution is a lifelong learning model that starts with an empty network and adds nodes according to four rules drawn from neuroplasticity: expansion for new input combinations, generalization for broader patterns, forgetting for pruning, and backpropagation for tuning. This zero-start approach is the novel framing. It does a reasonable job outlining how these rules could allow continuous development without initial structure, and the emphasis on energy efficiency and versatility is relevant for real-world adaptive systems. The soft spot is the complete absence of any implementation details or experimental results in the text. The claims of better performance in several cases are stated but not supported by numbers or comparisons, so it's difficult to judge if the rules actually lead to those outcomes or if they can be implemented without additional parameters. The zero free parameters is listed, which is promising if demonstrated. If the full paper has the algorithms and tests, that would address this. As presented, the weakest point is the unverified assumption that these rules can be put into practice to yield the stated improvements. This paper is for people working on continual learning who are open to biologically inspired architectures. A reader might get some ideas from the rule set, but it doesn't provide enough to build on directly or cite without further validation. I would recommend engaging with it only after seeing a version with concrete experiments and code. For peer review, it probably needs that evidence first to be worth a referee's time, though the idea itself is coherent and could be developed further.

Referee Report

2 major / 0 minor

Summary. The manuscript proposes a deep neural-network model for lifelong learning that starts with zero nodes and evolves continuously via four rules: (i) expansion to add nodes for new input combinations, (ii) generalization to add nodes that generalize from existing ones, (iii) forgetting to remove low-utility nodes, and (iv) backpropagation to fine-tune parameters. The model is claimed to outperform existing networks in accuracy, energy efficiency, and versatility in several cases.

Significance. A working implementation of the four rules that demonstrably improves the three metrics while starting from a blank slate would be a notable contribution to lifelong learning, especially given the reported zero free parameters. However, the manuscript supplies no implementation, experiments, data, or quantitative results, so the significance cannot be evaluated from the given text.

major comments (2)

Abstract: performance claims ('better performance in several cases') are stated without any experimental setup, datasets, baselines, metrics, or numerical results, rendering the central claims unverifiable.
Abstract: the four rules are described at a high level but no pseudocode, algorithmic specification, or reduction to concrete operations is provided, so the weakest assumption (implementability yielding the claimed gains) cannot be checked.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the comments. We respond point by point to the major comments.

read point-by-point responses

Referee: Abstract: performance claims ('better performance in several cases') are stated without any experimental setup, datasets, baselines, metrics, or numerical results, rendering the central claims unverifiable.

Authors: The manuscript is a conceptual proposal whose analysis of accuracy, energy efficiency, and versatility rests on reasoning from the model's structural properties rather than on empirical runs. The abstract phrasing therefore overstates what is shown. We will revise the abstract to remove the unverifiable performance claim and to state explicitly that the comparison is theoretical. revision: yes
Referee: Abstract: the four rules are described at a high level but no pseudocode, algorithmic specification, or reduction to concrete operations is provided, so the weakest assumption (implementability yielding the claimed gains) cannot be checked.

Authors: The current text introduces the rules at a conceptual level. We agree that pseudocode and a reduction to concrete operations are required before implementability can be assessed. We will add an algorithmic section containing pseudocode for each rule in the revised manuscript. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained

full rationale

The paper introduces a neural network model that begins with zero nodes and evolves via four explicitly stated rules (expansion, generalization, forgetting, backpropagation). Performance claims rest on empirical comparisons rather than any reduction of outputs to fitted parameters, self-definitions, or self-citation chains. No equations or steps in the provided material equate a claimed result to its inputs by construction, and the model description does not invoke uniqueness theorems or ansatzes from prior self-work as load-bearing justification. The central claims remain independently falsifiable through implementation and benchmarking.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on the unelaborated premise that the four rules can be combined into a working system; no free parameters or invented entities are named in the abstract.

axioms (1)

domain assumption The four rules (expansion, generalization, forgetting, backpropagation) suffice to produce effective lifelong learning.
This premise is invoked by the abstract's description of the model and its performance claims.

invented entities (1)

Dynamic node addition/removal mechanism no independent evidence
purpose: To enable lifelong learning from a blank slate
New mechanism introduced to realize the four rules.

pith-pipeline@v0.9.0 · 5676 in / 1132 out tokens · 29673 ms · 2026-05-25T17:24:13.643700+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

33 extracted references · 33 canonical work pages · 3 internal anchors

[1]

Child Development Perspectives 12(3), 183–188 (2018)

Cangelosi, A., Schlesinger, M.: From babies to robots: the contribution of devel- opmental robotics to developmental psychology. Child Development Perspectives 12(3), 183–188 (2018)

work page 2018
[2]

In: International Conference on Machine Learning (2014) 10 C

Chen,Z.,Liu,B.:Topicmodelingusingtopicsfrommanydomains,lifelonglearning and big data. In: International Conference on Machine Learning (2014) 10 C. Strannegård et al

work page 2014
[3]

In: Proceedings of the 34th International Conference on Machine Learning-Volume

Cortes, C., et al.: Adanet: Adaptive structural learning of artiﬁcial neural networks. In: Proceedings of the 34th International Conference on Machine Learning-Volume

work page
[4]

pp. 874–883. JMLR. org (2017)

work page 2017
[5]

IEEE Computational Intelligence Magazine10(4), 12–25 (2015)

Ditzler, G., Roveri, M., Alippi, C., Polikar, R.: Learning in nonstationary environ- ments: A survey. IEEE Computational Intelligence Magazine10(4), 12–25 (2015)

work page 2015
[6]

In: 2017 International Joint Conference on Neural Net- works (IJCNN)

Draelos, T.J., et al.: Neurogenesis deep learning: Extending deep networks to ac- commodate new classes. In: 2017 International Joint Conference on Neural Net- works (IJCNN). pp. 526–533. IEEE (2017)

work page 2017
[7]

Behavioural brain research192(1), 137–142 (2008)

Draganski, B., May, A.: Training-induced structural changes in the adult human brain. Behavioural brain research192(1), 137–142 (2008)

work page 2008
[8]

In: Ad- vances in neural information processing systems

Fahlman, S.E., Lebiere, C.: The cascade-correlation learning architecture. In: Ad- vances in neural information processing systems. pp. 524–532 (1990)

work page 1990
[9]

Trends in cogni- tive sciences 3(4), 128–135 (1999)

French, R.M.: Catastrophic forgetting in connectionist networks. Trends in cogni- tive sciences 3(4), 128–135 (1999)

work page 1999
[10]

MIT press (2016)

Goodfellow, I., Bengio, Y., Courville, A.: Deep learning. MIT press (2016)

work page 2016
[11]

Trends in Neurosciences27(12) (2004)

Greenspan, R.J., Van Swinderen, B.: Cognitive consonance: complex brain func- tions in the fruit ﬂy and its relatives. Trends in Neurosciences27(12) (2004)

work page 2004
[12]

Grossberg, S.: How Does a Brain Build a Cognitive Code?, pp. 1–52. Springer Netherlands, Dordrecht (1982)

work page 1982
[13]

Neuron95, 245–258 (2017)

Hassabis, D., Kumaran, D., Summerﬁeld, C., Botvinick, M.: Neuroscience-inspired artiﬁcial intelligence. Neuron95, 245–258 (2017)

work page 2017
[14]

IEEE Access6, 24411–24432 (2018)

Hatcher, W.G., Yu, W.: A survey of deep learning: platforms, applications and emerging research trends. IEEE Access6, 24411–24432 (2018)

work page 2018
[15]

Kandel, E.R., Schwartz, J.H., Jessell, T.M., et al.: Principles of neural science, vol. 4. McGraw-Hill New York (2000)

work page 2000
[16]

Pro- ceedings of the National Academy of Sciences114(13), 3521–3526 (2017)

Kirkpatrick, J., et al.: Overcoming catastrophic forgetting in neural networks. Pro- ceedings of the National Academy of Sciences114(13), 3521–3526 (2017)

work page 2017
[17]

Cognition 110(3), 380–394 (2009)

Krueger, K.A., Dayan, P.: Flexible shaping: How learning in small steps helps. Cognition 110(3), 380–394 (2009)

work page 2009
[18]

Lifelong Learning with Dynamically Expandable Networks

Lee, J., Yoon, J., Yang, E., Hwang, S.J.: Lifelong learning with dynamically ex- pandable networks. CoRRabs/1708.01547 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018
[19]

IEEE Transactions on Pattern Analysis and Machine Intelligence40, 2935–2947 (2018)

Li, Z., Hoiem, D.: Learning without forgetting. IEEE Transactions on Pattern Analysis and Machine Intelligence40, 2935–2947 (2018)

work page 2018
[20]

McCloskey, M., Cohen, N.J.: Catastrophic interference in connectionist networks: Thesequentiallearningproblem.In:Psychologyoflearningandmotivation,vol.24, pp. 109–165. Elsevier (1989)

work page 1989
[21]

Frontiers in psychology4, 504 (2013)

Mermillod, M., Bugaiska, A., Bonin, P.: The stability-plasticity dilemma: Investi- gating the continuum from catastrophic forgetting to age-limited learning eﬀects. Frontiers in psychology4, 504 (2013)

work page 2013
[22]

Com- munications of the ACM61(5), 103–115 (2018)

Mitchell, T., Cohen, W., Hruschka, E., Talukdar, P., Yang, B., Betteridge, J., Carlson, A., Dalvi, B., Gardner, M., Kisiel, B., et al.: Never-ending learning. Com- munications of the ACM61(5), 103–115 (2018)

work page 2018
[23]

Annual review of neuroscience14(1), 453–501 (1991)

Oppenheim, R.W.: Cell death during development of the nervous system. Annual review of neuroscience14(1), 453–501 (1991)

work page 1991
[24]

Science 333(6048), 1456–1458 (2011)

Paolicelli, R.C., et al.: Synaptic pruning by microglia is necessary for normal brain development. Science 333(6048), 1456–1458 (2011)

work page 2011
[25]

Neural networks: the oﬃcial journal of the International Neural Network Society113, 54–71 (2019)

Parisi, G., Kemker, R., Part, J., Kanan, C., Wermter, S.: Continual lifelong learn- ing with neural networks: A review. Neural networks: the oﬃcial journal of the International Neural Network Society113, 54–71 (2019)

work page 2019
[26]

Wiley Interdis- ciplinary Reviews: Developmental Biology6(1), e216 (2017) Lifelong Learning Starting From Zero 11

Power, J.D., Schlaggar, B.L.: Neural plasticity across the lifespan. Wiley Interdis- ciplinary Reviews: Developmental Biology6(1), e216 (2017) Lifelong Learning Starting From Zero 11

work page 2017
[27]

Progressive Neural Networks

Rusu, A.A., et al.: Progressive neural networks. arXiv preprint arXiv:1606.04671 (2016)

work page internal anchor Pith review Pith/arXiv arXiv 2016
[28]

Neural networks : the oﬃcial journal of the International Neural Network Society108, 48–67 (2018)

Soltoggio, A., Stanley, K.O., Risi, S.: Born to learn: The inspiration, progress, and future of evolved plastic artiﬁcial neural networks. Neural networks : the oﬃcial journal of the International Neural Network Society108, 48–67 (2018)

work page 2018
[29]

Proceedings of the IEEE105(12) (2017)

Sze, V., Chen, Y.H., Yang, T.J., Emer, J.S.: Eﬃcient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE105(12) (2017)

work page 2017
[30]

The Incredible Shrinking Neural Network: New Perspectives on Learning Representations Through The Lens of Pruning

Wolfe, N., Sharma, A., Drude, L., Raj, B.: The incredible shrinking neural network: New perspectives on learning representations through the lens of pruning. arXiv preprint arXiv:1701.04465 (2017)

work page internal anchor Pith review Pith/arXiv arXiv 2017
[31]

In: Proceedings of the 34th International Conference on Machine Learning-Volume

Zenke, F., Poole, B., Ganguli, S.: Continual learning through synaptic intelligence. In: Proceedings of the 34th International Conference on Machine Learning-Volume

work page
[32]

3987–3995

pp. 3987–3995. JMLR. org (2017)

work page 2017
[33]

In: Artiﬁcial intelligence and statistics

Zhou, G., Sohn, K., Lee, H.: Online incremental feature learning with denoising autoencoders. In: Artiﬁcial intelligence and statistics. pp. 1453–1461 (2012)

work page 2012

[1] [1]

Child Development Perspectives 12(3), 183–188 (2018)

Cangelosi, A., Schlesinger, M.: From babies to robots: the contribution of devel- opmental robotics to developmental psychology. Child Development Perspectives 12(3), 183–188 (2018)

work page 2018

[2] [2]

In: International Conference on Machine Learning (2014) 10 C

Chen,Z.,Liu,B.:Topicmodelingusingtopicsfrommanydomains,lifelonglearning and big data. In: International Conference on Machine Learning (2014) 10 C. Strannegård et al

work page 2014

[3] [3]

In: Proceedings of the 34th International Conference on Machine Learning-Volume

Cortes, C., et al.: Adanet: Adaptive structural learning of artiﬁcial neural networks. In: Proceedings of the 34th International Conference on Machine Learning-Volume

work page

[4] [4]

pp. 874–883. JMLR. org (2017)

work page 2017

[5] [5]

IEEE Computational Intelligence Magazine10(4), 12–25 (2015)

Ditzler, G., Roveri, M., Alippi, C., Polikar, R.: Learning in nonstationary environ- ments: A survey. IEEE Computational Intelligence Magazine10(4), 12–25 (2015)

work page 2015

[6] [6]

In: 2017 International Joint Conference on Neural Net- works (IJCNN)

Draelos, T.J., et al.: Neurogenesis deep learning: Extending deep networks to ac- commodate new classes. In: 2017 International Joint Conference on Neural Net- works (IJCNN). pp. 526–533. IEEE (2017)

work page 2017

[7] [7]

Behavioural brain research192(1), 137–142 (2008)

Draganski, B., May, A.: Training-induced structural changes in the adult human brain. Behavioural brain research192(1), 137–142 (2008)

work page 2008

[8] [8]

In: Ad- vances in neural information processing systems

Fahlman, S.E., Lebiere, C.: The cascade-correlation learning architecture. In: Ad- vances in neural information processing systems. pp. 524–532 (1990)

work page 1990

[9] [9]

Trends in cogni- tive sciences 3(4), 128–135 (1999)

French, R.M.: Catastrophic forgetting in connectionist networks. Trends in cogni- tive sciences 3(4), 128–135 (1999)

work page 1999

[10] [10]

MIT press (2016)

Goodfellow, I., Bengio, Y., Courville, A.: Deep learning. MIT press (2016)

work page 2016

[11] [11]

Trends in Neurosciences27(12) (2004)

Greenspan, R.J., Van Swinderen, B.: Cognitive consonance: complex brain func- tions in the fruit ﬂy and its relatives. Trends in Neurosciences27(12) (2004)

work page 2004

[12] [12]

Grossberg, S.: How Does a Brain Build a Cognitive Code?, pp. 1–52. Springer Netherlands, Dordrecht (1982)

work page 1982

[13] [13]

Neuron95, 245–258 (2017)

Hassabis, D., Kumaran, D., Summerﬁeld, C., Botvinick, M.: Neuroscience-inspired artiﬁcial intelligence. Neuron95, 245–258 (2017)

work page 2017

[14] [14]

IEEE Access6, 24411–24432 (2018)

Hatcher, W.G., Yu, W.: A survey of deep learning: platforms, applications and emerging research trends. IEEE Access6, 24411–24432 (2018)

work page 2018

[15] [15]

Kandel, E.R., Schwartz, J.H., Jessell, T.M., et al.: Principles of neural science, vol. 4. McGraw-Hill New York (2000)

work page 2000

[16] [16]

Pro- ceedings of the National Academy of Sciences114(13), 3521–3526 (2017)

Kirkpatrick, J., et al.: Overcoming catastrophic forgetting in neural networks. Pro- ceedings of the National Academy of Sciences114(13), 3521–3526 (2017)

work page 2017

[17] [17]

Cognition 110(3), 380–394 (2009)

Krueger, K.A., Dayan, P.: Flexible shaping: How learning in small steps helps. Cognition 110(3), 380–394 (2009)

work page 2009

[18] [18]

Lifelong Learning with Dynamically Expandable Networks

Lee, J., Yoon, J., Yang, E., Hwang, S.J.: Lifelong learning with dynamically ex- pandable networks. CoRRabs/1708.01547 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018

[19] [19]

IEEE Transactions on Pattern Analysis and Machine Intelligence40, 2935–2947 (2018)

Li, Z., Hoiem, D.: Learning without forgetting. IEEE Transactions on Pattern Analysis and Machine Intelligence40, 2935–2947 (2018)

work page 2018

[20] [20]

McCloskey, M., Cohen, N.J.: Catastrophic interference in connectionist networks: Thesequentiallearningproblem.In:Psychologyoflearningandmotivation,vol.24, pp. 109–165. Elsevier (1989)

work page 1989

[21] [21]

Frontiers in psychology4, 504 (2013)

Mermillod, M., Bugaiska, A., Bonin, P.: The stability-plasticity dilemma: Investi- gating the continuum from catastrophic forgetting to age-limited learning eﬀects. Frontiers in psychology4, 504 (2013)

work page 2013

[22] [22]

Com- munications of the ACM61(5), 103–115 (2018)

Mitchell, T., Cohen, W., Hruschka, E., Talukdar, P., Yang, B., Betteridge, J., Carlson, A., Dalvi, B., Gardner, M., Kisiel, B., et al.: Never-ending learning. Com- munications of the ACM61(5), 103–115 (2018)

work page 2018

[23] [23]

Annual review of neuroscience14(1), 453–501 (1991)

Oppenheim, R.W.: Cell death during development of the nervous system. Annual review of neuroscience14(1), 453–501 (1991)

work page 1991

[24] [24]

Science 333(6048), 1456–1458 (2011)

Paolicelli, R.C., et al.: Synaptic pruning by microglia is necessary for normal brain development. Science 333(6048), 1456–1458 (2011)

work page 2011

[25] [25]

Neural networks: the oﬃcial journal of the International Neural Network Society113, 54–71 (2019)

Parisi, G., Kemker, R., Part, J., Kanan, C., Wermter, S.: Continual lifelong learn- ing with neural networks: A review. Neural networks: the oﬃcial journal of the International Neural Network Society113, 54–71 (2019)

work page 2019

[26] [26]

Wiley Interdis- ciplinary Reviews: Developmental Biology6(1), e216 (2017) Lifelong Learning Starting From Zero 11

Power, J.D., Schlaggar, B.L.: Neural plasticity across the lifespan. Wiley Interdis- ciplinary Reviews: Developmental Biology6(1), e216 (2017) Lifelong Learning Starting From Zero 11

work page 2017

[27] [27]

Progressive Neural Networks

Rusu, A.A., et al.: Progressive neural networks. arXiv preprint arXiv:1606.04671 (2016)

work page internal anchor Pith review Pith/arXiv arXiv 2016

[28] [28]

Neural networks : the oﬃcial journal of the International Neural Network Society108, 48–67 (2018)

Soltoggio, A., Stanley, K.O., Risi, S.: Born to learn: The inspiration, progress, and future of evolved plastic artiﬁcial neural networks. Neural networks : the oﬃcial journal of the International Neural Network Society108, 48–67 (2018)

work page 2018

[29] [29]

Proceedings of the IEEE105(12) (2017)

Sze, V., Chen, Y.H., Yang, T.J., Emer, J.S.: Eﬃcient processing of deep neural networks: A tutorial and survey. Proceedings of the IEEE105(12) (2017)

work page 2017

[30] [30]

The Incredible Shrinking Neural Network: New Perspectives on Learning Representations Through The Lens of Pruning

Wolfe, N., Sharma, A., Drude, L., Raj, B.: The incredible shrinking neural network: New perspectives on learning representations through the lens of pruning. arXiv preprint arXiv:1701.04465 (2017)

work page internal anchor Pith review Pith/arXiv arXiv 2017

[31] [31]

In: Proceedings of the 34th International Conference on Machine Learning-Volume

Zenke, F., Poole, B., Ganguli, S.: Continual learning through synaptic intelligence. In: Proceedings of the 34th International Conference on Machine Learning-Volume

work page

[32] [32]

3987–3995

pp. 3987–3995. JMLR. org (2017)

work page 2017

[33] [33]

In: Artiﬁcial intelligence and statistics

Zhou, G., Sohn, K., Lee, H.: Online incremental feature learning with denoising autoencoders. In: Artiﬁcial intelligence and statistics. pp. 1453–1461 (2012)

work page 2012