Evolvability ES: Scalable and Direct Optimization of Evolvability

Alexander Gajewski; Jeff Clune; Joel Lehman; Kenneth O. Stanley

arxiv: 1907.06077 · v1 · pith:FECPZ3STnew · submitted 2019-07-13 · 💻 cs.NE

Evolvability ES: Scalable and Direct Optimization of Evolvability

Alexander Gajewski , Jeff Clune , Kenneth O. Stanley , Joel Lehman This is my paper

Pith reviewed 2026-05-24 21:54 UTC · model grok-4.3

classification 💻 cs.NE

keywords evolvabilityevolutionary strategiesmeta-learningneural network adaptationlocomotion tasksbehavioral diversityMAML comparison

0 comments

The pith

Evolvability ES derives an objective that directly maximizes behavioral diversity under random mutations to optimize for evolvability.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents evolvability ES as a method to optimize neural network representations specifically for their capacity to adapt further. It starts from natural evolution strategies and constructs a new objective that rewards an individual when its random mutations produce a wide range of behaviors. Experiments apply this to 2-D and 3-D locomotion tasks, producing networks with tens of thousands of parameters that adapt rapidly to new tasks and can initialize further evolution. The same approach is shown to match the performance of MAML while yielding solutions with different characteristics. A reader would care because explicit optimization for evolvability could accelerate evolutionary search and support adaptation in changing conditions.

Core claim

Evolvability ES derives a novel objective in the spirit of natural evolution strategies that maximizes the diversity of behaviors exhibited when an individual is subject to random mutations, and that efficiently scales with computation. Experiments in 2-D and 3-D locomotion tasks highlight the potential of evolvability ES to generate solutions with tens of thousands of parameters that can quickly be adapted to solve different tasks and that can productively seed further evolution. Results also show that evolvability ES can perform competitively with MAML while discovering solutions with distinct properties.

What carries the argument

The evolvability objective, derived from natural evolution strategies, that rewards diversity of behaviors produced by random mutations of an individual.

If this is right

Solutions with tens of thousands of parameters adapt quickly to different locomotion tasks.
Optimized individuals can productively initialize further evolutionary runs.
The method scales computationally to deep networks.
Performance matches gradient-based meta-learning while producing solutions with different properties.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same mutation-diversity objective could be tested on control problems outside locomotion to check whether the adaptation benefit generalizes.
Hybrid algorithms that combine the evolutionary objective with gradient updates might improve sample efficiency in meta-learning settings.
Representations found this way may reduce the need for task-specific retraining when environments change.

Load-bearing premise

Maximizing behavioral diversity under random mutations serves as a valid and sufficient proxy for evolvability that enables fast adaptation rather than merely varied but non-adaptive behaviors.

What would settle it

An experiment in which solutions produced by evolvability ES adapt no faster than those from standard evolutionary strategies when transferred to new locomotion tasks.

Figures

Figures reproduced from arXiv: 1907.06077 by Alexander Gajewski, Jeff Clune, Joel Lehman, Kenneth O. Stanley.

**Figure 1.** Figure 1: Interference Pattern Results. In the interference pattern task, a genome consisted of a single floating point parameter [PITH_FULL_IMAGE:figures/full_fig_p005_1.png] view at source ↗

**Figure 2.** Figure 2: Distribution of behaviors across evolution in the 2-D locomotion domain. Heat-maps of the final horizontal positions [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

**Figure 3.** Figure 3: Distribution of behaviors compared to MAML in the 2-D locomotion domain. Histograms of the final [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: Distribution of behaviors in the final population in the 3-D locomotion domain. Shown are heat-maps (taken from [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

**Figure 5.** Figure 5: Adaptation performance. The plot compares the [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

**Figure 6.** Figure 6: Distribution of behaviors during adaptation in the 3-D locomotion domain. Heat-maps are shown of the final posi [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗

**Figure 7.** Figure 7: Distribution of behaviors for uni- and multi-modal MaxEnt-EES variants. Heat-maps are shown of the final positions [PITH_FULL_IMAGE:figures/full_fig_p008_7.png] view at source ↗

read the original abstract

Designing evolutionary algorithms capable of uncovering highly evolvable representations is an open challenge; such evolvability is important because it accelerates evolution and enables fast adaptation to changing circumstances. This paper introduces evolvability ES, an evolutionary algorithm designed to explicitly and efficiently optimize for evolvability, i.e. the ability to further adapt. The insight is that it is possible to derive a novel objective in the spirit of natural evolution strategies that maximizes the diversity of behaviors exhibited when an individual is subject to random mutations, and that efficiently scales with computation. Experiments in 2-D and 3-D locomotion tasks highlight the potential of evolvability ES to generate solutions with tens of thousands of parameters that can quickly be adapted to solve different tasks and that can productively seed further evolution. We further highlight a connection between evolvability and a recent and popular gradient-based meta-learning algorithm called MAML; results show that evolvability ES can perform competitively with MAML and that it discovers solutions with distinct properties. The conclusion is that evolvability ES opens up novel research directions for studying and exploiting the potential of evolvable representations for deep neural networks.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Evolvability ES adds a clean new objective for maximizing post-mutation behavioral spread inside natural ES, but the experiments leave open whether that spread actually produces task-relevant adaptability rather than unstructured variation.

read the letter

The paper's central move is deriving an objective inside the natural evolution strategies framework that rewards networks whose random mutations produce a wide range of behaviors. This is new relative to standard performance-only ES and to earlier evolvability measures that were harder to scale. They show the method runs on controllers with tens of thousands of parameters in 2-D and 3-D locomotion, that the resulting solutions adapt faster to new tasks than plain ES, and that they remain competitive with MAML while yielding solutions with different properties. Those scaling and comparison results are the concrete contribution worth noting.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces Evolvability ES, an evolutionary algorithm that derives a novel objective in the style of natural evolution strategies to directly maximize the behavioral diversity (variance) exhibited by an individual under random mutations. Experiments on 2-D and 3-D locomotion tasks with neural networks of tens of thousands of parameters show that the resulting solutions adapt rapidly to new tasks, productively seed further evolution, and perform competitively with MAML while exhibiting distinct properties.

Significance. If the central claim holds, the work supplies a computationally scalable, direct method for optimizing evolvability in high-dimensional neural network parameter spaces—an open challenge in evolutionary computation. The explicit link to MAML and the empirical results on complex locomotion domains are strengths that could influence representation design for adaptability in RL and evolutionary algorithms.

major comments (2)

[Experiments (locomotion tasks)] Locomotion experiments (results section): the reported post-mutation adaptation performance is shown, yet the manuscript contains no ablation that removes or disables the behavioral-diversity term while retaining the base ES dynamics and initialization; without this control it remains unclear whether the observed adaptation gains are attributable to the evolvability objective rather than other algorithmic factors.
[Method (objective derivation)] Objective derivation (method section): the claim that maximizing behavioral variance under mutations yields representations whose variation is useful for task adaptation (rather than unstructured or orthogonal noise) is load-bearing for the central thesis, but the paper provides no formal analysis, gradient-alignment test, or control experiment demonstrating that the induced diversity lies along task-relevant directions.

minor comments (2)

[Experiments] The precise network architectures, mutation variances, and population sizes used in the 3-D experiments could be stated explicitly in the main text rather than deferred entirely to supplementary material.
[Method] Notation for the evolvability objective (e.g., the precise definition of behavioral variance) is introduced clearly but could be cross-referenced more explicitly when results are discussed.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We appreciate the referee's detailed review and the opportunity to clarify and strengthen our work on Evolvability ES. Below we respond to each major comment, agreeing where revisions are needed to address valid concerns about controls and analysis.

read point-by-point responses

Referee: [Experiments (locomotion tasks)] Locomotion experiments (results section): the reported post-mutation adaptation performance is shown, yet the manuscript contains no ablation that removes or disables the behavioral-diversity term while retaining the base ES dynamics and initialization; without this control it remains unclear whether the observed adaptation gains are attributable to the evolvability objective rather than other algorithmic factors.

Authors: We agree that the absence of this ablation leaves open the possibility that other factors contribute to the observed adaptation. To address this, the revised manuscript will include an ablation study comparing Evolvability ES to a baseline that retains the ES dynamics and initialization but removes the behavioral diversity objective. This will isolate the contribution of the evolvability term. revision: yes
Referee: [Method (objective derivation)] Objective derivation (method section): the claim that maximizing behavioral variance under mutations yields representations whose variation is useful for task adaptation (rather than unstructured or orthogonal noise) is load-bearing for the central thesis, but the paper provides no formal analysis, gradient-alignment test, or control experiment demonstrating that the induced diversity lies along task-relevant directions.

Authors: While the empirical results on task adaptation and the comparison to MAML provide evidence that the induced variations are useful, we acknowledge the lack of formal analysis or specific controls for task-relevance of the diversity. In the revision, we will incorporate a gradient alignment test or additional control experiment to demonstrate that the behavioral diversity aligns with directions that improve task performance. revision: yes

Circularity Check

0 steps flagged

Derivation of evolvability objective is self-contained with no reductions to inputs

full rationale

The paper presents the evolvability ES objective as a novel derivation in the style of natural evolution strategies, explicitly maximizing behavioral diversity under random mutations. This is introduced as an independent insight without equations or definitions that reduce to fitted parameters or prior self-citations by construction. Comparisons to MAML are external benchmarks rather than load-bearing foundations. Experiments on locomotion tasks provide independent validation. No self-definitional, fitted-prediction, or uniqueness-imported patterns appear in the provided abstract or description.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides insufficient detail to identify any free parameters, axioms, or invented entities; full paper required for complete audit. No explicit new entities or fitted parameters are mentioned.

pith-pipeline@v0.9.0 · 5733 in / 1058 out tokens · 22672 ms · 2026-05-24T21:54:01.092005+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

48 extracted references · 48 canonical work pages · 8 internal anchors

[1]

Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, San- jay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Leven- berg, Dan Mané, Rajat Monga, Sherry Moore, Derek Murray,...

work page 2015
[2]

The evolution of evolvability in genetic programming

Lee Altenberg et al. The evolution of evolvability in genetic programming. Advances in genetic programming , 3:47–74, 1994

work page 1994
[3]

Evolution strategies: A comprehen- sive introduction

Hans-Georg Beyer and Hans-Paul Schwefel. Evolution strategies: A comprehen- sive introduction. Natural Computing, 1:3–52, 2002

work page 2002
[4]

Evolution: The evolvability enigma

J.F.Y Brookfield. Evolution: The evolvability enigma. Current Biology, 11(3):R106 – R108, 2001. ISSN 0960-9822. doi: DOI:10.1016/S0960-9822(01)00041-0

work page doi:10.1016/s0960-9822(01)00041-0 2001
[5]

Back to Basics: Benchmarking Canonical Evolution Strategies for Playing Atari

Patryk Chrabaszcz, Ilya Loshchilov, and Frank Hutter. Back to basics: Bench- marking canonical evolution strategies for playing atari. arXiv preprint arXiv:1802.08842, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[6]

The evolutionary origins of modularity

Jeff Clune, Jean-Baptiste Mouret, and Hod Lipson. The evolutionary origins of modularity. Proc. R. Soc. B , 280(1755):20122863, 2013

work page 2013
[7]

Improving exploration in evolution strategies for deep reinforcement learning via a population of novelty-seeking agents

Edoardo Conti, Vashisht Madhavan, Felipe Petroski Such, Joel Lehman, Kenneth Stanley, and Jeff Clune. Improving exploration in evolution strategies for deep reinforcement learning via a population of novelty-seeking agents. In Advances in Neural Information Processing Systems 31 . 2018

work page 2018
[8]

Pybullet, a python module for physics simulation for games, robotics and machine learning

E Coumans and Y Bai. Pybullet, a python module for physics simulation for games, robotics and machine learning. GitHub repository, 2016

work page 2016
[9]

When novelty is not enough

Giuseppe Cuccu and Faustino Gomez. When novelty is not enough. In Euro- pean Conference on the Applications of Evolutionary Computation , pages 234–243. Springer, 2011

work page 2011
[10]

Ebner, M

M. Ebner, M. Shackleton, and R. Shipman. How neutral networks influence evolvability. Complexity, 7(2):19–33, 2001. ISSN 1099-0526

work page 2001
[11]

Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks

Chelsea Finn, Pieter Abbeel, and Sergey Levine. Model-agnostic meta-learning for fast adaptation of deep networks. arXiv preprint arXiv:1703.03400, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[12]

Evolution of plastic neurocontrollers for situated agents

Dario Floreano and Francesco Mondada. Evolution of plastic neurocontrollers for situated agents. In Proc. of The Fourth International Conference on Simulation of Adaptive Behavior (SAB), From Animals to Animats . ETH Zürich, 1996

work page 1996
[13]

Noisy networks for exploration

Meire Fortunato, Mohammad Gheshlaghi Azar, Bilal Piot, Jacob Menick, Ian Osband, Alex Graves, Vlad Mnih, Rémi Munos, Demis Hassabis, Olivier Pietquin, Charles Blundell, and Shane Legg. Noisy networks for exploration. CoRR, abs/1706.10295, 2017. URL http://arxiv.org/abs/1706.10295

work page arXiv 2017
[14]

Michael C. Fu. Chapter 19 gradient estimation. Simulation Handbooks in Opera- tions Research and Management Science , 13:575, 2006. doi: 10.1016/s0927-0507(06) 13019-4

work page doi:10.1016/s0927-0507(06 2006
[15]

Grefenstette

J.J. Grefenstette. Evolvability in dynamic fitness landscapes: A genetic algorithm approach. In Evolutionary Computation, 1999. CEC 99. Proceedings of the 1999 Congress on, volume 3. IEEE, 2002. ISBN 0780355369

work page 1999
[16]

Learning to learn using gradient descent

Sepp Hochreiter, A Steven Younger, and Peter R Conwell. Learning to learn using gradient descent. In International Conference on Artificial Neural Networks , pages 87–94. Springer, 2001

work page 2001
[17]

Varying environments can speed up evolution

Nadav Kashtan, Elad Noor, and Uri Alon. Varying environments can speed up evolution. Proceedings of the National Academy of Sciences , 104(34):13711–13716, 2007

work page 2007
[18]

Evolvability

Marc Kirschner and John Gerhart. Evolvability. Proceedings of the National Academy of Sciences, 95(15):8420–8427, 1998

work page 1998
[19]

Resolving the paradox of evolvability with learning theory: How evolution learns to improve evolvability on rugged fitness landscapes

Loizos Kounios, Jeff Clune, Kostas Kouvaris, Günter P Wagner, Mihaela Pavlicev, Daniel M Weinreich, and Richard A Watson. Resolving the paradox of evolvability with learning theory: How evolution learns to improve evolvability on rugged fitness landscapes. arXiv preprint arXiv:1612.05955, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016
[20]

Abandoning objectives: Evolution through the search for novelty alone

Joel Lehman and Kenneth O Stanley. Abandoning objectives: Evolution through the search for novelty alone. Evolutionary computation, 19(2):189–223, 2011

work page 2011
[21]

Improving evolvability through novelty search and self-adaptation

Joel Lehman and Kenneth O Stanley. Improving evolvability through novelty search and self-adaptation. In IEEE Congress on Evolutionary Computation , pages 2693–2700, 2011

work page 2011
[22]

Evolving a diversity of virtual creatures through novelty search and local competition

Joel Lehman and Kenneth O Stanley. Evolving a diversity of virtual creatures through novelty search and local competition. In Proceedings of the 13th annual conference on Genetic and evolutionary computation , pages 211–218. ACM, 2011

work page 2011
[23]

Evolvability is inevitable: Increasing evolv- ability without the pressure to adapt

Joel Lehman and Kenneth O Stanley. Evolvability is inevitable: Increasing evolv- ability without the pressure to adapt. PloS one, 8(4):e62186, 2013

work page 2013
[24]

On the potential benefits of knowing everything

Joel Lehman and Kenneth O Stanley. On the potential benefits of knowing everything. In Artificial Life Conference Proceedings, pages 558–565. MIT Press, 2018

work page 2018
[25]

McLachlan, Sharon X

Geoffrey J. McLachlan, Sharon X. Lee, and Suren I. Rathnayake. Finite mix- ture models. Annual Review of Statistics and Its Application , 6(1):355–378, 2019. doi: 10.1146/annurev-statistics-031017-100325. URL https://doi.org/10.1146/ annurev-statistics-031017-100325

work page doi:10.1146/annurev-statistics-031017-100325 2019
[26]

Evolvability search

Henok Mengistu, Joel Lehman, and Jeff Clune. Evolvability search. Proceedings of the 2016 on Genetic and Evolutionary Computation Conference - GECCO 16 , 2016. doi: 10.1145/2908812.2908838

work page doi:10.1145/2908812.2908838 2016
[27]

Differentiable plasticity: training plastic neural networks with backpropagation

Thomas Miconi, Jeff Clune, and Kenneth O Stanley. Differentiable plastic- ity: training plastic neural networks with backpropagation. arXiv preprint arXiv:1804.02464, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[28]

Human-level control through deep reinforcement learning

Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, et al. Human-level control through deep reinforcement learning. Nature, 518(7540):529, 2015

work page 2015
[29]

Innovation engines: Automated creativity and improved stochastic optimization via deep learning

Anh Mai Nguyen, Jason Yosinski, and Jeff Clune. Innovation engines: Automated creativity and improved stochastic optimization via deep learning. In Proceedings of the 2015 Annual Conference on Genetic and Evolutionary Computation , pages 959–966. ACM, 2015

work page 2015
[30]

Automatic differentiation in PyTorch

Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. Automatic differentiation in PyTorch. In NIPS-W, 2017

work page 2017
[31]

Pigliucci

M. Pigliucci. Is evolvability evolvable? Nature Reviews Genetics, 9(1):75–82, 2008. ISSN 1471-0056

work page 2008
[32]

Quality diversity: A new frontier for evolutionary computation

Justin K Pugh, Lisa B Soros, and Kenneth O Stanley. Quality diversity: A new frontier for evolutionary computation. Frontiers in Robotics and AI , 3:40, 2016

work page 2016
[33]

Stanley, and Risto Miikkulainen

Joseph Reisinger, Kenneth O. Stanley, and Risto Miikkulainen. Towards an empirical measure of evolvability. In Genetic and Evolutionary Computation Conference (GECCO2005) Workshop Program, pages 257–264, Washington, D.C.,

work page
[34]

URL http://nn.cs.utexas.edu/?reisinger:gecco05

ACM Press. URL http://nn.cs.utexas.edu/?reisinger:gecco05

work page
[35]

Evolution Strategies as a Scalable Alternative to Reinforcement Learning

Tim Salimans, Jonathan Ho, Xi Chen, Szymon Sidor, and Ilya Sutskever. Evolution strategies as a scalable alternative to reinforcement learning. arXiv preprint arXiv:1703.03864, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[36]

Gradient Estimation Using Stochastic Computation Graphs

John Schulman, Nicolas Heess, Theophane Weber, and Pieter Abbeel. Gradient estimation using stochastic computation graphs. CoRR, abs/1506.05254, 2015. URL http://arxiv.org/abs/1506.05254

work page internal anchor Pith review Pith/arXiv arXiv 2015
[37]

Evolution and optimum seeking: the sixth generation

Hans-Paul Paul Schwefel. Evolution and optimum seeking: the sixth generation . John Wiley & Sons, Inc., 1993

work page 1993
[38]

Evolutionary advantages of neuromodulated plasticity in dynamic, reward-based scenarios

Andrea Soltoggio, John A Bullinaria, Claudio Mattiussi, Peter Dürr, and Dario Floreano. Evolutionary advantages of neuromodulated plasticity in dynamic, reward-based scenarios. In Proceedings of the 11th international conference on artificial life (Alife XI), number LIS-CONF-2008-012, pages 569–576. MIT Press, 2008

work page 2008
[39]

Stanley and Risto Miikkulainen

Kenneth O. Stanley and Risto Miikkulainen. Evolving neural networks through augmenting topologies. Evolutionary Computation, 10:99–127, 2002

work page 2002
[40]

Evolving adaptive neural networks with and without adaptive synapses

Kenneth O Stanley, Bobby D Bryant, and Risto Miikkulainen. Evolving adaptive neural networks with and without adaptive synapses. In Evolutionary Compu- tation, 2003. CEC’03. The 2003 Congress on , volume 4, pages 2557–2564. IEEE, 2003

work page 2003
[41]

Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning

Felipe Petroski Such, Vashisht Madhavan, Edoardo Conti, Joel Lehman, Ken- neth O Stanley, and Jeff Clune. Deep neuroevolution: genetic algorithms are a competitive alternative for training deep neural networks for reinforcement learning. arXiv preprint arXiv:1712.06567, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[42]

Mujoco: A physics en- gine for model-based control, in: 2012 IEEE/RSJ International Con- ference on Intelligent Robots and Systems, IEEE

E. Todorov, T. Erez, and Y. Tassa. Mujoco: A physics engine for model-based control. In 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 5026–5033, Oct 2012. doi: 10.1109/IROS.2012.6386109

work page doi:10.1109/iros.2012.6386109 2012
[43]

A perspective view and survey of meta- learning

Ricardo Vilalta and Youssef Drissi. A perspective view and survey of meta- learning. Artificial Intelligence Review, 18(2):77–95, 2002

work page 2002
[44]

Perspective: complex adaptations and the evolution of evolvability

Günter P Wagner and Lee Altenberg. Perspective: complex adaptations and the evolution of evolvability. Evolution, 50(3):967–976, 1996

work page 1996
[45]

Learning to reinforcement learn

Jane X Wang, Zeb Kurth-Nelson, Dhruva Tirumala, Hubert Soyer, Joel Z Leibo, Remi Munos, Charles Blundell, Dharshan Kumaran, and Matt Botvinick. Learning to reinforcement learn. arXiv preprint arXiv:1611.05763, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016
[46]

Natural evolution strategies, 2011

Daan Wierstra, Tom Schaul, Tobias Glasmachers, Yi Sun, and JÃĳrgen Schmid- huber. Natural evolution strategies, 2011

work page 2011
[47]

nested stochastic computation graphs,

Bryan Wilder and Kenneth Stanley. Reconciling explanations for the evolution of evolvability. Adaptive Behavior, 23(3):171–179, 2015. GECCO ’19, July 13–17, 2019, Prague, Czech Republic Alexander Gajewski, Jeff Clune, Kenneth O. Stanley, and Joel Lehman 0 20 40 60 80 100 Generation 0 10 20 30 40 50Max Final Distance Standard MaxVar MaxEnt Figure S1: 2-D l...

work page 2015
[48]

expectation

+ Õ j f2(xi 2)L(xi 2;θ) L(xi 1;θ, x0). (18) While this may not seem like very much of an improvement at first, it is insightful to note how similar the forms of Equations 10 z f θ Ez [f ] Figure S9: Nested stochastic computation graph represent- ing Natural Evolution Strategies. z B (·) 2 θ Ez [(·) 2] Figure S10: Nested stochastic computation graph repres...

work page 2019

[1] [1]

Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, San- jay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Leven- berg, Dan Mané, Rajat Monga, Sherry Moore, Derek Murray,...

work page 2015

[2] [2]

The evolution of evolvability in genetic programming

Lee Altenberg et al. The evolution of evolvability in genetic programming. Advances in genetic programming , 3:47–74, 1994

work page 1994

[3] [3]

Evolution strategies: A comprehen- sive introduction

Hans-Georg Beyer and Hans-Paul Schwefel. Evolution strategies: A comprehen- sive introduction. Natural Computing, 1:3–52, 2002

work page 2002

[4] [4]

Evolution: The evolvability enigma

J.F.Y Brookfield. Evolution: The evolvability enigma. Current Biology, 11(3):R106 – R108, 2001. ISSN 0960-9822. doi: DOI:10.1016/S0960-9822(01)00041-0

work page doi:10.1016/s0960-9822(01)00041-0 2001

[5] [5]

Back to Basics: Benchmarking Canonical Evolution Strategies for Playing Atari

Patryk Chrabaszcz, Ilya Loshchilov, and Frank Hutter. Back to basics: Bench- marking canonical evolution strategies for playing atari. arXiv preprint arXiv:1802.08842, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[6] [6]

The evolutionary origins of modularity

Jeff Clune, Jean-Baptiste Mouret, and Hod Lipson. The evolutionary origins of modularity. Proc. R. Soc. B , 280(1755):20122863, 2013

work page 2013

[7] [7]

Improving exploration in evolution strategies for deep reinforcement learning via a population of novelty-seeking agents

Edoardo Conti, Vashisht Madhavan, Felipe Petroski Such, Joel Lehman, Kenneth Stanley, and Jeff Clune. Improving exploration in evolution strategies for deep reinforcement learning via a population of novelty-seeking agents. In Advances in Neural Information Processing Systems 31 . 2018

work page 2018

[8] [8]

Pybullet, a python module for physics simulation for games, robotics and machine learning

E Coumans and Y Bai. Pybullet, a python module for physics simulation for games, robotics and machine learning. GitHub repository, 2016

work page 2016

[9] [9]

When novelty is not enough

Giuseppe Cuccu and Faustino Gomez. When novelty is not enough. In Euro- pean Conference on the Applications of Evolutionary Computation , pages 234–243. Springer, 2011

work page 2011

[10] [10]

Ebner, M

M. Ebner, M. Shackleton, and R. Shipman. How neutral networks influence evolvability. Complexity, 7(2):19–33, 2001. ISSN 1099-0526

work page 2001

[11] [11]

Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks

Chelsea Finn, Pieter Abbeel, and Sergey Levine. Model-agnostic meta-learning for fast adaptation of deep networks. arXiv preprint arXiv:1703.03400, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[12] [12]

Evolution of plastic neurocontrollers for situated agents

Dario Floreano and Francesco Mondada. Evolution of plastic neurocontrollers for situated agents. In Proc. of The Fourth International Conference on Simulation of Adaptive Behavior (SAB), From Animals to Animats . ETH Zürich, 1996

work page 1996

[13] [13]

Noisy networks for exploration

Meire Fortunato, Mohammad Gheshlaghi Azar, Bilal Piot, Jacob Menick, Ian Osband, Alex Graves, Vlad Mnih, Rémi Munos, Demis Hassabis, Olivier Pietquin, Charles Blundell, and Shane Legg. Noisy networks for exploration. CoRR, abs/1706.10295, 2017. URL http://arxiv.org/abs/1706.10295

work page arXiv 2017

[14] [14]

Michael C. Fu. Chapter 19 gradient estimation. Simulation Handbooks in Opera- tions Research and Management Science , 13:575, 2006. doi: 10.1016/s0927-0507(06) 13019-4

work page doi:10.1016/s0927-0507(06 2006

[15] [15]

Grefenstette

J.J. Grefenstette. Evolvability in dynamic fitness landscapes: A genetic algorithm approach. In Evolutionary Computation, 1999. CEC 99. Proceedings of the 1999 Congress on, volume 3. IEEE, 2002. ISBN 0780355369

work page 1999

[16] [16]

Learning to learn using gradient descent

Sepp Hochreiter, A Steven Younger, and Peter R Conwell. Learning to learn using gradient descent. In International Conference on Artificial Neural Networks , pages 87–94. Springer, 2001

work page 2001

[17] [17]

Varying environments can speed up evolution

Nadav Kashtan, Elad Noor, and Uri Alon. Varying environments can speed up evolution. Proceedings of the National Academy of Sciences , 104(34):13711–13716, 2007

work page 2007

[18] [18]

Evolvability

Marc Kirschner and John Gerhart. Evolvability. Proceedings of the National Academy of Sciences, 95(15):8420–8427, 1998

work page 1998

[19] [19]

Resolving the paradox of evolvability with learning theory: How evolution learns to improve evolvability on rugged fitness landscapes

Loizos Kounios, Jeff Clune, Kostas Kouvaris, Günter P Wagner, Mihaela Pavlicev, Daniel M Weinreich, and Richard A Watson. Resolving the paradox of evolvability with learning theory: How evolution learns to improve evolvability on rugged fitness landscapes. arXiv preprint arXiv:1612.05955, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016

[20] [20]

Abandoning objectives: Evolution through the search for novelty alone

Joel Lehman and Kenneth O Stanley. Abandoning objectives: Evolution through the search for novelty alone. Evolutionary computation, 19(2):189–223, 2011

work page 2011

[21] [21]

Improving evolvability through novelty search and self-adaptation

Joel Lehman and Kenneth O Stanley. Improving evolvability through novelty search and self-adaptation. In IEEE Congress on Evolutionary Computation , pages 2693–2700, 2011

work page 2011

[22] [22]

Evolving a diversity of virtual creatures through novelty search and local competition

Joel Lehman and Kenneth O Stanley. Evolving a diversity of virtual creatures through novelty search and local competition. In Proceedings of the 13th annual conference on Genetic and evolutionary computation , pages 211–218. ACM, 2011

work page 2011

[23] [23]

Evolvability is inevitable: Increasing evolv- ability without the pressure to adapt

Joel Lehman and Kenneth O Stanley. Evolvability is inevitable: Increasing evolv- ability without the pressure to adapt. PloS one, 8(4):e62186, 2013

work page 2013

[24] [24]

On the potential benefits of knowing everything

Joel Lehman and Kenneth O Stanley. On the potential benefits of knowing everything. In Artificial Life Conference Proceedings, pages 558–565. MIT Press, 2018

work page 2018

[25] [25]

McLachlan, Sharon X

Geoffrey J. McLachlan, Sharon X. Lee, and Suren I. Rathnayake. Finite mix- ture models. Annual Review of Statistics and Its Application , 6(1):355–378, 2019. doi: 10.1146/annurev-statistics-031017-100325. URL https://doi.org/10.1146/ annurev-statistics-031017-100325

work page doi:10.1146/annurev-statistics-031017-100325 2019

[26] [26]

Evolvability search

Henok Mengistu, Joel Lehman, and Jeff Clune. Evolvability search. Proceedings of the 2016 on Genetic and Evolutionary Computation Conference - GECCO 16 , 2016. doi: 10.1145/2908812.2908838

work page doi:10.1145/2908812.2908838 2016

[27] [27]

Differentiable plasticity: training plastic neural networks with backpropagation

Thomas Miconi, Jeff Clune, and Kenneth O Stanley. Differentiable plastic- ity: training plastic neural networks with backpropagation. arXiv preprint arXiv:1804.02464, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[28] [28]

Human-level control through deep reinforcement learning

Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, et al. Human-level control through deep reinforcement learning. Nature, 518(7540):529, 2015

work page 2015

[29] [29]

Innovation engines: Automated creativity and improved stochastic optimization via deep learning

Anh Mai Nguyen, Jason Yosinski, and Jeff Clune. Innovation engines: Automated creativity and improved stochastic optimization via deep learning. In Proceedings of the 2015 Annual Conference on Genetic and Evolutionary Computation , pages 959–966. ACM, 2015

work page 2015

[30] [30]

Automatic differentiation in PyTorch

Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. Automatic differentiation in PyTorch. In NIPS-W, 2017

work page 2017

[31] [31]

Pigliucci

M. Pigliucci. Is evolvability evolvable? Nature Reviews Genetics, 9(1):75–82, 2008. ISSN 1471-0056

work page 2008

[32] [32]

Quality diversity: A new frontier for evolutionary computation

Justin K Pugh, Lisa B Soros, and Kenneth O Stanley. Quality diversity: A new frontier for evolutionary computation. Frontiers in Robotics and AI , 3:40, 2016

work page 2016

[33] [33]

Stanley, and Risto Miikkulainen

Joseph Reisinger, Kenneth O. Stanley, and Risto Miikkulainen. Towards an empirical measure of evolvability. In Genetic and Evolutionary Computation Conference (GECCO2005) Workshop Program, pages 257–264, Washington, D.C.,

work page

[34] [34]

URL http://nn.cs.utexas.edu/?reisinger:gecco05

ACM Press. URL http://nn.cs.utexas.edu/?reisinger:gecco05

work page

[35] [35]

Evolution Strategies as a Scalable Alternative to Reinforcement Learning

Tim Salimans, Jonathan Ho, Xi Chen, Szymon Sidor, and Ilya Sutskever. Evolution strategies as a scalable alternative to reinforcement learning. arXiv preprint arXiv:1703.03864, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[36] [36]

Gradient Estimation Using Stochastic Computation Graphs

John Schulman, Nicolas Heess, Theophane Weber, and Pieter Abbeel. Gradient estimation using stochastic computation graphs. CoRR, abs/1506.05254, 2015. URL http://arxiv.org/abs/1506.05254

work page internal anchor Pith review Pith/arXiv arXiv 2015

[37] [37]

Evolution and optimum seeking: the sixth generation

Hans-Paul Paul Schwefel. Evolution and optimum seeking: the sixth generation . John Wiley & Sons, Inc., 1993

work page 1993

[38] [38]

Evolutionary advantages of neuromodulated plasticity in dynamic, reward-based scenarios

Andrea Soltoggio, John A Bullinaria, Claudio Mattiussi, Peter Dürr, and Dario Floreano. Evolutionary advantages of neuromodulated plasticity in dynamic, reward-based scenarios. In Proceedings of the 11th international conference on artificial life (Alife XI), number LIS-CONF-2008-012, pages 569–576. MIT Press, 2008

work page 2008

[39] [39]

Stanley and Risto Miikkulainen

Kenneth O. Stanley and Risto Miikkulainen. Evolving neural networks through augmenting topologies. Evolutionary Computation, 10:99–127, 2002

work page 2002

[40] [40]

Evolving adaptive neural networks with and without adaptive synapses

Kenneth O Stanley, Bobby D Bryant, and Risto Miikkulainen. Evolving adaptive neural networks with and without adaptive synapses. In Evolutionary Compu- tation, 2003. CEC’03. The 2003 Congress on , volume 4, pages 2557–2564. IEEE, 2003

work page 2003

[41] [41]

Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning

Felipe Petroski Such, Vashisht Madhavan, Edoardo Conti, Joel Lehman, Ken- neth O Stanley, and Jeff Clune. Deep neuroevolution: genetic algorithms are a competitive alternative for training deep neural networks for reinforcement learning. arXiv preprint arXiv:1712.06567, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[42] [42]

Mujoco: A physics en- gine for model-based control, in: 2012 IEEE/RSJ International Con- ference on Intelligent Robots and Systems, IEEE

E. Todorov, T. Erez, and Y. Tassa. Mujoco: A physics engine for model-based control. In 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 5026–5033, Oct 2012. doi: 10.1109/IROS.2012.6386109

work page doi:10.1109/iros.2012.6386109 2012

[43] [43]

A perspective view and survey of meta- learning

Ricardo Vilalta and Youssef Drissi. A perspective view and survey of meta- learning. Artificial Intelligence Review, 18(2):77–95, 2002

work page 2002

[44] [44]

Perspective: complex adaptations and the evolution of evolvability

Günter P Wagner and Lee Altenberg. Perspective: complex adaptations and the evolution of evolvability. Evolution, 50(3):967–976, 1996

work page 1996

[45] [45]

Learning to reinforcement learn

Jane X Wang, Zeb Kurth-Nelson, Dhruva Tirumala, Hubert Soyer, Joel Z Leibo, Remi Munos, Charles Blundell, Dharshan Kumaran, and Matt Botvinick. Learning to reinforcement learn. arXiv preprint arXiv:1611.05763, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016

[46] [46]

Natural evolution strategies, 2011

Daan Wierstra, Tom Schaul, Tobias Glasmachers, Yi Sun, and JÃĳrgen Schmid- huber. Natural evolution strategies, 2011

work page 2011

[47] [47]

nested stochastic computation graphs,

Bryan Wilder and Kenneth Stanley. Reconciling explanations for the evolution of evolvability. Adaptive Behavior, 23(3):171–179, 2015. GECCO ’19, July 13–17, 2019, Prague, Czech Republic Alexander Gajewski, Jeff Clune, Kenneth O. Stanley, and Joel Lehman 0 20 40 60 80 100 Generation 0 10 20 30 40 50Max Final Distance Standard MaxVar MaxEnt Figure S1: 2-D l...

work page 2015

[48] [48]

expectation

+ Õ j f2(xi 2)L(xi 2;θ) L(xi 1;θ, x0). (18) While this may not seem like very much of an improvement at first, it is insightful to note how similar the forms of Equations 10 z f θ Ez [f ] Figure S9: Nested stochastic computation graph represent- ing Natural Evolution Strategies. z B (·) 2 θ Ez [(·) 2] Figure S10: Nested stochastic computation graph repres...

work page 2019