pith. sign in

arxiv: 1907.06077 · v1 · pith:FECPZ3STnew · submitted 2019-07-13 · 💻 cs.NE

Evolvability ES: Scalable and Direct Optimization of Evolvability

Pith reviewed 2026-05-24 21:54 UTC · model grok-4.3

classification 💻 cs.NE
keywords evolvabilityevolutionary strategiesmeta-learningneural network adaptationlocomotion tasksbehavioral diversityMAML comparison
0
0 comments X

The pith

Evolvability ES derives an objective that directly maximizes behavioral diversity under random mutations to optimize for evolvability.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents evolvability ES as a method to optimize neural network representations specifically for their capacity to adapt further. It starts from natural evolution strategies and constructs a new objective that rewards an individual when its random mutations produce a wide range of behaviors. Experiments apply this to 2-D and 3-D locomotion tasks, producing networks with tens of thousands of parameters that adapt rapidly to new tasks and can initialize further evolution. The same approach is shown to match the performance of MAML while yielding solutions with different characteristics. A reader would care because explicit optimization for evolvability could accelerate evolutionary search and support adaptation in changing conditions.

Core claim

Evolvability ES derives a novel objective in the spirit of natural evolution strategies that maximizes the diversity of behaviors exhibited when an individual is subject to random mutations, and that efficiently scales with computation. Experiments in 2-D and 3-D locomotion tasks highlight the potential of evolvability ES to generate solutions with tens of thousands of parameters that can quickly be adapted to solve different tasks and that can productively seed further evolution. Results also show that evolvability ES can perform competitively with MAML while discovering solutions with distinct properties.

What carries the argument

The evolvability objective, derived from natural evolution strategies, that rewards diversity of behaviors produced by random mutations of an individual.

If this is right

  • Solutions with tens of thousands of parameters adapt quickly to different locomotion tasks.
  • Optimized individuals can productively initialize further evolutionary runs.
  • The method scales computationally to deep networks.
  • Performance matches gradient-based meta-learning while producing solutions with different properties.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same mutation-diversity objective could be tested on control problems outside locomotion to check whether the adaptation benefit generalizes.
  • Hybrid algorithms that combine the evolutionary objective with gradient updates might improve sample efficiency in meta-learning settings.
  • Representations found this way may reduce the need for task-specific retraining when environments change.

Load-bearing premise

Maximizing behavioral diversity under random mutations serves as a valid and sufficient proxy for evolvability that enables fast adaptation rather than merely varied but non-adaptive behaviors.

What would settle it

An experiment in which solutions produced by evolvability ES adapt no faster than those from standard evolutionary strategies when transferred to new locomotion tasks.

Figures

Figures reproduced from arXiv: 1907.06077 by Alexander Gajewski, Jeff Clune, Joel Lehman, Kenneth O. Stanley.

Figure 1
Figure 1. Figure 1: Interference Pattern Results. In the interference pattern task, a genome consisted of a single floating point parameter [PITH_FULL_IMAGE:figures/full_fig_p005_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Distribution of behaviors across evolution in the 2-D locomotion domain. Heat-maps of the final horizontal positions [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Distribution of behaviors compared to MAML in the 2-D locomotion domain. Histograms of the final [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Distribution of behaviors in the final population in the 3-D locomotion domain. Shown are heat-maps (taken from [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Adaptation performance. The plot compares the [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Distribution of behaviors during adaptation in the 3-D locomotion domain. Heat-maps are shown of the final posi [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Distribution of behaviors for uni- and multi-modal MaxEnt-EES variants. Heat-maps are shown of the final positions [PITH_FULL_IMAGE:figures/full_fig_p008_7.png] view at source ↗
read the original abstract

Designing evolutionary algorithms capable of uncovering highly evolvable representations is an open challenge; such evolvability is important because it accelerates evolution and enables fast adaptation to changing circumstances. This paper introduces evolvability ES, an evolutionary algorithm designed to explicitly and efficiently optimize for evolvability, i.e. the ability to further adapt. The insight is that it is possible to derive a novel objective in the spirit of natural evolution strategies that maximizes the diversity of behaviors exhibited when an individual is subject to random mutations, and that efficiently scales with computation. Experiments in 2-D and 3-D locomotion tasks highlight the potential of evolvability ES to generate solutions with tens of thousands of parameters that can quickly be adapted to solve different tasks and that can productively seed further evolution. We further highlight a connection between evolvability and a recent and popular gradient-based meta-learning algorithm called MAML; results show that evolvability ES can perform competitively with MAML and that it discovers solutions with distinct properties. The conclusion is that evolvability ES opens up novel research directions for studying and exploiting the potential of evolvable representations for deep neural networks.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces Evolvability ES, an evolutionary algorithm that derives a novel objective in the style of natural evolution strategies to directly maximize the behavioral diversity (variance) exhibited by an individual under random mutations. Experiments on 2-D and 3-D locomotion tasks with neural networks of tens of thousands of parameters show that the resulting solutions adapt rapidly to new tasks, productively seed further evolution, and perform competitively with MAML while exhibiting distinct properties.

Significance. If the central claim holds, the work supplies a computationally scalable, direct method for optimizing evolvability in high-dimensional neural network parameter spaces—an open challenge in evolutionary computation. The explicit link to MAML and the empirical results on complex locomotion domains are strengths that could influence representation design for adaptability in RL and evolutionary algorithms.

major comments (2)
  1. [Experiments (locomotion tasks)] Locomotion experiments (results section): the reported post-mutation adaptation performance is shown, yet the manuscript contains no ablation that removes or disables the behavioral-diversity term while retaining the base ES dynamics and initialization; without this control it remains unclear whether the observed adaptation gains are attributable to the evolvability objective rather than other algorithmic factors.
  2. [Method (objective derivation)] Objective derivation (method section): the claim that maximizing behavioral variance under mutations yields representations whose variation is useful for task adaptation (rather than unstructured or orthogonal noise) is load-bearing for the central thesis, but the paper provides no formal analysis, gradient-alignment test, or control experiment demonstrating that the induced diversity lies along task-relevant directions.
minor comments (2)
  1. [Experiments] The precise network architectures, mutation variances, and population sizes used in the 3-D experiments could be stated explicitly in the main text rather than deferred entirely to supplementary material.
  2. [Method] Notation for the evolvability objective (e.g., the precise definition of behavioral variance) is introduced clearly but could be cross-referenced more explicitly when results are discussed.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We appreciate the referee's detailed review and the opportunity to clarify and strengthen our work on Evolvability ES. Below we respond to each major comment, agreeing where revisions are needed to address valid concerns about controls and analysis.

read point-by-point responses
  1. Referee: [Experiments (locomotion tasks)] Locomotion experiments (results section): the reported post-mutation adaptation performance is shown, yet the manuscript contains no ablation that removes or disables the behavioral-diversity term while retaining the base ES dynamics and initialization; without this control it remains unclear whether the observed adaptation gains are attributable to the evolvability objective rather than other algorithmic factors.

    Authors: We agree that the absence of this ablation leaves open the possibility that other factors contribute to the observed adaptation. To address this, the revised manuscript will include an ablation study comparing Evolvability ES to a baseline that retains the ES dynamics and initialization but removes the behavioral diversity objective. This will isolate the contribution of the evolvability term. revision: yes

  2. Referee: [Method (objective derivation)] Objective derivation (method section): the claim that maximizing behavioral variance under mutations yields representations whose variation is useful for task adaptation (rather than unstructured or orthogonal noise) is load-bearing for the central thesis, but the paper provides no formal analysis, gradient-alignment test, or control experiment demonstrating that the induced diversity lies along task-relevant directions.

    Authors: While the empirical results on task adaptation and the comparison to MAML provide evidence that the induced variations are useful, we acknowledge the lack of formal analysis or specific controls for task-relevance of the diversity. In the revision, we will incorporate a gradient alignment test or additional control experiment to demonstrate that the behavioral diversity aligns with directions that improve task performance. revision: yes

Circularity Check

0 steps flagged

Derivation of evolvability objective is self-contained with no reductions to inputs

full rationale

The paper presents the evolvability ES objective as a novel derivation in the style of natural evolution strategies, explicitly maximizing behavioral diversity under random mutations. This is introduced as an independent insight without equations or definitions that reduce to fitted parameters or prior self-citations by construction. Comparisons to MAML are external benchmarks rather than load-bearing foundations. Experiments on locomotion tasks provide independent validation. No self-definitional, fitted-prediction, or uniqueness-imported patterns appear in the provided abstract or description.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides insufficient detail to identify any free parameters, axioms, or invented entities; full paper required for complete audit. No explicit new entities or fitted parameters are mentioned.

pith-pipeline@v0.9.0 · 5733 in / 1058 out tokens · 22672 ms · 2026-05-24T21:54:01.092005+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

48 extracted references · 48 canonical work pages · 8 internal anchors

  1. [1]

    Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, San- jay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Leven- berg, Dan Mané, Rajat Monga, Sherry Moore, Derek Murray,...

  2. [2]

    The evolution of evolvability in genetic programming

    Lee Altenberg et al. The evolution of evolvability in genetic programming. Advances in genetic programming , 3:47–74, 1994

  3. [3]

    Evolution strategies: A comprehen- sive introduction

    Hans-Georg Beyer and Hans-Paul Schwefel. Evolution strategies: A comprehen- sive introduction. Natural Computing, 1:3–52, 2002

  4. [4]

    Evolution: The evolvability enigma

    J.F.Y Brookfield. Evolution: The evolvability enigma. Current Biology, 11(3):R106 – R108, 2001. ISSN 0960-9822. doi: DOI:10.1016/S0960-9822(01)00041-0

  5. [5]

    Back to Basics: Benchmarking Canonical Evolution Strategies for Playing Atari

    Patryk Chrabaszcz, Ilya Loshchilov, and Frank Hutter. Back to basics: Bench- marking canonical evolution strategies for playing atari. arXiv preprint arXiv:1802.08842, 2018

  6. [6]

    The evolutionary origins of modularity

    Jeff Clune, Jean-Baptiste Mouret, and Hod Lipson. The evolutionary origins of modularity. Proc. R. Soc. B , 280(1755):20122863, 2013

  7. [7]

    Improving exploration in evolution strategies for deep reinforcement learning via a population of novelty-seeking agents

    Edoardo Conti, Vashisht Madhavan, Felipe Petroski Such, Joel Lehman, Kenneth Stanley, and Jeff Clune. Improving exploration in evolution strategies for deep reinforcement learning via a population of novelty-seeking agents. In Advances in Neural Information Processing Systems 31 . 2018

  8. [8]

    Pybullet, a python module for physics simulation for games, robotics and machine learning

    E Coumans and Y Bai. Pybullet, a python module for physics simulation for games, robotics and machine learning. GitHub repository, 2016

  9. [9]

    When novelty is not enough

    Giuseppe Cuccu and Faustino Gomez. When novelty is not enough. In Euro- pean Conference on the Applications of Evolutionary Computation , pages 234–243. Springer, 2011

  10. [10]

    Ebner, M

    M. Ebner, M. Shackleton, and R. Shipman. How neutral networks influence evolvability. Complexity, 7(2):19–33, 2001. ISSN 1099-0526

  11. [11]

    Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks

    Chelsea Finn, Pieter Abbeel, and Sergey Levine. Model-agnostic meta-learning for fast adaptation of deep networks. arXiv preprint arXiv:1703.03400, 2017

  12. [12]

    Evolution of plastic neurocontrollers for situated agents

    Dario Floreano and Francesco Mondada. Evolution of plastic neurocontrollers for situated agents. In Proc. of The Fourth International Conference on Simulation of Adaptive Behavior (SAB), From Animals to Animats . ETH Zürich, 1996

  13. [13]

    Noisy networks for exploration

    Meire Fortunato, Mohammad Gheshlaghi Azar, Bilal Piot, Jacob Menick, Ian Osband, Alex Graves, Vlad Mnih, Rémi Munos, Demis Hassabis, Olivier Pietquin, Charles Blundell, and Shane Legg. Noisy networks for exploration. CoRR, abs/1706.10295, 2017. URL http://arxiv.org/abs/1706.10295

  14. [14]

    Michael C. Fu. Chapter 19 gradient estimation. Simulation Handbooks in Opera- tions Research and Management Science , 13:575, 2006. doi: 10.1016/s0927-0507(06) 13019-4

  15. [15]

    Grefenstette

    J.J. Grefenstette. Evolvability in dynamic fitness landscapes: A genetic algorithm approach. In Evolutionary Computation, 1999. CEC 99. Proceedings of the 1999 Congress on, volume 3. IEEE, 2002. ISBN 0780355369

  16. [16]

    Learning to learn using gradient descent

    Sepp Hochreiter, A Steven Younger, and Peter R Conwell. Learning to learn using gradient descent. In International Conference on Artificial Neural Networks , pages 87–94. Springer, 2001

  17. [17]

    Varying environments can speed up evolution

    Nadav Kashtan, Elad Noor, and Uri Alon. Varying environments can speed up evolution. Proceedings of the National Academy of Sciences , 104(34):13711–13716, 2007

  18. [18]

    Evolvability

    Marc Kirschner and John Gerhart. Evolvability. Proceedings of the National Academy of Sciences, 95(15):8420–8427, 1998

  19. [19]

    Resolving the paradox of evolvability with learning theory: How evolution learns to improve evolvability on rugged fitness landscapes

    Loizos Kounios, Jeff Clune, Kostas Kouvaris, Günter P Wagner, Mihaela Pavlicev, Daniel M Weinreich, and Richard A Watson. Resolving the paradox of evolvability with learning theory: How evolution learns to improve evolvability on rugged fitness landscapes. arXiv preprint arXiv:1612.05955, 2016

  20. [20]

    Abandoning objectives: Evolution through the search for novelty alone

    Joel Lehman and Kenneth O Stanley. Abandoning objectives: Evolution through the search for novelty alone. Evolutionary computation, 19(2):189–223, 2011

  21. [21]

    Improving evolvability through novelty search and self-adaptation

    Joel Lehman and Kenneth O Stanley. Improving evolvability through novelty search and self-adaptation. In IEEE Congress on Evolutionary Computation , pages 2693–2700, 2011

  22. [22]

    Evolving a diversity of virtual creatures through novelty search and local competition

    Joel Lehman and Kenneth O Stanley. Evolving a diversity of virtual creatures through novelty search and local competition. In Proceedings of the 13th annual conference on Genetic and evolutionary computation , pages 211–218. ACM, 2011

  23. [23]

    Evolvability is inevitable: Increasing evolv- ability without the pressure to adapt

    Joel Lehman and Kenneth O Stanley. Evolvability is inevitable: Increasing evolv- ability without the pressure to adapt. PloS one, 8(4):e62186, 2013

  24. [24]

    On the potential benefits of knowing everything

    Joel Lehman and Kenneth O Stanley. On the potential benefits of knowing everything. In Artificial Life Conference Proceedings, pages 558–565. MIT Press, 2018

  25. [25]

    McLachlan, Sharon X

    Geoffrey J. McLachlan, Sharon X. Lee, and Suren I. Rathnayake. Finite mix- ture models. Annual Review of Statistics and Its Application , 6(1):355–378, 2019. doi: 10.1146/annurev-statistics-031017-100325. URL https://doi.org/10.1146/ annurev-statistics-031017-100325

  26. [26]

    Evolvability search

    Henok Mengistu, Joel Lehman, and Jeff Clune. Evolvability search. Proceedings of the 2016 on Genetic and Evolutionary Computation Conference - GECCO 16 , 2016. doi: 10.1145/2908812.2908838

  27. [27]

    Differentiable plasticity: training plastic neural networks with backpropagation

    Thomas Miconi, Jeff Clune, and Kenneth O Stanley. Differentiable plastic- ity: training plastic neural networks with backpropagation. arXiv preprint arXiv:1804.02464, 2018

  28. [28]

    Human-level control through deep reinforcement learning

    Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, et al. Human-level control through deep reinforcement learning. Nature, 518(7540):529, 2015

  29. [29]

    Innovation engines: Automated creativity and improved stochastic optimization via deep learning

    Anh Mai Nguyen, Jason Yosinski, and Jeff Clune. Innovation engines: Automated creativity and improved stochastic optimization via deep learning. In Proceedings of the 2015 Annual Conference on Genetic and Evolutionary Computation , pages 959–966. ACM, 2015

  30. [30]

    Automatic differentiation in PyTorch

    Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. Automatic differentiation in PyTorch. In NIPS-W, 2017

  31. [31]

    Pigliucci

    M. Pigliucci. Is evolvability evolvable? Nature Reviews Genetics, 9(1):75–82, 2008. ISSN 1471-0056

  32. [32]

    Quality diversity: A new frontier for evolutionary computation

    Justin K Pugh, Lisa B Soros, and Kenneth O Stanley. Quality diversity: A new frontier for evolutionary computation. Frontiers in Robotics and AI , 3:40, 2016

  33. [33]

    Stanley, and Risto Miikkulainen

    Joseph Reisinger, Kenneth O. Stanley, and Risto Miikkulainen. Towards an empirical measure of evolvability. In Genetic and Evolutionary Computation Conference (GECCO2005) Workshop Program, pages 257–264, Washington, D.C.,

  34. [34]

    URL http://nn.cs.utexas.edu/?reisinger:gecco05

    ACM Press. URL http://nn.cs.utexas.edu/?reisinger:gecco05

  35. [35]

    Evolution Strategies as a Scalable Alternative to Reinforcement Learning

    Tim Salimans, Jonathan Ho, Xi Chen, Szymon Sidor, and Ilya Sutskever. Evolution strategies as a scalable alternative to reinforcement learning. arXiv preprint arXiv:1703.03864, 2017

  36. [36]

    Gradient Estimation Using Stochastic Computation Graphs

    John Schulman, Nicolas Heess, Theophane Weber, and Pieter Abbeel. Gradient estimation using stochastic computation graphs. CoRR, abs/1506.05254, 2015. URL http://arxiv.org/abs/1506.05254

  37. [37]

    Evolution and optimum seeking: the sixth generation

    Hans-Paul Paul Schwefel. Evolution and optimum seeking: the sixth generation . John Wiley & Sons, Inc., 1993

  38. [38]

    Evolutionary advantages of neuromodulated plasticity in dynamic, reward-based scenarios

    Andrea Soltoggio, John A Bullinaria, Claudio Mattiussi, Peter Dürr, and Dario Floreano. Evolutionary advantages of neuromodulated plasticity in dynamic, reward-based scenarios. In Proceedings of the 11th international conference on artificial life (Alife XI), number LIS-CONF-2008-012, pages 569–576. MIT Press, 2008

  39. [39]

    Stanley and Risto Miikkulainen

    Kenneth O. Stanley and Risto Miikkulainen. Evolving neural networks through augmenting topologies. Evolutionary Computation, 10:99–127, 2002

  40. [40]

    Evolving adaptive neural networks with and without adaptive synapses

    Kenneth O Stanley, Bobby D Bryant, and Risto Miikkulainen. Evolving adaptive neural networks with and without adaptive synapses. In Evolutionary Compu- tation, 2003. CEC’03. The 2003 Congress on , volume 4, pages 2557–2564. IEEE, 2003

  41. [41]

    Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning

    Felipe Petroski Such, Vashisht Madhavan, Edoardo Conti, Joel Lehman, Ken- neth O Stanley, and Jeff Clune. Deep neuroevolution: genetic algorithms are a competitive alternative for training deep neural networks for reinforcement learning. arXiv preprint arXiv:1712.06567, 2017

  42. [42]

    Mujoco: A physics en- gine for model-based control, in: 2012 IEEE/RSJ International Con- ference on Intelligent Robots and Systems, IEEE

    E. Todorov, T. Erez, and Y. Tassa. Mujoco: A physics engine for model-based control. In 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 5026–5033, Oct 2012. doi: 10.1109/IROS.2012.6386109

  43. [43]

    A perspective view and survey of meta- learning

    Ricardo Vilalta and Youssef Drissi. A perspective view and survey of meta- learning. Artificial Intelligence Review, 18(2):77–95, 2002

  44. [44]

    Perspective: complex adaptations and the evolution of evolvability

    Günter P Wagner and Lee Altenberg. Perspective: complex adaptations and the evolution of evolvability. Evolution, 50(3):967–976, 1996

  45. [45]

    Learning to reinforcement learn

    Jane X Wang, Zeb Kurth-Nelson, Dhruva Tirumala, Hubert Soyer, Joel Z Leibo, Remi Munos, Charles Blundell, Dharshan Kumaran, and Matt Botvinick. Learning to reinforcement learn. arXiv preprint arXiv:1611.05763, 2016

  46. [46]

    Natural evolution strategies, 2011

    Daan Wierstra, Tom Schaul, Tobias Glasmachers, Yi Sun, and JÃijrgen Schmid- huber. Natural evolution strategies, 2011

  47. [47]

    nested stochastic computation graphs,

    Bryan Wilder and Kenneth Stanley. Reconciling explanations for the evolution of evolvability. Adaptive Behavior, 23(3):171–179, 2015. GECCO ’19, July 13–17, 2019, Prague, Czech Republic Alexander Gajewski, Jeff Clune, Kenneth O. Stanley, and Joel Lehman 0 20 40 60 80 100 Generation 0 10 20 30 40 50Max Final Distance Standard MaxVar MaxEnt Figure S1: 2-D l...

  48. [48]

    expectation

    + Õ j f2(xi 2)L(xi 2;θ) L(xi 1;θ, x0). (18) While this may not seem like very much of an improvement at first, it is insightful to note how similar the forms of Equations 10 z f θ Ez [f ] Figure S9: Nested stochastic computation graph represent- ing Natural Evolution Strategies. z B (·) 2 θ Ez [(·) 2] Figure S10: Nested stochastic computation graph repres...