Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning

Edoardo Conti; Felipe Petroski Such; Jeff Clune; Joel Lehman; Kenneth O. Stanley; Vashisht Madhavan

arxiv: 1712.06567 · v3 · pith:WSQ4ZOQRnew · submitted 2017-12-18 · 💻 cs.NE · cs.LG

Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning

Felipe Petroski Such , Vashisht Madhavan , Edoardo Conti , Joel Lehman , Kenneth O. Stanley , Jeff Clune This is my paper

classification 💻 cs.NE cs.LG

keywords deepalgorithmsnetworksalgorithmgradientlearningneuralatari

0 comments

read the original abstract

Deep artificial neural networks (DNNs) are typically trained via gradient-based learning algorithms, namely backpropagation. Evolution strategies (ES) can rival backprop-based algorithms such as Q-learning and policy gradients on challenging deep reinforcement learning (RL) problems. However, ES can be considered a gradient-based algorithm because it performs stochastic gradient descent via an operation similar to a finite-difference approximation of the gradient. That raises the question of whether non-gradient-based evolutionary algorithms can work at DNN scales. Here we demonstrate they can: we evolve the weights of a DNN with a simple, gradient-free, population-based genetic algorithm (GA) and it performs well on hard deep RL problems, including Atari and humanoid locomotion. The Deep GA successfully evolves networks with over four million free parameters, the largest neural networks ever evolved with a traditional evolutionary algorithm. These results (1) expand our sense of the scale at which GAs can operate, (2) suggest intriguingly that in some cases following the gradient is not the best choice for optimizing performance, and (3) make immediately available the multitude of neuroevolution techniques that improve performance. We demonstrate the latter by showing that combining DNNs with novelty search, which encourages exploration on tasks with deceptive or sparse reward functions, can solve a high-dimensional problem on which reward-maximizing algorithms (e.g.\ DQN, A3C, ES, and the GA) fail. Additionally, the Deep GA is faster than ES, A3C, and DQN (it can train Atari in ${\raise.17ex\hbox{$\scriptstyle\sim$}}$4 hours on one desktop or ${\raise.17ex\hbox{$\scriptstyle\sim$}}$1 hour distributed on 720 cores), and enables a state-of-the-art, up to 10,000-fold compact encoding technique.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 8 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Parameter-Efficient Neuroevolution for Diverse LLM Generation: Quality-Diversity Optimization via Prompt Embedding Evolution
cs.NE 2026-05 unverdicted novelty 7.0

QD-LLM evolves prompt embeddings via neuroevolution in a quality-diversity framework, delivering 46% higher coverage and 41% higher QD-score than prior methods on coding and writing benchmarks.
Darwin Family: MRI-Trust-Weighted Evolutionary Merging for Training-Free Scaling of Language-Model Reasoning
cs.NE 2026-05 unverdicted novelty 6.0

Evolutionary merging with a 14-dimensional genome and MRI-Trust Fusion produces models that outperform their trained parents on reasoning benchmarks without any gradient updates.
Differentiable Evolutionary Reinforcement Learning
cs.AI 2025-12 unverdicted novelty 6.0

DERL is a differentiable bi-level method that evolves optimal reward structures for RL policies by composing atomic primitives and using meta-gradients from validation performance.
Learning Evolution via Optimization Knowledge Adaptation
cs.NE 2025-01 unverdicted novelty 6.0

OKAEM is a unified learnable evolutionary framework that uses attention-based operators for pre-training on prior knowledge and real-time self-tuning adaptation.
Evolvability ES: Scalable and Direct Optimization of Evolvability
cs.NE 2019-07 unverdicted novelty 6.0

Evolvability ES is an evolutionary strategy variant that directly optimizes for evolvability by maximizing behavioral diversity under mutations, tested on 2D/3D locomotion tasks and shown competitive with MAML.
An Evolutionary Algorithm of Linear complexity: Application to Training of Deep Neural Networks
cs.NE 2019-07 unverdicted novelty 5.0

Introduces an O(n) evolutionary algorithm claimed to deliver competitive performance for training RBMs with over one million parameters versus CMA-ES and contrastive divergence.
NeuroTrajectory: A Neuroevolutionary Approach to Local State Trajectory Learning for Autonomous Vehicles
cs.RO 2019-06 unverdicted novelty 5.0

NeuroTrajectory is a neuroevolutionary method that trains deep neural networks via genetic algorithms to estimate multi-objective optimal state trajectories over a finite horizon for autonomous vehicle motion planning.
Lark: Biologically Inspired Neuroevolution for Multi-Stakeholder LLM Agents
cs.MA 2025-10 unverdicted novelty 4.0

Lark is a biologically inspired neuroevolution framework for multi-stakeholder LLM agents that iteratively generates, refines, and selects strategies using plasticity, duplication/maturation, influence-weighted Borda ...