Algorithm Discovery With LLMs: Evolutionary Search Meets Reinforcement Learning

Amal Seddas; Amin Mansouri; Anja Surina; Caglar Gulcehre; Emmanuel Abbe; Lars Quaedvlieg; Maryna Viazovska

arxiv: 2504.05108 · v4 · pith:PJWOFGZQnew · submitted 2025-04-07 · 💻 cs.AI · cs.LG· cs.NE

Algorithm Discovery With LLMs: Evolutionary Search Meets Reinforcement Learning

Anja Surina , Amin Mansouri , Lars Quaedvlieg , Amal Seddas , Maryna Viazovska , Emmanuel Abbe , Caglar Gulcehre This is my paper

classification 💻 cs.AI cs.LGcs.NE

keywords evolutionarysearchalgorithmsdiscoveryalgorithmexplorationlearningllms

0 comments

read the original abstract

Discovering efficient algorithms for solving complex problems has been an outstanding challenge in mathematics and computer science, requiring substantial human expertise over the years. Recent advancements in evolutionary search with large language models (LLMs) have shown promise in accelerating the discovery of algorithms across various domains, particularly in mathematics and optimization. However, existing approaches treat the LLM as a static generator, missing the opportunity to update the model with the signal obtained from evolutionary exploration. In this work, we propose to augment LLM-based evolutionary search by continuously refining the search operator - the LLM - through reinforcement learning (RL) fine-tuning. Our method leverages evolutionary search as an exploration strategy to discover improved algorithms, while RL optimizes the LLM policy based on these discoveries. Our experiments on combinatorial optimization tasks demonstrate that integrating RL with evolutionary search accelerates the discovery of superior algorithms, showcasing the potential of RL-enhanced evolutionary strategies for algorithm design.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 10 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Budget-Efficient Automatic Algorithm Design via Code Graph
cs.AI 2026-05 unverdicted novelty 7.0

A code-graph and correction-based LLM search framework outperforms full-algorithm generation at equal token budgets on three combinatorial optimization problems.
AHD Agent: Agentic Reinforcement Learning for Automatic Heuristic Design
cs.AI 2026-05 unverdicted novelty 7.0

AHD Agent trains a 4B-parameter LLM via agentic RL to actively use tools for automatic heuristic design, matching or exceeding larger baselines across eight domains with fewer evaluations.
Back to the Beginning of Heuristic Design: Bridging Code and Knowledge with LLMs
cs.AI 2026-05 unverdicted novelty 7.0

A knowledge-first approach to LLM-driven automatic heuristic design in combinatorial optimization yields better discovery efficiency, transfer, and generalization than code-centric baselines by formalizing a distortio...
$k$-server-bench: Automating Potential Discovery for the $k$-Server Conjecture
cs.MS 2026-04 accept novelty 7.0

k-server-bench formulates potential-function discovery for the k-server conjecture as a code-based inequality-satisfaction task; current agents fully solve the resolved k=3 case and reduce violations on the open k=4 case.
Learning to Discover at Test Time
cs.LG 2026-01 unverdicted novelty 7.0

TTT-Discover applies test-time RL to set new state-of-the-art results on math inequalities, GPU kernels, algorithm contests, and single-cell denoising using an open model and public code.
AlphaEvolve: A coding agent for scientific and algorithmic discovery
cs.AI 2025-06 unverdicted novelty 7.0

AlphaEvolve is an LLM-orchestrated evolutionary coding agent that discovered a 4x4 complex matrix multiplication algorithm using 48 scalar multiplications, the first improvement over Strassen's algorithm in 56 years, ...
MLS-Bench: A Holistic and Rigorous Assessment of AI Systems on Building Better AI
cs.LG 2026-05 unverdicted novelty 6.0

MLS-Bench shows that current AI agents fall short of reliably inventing generalizable ML methods, with engineering tuning easier than genuine invention.
GrandCode: Achieving Grandmaster Level in Competitive Programming via Agentic Reinforcement Learning
cs.AI 2026-04 unverdicted novelty 6.0

GrandCode is the first AI system to consistently beat all human participants and place first in live Codeforces competitive programming contests.
RL4RLA: Teaching ML to Discover Randomized Linear Algebra Algorithms Through Curriculum Design and Graph-Based Search
cs.LG 2026-05 unverdicted novelty 5.0

RL4RLA is a reinforcement learning framework that discovers interpretable symbolic randomized linear algebra algorithms by combining curriculum learning and graph-based search to overcome sparse rewards and large sear...
Lark: Biologically Inspired Neuroevolution for Multi-Stakeholder LLM Agents
cs.MA 2025-10 unverdicted novelty 4.0

Lark is a biologically inspired neuroevolution framework for multi-stakeholder LLM agents that iteratively generates, refines, and selects strategies using plasticity, duplication/maturation, influence-weighted Borda ...