pith. sign in

super hub Canonical reference

AlphaEvolve: A coding agent for scientific and algorithmic discovery

Canonical reference. 74% of citing Pith papers cite this work as background.

179 Pith papers citing it
Background 74% of classified citations
abstract

In this white paper, we present AlphaEvolve, an evolutionary coding agent that substantially enhances capabilities of state-of-the-art LLMs on highly challenging tasks such as tackling open scientific problems or optimizing critical pieces of computational infrastructure. AlphaEvolve orchestrates an autonomous pipeline of LLMs, whose task is to improve an algorithm by making direct changes to the code. Using an evolutionary approach, continuously receiving feedback from one or more evaluators, AlphaEvolve iteratively improves the algorithm, potentially leading to new scientific and practical discoveries. We demonstrate the broad applicability of this approach by applying it to a number of important computational problems. When applied to optimizing critical components of large-scale computational stacks at Google, AlphaEvolve developed a more efficient scheduling algorithm for data centers, found a functionally equivalent simplification in the circuit design of hardware accelerators, and accelerated the training of the LLM underpinning AlphaEvolve itself. Furthermore, AlphaEvolve discovered novel, provably correct algorithms that surpass state-of-the-art solutions on a spectrum of problems in mathematics and computer science, significantly expanding the scope of prior automated discovery methods (Romera-Paredes et al., 2023). Notably, AlphaEvolve developed a search algorithm that found a procedure to multiply two $4 \times 4$ complex-valued matrices using $48$ scalar multiplications; offering the first improvement, after 56 years, over Strassen's algorithm in this setting. We believe AlphaEvolve and coding agents like it can have a significant impact in improving solutions of problems across many areas of science and computation.

hub tools

citation-role summary

background 33 baseline 3 method 3 dataset 2 other 1

citation-polarity summary

claims ledger

  • abstract In this white paper, we present AlphaEvolve, an evolutionary coding agent that substantially enhances capabilities of state-of-the-art LLMs on highly challenging tasks such as tackling open scientific problems or optimizing critical pieces of computational infrastructure. AlphaEvolve orchestrates an autonomous pipeline of LLMs, whose task is to improve an algorithm by making direct changes to the code. Using an evolutionary approach, continuously receiving feedback from one or more evaluators, AlphaEvolve iteratively improves the algorithm, potentially leading to new scientific and practical d

authors

co-cited works

clear filters

representative citing papers

LLMs Improving LLMs: Agentic Discovery for Test-Time Scaling

cs.CL · 2026-05-08 · conditional · novelty 8.0 · 2 refs

AutoTTS discovers width-depth test-time scaling controllers through agentic search in a pre-collected trajectory environment, yielding better accuracy-cost tradeoffs than hand-designed baselines on math reasoning tasks at low cost.

VibeServe: Can AI Agents Build Bespoke LLM Serving Systems?

cs.AI · 2026-05-07 · unverdicted · novelty 8.0

VibeServe demonstrates that AI agents can synthesize bespoke LLM serving systems end-to-end, remaining competitive with vLLM in standard settings while outperforming it in six non-standard scenarios involving unusual models, workloads, or hardware.

Prism: Symbolic Superoptimization of Tensor Programs

cs.PL · 2026-04-16 · unverdicted · novelty 8.0

Prism is the first symbolic superoptimizer for tensor programs that uses sGraph for compact representation of program families, two-level search, e-graph equivalence checking, and auto-tuning to achieve up to 2.2x speedup over prior superoptimizers on LLM workloads.

Forecasting Scientific Progress with Artificial Intelligence

cs.AI · 2026-05-21 · unverdicted · novelty 7.0

Introduces the CUSP benchmark across 4760 events and finds frontier AI models can pick plausible directions but fail to predict whether or when scientific advances will occur, with performance varying by domain and insensitive to training cutoffs.

What Do Evolutionary Coding Agents Evolve?

cs.NE · 2026-05-19 · unverdicted · novelty 7.0

Evolutionary coding agents achieve most benchmark gains through a small subset of edit types and by cycling previously deleted code lines rather than developing new algorithmic structures.

Property-Guided LLM Program Synthesis for Planning

cs.AI · 2026-05-15 · unverdicted · novelty 7.0

Property-guided LLM program synthesis with counterexample feedback creates direct heuristics for PDDL planning domains that require far fewer generations and less evaluation cost than score-based baselines.

citing papers explorer

Showing 17 of 17 citing papers after filters.

  • VibeServe: Can AI Agents Build Bespoke LLM Serving Systems? cs.AI · 2026-05-07 · unverdicted · none · ref 54 · internal anchor

    VibeServe demonstrates that AI agents can synthesize bespoke LLM serving systems end-to-end, remaining competitive with vLLM in standard settings while outperforming it in six non-standard scenarios involving unusual models, workloads, or hardware.

  • Harnessing Agentic Evolution cs.AI · 2026-05-13 · unverdicted · none · ref 19 · internal anchor

    AEvo introduces a meta-agent that edits the evolution procedure or agent context based on accumulated state, outperforming baselines by 26% relative improvement on agentic benchmarks and achieving SOTA on open-ended tasks.

  • Budget-Efficient Automatic Algorithm Design via Code Graph cs.AI · 2026-05-11 · unverdicted · none · ref 2 · internal anchor

    A code-graph and correction-based LLM search framework outperforms full-algorithm generation at equal token budgets on three combinatorial optimization problems.

  • AHD Agent: Agentic Reinforcement Learning for Automatic Heuristic Design cs.AI · 2026-05-09 · unverdicted · none · ref 8 · internal anchor

    AHD Agent trains a 4B-parameter LLM via agentic RL to actively use tools for automatic heuristic design, matching or exceeding larger baselines across eight domains with fewer evaluations.

  • Weblica: Scalable and Reproducible Training Environments for Visual Web Agents cs.AI · 2026-05-07 · unverdicted · none · ref 23 · internal anchor

    Weblica scales RL training for visual web agents by building thousands of reproducible environments through HTTP caching for stable replays and LLM synthesis from real sites, yielding an 8B model that beats similar open baselines on navigation benchmarks.

  • Agentic-imodels: Evolving agentic interpretability tools via autoresearch cs.AI · 2026-05-05 · unverdicted · none · ref 51 · internal anchor

    Agentic-imodels evolves scikit-learn regressors via an autoresearch loop to jointly boost predictive performance and LLM-simulatability, improving downstream agentic data science tasks by up to 73% on the BLADE benchmark.

  • Meta-Harness: End-to-End Optimization of Model Harnesses cs.AI · 2026-03-30 · unverdicted · none · ref 37 · internal anchor

    Meta-Harness discovers improved harness code for LLMs via agentic search over prior execution traces, yielding 7.7-point gains on text classification with 4x fewer tokens and 4.7-point gains on math reasoning across held-out models.

  • Agentic Architect: An Agentic AI Framework for Architecture Design Exploration and Optimization cs.AI · 2026-04-28 · accept · none · ref 34 · internal anchor

    An LLM-driven agentic system evolves microarchitectural policies for cache replacement, data prefetching, and branch prediction, producing designs that match or exceed prior state-of-the-art in IPC on standard benchmarks.

  • LLM-Guided Strategy Synthesis for Scalable Equality Saturation cs.AI · 2026-04-19 · unverdicted · none · ref 28 · internal anchor

    EggMind automates EqSat strategy synthesis via LLMs and EqSatL, cutting final cost 45.1% and peak RAM 69.1% versus full equality saturation on vectorization benchmarks while transferring to tensor compilers.

  • EGL-SCA: Structural Credit Assignment for Co-Evolving Instructions and Tools in Graph Reasoning Agents cs.AI · 2026-05-11 · unverdicted · none · ref 1 · internal anchor

    EGL-SCA co-evolves instructions and tools via structural credit assignment in graph reasoning agents and reports 92% average success on four benchmarks.

  • pAI/MSc: ML Theory Research with Humans on the Loop cs.AI · 2026-04-22 · unverdicted · none · ref 3 · 2 links · internal anchor

    pAI/MSc is a customizable multi-agent system that reduces human steering by orders of magnitude when turning a hypothesis into a literature-grounded, mathematically established, experimentally supported manuscript draft in ML theory.

  • AutoResearch AI: Towards AI-Powered Research Automation for Scientific Discovery cs.AI · 2026-05-22 · unverdicted · none · ref 86 · internal anchor

    A survey organizing AI-powered research automation into five workflow stages, defining AutoResearch and Vibe Research, and proposing five evaluation dimensions while noting domain-conditioned limits on autonomy.

  • AI for Auto-Research: Roadmap & User Guide cs.AI · 2026-05-18 · unverdicted · none · ref 140 · internal anchor

    The paper delivers a stage-by-stage roadmap for AI in research, showing reliable assistance in retrieval and tool tasks but fragility in novelty and judgment, advocating human-governed collaboration.

  • Artificial Intelligence and the Structure of Mathematics cs.AI · 2026-04-07 · unverdicted · none · ref 65 · internal anchor

    AI agents exploring Platonic mathematical structures via proof hypergraphs may reveal the overall architecture of formal mathematics and what makes parts of it human-accessible.

  • Agentic Reasoning for Large Language Models cs.AI · 2026-01-18 · unverdicted · none · ref 70 · internal anchor

    The survey structures agentic reasoning for LLMs into foundational, self-evolving, and collective multi-agent layers while distinguishing in-context orchestration from post-training optimization and reviewing applications across domains.

  • Shepherd: Enabling Programmable Meta-Agents via Reversible Agentic Execution Traces cs.AI · 2026-05-11 · unreviewed · ref 26 · internal anchor
  • FitText: Evolving Agent Tool Ecologies via Memetic Retrieval cs.AI · 2026-05-04 · unreviewed · ref 28 · internal anchor