pith. machine review for the scientific record. sign in

arxiv: 2603.09172 · v5 · submitted 2026-03-10 · 🧮 math.CO · cs.AI· cs.CC

Recognition: no theorem link

Reinforced Generation of Combinatorial Structures: Ramsey Numbers

Authors on Pith no claims yet

Pith reviewed 2026-05-15 14:06 UTC · model grok-4.3

classification 🧮 math.CO cs.AIcs.CC MSC 05C55
keywords Ramsey numberslower boundscombinatorial searchLLM code generationgraph theoryextremal combinatoricscode mutation
0
0 comments X

The pith

An LLM-based code mutation agent produces improved lower bounds for nine Ramsey numbers.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper demonstrates that AlphaEvolve, a single LLM-driven agent, can mutate code to discover new combinatorial objects that raise the known lower bounds on several classical Ramsey numbers. It reports concrete improvements: R(3,13) to 61, R(3,18) to 100, R(4,13) to 139, R(4,14) to 148, R(4,15) to 159, R(4,16) to 174, R(4,18) to 209, R(4,19) to 219, and R(4,20) to 237. The same agent recovers every known exact Ramsey number and matches the best published lower bounds in many additional cases. The work argues that virtually all existing Ramsey lower bounds come from bespoke computational searches, whereas one general meta-algorithm suffices for all the results presented here.

Core claim

AlphaEvolve evolves code that searches for graphs establishing new lower bounds on Ramsey numbers R(s,t) by producing explicit graphs on n vertices that contain neither a clique of size s nor an independent set of size t, where n is larger than all previously known constructions for the nine listed pairs.

What carries the argument

AlphaEvolve, an LLM-based code mutation agent that generates and iteratively improves search programs for Ramsey graphs.

If this is right

  • A single general-purpose agent can replace many hand-crafted search algorithms for Ramsey lower bounds.
  • The method recovers all known exact Ramsey numbers, showing it does not miss small cases where exhaustive verification is feasible.
  • It matches prior best lower bounds even in cases where earlier papers give no algorithmic details.
  • Virtually all Ramsey lower bounds remain computational, so the meta-algorithm approach scales to further pairs.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same code-evolution loop could be applied to other extremal problems such as Turán numbers or Ramsey numbers in hypergraphs.
  • Independent verification of the largest new graphs would remove any remaining doubt about the correctness of the agent-generated code.
  • If the agent continues to improve bounds on additional pairs, it would suggest that LLM-driven search can systematically advance tables of Ramsey numbers.
  • The success on Ramsey problems indicates that code mutation may transfer to other combinatorial search domains where bespoke algorithms currently dominate.

Load-bearing premise

The graphs output by the mutated code are correctly verified to contain no clique or independent set of the forbidden sizes.

What would settle it

An independent program that reads one of the published adjacency lists, for example the 237-vertex graph claimed for R(4,20), and reports no K4 or independent set of size 20.

read the original abstract

We present improved lower bounds for nine classical Ramsey numbers: $\mathbf{R}(3, 13)$ is increased from $60$ to $61$, $\mathbf{R}(3, 18)$ from $99$ to $100$, $\mathbf{R}(4, 13)$ from $138$ to $139$, $\mathbf{R}(4, 14)$ from $147$ to $148$, $\mathbf{R}(4, 15)$ from $158$ to $159$, $\mathbf{R}(4, 16)$ from $170$ to $174$, $\mathbf{R}(4, 18)$ from $205$ to $209$, $\mathbf{R}(4, 19)$ from $213$ to $219$, and $\mathbf{R}(4, 20)$ from $234$ to $237$. These results were achieved using AlphaEvolve, an LLM-based code mutation agent. Beyond these new results, we successfully recovered lower bounds for all Ramsey numbers known to be exact, and matched the best known lower bounds across many other cases. These include bounds for which previous work does not detail the algorithms used. Virtually all known Ramsey lower bounds are derived computationally, with bespoke search algorithms each delivering a handful of results. AlphaEvolve is a single meta-algorithm yielding search algorithms for all of our results.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript claims improved lower bounds for nine Ramsey numbers—R(3,13) from 60 to 61, R(3,18) from 99 to 100, R(4,13) from 138 to 139, R(4,14) from 147 to 148, R(4,15) from 158 to 159, R(4,16) from 170 to 174, R(4,18) from 205 to 209, R(4,19) from 213 to 219, and R(4,20) from 234 to 237—obtained via AlphaEvolve, an LLM-based code-mutation agent that generates bespoke search algorithms. It further states that the same pipeline recovers all known exact Ramsey numbers and matches the best-known lower bounds in many other cases.

Significance. If independently verified, the nine new lower bounds would constitute concrete progress on classical Ramsey numbers, with the largest gains appearing for the R(4,k) family with k=16–20. The meta-algorithmic approach—producing multiple search procedures from a single LLM-driven framework rather than hand-crafted code for each instance—could be of broader interest in computational combinatorics if the verification pipeline is made reproducible.

major comments (3)
  1. [Results section (new bounds)] The nine new lower-bound claims rest on explicit graphs whose existence is asserted but not exhibited: no adjacency matrices, edge lists, or SAT encodings are supplied for any of the improved instances (e.g., the 174-vertex graph witnessing R(4,16) > 173). Without these witnesses or the exact verification code that certifies ω(G) < 4 and α(G) < 16, the central computational results cannot be checked by readers or referees.
  2. [Experimental validation] The manuscript reports that the AlphaEvolve pipeline recovered all previously known exact Ramsey numbers, yet provides no quantitative comparison (e.g., success rate, runtime, or failure modes) between the recovered instances and the nine new, larger instances. This leaves open whether the verification procedure scales reliably beyond the range where ground truth is already known.
  3. [Method (AlphaEvolve pipeline)] Because the search and verification routines themselves are outputs of LLM code mutation, any undetected error in the generated Bron-Kerbosch implementation, bitset operations, or clique/independence-number checks would silently validate an invalid witness. The paper does not describe an independent, human-auditable verification step or release the mutated source code.
minor comments (2)
  1. [Abstract] The abstract and introduction use boldface for the new bounds; this notation is unnecessary and should be removed for consistency with standard mathematical typesetting.
  2. [Tables] Table captions should explicitly state the source of each listed bound (literature reference or new result) rather than relying on context.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed comments, which highlight important aspects of reproducibility and verifiability. We address each major comment below and will revise the manuscript accordingly to strengthen these elements.

read point-by-point responses
  1. Referee: The nine new lower-bound claims rest on explicit graphs whose existence is asserted but not exhibited: no adjacency matrices, edge lists, or SAT encodings are supplied for any of the improved instances (e.g., the 174-vertex graph witnessing R(4,16) > 173). Without these witnesses or the exact verification code that certifies ω(G) < 4 and α(G) < 16, the central computational results cannot be checked by readers or referees.

    Authors: We agree that explicit witnesses are necessary for independent verification. In the revision we will supply adjacency-list representations for all nine new graphs as supplementary data files. We will also include a standalone, human-written Python verification script (using standard algorithms such as a reference Bron-Kerbosch implementation) that readers can run to confirm ω(G) < 4 and α(G) < 16 for each witness. This directly resolves the reproducibility concern. revision: yes

  2. Referee: The manuscript reports that the AlphaEvolve pipeline recovered all previously known exact Ramsey numbers, yet provides no quantitative comparison (e.g., success rate, runtime, or failure modes) between the recovered instances and the nine new, larger instances. This leaves open whether the verification procedure scales reliably beyond the range where ground truth is already known.

    Authors: We accept that a quantitative comparison would strengthen the experimental section. The revised manuscript will add a table reporting success rates, average runtimes, number of mutations attempted, and resource usage for both the recovered exact Ramsey numbers and the nine new bounds. This comparison will show that the same pipeline scales reliably to the larger instances, with the new results obtained under comparable verification protocols. revision: yes

  3. Referee: Because the search and verification routines themselves are outputs of LLM code mutation, any undetected error in the generated Bron-Kerbosch implementation, bitset operations, or clique/independence-number checks would silently validate an invalid witness. The paper does not describe an independent, human-auditable verification step or release the mutated source code.

    Authors: We will expand the Methods section to describe an independent, human-auditable verification protocol that relies on a separately written, standard implementation of clique and independence-number algorithms (distinct from any LLM-generated code). In addition, we will release the final mutated source codes together with the verification scripts in a public GitHub repository, with a link provided in the revised paper. This ensures full auditability. revision: yes

Circularity Check

0 steps flagged

No circularity: results are direct computational outputs from search, not derived quantities

full rationale

The manuscript reports new Ramsey lower bounds obtained by running an LLM-mutated search program that enumerates candidate graphs and checks the clique and independence-number conditions. No equations, fitted parameters, or self-referential definitions appear; the claimed bounds are simply the largest n for which the generated code returned a valid witness graph. Recovery of previously known exact values is presented only as a sanity check on the pipeline, not as part of the derivation of the new bounds. No self-citation chain, ansatz smuggling, or renaming of known results is used to support the central claims.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the assumption that LLM-mutated code produces valid Ramsey graphs; no free parameters or invented entities are introduced in the abstract.

axioms (1)
  • domain assumption The output of the evolved search code correctly certifies the absence of forbidden cliques and independent sets.
    This assumption is required for any claimed lower bound to be valid.

pith-pipeline@v0.9.0 · 5546 in / 1315 out tokens · 92687 ms · 2026-05-15T14:06:54.019949+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. A revision of Litvak's conjecture on Gaussian minima and a volumetric zone conjecture

    math.PR 2026-05 unverdicted novelty 7.0

    Litvak's conjecture on minimizing moments of Gaussian minima is disproved by a cosine-based correlation matrix for small n and p, with a new conjecture proposed that this matrix is the general minimizer, supported con...

  2. $k$-server-bench: Automating Potential Discovery for the $k$-Server Conjecture

    cs.MS 2026-04 accept novelty 7.0

    k-server-bench formulates potential-function discovery for the k-server conjecture as a code-based inequality-satisfaction task; current agents fully solve the resolved k=3 case and reduce violations on the open k=4 case.

  3. pAI/MSc: ML Theory Research with Humans on the Loop

    cs.AI 2026-04 unverdicted novelty 5.0

    pAI/MSc is a customizable multi-agent system that reduces human steering by orders of magnitude when turning a hypothesis into a literature-grounded, mathematically established, experimentally supported manuscript dra...