Recognition: no theorem link
k-server-bench: Automating Potential Discovery for the k-Server Conjecture
Pith reviewed 2026-05-10 17:45 UTC · model grok-4.3
The pith
This paper introduces a benchmark that casts the k-server conjecture as a search for a potential function satisfying a graph-structured system of linear inequalities.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper's central contribution is the definition of k-server-bench, a code-based challenge in which an automated agent must produce a potential function that satisfies every inequality in a fixed, graph-structured linear system derived from the k-server conjecture. The evaluation is sound but incomplete: any violated inequality immediately refutes the candidate, while satisfying the entire system does not constitute a proof yet still yields strong evidence and, to date, no known potential passes the open k=4 circle formulation. Experiments confirm that agentic methods solve nontrivial k=3 instances completely and reduce violations on k=4 relative to prior potentials without resolving the k
What carries the argument
The graph-structured system of linear inequalities that any candidate potential function must satisfy for the k-server conjecture.
If this is right
- A candidate that passes the full k=4 system would constitute strong evidence for the conjecture and could be paired with a proof to yield a substantial theoretical result.
- The benchmark supplies concrete tooling that lets k-server researchers test new potential hypotheses systematically.
- Current methods already solve the entire k=3 regime and measurably improve on known potentials for k=4.
- The task separates naive random baselines from sophisticated agents more sharply than earlier open-ended discovery benchmarks.
Where Pith is reading between the lines
- The same inequality-system approach could be adapted to other open problems in online algorithms whose proofs rely on potential functions.
- Integrating the benchmark with symbolic or theorem-proving tools might convert a passing candidate into a complete proof.
- The formulation may surface previously unnoticed structural constraints on possible potentials for the k-server problem.
Load-bearing premise
The chosen graph-structured linear inequalities faithfully encode the necessary conditions that every valid potential function for the k-server conjecture must obey.
What would settle it
An automated agent or exhaustive search that produces a potential satisfying every inequality in the k=4 circle formulation would confirm the benchmark can surface useful candidates; conversely, a proof that no such function exists under the given inequalities would show the formulation is too strict.
Figures
read the original abstract
We introduce a code-based challenge for automated, open-ended mathematical discovery based on the $k$-server conjecture, a central open problem in competitive analysis. The task is to discover a potential function satisfying a large graph-structured system of simple linear inequalities. The resulting evaluation procedure is sound but incomplete: any violated inequality definitively refutes a candidate, whereas satisfying all inequalities does not by itself constitute a proof of the corresponding conjecture's special case. Nevertheless, a candidate that passes all constraints would be strong evidence toward a valid proof and, to the best of our knowledge, no currently known potential achieves this under our formulation in the open $k=4$ circle case. As such, a successful candidate would already be an interesting contribution to the $k$-server conjecture, and could become a substantial theoretical result when paired with a full proof. Experiments on the resolved $k=3$ regime show that current agentic methods can solve nontrivial instances, and in the open $k=4$ regime they reduce the number of violations relative to existing potentials without fully resolving the task. Taken together, these results suggest that the task is challenging but plausibly within reach of current methods. Beyond its relevance to the $k$-server community, where the developed tooling enables researchers to test new hypotheses and potentially improve on the current record, the task also serves as a useful \emph{benchmark} for developing code-based discovery agents. In particular, our $k=3$ results show that it mitigates important limitations of existing open-ended code-based benchmarks, including early saturation and the weak separation between naive random baselines and more sophisticated methods.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces k-server-bench, a code-based benchmark for automated discovery of potential functions satisfying a graph-structured system of linear inequalities derived from the k-server conjecture. The evaluation procedure is sound (any violation refutes a candidate) but incomplete (satisfaction provides evidence but not a proof). Experiments demonstrate that agentic methods recover known solutions on resolved k=3 instances and reduce violations relative to existing potentials on the open k=4 circle case, where no known potential satisfies the formulation. The work positions the benchmark as useful both for advancing the k-server conjecture and for evaluating open-ended code-based discovery agents, addressing limitations such as early saturation in prior benchmarks.
Significance. If the results hold, the benchmark supplies a concrete, falsifiable tool that enables systematic testing of candidate potentials for the k-server problem, with a passing candidate constituting strong evidence toward a proof in the open k=4 case. The k=3 experiments confirm recovery of known solutions and demonstrate clear separation between naive and sophisticated methods on nontrivial instances, mitigating early saturation issues in existing discovery benchmarks. The absence of free parameters or self-referential fitting in the inequality system strengthens the claim that the benchmark faithfully captures necessary conditions from the literature.
minor comments (2)
- [§3.2] §3.2: the precise construction of the graph for the k=4 circle case (number of nodes, edge weights) could be stated explicitly with a small example to aid reproducibility without requiring inspection of the released code.
- [Table 2] Table 2: the column reporting number of violations for agent runs on k=4 would benefit from an additional row or note indicating the best-known hand-crafted potential for direct comparison.
Simulated Author's Rebuttal
We thank the referee for their positive assessment of the work and for recommending acceptance. No major comments were raised, so we have no specific points requiring response or revision.
Circularity Check
No significant circularity: benchmark defined from external literature
full rationale
The paper defines a sound-but-incomplete benchmark consisting of graph-structured linear inequalities drawn from the established k-server conjecture literature. No parameter is fitted to data and then renamed as a prediction; the k=3 results rest on the independently resolved case rather than any self-derived equation; and the central claim is limited to the benchmark's utility for agentic discovery without asserting a proof of the open k=4 case. No load-bearing step reduces by construction to the paper's own inputs or self-citations.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
- [1]
-
[2]
On the competitive ratio of the work function algorithm for the k-server problem
Yair Bartal and Elias Koutsoupias. “On the competitive ratio of the work function algorithm for the k-server problem”. In:Theoretical computer science324.2-3 (2004), pp. 337–345
2004
-
[3]
The 3-server problem in the plane
Wolfgang W Bein, Marek Chrobak, and Lawrence L Larmore. “The 3-server problem in the plane”. In:Theoretical Computer Science289.1 (2002), pp. 335–354
2002
-
[4]
Michael P. Brenner, Vincent Cohen-Addad, and David Woodruff.Solving an Open Problem in Theoretical Physics using AI-Assisted Discovery. 2026. arXiv:2603.04735 [cs.AI].url:https://arxiv.org/abs/2603.04735
- [5]
- [6]
- [7]
- [8]
- [9]
-
[10]
An optimal on-line algorithm for k servers on trees
Marek Chrobak and Lawrence L Larmore. “An optimal on-line algorithm for k servers on trees”. In:SIAM Journal on Computing20.1 (1991), pp. 144–148
1991
-
[11]
New results on server problems
Marek Chrobak et al. “New results on server problems”. In:SIAM Journal on Discrete Mathematics4.2 (1991), pp. 172–181
1991
- [12]
-
[13]
Advancing mathematics by guiding human intuition with AI
A. Davies et al. “Advancing mathematics by guiding human intuition with AI”. In: Nature600 (2021), pp. 70–74
2021
- [14]
-
[15]
Epoch AI
Epoch AI.FrontierMath Open Problems. Epoch AI. 2026.url:https://epoch.ai/ frontiermath/open-problems(visited on 02/13/2026)
2026
- [16]
- [17]
-
[18]
Competitive k-server algorithms
Amos Fiat, Yuval Rabani, and Yiftach Ravid. “Competitive k-server algorithms”. In: Journal of Computer and System Sciences48.3 (1994), pp. 410–428
1994
- [19]
-
[20]
Random Baselines for Simple Code Problems are Competitive with Code Evolution
Yonatan Gideoni et al. “Random Baselines for Simple Code Problems are Competitive with Code Evolution”. In:NeurIPS 2025 Fourth Workshop on Deep Learning for Code. 2025.url:https://openreview.net/forum?id=rbVIpbmbTc
2025
- [21]
- [22]
- [23]
- [24]
-
[25]
Thek-server problem
Elias Koutsoupias. “Thek-server problem”. en. In:Comput. Sci. Rev.3.2 (May 2009), pp. 105–118
2009
-
[26]
On thek-server conjecture
Elias Koutsoupias and Christos H Papadimitriou. “On thek-server conjecture”. en. In: J. ACM42.5 (Sept. 1995), pp. 971–983
1995
-
[27]
Guillaume Lample and Fran¸ cois Charton.Deep Learning for Symbolic Mathematics
- [28]
- [29]
-
[30]
Chris Lu et al.The AI Scientist: Towards Fully Automated Open-Ended Scientific Dis- covery. 2024. arXiv:2408.06292 [cs.AI].url:https://arxiv.org/abs/2408.06292
work page internal anchor Pith review arXiv 2024
-
[31]
Mark S Manasse, Lyle A McGeoch, and Daniel D Sleator. “Competitive algorithms for server problems”. In:Journal of Algorithms11.2 (1990), pp. 208–230.issn: 0196-6774. doi:https : / / doi . org / 10 . 1016 / 0196 - 6774(90 ) 90003 - W.url:https : / / www . sciencedirect.com/science/article/pii/019667749090003W
- [32]
-
[33]
Ansh Nagda, Prabhakar Raghavan, and Abhradeep Thakurta.Reinforced Generation of Combinatorial Structures: Ramsey Numbers. 2026. arXiv:2603.09172 [math.CO]. url:https://arxiv.org/abs/2603.09172
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[34]
Alexander Novikov et al.AlphaEvolve: A coding agent for scientific and algorithmic discovery. 2025. arXiv:2506.13131 [cs.AI].url:https://arxiv.org/abs/2506. 13131
work page internal anchor Pith review arXiv 2025
- [35]
-
[36]
Mathematical discoveries from program search with large language models
Bernardino Romera-Paredes et al. “Mathematical discoveries from program search with large language models”. en. In:Nature625.7995 (Jan. 2024), pp. 468–475
2024
-
[37]
2025.url: https://github.com/algorithmicsuperintelligence/openevolve
Asankhaya Sharma.OpenEvolve: an open-source evolutionary coding agent. 2025.url: https://github.com/algorithmicsuperintelligence/openevolve. 12
2025
-
[38]
Nat Sothanaphan.Resolution of Erd˝ os Problem #728: a writeup of Aristotle’s Lean proof. 2026. arXiv:2601 . 07421 [math.NT].url:https : / / arxiv . org / abs / 2601 . 07421
2026
- [39]
- [40]
- [41]
- [42]
-
[43]
Online Matching with Stochastic Rewards: Provable Better Bound via Adversarial Reinforcement Learning
Qiankun Zhang et al. “Online Matching with Stochastic Rewards: Provable Better Bound via Adversarial Reinforcement Learning”. In:Proceedings of the 41st Interna- tional Conference on Machine Learning. Ed. by Ruslan Salakhutdinov et al. Vol. 235. Proceedings of Machine Learning Research. PMLR, 21–27 Jul 2024, pp. 59806–59826. url:https://proceedings.mlr.pr...
2024
-
[44]
Tianshi Zheng et al.From Automation to Autonomy: A Survey on Large Language Models in Scientific Discovery. 2025. arXiv:2505 . 13259 [cs.CL].url:https : / / arxiv.org/abs/2505.13259. A Related Work Recent AI systems have shown growing promise in formal scientific domains, including both physics [4, 22] and mathematics [19, 14, 6, 16, 17, 37, 9, 8]. Alongs...
-
[45]
For allwand requestsr, Φ(w′)−Φ(w)≥∇(w,r), wherew′=T r(w)is the work function afterr,
-
[46]
For allw, Φ(w)≤(c+ 1) min X∈C w(X) + const. Then WFA isc-competitive. 15 Proof.Using Lemma 1 and telescoping, OPT(ρ) + WFA(ρ)≤ ∑ i ∇(wi,r i+1)≤Φ(wt+1)−Φ(w0), and the second property yields the claim after absorbing constants. B.3 Work-Function Graph We first note that work functions are shift invariant in the following sense: Lemma 2(Shift-Invariance).For...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.