Iteris: Agentic Research Loops for Computational Mathematics
Pith reviewed 2026-06-28 14:46 UTC · model grok-4.3
The pith
An agentic research system generated numerical evidence, constructions, and proof drafts that contributed to verified results on two open problems in computational mathematics after human review and correction.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Iteris generated numerical evidence, constructions, and proof drafts that led, after expert review and correction, to verified results: a phase diagram for the asymptotic comparison between conjugate gradient and randomized coordinate descent on power-law spectra, and a counterexample showing that QR factorization with column pivoting can fail to select well-conditioned submatrices even under low coherence.
What carries the argument
Iteris, the agentic research system that runs loops to produce numerical evidence, constructions, and proof drafts for open computational mathematics problems.
If this is right
- Agentic systems can supply initial numerical evidence for asymptotic comparisons between iterative solvers on structured spectra.
- Such systems can surface concrete counterexamples to conjectures about the behavior of matrix algorithms like column-pivoted QR under low coherence.
- Research workflows in computational mathematics can incorporate AI-generated drafts that experts refine into publishable results.
- Human review remains required to convert the system's outputs into verified theorems or diagrams.
Where Pith is reading between the lines
- If the pattern holds on further problems, agentic systems could shorten the time spent on initial numerical exploration before humans focus on proofs.
- The approach could be tested on other open questions that combine linear-algebraic analysis with algorithm design.
- Over time, repeated use might reveal which kinds of computational mathematics tasks benefit most from this form of human-AI iteration.
Load-bearing premise
The material generated by the system supplied a non-trivial fraction of the insight or evidence, rather than the human experts performing the core discovery work during the review-and-correction step.
What would settle it
A controlled trial in which human experts solve the same two problems using only standard methods and without access to the system's generated evidence, constructions, or drafts would show whether the system's outputs were necessary for reaching the verified results.
read the original abstract
Recent advances in large language models and agentic AI systems have enabled significant progress in mathematical discovery, from solving competition problems to tackling research-level conjectures. However, open problems in computational mathematics have received comparatively less attention: research in this area often requires not only proofs but also numerical experimentation, adversarial constructions, and algorithm design. In this paper, we introduce an agentic research system, Iteris, designed for open problems in computational mathematics. We apply Iteris to two open problems from a recent Simons Workshop collection (arXiv:2602.05394). In these case studies, Iteris generated numerical evidence, constructions, and proof drafts that led, after expert review and correction, to verified results. The first result is a phase diagram for the asymptotic comparison between conjugate gradient and randomized coordinate descent on power-law spectra; the second is a counterexample showing that QR factorization with column pivoting can fail to select well-conditioned submatrices even under low coherence. These case studies suggest that agentic AI systems can participate meaningfully in research workflows for open problems in computational mathematics, while human validation remains essential.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces Iteris, an agentic research system for open problems in computational mathematics. It applies Iteris to two problems from a recent Simons Workshop collection, claiming that the system generated numerical evidence, constructions, and proof drafts that, after expert review and correction, yielded verified results: a phase diagram for the asymptotic comparison between conjugate gradient and randomized coordinate descent on power-law spectra, and a counterexample showing that QR factorization with column pivoting can fail to select well-conditioned submatrices even under low coherence. The paper concludes that such systems can participate meaningfully in research workflows while emphasizing the essential role of human validation.
Significance. If the attribution of non-trivial contributions to Iteris can be substantiated, the work would provide concrete case studies of agentic AI assisting in generating verifiable numerical and constructive results in computational mathematics. This could highlight a pathway for AI participation in areas requiring both experimentation and proof elements. However, the absence of detailed evidence for the system's independent role limits the ability to evaluate novelty or impact beyond existing human-driven workflows.
major comments (2)
- [Case studies] Case studies section: The central claim that Iteris supplied a non-trivial fraction of the numerical evidence, constructions, or proof drafts for the phase diagram and QR-pivoting counterexample is not supported by any session logs, prompt traces, before/after comparisons, or quantitative measures of the system's contribution versus human corrections. The text only states that results emerged 'after expert review and correction,' leaving the 'meaningful participation' assertion dependent on an unverified assumption about the division of labor.
- [System description] System description: No details are provided on the exact prompts used, the loop structure of Iteris, or any metrics quantifying how much of the final verified evidence originated from the agent versus human intervention. This omission prevents independent assessment of whether the system performed core discovery work or primarily supplied raw material for human refinement.
minor comments (1)
- [Abstract] The abstract and introduction could more explicitly distinguish between the system's raw outputs and the final verified results to avoid potential overstatement of autonomy.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our manuscript. The comments highlight important areas for improving transparency in both the case studies and system description. We address each major comment below and indicate the revisions we plan to make.
read point-by-point responses
-
Referee: [Case studies] Case studies section: The central claim that Iteris supplied a non-trivial fraction of the numerical evidence, constructions, or proof drafts for the phase diagram and QR-pivoting counterexample is not supported by any session logs, prompt traces, before/after comparisons, or quantitative measures of the system's contribution versus human corrections. The text only states that results emerged 'after expert review and correction,' leaving the 'meaningful participation' assertion dependent on an unverified assumption about the division of labor.
Authors: We agree that the manuscript does not provide direct evidence such as session logs or quantitative contribution metrics to support the extent of Iteris's role. The presentation focused on the verified outcomes rather than process documentation. In the revised version, we will expand the case studies section to describe specific examples of numerical evidence and initial constructions generated by the system, along with the nature of subsequent human corrections, to better substantiate the claim of meaningful participation. revision: partial
-
Referee: [System description] System description: No details are provided on the exact prompts used, the loop structure of Iteris, or any metrics quantifying how much of the final verified evidence originated from the agent versus human intervention. This omission prevents independent assessment of whether the system performed core discovery work or primarily supplied raw material for human refinement.
Authors: We acknowledge the lack of implementation details in the current system description. We will revise this section to include a description of the agentic loop structure (hypothesis generation, tool use for computation, iterative refinement), representative prompts from the case studies, and a qualitative discussion of the division of labor between the system and human experts. Quantitative metrics were not collected during the work, so we will clarify this limitation explicitly. revision: yes
- Complete session logs, full prompt traces, and before/after comparisons from the original experiments are unavailable, as they were not systematically archived during the exploratory process.
Circularity Check
No significant circularity: case studies rely on external workshop problems and post-hoc human validation rather than self-referential derivations or fitted predictions.
full rationale
The manuscript describes an agentic system applied to two external open problems (arXiv:2602.05394) and states that Iteris outputs were reviewed and corrected by experts to reach verified results. No equations, parameters, or uniqueness claims are defined in terms of the target results; the central narrative does not reduce any prediction or theorem to a fit or self-citation by construction. The attribution of insight is presented as a qualitative case study rather than a closed mathematical derivation, so the load-bearing steps remain independent of the paper's own outputs.
Axiom & Free-Parameter Ledger
invented entities (1)
-
Iteris
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Blumberg, Martin Hairer, Joe Kileel, Tamara G
Mohammed Abouzaid, Andrew J. Blumberg, Martin Hairer, Joe Kileel, Tamara G. Kolda, Paul D. Nelson, Daniel Spielman, Nikhil Srivastava, Rachel Ward, Shmuel Weinberger, and Lauren Williams. First proof, 2026. URL https://arxiv.org/abs/2602.05192
-
[2]
Convergence of random products of contractions in hilbert space.Acta Scientiarum Mathematicarum, 26(3–4):239–244, 1965
Ichiro Amemiya and Tadao Ando. Convergence of random products of contractions in hilbert space.Acta Scientiarum Mathematicarum, 26(3–4):239–244, 1965
1965
-
[3]
Noah Amsel, Yves Baumann, Paul Beckman, Peter Bürgisser, Chris Camaño, Tyler Chen, Edmond Chow, Anil Damle, Michal Derezinski, Mark Embree, Ethan N. Epperly, Robert Falgout, Mark Fornace, Anne Greenbaum, Chen Greif, Diana Halikias, Zhen Huang, Elias Jarlebring, Yiannis Koutis, Daniel Kressner, Rasmus Kyng, Jörg Liesen, Jackie Lok, Raphael A. Meyer, Yuji N...
-
[4]
Boiko, Robert MacKnight, Ben Kline, and Gabe Gomes
Daniil A. Boiko, Robert MacKnight, Ben Kline, and Gabe Gomes. Autonomous chemical research with large language models. Nature, 624(7992):570–578, December 2023. ISSN 1476-4687. doi: 10.1038/s41586-023-06792-0. URLhttp://dx.doi.org/10.1038/s41586-023-06792-0
-
[5]
Peter Businger and Gene H. Golub. Linear least squares solutions by householder transformations.Numerische Mathematik, 7(3):269–276, 1965. ISSN 0945-3245. doi: 10.1007/bf01436084. URLhttp://dx.doi.org/10.1007/ bf01436084
-
[6]
doi: 10.1007/s10208-009-9045-5
Emmanuel J. Candès and Benjamin Recht. Exact matrix completion via convex optimization.Foundations of Computational Mathematics, 9(6):717–772, April 2009. ISSN 1615-3383. doi: 10.1007/s10208-009-9045-5. URL http://dx.doi.org/10.1007/s10208-009-9045-5
-
[7]
Gheorghe Comanici, Eric Bieber, Mike Schaekermann, Ice Pasupat, Noveen Sachdeva, Inderjit Dhillon, Marcel Blistein, Ori Ram, Dan Zhang, Evan Rosen, et al. Gemini 2.5: Pushing the frontier with advanced reasoning, multimodality, long context, and next generation agentic capabilities.arXiv preprint arXiv:2507.06261, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[8]
Alex Davies, Petar Veličković, Lars Buesing, Sam Blackwell, Daniel Zheng, Nenad Tomašev, Richard Tanburn, Peter Battaglia, Charles Blundell, András Juhász, Marc Lackenby, Geordie Williamson, Demis Hassabis, and Pushmeet Kohli. Advancing mathematics by guiding human intuition with ai.Nature, 600(7887):70–74, December 2021. ISSN 1476-4687. doi: 10.1038/s415...
-
[9]
Sharp analysis of sketch-and-project methods via a connection to randomized singular value decomposition.SIAM Journal on Mathematics of Data Science, 6(1):127–153, February
Michał Dereziński and Elizaveta Rebrova. Sharp analysis of sketch-and-project methods via a connection to randomized singular value decomposition.SIAM Journal on Mathematics of Data Science, 6(1):127–153, February
-
[10]
ISSN 2577-0187. doi: 10.1137/23m1545537. URLhttp://dx.doi.org/10.1137/23m1545537
-
[11]
Aletheia: Towards autonomous mathematics research,
Tony Feng, Trieu H. Trinh, Garrett Bingham, Dawsen Hwang, Yuri Chervonyi, Junehyuk Jung, Joonkyung Lee, Carlo Pagano, Sang hyun Kim, Federico Pasqualotto, Sergei Gukov, Jonathan N. Lee, Junsu Kim, Kaiying Hou, Golnaz Ghiasi, Yi Tay, YaGuang Li, Chenkai Kuang, Yuan Liu, Hanzhao Lin, Evan Zheran Liu, Nigamaa Nayakanti, Xiaomeng Yang, Heng-Tze Cheng, Demis H...
-
[12]
Robert M. Gower and Peter Richtárik. Randomized iterative methods for linear systems.SIAM Journal on Matrix Analysis and Applications, 36(4):1660–1690, January 2015. ISSN 1095-7162. doi: 10.1137/15m1025487. URLhttp://dx.doi.org/10.1137/15m1025487
-
[13]
Deepseek-r1 incentivizes reasoning in llms through reinforcement learning.Nature, 645(8081): 633–638, 2025
Daya Guo, Dejian Yang, Haowei Zhang, Junxiao Song, Peiyi Wang, Qihao Zhu, Runxin Xu, Ruoyu Zhang, Shirong Ma, Xiao Bi, et al. Deepseek-r1 incentivizes reasoning in llms through reinforcement learning.Nature, 645(8081): 633–638, 2025
2025
-
[14]
M.R. Hestenes and E. Stiefel. Methods of conjugate gradients for solving linear systems.Journal of Research of the National Bureau of Standards, 49(6):409, December 1952. ISSN 0091-0635. doi: 10.6028/jres.049.044. URL http://dx.doi.org/10.6028/jres.049.044
-
[15]
Y. P. Hong and C.-T. Pan. Rank-revealing qr factorizations and the singular value decomposition.Mathematics of Computation, 58(197):213, January 1992. ISSN 0025-5718. doi: 10.2307/2153029. URLhttp://dx.doi.org/ 10.2307/2153029. 16 Iteris: Agentic Research Loops for Computational Mathematics
-
[16]
Automated Conjecture Resolution with Formal Verification
Haocheng Ju, Guoxiong Gao, Jiedong Jiang, Bin Wu, Zeming Sun, Leheng Chen, Yutong Wang, Yuefeng Wang, Zichen Wang, Wanyi He, Peihao Wu, Liang Xiao, Ruochuan Liu, Bryan Dai, and Bin Dong. Automated conjecture resolution with formal verification, 2026. URLhttps://arxiv.org/abs/2604.03789
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[17]
D. Leventhal and A. S. Lewis. Randomized methods for linear constraints: Convergence rates and conditioning. Mathematics of Operations Research, 35(3):641–654, August 2010. ISSN 1526-5471. doi: 10.1287/moor.1100.0456. URLhttp://dx.doi.org/10.1287/moor.1100.0456
-
[18]
Jackie Lok and Elizaveta Rebrova. Subspace-constrained randomized coordinate descent for linear systems with good low-rank matrix approximations, 2026. URLhttps://arxiv.org/abs/2506.09394
-
[19]
Towards end-to-end automation of ai research.Nature, 651(8107):914–919, March 2026
Chris Lu, Cong Lu, Robert Tjarko Lange, Yutaro Yamada, Shengran Hu, Jakob Foerster, David Ha, and Jeff Clune. Towards end-to-end automation of ai research.Nature, 651(8107):914–919, March 2026. ISSN 1476-4687. doi: 10.1038/s41586-026-10265-5. URLhttp://dx.doi.org/10.1038/s41586-026-10265-5
-
[20]
Yu. Nesterov. Efficiency of coordinate descent methods on huge-scale optimization problems.SIAM Journal on Optimization, 22(2):341–362, January 2012. ISSN 1095-7189. doi: 10.1137/100802001. URLhttp://dx.doi.org/ 10.1137/100802001
-
[21]
Alexander Novikov, Ngân V˜ u, Marvin Eisenberger, Emilien Dupont, Po-Sen Huang, Adam Zsolt Wagner, Sergey Shirobokov, Borislav Kozlovskii, Francisco J. R. Ruiz, Abbas Mehrabian, M. Pawan Kumar, Abigail See, Swarat Chaudhuri, George Holland, Alex Davies, Sebastian Nowozin, Pushmeet Kohli, and Matej Balog. Alphaevolve: A coding agent for scientific and algo...
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[22]
A.I. Osinsky. Close to optimal column approximation using a single svd.Linear Algebra and its Applications, 725:359–377, November 2025. ISSN 0024-3795. doi: 10.1016/j.laa.2025.07.016. URLhttp://dx.doi.org/10. 1016/j.laa.2025.07.016
-
[23]
Pawan Kumar, Emilien Dupont, Francisco J
Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov, Matej Balog, M. Pawan Kumar, Emilien Dupont, Francisco J. R. Ruiz, Jordan S. Ellenberg, Pengming Wang, Omar Fawzi, Pushmeet Kohli, and Alhussein Fawzi. Mathematical discoveries from program search with large language models.Nature, 625(7995): 468–475, December 2023. ISSN 1476-4687. doi...
-
[24]
2003.Iterative Methods for Sparse Linear Systems(second ed.)
Yousef Saad. Iterative Methods for Sparse Linear Systems. Society for Industrial and Applied Mathematics, January 2003. ISBN 9780898718003. doi: 10.1137/1.9780898718003. URL http://dx.doi.org/10.1137/1. 9780898718003
-
[25]
An Yang, Anfeng Li, Baosong Yang, Beichen Zhang, Binyuan Hui, Bo Zheng, Bowen Yu, Chang Gao, Chengen Huang, Chenxu Lv, et al. Qwen3 technical report.arXiv preprint arXiv:2505.09388, 2025
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[26]
ReAct: Synergizing Reasoning and Acting in Language Models
Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, and Yuan Cao. React: Synergizing reasoning and acting in language models, 2023. URLhttps://arxiv.org/abs/2210.03629
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[27]
Daniel Zheng, Ingrid von Glehn, Yori Zwols, Iuliya Beloshapka, Lars Buesing, Daniel M. Roy, Martin Wattenberg, Bogdan Georgiev, Tatiana Schmidt, Andrew Cowie, Fernanda Viegas, Dimitri Kanevsky, Vineet Kahlon, Hartmut Maennel, Sophia Alj, George Holland, Alex Davies, and Pushmeet Kohli. Ai co-mathematician: Accelerating mathematicians with agentic ai, 2026...
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[28]
DefineR β ∈R m×m by (Rβ)t,q = dt, q=t, −βσtσqsdt, t < q, 0, t > q
This constant controls the strictly upper-triangular couplings among the firstmselected columns. DefineR β ∈R m×m by (Rβ)t,q = dt, q=t, −βσtσqsdt, t < q, 0, t > q. The firstmselected columns are aq = (Rβ):,q 0 ,1≤q≤m. In matrix form their firstmcoordinates are Rβ = d1 −βσ1σ2sd1 · · · −βσ 1σmsd1 0d 2 · · · −βσ 2σmsd2 ... ... ... ... 0 0· · ·...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.