pith. sign in

arxiv: 2606.02484 · v1 · pith:V3UKFL3Wnew · submitted 2026-06-01 · 💻 cs.AI · cs.LG

Iteris: Agentic Research Loops for Computational Mathematics

Pith reviewed 2026-06-28 14:46 UTC · model grok-4.3

classification 💻 cs.AI cs.LG
keywords agentic systemscomputational mathematicsnumerical linear algebraphase diagramcounterexampleconjugate gradientQR factorization
0
0 comments X

The pith

An agentic research system generated numerical evidence, constructions, and proof drafts that contributed to verified results on two open problems in computational mathematics after human review and correction.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents Iteris as a system built to handle open problems in computational mathematics, which often mix proofs with numerical experiments, adversarial constructions, and algorithm design. It describes applying the system to two problems from a recent workshop collection, where the system produced evidence and drafts that experts then reviewed, corrected, and used to reach verified outcomes. One outcome is a phase diagram comparing the asymptotic behavior of conjugate gradient and randomized coordinate descent on power-law spectra. The other is a counterexample showing that QR factorization with column pivoting can fail to pick well-conditioned submatrices even when coherence is low. A sympathetic reader would care because the work explores whether such systems can supply usable material in research loops that combine computation and theory, while keeping human validation central.

Core claim

Iteris generated numerical evidence, constructions, and proof drafts that led, after expert review and correction, to verified results: a phase diagram for the asymptotic comparison between conjugate gradient and randomized coordinate descent on power-law spectra, and a counterexample showing that QR factorization with column pivoting can fail to select well-conditioned submatrices even under low coherence.

What carries the argument

Iteris, the agentic research system that runs loops to produce numerical evidence, constructions, and proof drafts for open computational mathematics problems.

If this is right

  • Agentic systems can supply initial numerical evidence for asymptotic comparisons between iterative solvers on structured spectra.
  • Such systems can surface concrete counterexamples to conjectures about the behavior of matrix algorithms like column-pivoted QR under low coherence.
  • Research workflows in computational mathematics can incorporate AI-generated drafts that experts refine into publishable results.
  • Human review remains required to convert the system's outputs into verified theorems or diagrams.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the pattern holds on further problems, agentic systems could shorten the time spent on initial numerical exploration before humans focus on proofs.
  • The approach could be tested on other open questions that combine linear-algebraic analysis with algorithm design.
  • Over time, repeated use might reveal which kinds of computational mathematics tasks benefit most from this form of human-AI iteration.

Load-bearing premise

The material generated by the system supplied a non-trivial fraction of the insight or evidence, rather than the human experts performing the core discovery work during the review-and-correction step.

What would settle it

A controlled trial in which human experts solve the same two problems using only standard methods and without access to the system's generated evidence, constructions, or drafts would show whether the system's outputs were necessary for reaching the verified results.

read the original abstract

Recent advances in large language models and agentic AI systems have enabled significant progress in mathematical discovery, from solving competition problems to tackling research-level conjectures. However, open problems in computational mathematics have received comparatively less attention: research in this area often requires not only proofs but also numerical experimentation, adversarial constructions, and algorithm design. In this paper, we introduce an agentic research system, Iteris, designed for open problems in computational mathematics. We apply Iteris to two open problems from a recent Simons Workshop collection (arXiv:2602.05394). In these case studies, Iteris generated numerical evidence, constructions, and proof drafts that led, after expert review and correction, to verified results. The first result is a phase diagram for the asymptotic comparison between conjugate gradient and randomized coordinate descent on power-law spectra; the second is a counterexample showing that QR factorization with column pivoting can fail to select well-conditioned submatrices even under low coherence. These case studies suggest that agentic AI systems can participate meaningfully in research workflows for open problems in computational mathematics, while human validation remains essential.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces Iteris, an agentic research system for open problems in computational mathematics. It applies Iteris to two problems from a recent Simons Workshop collection, claiming that the system generated numerical evidence, constructions, and proof drafts that, after expert review and correction, yielded verified results: a phase diagram for the asymptotic comparison between conjugate gradient and randomized coordinate descent on power-law spectra, and a counterexample showing that QR factorization with column pivoting can fail to select well-conditioned submatrices even under low coherence. The paper concludes that such systems can participate meaningfully in research workflows while emphasizing the essential role of human validation.

Significance. If the attribution of non-trivial contributions to Iteris can be substantiated, the work would provide concrete case studies of agentic AI assisting in generating verifiable numerical and constructive results in computational mathematics. This could highlight a pathway for AI participation in areas requiring both experimentation and proof elements. However, the absence of detailed evidence for the system's independent role limits the ability to evaluate novelty or impact beyond existing human-driven workflows.

major comments (2)
  1. [Case studies] Case studies section: The central claim that Iteris supplied a non-trivial fraction of the numerical evidence, constructions, or proof drafts for the phase diagram and QR-pivoting counterexample is not supported by any session logs, prompt traces, before/after comparisons, or quantitative measures of the system's contribution versus human corrections. The text only states that results emerged 'after expert review and correction,' leaving the 'meaningful participation' assertion dependent on an unverified assumption about the division of labor.
  2. [System description] System description: No details are provided on the exact prompts used, the loop structure of Iteris, or any metrics quantifying how much of the final verified evidence originated from the agent versus human intervention. This omission prevents independent assessment of whether the system performed core discovery work or primarily supplied raw material for human refinement.
minor comments (1)
  1. [Abstract] The abstract and introduction could more explicitly distinguish between the system's raw outputs and the final verified results to avoid potential overstatement of autonomy.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for the constructive feedback on our manuscript. The comments highlight important areas for improving transparency in both the case studies and system description. We address each major comment below and indicate the revisions we plan to make.

read point-by-point responses
  1. Referee: [Case studies] Case studies section: The central claim that Iteris supplied a non-trivial fraction of the numerical evidence, constructions, or proof drafts for the phase diagram and QR-pivoting counterexample is not supported by any session logs, prompt traces, before/after comparisons, or quantitative measures of the system's contribution versus human corrections. The text only states that results emerged 'after expert review and correction,' leaving the 'meaningful participation' assertion dependent on an unverified assumption about the division of labor.

    Authors: We agree that the manuscript does not provide direct evidence such as session logs or quantitative contribution metrics to support the extent of Iteris's role. The presentation focused on the verified outcomes rather than process documentation. In the revised version, we will expand the case studies section to describe specific examples of numerical evidence and initial constructions generated by the system, along with the nature of subsequent human corrections, to better substantiate the claim of meaningful participation. revision: partial

  2. Referee: [System description] System description: No details are provided on the exact prompts used, the loop structure of Iteris, or any metrics quantifying how much of the final verified evidence originated from the agent versus human intervention. This omission prevents independent assessment of whether the system performed core discovery work or primarily supplied raw material for human refinement.

    Authors: We acknowledge the lack of implementation details in the current system description. We will revise this section to include a description of the agentic loop structure (hypothesis generation, tool use for computation, iterative refinement), representative prompts from the case studies, and a qualitative discussion of the division of labor between the system and human experts. Quantitative metrics were not collected during the work, so we will clarify this limitation explicitly. revision: yes

standing simulated objections not resolved
  • Complete session logs, full prompt traces, and before/after comparisons from the original experiments are unavailable, as they were not systematically archived during the exploratory process.

Circularity Check

0 steps flagged

No significant circularity: case studies rely on external workshop problems and post-hoc human validation rather than self-referential derivations or fitted predictions.

full rationale

The manuscript describes an agentic system applied to two external open problems (arXiv:2602.05394) and states that Iteris outputs were reviewed and corrected by experts to reach verified results. No equations, parameters, or uniqueness claims are defined in terms of the target results; the central narrative does not reduce any prediction or theorem to a fit or self-citation by construction. The attribution of insight is presented as a qualitative case study rather than a closed mathematical derivation, so the load-bearing steps remain independent of the paper's own outputs.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 1 invented entities

The paper introduces a new named system (Iteris) whose internal design is not detailed in the abstract; no free parameters or mathematical axioms are mentioned.

invented entities (1)
  • Iteris no independent evidence
    purpose: Agentic research loop system for generating numerical evidence and proof drafts in computational mathematics
    The system is presented as the central new artifact of the paper.

pith-pipeline@v0.9.1-grok · 5723 in / 1114 out tokens · 25036 ms · 2026-06-28T14:46:08.544401+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

28 extracted references · 24 canonical work pages · 6 internal anchors

  1. [1]

    Blumberg, Martin Hairer, Joe Kileel, Tamara G

    Mohammed Abouzaid, Andrew J. Blumberg, Martin Hairer, Joe Kileel, Tamara G. Kolda, Paul D. Nelson, Daniel Spielman, Nikhil Srivastava, Rachel Ward, Shmuel Weinberger, and Lauren Williams. First proof, 2026. URL https://arxiv.org/abs/2602.05192

  2. [2]

    Convergence of random products of contractions in hilbert space.Acta Scientiarum Mathematicarum, 26(3–4):239–244, 1965

    Ichiro Amemiya and Tadao Ando. Convergence of random products of contractions in hilbert space.Acta Scientiarum Mathematicarum, 26(3–4):239–244, 1965

  3. [3]

    Noah Amsel, Yves Baumann, Paul Beckman, Peter Bürgisser, Chris Camaño, Tyler Chen, Edmond Chow, Anil Damle, Michal Derezinski, Mark Embree, Ethan N. Epperly, Robert Falgout, Mark Fornace, Anne Greenbaum, Chen Greif, Diana Halikias, Zhen Huang, Elias Jarlebring, Yiannis Koutis, Daniel Kressner, Rasmus Kyng, Jörg Liesen, Jackie Lok, Raphael A. Meyer, Yuji N...

  4. [4]

    Boiko, Robert MacKnight, Ben Kline, and Gabe Gomes

    Daniil A. Boiko, Robert MacKnight, Ben Kline, and Gabe Gomes. Autonomous chemical research with large language models. Nature, 624(7992):570–578, December 2023. ISSN 1476-4687. doi: 10.1038/s41586-023-06792-0. URLhttp://dx.doi.org/10.1038/s41586-023-06792-0

  5. [5]

    Peter Businger and Gene H. Golub. Linear least squares solutions by householder transformations.Numerische Mathematik, 7(3):269–276, 1965. ISSN 0945-3245. doi: 10.1007/bf01436084. URLhttp://dx.doi.org/10.1007/ bf01436084

  6. [6]

    doi: 10.1007/s10208-009-9045-5

    Emmanuel J. Candès and Benjamin Recht. Exact matrix completion via convex optimization.Foundations of Computational Mathematics, 9(6):717–772, April 2009. ISSN 1615-3383. doi: 10.1007/s10208-009-9045-5. URL http://dx.doi.org/10.1007/s10208-009-9045-5

  7. [7]

    Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities

    Gheorghe Comanici, Eric Bieber, Mike Schaekermann, Ice Pasupat, Noveen Sachdeva, Inderjit Dhillon, Marcel Blistein, Ori Ram, Dan Zhang, Evan Rosen, et al. Gemini 2.5: Pushing the frontier with advanced reasoning, multimodality, long context, and next generation agentic capabilities.arXiv preprint arXiv:2507.06261, 2025

  8. [8]

    Kefan Dong and Tengyu Ma

    Alex Davies, Petar Veličković, Lars Buesing, Sam Blackwell, Daniel Zheng, Nenad Tomašev, Richard Tanburn, Peter Battaglia, Charles Blundell, András Juhász, Marc Lackenby, Geordie Williamson, Demis Hassabis, and Pushmeet Kohli. Advancing mathematics by guiding human intuition with ai.Nature, 600(7887):70–74, December 2021. ISSN 1476-4687. doi: 10.1038/s415...

  9. [9]

    Sharp analysis of sketch-and-project methods via a connection to randomized singular value decomposition.SIAM Journal on Mathematics of Data Science, 6(1):127–153, February

    Michał Dereziński and Elizaveta Rebrova. Sharp analysis of sketch-and-project methods via a connection to randomized singular value decomposition.SIAM Journal on Mathematics of Data Science, 6(1):127–153, February

  10. [10]

    doi: 10.1137/23m1545537

    ISSN 2577-0187. doi: 10.1137/23m1545537. URLhttp://dx.doi.org/10.1137/23m1545537

  11. [11]

    Aletheia: Towards autonomous mathematics research,

    Tony Feng, Trieu H. Trinh, Garrett Bingham, Dawsen Hwang, Yuri Chervonyi, Junehyuk Jung, Joonkyung Lee, Carlo Pagano, Sang hyun Kim, Federico Pasqualotto, Sergei Gukov, Jonathan N. Lee, Junsu Kim, Kaiying Hou, Golnaz Ghiasi, Yi Tay, YaGuang Li, Chenkai Kuang, Yuan Liu, Hanzhao Lin, Evan Zheran Liu, Nigamaa Nayakanti, Xiaomeng Yang, Heng-Tze Cheng, Demis H...

  12. [12]

    Gower and Peter Richtárik

    Robert M. Gower and Peter Richtárik. Randomized iterative methods for linear systems.SIAM Journal on Matrix Analysis and Applications, 36(4):1660–1690, January 2015. ISSN 1095-7162. doi: 10.1137/15m1025487. URLhttp://dx.doi.org/10.1137/15m1025487

  13. [13]

    Deepseek-r1 incentivizes reasoning in llms through reinforcement learning.Nature, 645(8081): 633–638, 2025

    Daya Guo, Dejian Yang, Haowei Zhang, Junxiao Song, Peiyi Wang, Qihao Zhu, Runxin Xu, Ruoyu Zhang, Shirong Ma, Xiao Bi, et al. Deepseek-r1 incentivizes reasoning in llms through reinforcement learning.Nature, 645(8081): 633–638, 2025

  14. [14]

    Hestenes and E

    M.R. Hestenes and E. Stiefel. Methods of conjugate gradients for solving linear systems.Journal of Research of the National Bureau of Standards, 49(6):409, December 1952. ISSN 0091-0635. doi: 10.6028/jres.049.044. URL http://dx.doi.org/10.6028/jres.049.044

  15. [15]

    Y. P. Hong and C.-T. Pan. Rank-revealing qr factorizations and the singular value decomposition.Mathematics of Computation, 58(197):213, January 1992. ISSN 0025-5718. doi: 10.2307/2153029. URLhttp://dx.doi.org/ 10.2307/2153029. 16 Iteris: Agentic Research Loops for Computational Mathematics

  16. [16]

    Automated Conjecture Resolution with Formal Verification

    Haocheng Ju, Guoxiong Gao, Jiedong Jiang, Bin Wu, Zeming Sun, Leheng Chen, Yutong Wang, Yuefeng Wang, Zichen Wang, Wanyi He, Peihao Wu, Liang Xiao, Ruochuan Liu, Bryan Dai, and Bin Dong. Automated conjecture resolution with formal verification, 2026. URLhttps://arxiv.org/abs/2604.03789

  17. [17]

    Leventhal and A

    D. Leventhal and A. S. Lewis. Randomized methods for linear constraints: Convergence rates and conditioning. Mathematics of Operations Research, 35(3):641–654, August 2010. ISSN 1526-5471. doi: 10.1287/moor.1100.0456. URLhttp://dx.doi.org/10.1287/moor.1100.0456

  18. [18]

    Subspace-constrained randomized coordinate descent for linear systems with good low-rank matrix approximations, 2026

    Jackie Lok and Elizaveta Rebrova. Subspace-constrained randomized coordinate descent for linear systems with good low-rank matrix approximations, 2026. URLhttps://arxiv.org/abs/2506.09394

  19. [19]

    Towards end-to-end automation of ai research.Nature, 651(8107):914–919, March 2026

    Chris Lu, Cong Lu, Robert Tjarko Lange, Yutaro Yamada, Shengran Hu, Jakob Foerster, David Ha, and Jeff Clune. Towards end-to-end automation of ai research.Nature, 651(8107):914–919, March 2026. ISSN 1476-4687. doi: 10.1038/s41586-026-10265-5. URLhttp://dx.doi.org/10.1038/s41586-026-10265-5

  20. [20]

    Nesterov

    Yu. Nesterov. Efficiency of coordinate descent methods on huge-scale optimization problems.SIAM Journal on Optimization, 22(2):341–362, January 2012. ISSN 1095-7189. doi: 10.1137/100802001. URLhttp://dx.doi.org/ 10.1137/100802001

  21. [21]

    Alexander Novikov, Ngân V˜ u, Marvin Eisenberger, Emilien Dupont, Po-Sen Huang, Adam Zsolt Wagner, Sergey Shirobokov, Borislav Kozlovskii, Francisco J. R. Ruiz, Abbas Mehrabian, M. Pawan Kumar, Abigail See, Swarat Chaudhuri, George Holland, Alex Davies, Sebastian Nowozin, Pushmeet Kohli, and Matej Balog. Alphaevolve: A coding agent for scientific and algo...

  22. [22]

    A.I. Osinsky. Close to optimal column approximation using a single svd.Linear Algebra and its Applications, 725:359–377, November 2025. ISSN 0024-3795. doi: 10.1016/j.laa.2025.07.016. URLhttp://dx.doi.org/10. 1016/j.laa.2025.07.016

  23. [23]

    Pawan Kumar, Emilien Dupont, Francisco J

    Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov, Matej Balog, M. Pawan Kumar, Emilien Dupont, Francisco J. R. Ruiz, Jordan S. Ellenberg, Pengming Wang, Omar Fawzi, Pushmeet Kohli, and Alhussein Fawzi. Mathematical discoveries from program search with large language models.Nature, 625(7995): 468–475, December 2023. ISSN 1476-4687. doi...

  24. [24]

    2003.Iterative Methods for Sparse Linear Systems(second ed.)

    Yousef Saad. Iterative Methods for Sparse Linear Systems. Society for Industrial and Applied Mathematics, January 2003. ISBN 9780898718003. doi: 10.1137/1.9780898718003. URL http://dx.doi.org/10.1137/1. 9780898718003

  25. [25]

    Qwen3 Technical Report

    An Yang, Anfeng Li, Baosong Yang, Beichen Zhang, Binyuan Hui, Bo Zheng, Bowen Yu, Chang Gao, Chengen Huang, Chenxu Lv, et al. Qwen3 technical report.arXiv preprint arXiv:2505.09388, 2025

  26. [26]

    ReAct: Synergizing Reasoning and Acting in Language Models

    Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, and Yuan Cao. React: Synergizing reasoning and acting in language models, 2023. URLhttps://arxiv.org/abs/2210.03629

  27. [27]

    Daniel Zheng, Ingrid von Glehn, Yori Zwols, Iuliya Beloshapka, Lars Buesing, Daniel M. Roy, Martin Wattenberg, Bogdan Georgiev, Tatiana Schmidt, Andrew Cowie, Fernanda Viegas, Dimitri Kanevsky, Vineet Kahlon, Hartmut Maennel, Sophia Alj, George Holland, Alex Davies, and Pushmeet Kohli. Ai co-mathematician: Accelerating mathematicians with agentic ai, 2026...

  28. [28]

    DefineR β ∈R m×m by (Rβ)t,q =    dt, q=t, −βσtσqsdt, t < q, 0, t > q

    This constant controls the strictly upper-triangular couplings among the firstmselected columns. DefineR β ∈R m×m by (Rβ)t,q =    dt, q=t, −βσtσqsdt, t < q, 0, t > q. The firstmselected columns are aq = (Rβ):,q 0 ,1≤q≤m. In matrix form their firstmcoordinates are Rβ =   d1 −βσ1σ2sd1 · · · −βσ 1σmsd1 0d 2 · · · −βσ 2σmsd2 ... ... ... ... 0 0· · ·...