pith. machine review for the scientific record. sign in

arxiv: 2604.06107 · v1 · submitted 2026-04-07 · 💻 cs.AI · math.HO· math.LO

Recognition: 3 theorem links

· Lean Theorem

Artificial Intelligence and the Structure of Mathematics

Authors on Pith no claims yet

Pith reviewed 2026-05-10 19:08 UTC · model grok-4.3

classification 💻 cs.AI math.HOmath.LO
keywords artificial intelligencemathematical proofsPlatonic worldsautomated discoveryproof hypergraphsfoundational structuremathematics and AI
0
0 comments X

The pith

AI offers a complementary path to mathematical logic by revealing the global structure of formal proofs through automated exploration.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper claims that AI can move beyond solving isolated problems to mapping the overall architecture of mathematics. It sketches mathematics as networks of universal proofs and structural hypergraphs, then lists criteria AI models must meet to explore these networks autonomously. If such models succeed in traversing what the authors call Platonic mathematical worlds, they are expected to expose both large-scale patterns and the narrower ribbons of structure that humans can grasp. This route would address the old question of whether mathematics is discovered or invented by letting machines report on the terrain they encounter.

Core claim

The authors argue that AI agents satisfying criteria for automated mathematical discovery can traverse Platonic mathematical worlds represented via universal proofs and structural hypergraphs, thereby illuminating the global structure of formal proofs in a manner complementary to mathematical logic and clarifying both the overall nature of mathematics and the portions accessible to human understanding.

What carries the argument

Universal proof and structural hypergraphs, which encode the formal connections across all mathematical statements and allow systematic traversal by AI agents.

Load-bearing premise

AI models can be built that meet the stated criteria for automated discovery and will meaningfully traverse Platonic mathematical worlds to produce new insights about mathematical structure.

What would settle it

An AI system trained on large corpora of formal proofs that fails to identify any previously unrecognized global patterns or cross-proof dependencies beyond those already captured by standard logical analysis.

Figures

Figures reproduced from arXiv: 2604.06107 by Maissam Barkeshli, Michael H. Freedman, Michael R. Douglas.

Figure 1
Figure 1. Figure 1: Criteria for AMD 2 The hypergraphs of mathematics Our goal in this section is to briefly sketch the overarching structure of formal mathematics in terms of a series of hypergraphs. We consider it to be an important research program to further flesh out and make rigorous the discussion below. We subsequently use the hypergraphs to pose fundamental questions about the structure of mathematics and consider ho… view at source ↗
Figure 2
Figure 2. Figure 2: Figure generated by ChatGPT 35This is time-dependent, making it interesting to train an LLM only on mathematics up to time t and compare its discoveries with those of humans. 34 [PITH_FULL_IMAGE:figures/full_fig_p034_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: The Iota-Red Hyperedge. The blue solid lines represent static construction (Application). The red dashed line represents the computational step. Because the second argument matches the constructor S, the graph explicitly links the expression a + S(b) to its reduction S(a + b). A.3 Example: The Doubling Function Following §1.9 of the HoTT Book, we define the operation double(n) = 2n using the recursion prin… view at source ↗
read the original abstract

Recent progress in artificial intelligence (AI) is unlocking transformative capabilities for mathematics. There is great hope that AI will help solve major open problems and autonomously discover new mathematical concepts. In this essay, we further consider how AI may open a grand perspective on mathematics by forging a new route, complementary to mathematical\textbf{ logic,} to understanding the global structure of formal \textbf{proof}\textbf{s}. We begin by providing a sketch of the formal structure of mathematics in terms of universal proof and structural hypergraphs and discuss questions this raises about the foundational structure of mathematics. We then outline the main ingredients and provide a set of criteria to be satisfied for AI models capable of automated mathematical discovery. As we send AI agents to traverse Platonic mathematical worlds, we expect they will teach us about the nature of mathematics: both as a whole, and the small ribbons conducive to human understanding. Perhaps they will shed light on the old question: "Is mathematics discovered or invented?" Can we grok the terrain of these \textbf{Platonic worlds}?

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript is an essay arguing that recent AI progress offers a new route, complementary to mathematical logic, for understanding the global structure of formal proofs. It sketches mathematics via 'universal proofs' and 'structural hypergraphs,' outlines high-level criteria for AI models of automated discovery, and posits that AI agents traversing Platonic mathematical worlds will reveal insights into the nature of mathematics, including whether it is discovered or invented.

Significance. If the sketched framework and future AI capabilities are realized, the work could broaden foundational studies by integrating AI-driven exploration of proof structures, potentially yielding global insights inaccessible to logic alone and addressing philosophical questions about mathematics. The essay correctly identifies the prospective value of AI beyond problem-solving, but its significance is prospective and depends on operationalizing the proposed concepts.

major comments (2)
  1. In the section providing a sketch of the formal structure of mathematics, the concepts of 'universal proof' and 'structural hypergraphs' are introduced without definitions of their nodes, edges, global properties, or any concrete example. This absence is load-bearing for the central claim that AI traversal will reveal new insights into proof structure, as there is no basis to distinguish such traversal from existing automated theorem proving.
  2. In the section outlining the main ingredients and criteria for AI models capable of automated mathematical discovery, the criteria are presented only as high-level desiderata without formalization, mathematical specification, or a minimal worked example (e.g., for a single axiom system). This undermines the assertion that such models will teach us about the nature of mathematics in ways complementary to logic.
minor comments (2)
  1. Abstract contains visible LaTeX commands such as boldface markers that should be properly rendered in the final version.
  2. The manuscript would benefit from citations to related work in automated theorem proving, proof mining, and the philosophy of mathematical discovery to better situate the proposals.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments, which correctly identify the need for greater concreteness in our conceptual sketch. We address each point below and will revise the manuscript to incorporate definitions and examples while preserving the essay's exploratory character.

read point-by-point responses
  1. Referee: In the section providing a sketch of the formal structure of mathematics, the concepts of 'universal proof' and 'structural hypergraphs' are introduced without definitions of their nodes, edges, global properties, or any concrete example. This absence is load-bearing for the central claim that AI traversal will reveal new insights into proof structure, as there is no basis to distinguish such traversal from existing automated theorem proving.

    Authors: We agree that the manuscript introduces 'universal proof' and 'structural hypergraphs' at a conceptual level without formal definitions, node/edge specifications, global properties, or examples. This limits the ability to clearly differentiate the proposed AI-driven traversal from existing automated theorem proving. As the work is an essay outlining prospective ideas rather than a technical formalism, the presentation remained high-level. In revision we will add explicit definitions (nodes as statements or subproofs, edges as inference steps or dependency relations), describe global properties such as hypergraph connectivity and centrality, and include a concrete example based on a small system such as propositional logic to illustrate how AI traversal could yield structural insights beyond standard ATP. revision: yes

  2. Referee: In the section outlining the main ingredients and criteria for AI models capable of automated mathematical discovery, the criteria are presented only as high-level desiderata without formalization, mathematical specification, or a minimal worked example (e.g., for a single axiom system). This undermines the assertion that such models will teach us about the nature of mathematics in ways complementary to logic.

    Authors: The referee accurately observes that the criteria for AI models of automated discovery are stated as high-level desiderata without formalization or a minimal worked example. This choice aligns with the essay's goal of sketching broad requirements for future systems rather than providing an implemented specification. To better support the claim of complementarity to logic, we will revise the section to include a minimal worked example, such as applying the criteria to a simple axiom system like basic group theory or a fragment of Peano arithmetic, showing how the AI exploration could reveal structural or accessibility features not directly accessible through logical analysis alone. revision: yes

Circularity Check

0 steps flagged

No circularity: high-level essay with no derivations, equations, or self-referential reductions

full rationale

The paper is a conceptual essay sketching mathematics via universal proofs and structural hypergraphs, then outlining high-level criteria for AI discovery models. It contains no equations, no fitted parameters, no derivations, and no load-bearing self-citations. Claims about AI agents traversing Platonic worlds are presented as expectations and open questions rather than results derived from prior inputs or definitions. The criteria remain desiderata without formalization or reduction to the paper's own assumptions. This matches the reader's zero-circularity assessment; the argument is self-contained as philosophical speculation without any step that reduces by construction to its own inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 2 invented entities

The essay relies on assumptions about future AI capabilities and introduces conceptual structures without independent evidence or prior derivation.

axioms (1)
  • domain assumption AI models can be constructed that meet criteria for autonomous mathematical discovery and traversal of proof structures.
    Invoked when outlining main ingredients and criteria for AI models capable of automated discovery.
invented entities (2)
  • universal proof no independent evidence
    purpose: To represent the formal structure of mathematics as a whole.
    Introduced in the sketch of the formal structure of mathematics.
  • structural hypergraphs no independent evidence
    purpose: To model the global structure of formal proofs.
    Proposed as a way to discuss questions about the foundational structure of mathematics.

pith-pipeline@v0.9.0 · 5481 in / 1167 out tokens · 32097 ms · 2026-05-10T19:08:30.632170+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. NeuroClaw Technical Report

    cs.CV 2026-04 unverdicted novelty 6.0

    NeuroClaw introduces a three-tier multi-agent framework and NeuroBench benchmark that improve executability and reproducibility scores for neuroimaging tasks when used with multimodal LLMs.

  2. Ablation and the Meno: Tools for Empirical Metamathematics

    cs.LO 2026-04 unverdicted novelty 6.0

    Meno and tactic ablation on Tao's Analysis I generate proof populations that embed on low one- or two-dimensional submanifolds far from human constructions in Goedel Prover space.

  3. Astrolabe: A Content-Addressable Hypergraph for Semantic Knowledge Management

    math.HO 2026-04 unverdicted novelty 5.0

    Astrolabe is a content-addressable hypergraph for semantic knowledge management using hash-identified entries, arbitrary-width ordered references, and plugin-extensible records.

Reference graph

Works this paper leans on

88 extracted references · 44 canonical work pages · cited by 3 Pith papers · 2 internal anchors

  1. [1]

    Why philosophers should care about computational complexity.Com- plexity, 101:5

    Scott Aaronson. Why philosophers should care about computational complexity.Com- plexity, 101:5

  2. [2]

    Abouzaid, A

    Mohammed Abouzaid, Andrew J Blumberg, Martin Hairer, Joe Kileel, Tamara G Kolda, Paul D Nelson, Daniel Spielman, Nikhil Srivastava, Rachel Ward, Shmuel Wein- berger, et al. First proof.arXiv preprint arXiv:2602.05192, 2026

  3. [3]

    The Univalent Foundations Pro- gram Institute for Advanced Study, 2013

    Peter Aczel, Benedikt Ahrens, Thorsten Altenkirch, Steve Awodey, Bruno Barras, An- drej Bauer, Yves Bertot, Marc Bezem, Thierry Coquand, Eric Finster, et al.Homotopy type theory: univalent foundations of mathematics. The Univalent Foundations Pro- gram Institute for Advanced Study, 2013

  4. [4]

    Freedman, and Michael Mulligan

    Vitaly Aksenov, Eve Bodnia, Michael H. Freedman, and Michael Mulligan. Compres- sion is all you need: Modeling mathematics. 2026. URLhttps://arxiv.org/abs/26 03.20396

  5. [5]

    Mathematicians in the age of ai

    Jeremy Avigad. Mathematicians in the age of ai. PDF, March 2026. URLhttps: //www.andrew.cmu.edu/user/avigad/Papers/mathematicians.pdf. Accessed 2026-03-04. 39

  6. [6]

    Putnam2025: Our solution to putnam 2025 (axiomprover lean proofs)

    Axiom Math. Putnam2025: Our solution to putnam 2025 (axiomprover lean proofs). https://github.com/AxiomMath/putnam2025, January 2026. GitHub repository (released Jan 2026). Accessed 2026-01-21

  7. [7]

    Mlfmf: Data sets for machine learning for mathematical formalization

    Andrej Bauer, Matej Petković, and Ljupčo Todorovski. Mlfmf: Data sets for machine learning for mathematical formalization. InAdvances in Neural Information Processing Systems (NeurIPS), volume 36, 2023. URLhttps://arxiv.org/abs/2310.16005

  8. [8]

    Courier Corporation, 2013

    Rodney J Baxter.Exactly Solved Models in Statistical Mechanics. Courier Corporation, 2013

  9. [9]

    Beckenbach

    Edwin F. Beckenbach. Interesting integers.The American Mathematical Monthly, 52 (4):211, April 1945

  10. [10]

    Machine learning and information theory concepts towards an AI mathematician.Bull

    Yoshua Bengio and Nikolay Malkin. Machine learning and information theory concepts towards an AI mathematician.Bull. Amer. Math. Soc. (N.S.), 61(3):457–469, July

  11. [11]

    doi: 10.1090/bull/1839

  12. [12]

    na, 1988

    Charles H Bennett.Logical depth and physical complexity. na, 1988

  13. [13]

    Erdős problems.https://www.erdosproblems.com/

    Thomas Bloom. Erdős problems.https://www.erdosproblems.com/. URLhttps: //www.erdosproblems.com/. Accessed: 2026-01-20

  14. [14]

    Olausson, Lionel Wong, Gabriel Grand, Joshua B

    Matthew Bowers, Theo X. Olausson, Lionel Wong, Gabriel Grand, Joshua B. Tenen- baum, KevinEllis, andArmandoSolar-Lezama. Top-DownSynthesisforLibraryLearn- ing.Proceedings of the ACM on Programming Languages, 7(POPL):41:1182–41:1213, January 2023. doi: 10.1145/3571234. URLhttps://dl.acm.org/doi/10.1145/357 1234

  15. [15]

    Brenner, Vincent Cohen-Addad, and David Woodruff.Solving an Open Problem in Theoretical Physics using AI-Assisted Discovery

    Michael P. Brenner, Vincent Cohen-Addad, and David Woodruff. Solving an open problem in theoretical physics using ai-assisted discovery, 2026. URLhttps://arxiv. org/abs/2603.04735

  16. [16]

    The motivic class of the space of genus0maps to the flag variety.arXiv preprint arXiv:2601.07222, 2026

    Jim Bryan, Balázs Elek, Freddie Manners, George Salafatinos, and Ravi Vakil. The motivic class of the space of genus0maps to the flag variety, 2026. URLhttps: //arxiv.org/abs/2601.07222

  17. [17]

    Samuel R. Buss. On gödel’s theorems on lengths of proofs i: Number of lines and speedup for arithmetics.Journal of Symbolic Logic, 59(3):737–756, 1994. doi: 10.230 7/2275906

  18. [18]

    How deep neural networks learn compositional data: The random hierarchy model.Physical Review X, 14(3):031001, 2024

    Francesco Cagnetta, Leonardo Petrini, Umberto M Tomasini, Alessandro Favero, and Matthieu Wyart. How deep neural networks learn compositional data: The random hierarchy model.Physical Review X, 14(3):031001, 2024

  19. [19]

    Intrinsically motivated reinforcement learning.Advances in neural information processing systems, 17, 2004

    Nuttapong Chentanez, Andrew Barto, and Satinder Singh. Intrinsically motivated reinforcement learning.Advances in neural information processing systems, 17, 2004

  20. [20]

    Coester,Transposition is nearly optimal for IID list update, preprint, arXiv:2603.10244 (2026)

    Christian Coester. Transposition is nearly optimal for iid list update, 2026. URL https://arxiv.org/abs/2603.10244

  21. [21]

    Springer Science & Business Media, 2012

    Simon Colton.Automated theory formation in pure mathematics. Springer Science & Business Media, 2012. 40

  22. [22]

    On the notion of interestingness in automatedmathematicaldiscovery.International Journal of Human-Computer Studies, 53(3):351–375, 2000

    Simon Colton, Alan Bundy, and Toby Walsh. On the notion of interestingness in automatedmathematicaldiscovery.International Journal of Human-Computer Studies, 53(3):351–375, 2000. doi: 10.1006/ijhc.2000.0394. URLhttps://www.research.ed. ac.uk/files/408775/On_the_notion_of_interestingness_in_automated_mathe matical_discovery.pdf

  23. [23]

    Automatic invention of integer sequences

    Simon Colton, Alan Bundy, and Toby Walsh. Automatic invention of integer sequences. InAAAI/IAAI, pages 558–563, 2000

  24. [24]

    Coarse structure.Wikipedia, The Free Encyclopedia, 2025

    Wikipedia contributors. Coarse structure.Wikipedia, The Free Encyclopedia, 2025. URLhttps://en.wikipedia.org/wiki/Coarse_structure

  25. [25]

    Working with machines in mathematics.Bull

    Alex Davies. Working with machines in mathematics.Bull. Amer. Math. Soc. (N.S.), 61(3):387–394, July 2024. doi: 10.1090/bull/1843

  26. [26]

    Automated conjecturing with \emph{TxGraffiti}

    Randy Davila. Automated conjecturing in mathematics with\emph{TxGraffiti}.arXiv preprint arXiv:2409.19379, 2024

  27. [27]

    Graffiti3: Compact theory libraries for automated mathematical dis- covery.Research Square, 2026

    Randy Davila. Graffiti3: Compact theory libraries for automated mathematical dis- covery.Research Square, 2026. doi: 10.21203/rs.3.rs-8493329/v1. Posted January 19, 2026

  28. [28]

    Z3: An efficient smt solver

    Leonardo De Moura and Nikolaj Bjørner. Z3: An efficient smt solver. InInternational conference on Tools and Algorithms for the Construction and Analysis of Systems, pages 337–340. Springer, 2008

  29. [29]

    Proofs for a price: tomorrow’s ultra-rigorous mathematical culture

    Silvia De Toffoli. Proofs for a price: tomorrow’s ultra-rigorous mathematical culture. Bull. Amer. Math. Soc. (N.S.), 61(3):395–410, July 2024. doi: 10.1090/bull/1823

  30. [30]

    Morgan Kaufmann Publishers, San Francisco, CA, 2003

    Rina Dechter.Constraint Processing. Morgan Kaufmann Publishers, San Francisco, CA, 2003. ISBN 978-1558608906

  31. [31]

    Alephzero and mathematical experience.Bull

    Simon DeDeo. Alephzero and mathematical experience.Bull. Amer. Math. Soc. (N.S.), 61(3):375–386, July 2024. doi: 10.1090/bull/1824

  32. [32]

    Some history of the development of Graffiti

    Ermelinda DeLaViña. Some history of the development of Graffiti. In Siemion Fajt- lowicz, Patrick W. Fowler, Pierre Hansen, Melvin F. Janowitz, and Fred S. Roberts, editors,Graphs and Discovery, volume 69 ofDIMACS Series in Discrete Mathemat- ics and Theoretical Computer Science, pages 81–118. American Mathematical Society, Providence, RI, 2005. doi: 10.1...

  33. [33]

    Dreamcoder: Growing generalizable, interpretable knowledge with wake-sleep bayesian program learning,

    Kevin Ellis, Catherine Wong, Maxwell Nye, Mathias Sable-Meyer, Luc Cary, Lucas Morales, Luke Hewitt, Armando Solar-Lezama, and Joshua B. Tenenbaum. Dream- Coder: Growing generalizable, interpretable knowledge with wake-sleep Bayesian pro- gram learning.arXiv:2006.08381 [cs], June 2020. URLhttp://arxiv.org/abs/2006 .08381. arXiv: 2006.08381

  34. [34]

    Standard models of arithmetic

    Ali Enayat. Standard models of arithmetic. In Martin Kaså, editor,Idées Fixes: a Festschrift dedicated to Christian Bennet on the occasion of his 60th birthday, pages 55–64. University of Gothenburg Publications, Gothenburg, 2014. URLhttps://ww w.researchgate.net/publication/281936805_STANDARD_MODELS_OF_ARITHMETIC. Festschrift chapter. 41

  35. [35]

    Frontiermath: Open problems.https://epoch.ai/frontiermath/open-p roblems,

    Epoch AI. Frontiermath: Open problems.https://epoch.ai/frontiermath/open-p roblems, . Accessed: 2026-01-27

  36. [36]

    Frontiermath: Llm benchmark for advanced ai math reasoning

    Epoch AI. Frontiermath: Llm benchmark for advanced ai math reasoning. Online, . URLhttps://epoch.ai/frontiermath. Accessed: 2026-01-25

  37. [37]

    On conjectures of graffiti

    Siemion Fajtlowicz. On conjectures of graffiti. InAnnals of discrete mathematics, volume 38, pages 113–118. Elsevier, 1988

  38. [38]

    Aletheia tackles FirstProof autonomously.arXiv preprint arXiv:2602.21201, 2026

    Tony Feng, Junehyuk Jung, Sang hyun Kim, Carlo Pagano, Sergei Gukov, Chiang- Chiang Tsai, David Woodruff, Adel Javanmard, Aryan Mokhtari, Dawsen Hwang, Yuri Chervonyi, Jonathan N. Lee, Garrett Bingham, Trieu H. Trinh, Vahab Mirrokni, Quoc V. Le, and Thang Luong. Aletheia tackles firstproof autonomously, 2026. URL https://arxiv.org/abs/2602.21201

  39. [39]

    Truth and speed-up.The Review of Symbolic Logic, 7(2):319–340,

    Martin Fischer. Truth and speed-up.The Review of Symbolic Logic, 7(2):319–340,

  40. [40]

    Will machines change mathematics?Bull

    Maia Fraser, Andrew Granville, Michael Harris, Colin McLarty, Emily Riehl, and Ak- shay Venkatesh. Will machines change mathematics?Bull. Amer. Math. Soc. (N.S.), 61(3):373–374, July 2024. doi: 10.1090/bull/1842

  41. [41]

    Natural deduction via graphs: formal definition and computation rules.Mathematical Structures in Computer Science, 17:485–526, 2007

    Herman Geuvers and Iris Loeb. Natural deduction via graphs: formal definition and computation rules.Mathematical Structures in Computer Science, 17:485–526, 2007. doi: 10.1017/S0960129507006123

  42. [42]

    Lattice-valued bottleneck duality,

    Robert Ghrist, Julian Gould, and Miguel Lopez. Lattice-valued bottleneck duality,

  43. [43]

    URLhttps://arxiv.org/abs/2410.00315

  44. [44]

    Linear logic , journal =

    Jean-Yves Girard. Linear logic.Theoretical Computer Science, 50:1–102, 1987. doi: 10.1016/0304-3975(87)90045-4

  45. [45]

    FrontierMath: A benchmark for evaluating advanced mathematical reasoning in AI.arXiv preprint arXiv:2411.04872, 2024

    Elliot Glazer, Ege Erdil, Tamay Besiroglu, Diego Chicharro, Evan Chen, Alex Gunning, Caroline Falkman Olsson, Jean-Stanislas Denain, Anson Ho, Emily de Oliveira Santos, Olli Järviniemi, Matthew Barnett, Robert Sandler, Matej Vrzala, Jaime Sevilla, Qiuyu Ren, Elizabeth Pratt, Lionel Levine, Grant Barkley, Natalie Stewart, Bogdan Grechuk, Tetiana Grechuk, S...

  46. [46]

    Kac, Random walk and the theory of brownian motion, The American Ma- thematical Monthly54, 369 (1947), https://doi.org/10.1080/00029890.1947.11990189

    Kurt Gödel. What is cantor’s continuum problem?The American Mathematical Monthly, 54(9):515–525, 1947. doi: 10.1080/00029890.1947.11991877

  47. [47]

    Stephen Jay Gould.Wonderful Life: The Burgess Shale and the Nature of History. W. W. Norton & Company, New York, 1989

  48. [48]

    Lilo: Learning interpretable libraries by compressing and documenting code

    Gabriel Grand, Lionel Wong, Matthew Bowers, Theo X. Olausson, Muxin Liu, Joshua B. Tenenbaum, and Jacob Andreas. LILO: Learning Interpretable Libraries by Compressing and Documenting Code, October 2023. URLhttp://arxiv.org/ab s/2310.19791. arXiv:2310.19791 [cs]. 42

  49. [49]

    Guevara, A

    Alfredo Guevara, Alexandru Lupsasca, David Skinner, Andrew Strominger, and Kevin Weil. Single-minus gluon tree amplitudes are nonzero, 2026. URLhttps://arxiv.or g/abs/2602.12176

  50. [50]

    Formal entity graphs as complex networks: Assessing centrality metrics of the archive of formal proofs

    Fabian Huch. Formal entity graphs as complex networks: Assessing centrality metrics of the archive of formal proofs. InIntelligent Computer Mathematics: 15th International Conference, CICM 2022, Prague, Czech Republic, September 19–23, 2022, Proceedings, pages 151–166. Springer, 2022. doi: 10.1007/978-3-031-16681-5_10

  51. [51]

    Uijeong Jang and Ernest K. Ryu. Point convergence of nesterov’s accelerated gradient method: An ai-assisted proof, 2025. URLhttps://arxiv.org/abs/2510.23513

  52. [52]

    Conjecture synthesis for inductive theories.Journal of Automated Reasoning, 47(3):251–289, 2011

    Moa Johansson, Lucas Dixon, and Alan Bundy. Conjecture synthesis for inductive theories.Journal of Automated Reasoning, 47(3):251–289, 2011

  53. [53]

    Martin.Speech and Language Processing

    Daniel Jurafsky and James H. Martin.Speech and Language Processing. Prentice Hall, Upper Saddle River, NJ, 2 edition, 2009. ISBN 978-0131873216

  54. [54]

    Douglas B. Lenat. Automated theory formation in mathematics. InProceedings of the 5th International Joint Conference on Artificial Intelligence (IJCAI-77), pages 833– 842, Cambridge, MA, 1977. URLhttps://www.ijcai.org/Proceedings/77-2/Pap ers/061.pdf

  55. [55]

    Why does deep and cheap learning work so well?Journal of Statistical Physics, 168(6):1223–1247, 2017

    Henry W Lin, Max Tegmark, and David Rolnick. Why does deep and cheap learning work so well?Journal of Statistical Physics, 168(6):1223–1247, 2017

  56. [56]

    Numina-lean-agent: An open and general agentic reasoning system for formal mathematics.arXiv preprint arXiv:2601.14027,

    Junqi Liu, Zihao Zhou, Zekai Zhu, Marco Dos Santos, Weikun He, Jiawei Liu, Ran Wang, Yunzhou Xie, Junqiao Zhao, Qiufeng Wang, Lihong Zhi, Jia Li, and Wenda Li. Numina-lean-agent: An open and general agentic reasoning system for formal mathematics, 2026. URLhttps://arxiv.org/abs/2601.14027

  57. [57]

    Advanced version of gemini with deep think officially achieves gold-medal standard at the international mathematical olympiad

    Thang Luong and Edward Lockhart. Advanced version of gemini with deep think officially achieves gold-medal standard at the international mathematical olympiad. https://deepmind.google/discover/blog/advanced-version-of-gemini-with-d eep-think-officially-achieves-gold-medal-standard-at-the-international -mathematical-olympiad/, 7 2025. Google DeepMind; acce...

  58. [58]

    Introducing gauss, an agent for autoformalization, 2025

    Math, Inc. Introducing gauss, an agent for autoformalization, 2025. URLhttps: //www.math.inc/gauss. Announcement of Gauss; completion of the strong Prime Number Theorem formalization in Lean

  59. [59]

    Sphere-packing-lean: Sphere packing in lean

    Math, Inc. Sphere-packing-lean: Sphere packing in lean. GitHub repository, 2026. URL https://github.com/math-inc/Sphere-Packing-Lean. A Lean formalisation of sphere-packingoptimalityindimensions8and24(andperiodicuniquenessindimension 24). Accessed: 2026-03-05

  60. [60]

    MathZero, The Classification Problem, and Set-Theoretic Type Theory.arXiv:2005.05512 [cs], May 2020

    David McAllester. MathZero, The Classification Problem, and Set-Theoretic Type Theory.arXiv:2005.05512 [cs], May 2020. URLhttp://arxiv.org/abs/2005.05512. arXiv: 2005.05512

  61. [61]

    Single axioms for groups and abelian groups with various operations

    William McCune. Single axioms for groups and abelian groups with various operations. Journal of Automated Reasoning, 10(1):1–13, 1993. 43

  62. [62]

    Poincaré on the value of reasoning machines.Bull

    Colin McLarty. Poincaré on the value of reasoning machines.Bull. Amer. Math. Soc. (N.S.), 61(3):411–422, July 2024. doi: 10.1090/bull/1822

  63. [63]

    Google a.i

    Cade Metz. Google a.i. system wins gold medal in international math olympiad.https: //www.nytimes.com/2025/07/21/technology/google-ai-international-mathema tics-olympiad.html, 7 2025. The New York Times; accessed 2025-10-20

  64. [64]

    Mathematical discovery in the age of artificial intel- ligence.Nature Physics, pages 1–3, 2025

    Bartosz Naskręcki and Ken Ono. Mathematical discovery in the age of artificial intel- ligence.Nature Physics, pages 1–3, 2025

  65. [65]

    AlphaEvolve: A coding agent for scientific and algorithmic discovery

    Alexander Novikov, Ngân V˜ u, Marvin Eisenberger, Emilien Dupont, Po-Sen Huang, Adam Zsolt Wagner, Sergey Shirobokov, Borislav Kozlovskii, Francisco JR Ruiz, Abbas Mehrabian, et al. Alphaevolve: A coding agent for scientific and algorithmic discovery. arXiv preprint arXiv:2506.13131, 2025

  66. [66]

    Automatedmathematicsandthereconfigurationofproofandlabor

    RodrigoOchigame. Automatedmathematicsandthereconfigurationofproofandlabor. Bull. Amer. Math. Soc. (N.S.), 61(3):423–437, July 2024. doi: 10.1090/bull/1821

  67. [67]

    Our first proof submissions

    OpenAI. Our first proof submissions. OpenAI, February 2026. URLhttps://openai .com/index/first-proof-submissions/. Accessed: 2026-02-25

  68. [68]

    Gabriel Poesia, David Broman, Nick Haber, and Noah D. Goodman. Learning Formal Mathematics From Intrinsic Motivation, June 2024. URLhttp://arxiv.org/abs/24 07.00695. arXiv:2407.00695 [cs]

  69. [69]

    Compositional sparsity of learnable functions.Bull

    Tomaso Poggio and Maia Fraser. Compositional sparsity of learnable functions.Bull. Amer. Math. Soc. (N.S.), 61(3):438–456, July 2024. doi: 10.1090/bull/1820

  70. [70]

    The lengths of proofs

    Pavel Pudlák. The lengths of proofs. In Samuel R. Buss, editor,Handbook of Proof Theory, volume 137 ofStudies in Logic and the Foundations of Mathematics, pages 547–637. Elsevier–North-Holland, Amsterdam, 1998. URLhttps://users.math.cas .cz/~pudlak/length.pdf. See Theorem 7.2.2 (speed-up by any function provably total in the base theory) and Theorem 7.3.2

  71. [71]

    Finite entropy sums in quantum field theory, 2025

    Mark Van Raamsdonk. Finite entropy sums in quantum field theory, 2025. URL https://arxiv.org/abs/2508.21276

  72. [72]

    Mathematical discoveries from program search with large language models.Nature, 625(7995):468–475, 2024

    Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov, Matej Balog, M Pawan Kumar, Emilien Dupont, Francisco JR Ruiz, Jordan S Ellenberg, Pengming Wang, Omar Fawzi, et al. Mathematical discoveries from program search with large language models.Nature, 625(7995):468–475, 2024

  73. [73]

    Learning hierarchical cate- gories in deep neural networks

    Andrew M Saxe, James L McClellans, and Surya Ganguli. Learning hierarchical cate- gories in deep neural networks. InProceedings of the Annual Meeting of the Cognitive Science Society, volume 35, 2013

  74. [74]

    Vibe physics: the AI grad student,

    Matthew D. Schwartz. Resummation of the c-parameter sudakov shoulder using effec- tive field theory, 2026. URLhttps://arxiv.org/abs/2601.02484

  75. [75]

    MIT press Cambridge, 1998

    Richard S Sutton, Andrew G Barto, et al.Reinforcement learning: An introduction, volume 1. MIT press Cambridge, 1998. 44

  76. [76]

    Mastodon post.https://mathstodon.xyz/@tao/115306424727150237, 10 2025

    Terence Tao. Mastodon post.https://mathstodon.xyz/@tao/115306424727150237, 10 2025

  77. [77]

    Hilbert’stwenty-fourthproblem.The American Mathematical Monthly, 110(1):1–24, 2003

    RüdigerThiele. Hilbert’stwenty-fourthproblem.The American Mathematical Monthly, 110(1):1–24, 2003. doi: 10.1080/00029890.2003.11919933. URLhttps://doi.org/10 .1080/00029890.2003.11919933

  78. [78]

    Putnambench leaderboard.https://trishullab.github.io/PutnamB ench/leaderboard.html

    Trishul Lab. Putnambench leaderboard.https://trishullab.github.io/PutnamB ench/leaderboard.html. Webpage. Accessed 2026-01-21

  79. [79]

    PutnamBench: Evaluating Neural Theorem‑Provers on the Putnam Mathematical Com- petition, 2024

    George Tsoukalas, Jasper Lee, John Jennings, Jimmy Xin, Michelle Ding, Michael Jen- nings, Amitayush Thakur, and Swarat Chaudhuri. Putnambench: Evaluating neural theorem-provers on the putnam mathematical competition.https://arxiv.org/abs/ 2407.11214, 2024. Accessed 2026-01-21

  80. [80]

    Learning Interestingness in Automated Mathematical Theory Formation, November 2025

    George Tsoukalas, Rahul Saha, Amitayush Thakur, Sabrina Reguyal, and Swarat Chaudhuri. Learning Interestingness in Automated Mathematical Theory Formation, November 2025. URLhttp://arxiv.org/abs/2511.14778. arXiv:2511.14778 [cs] version: 1

Showing first 80 references.