Mapping Uncharted Symmetries: Machine Discovery in Combinatorics
Pith reviewed 2026-05-20 11:45 UTC · model grok-4.3
The pith
Machine learning discovers a new combinatorial interpretation of the q,t-Narayana polynomials based on noncrossing partitions.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central discovery is a new combinatorial interpretation of the q,t-Narayana polynomials that arises from representation theory and is realized by statistics on noncrossing partitions. Machine learning under exact proportion constraints identifies these statistics, and one of them directly yields a combinatorial proof of the polynomials' symmetry in a previously unsolved case. All findings are formalized in Lean 4.
What carries the argument
Statistics on noncrossing partitions discovered by MapSeek-Functional and MapSeek-Symbolic that generate the q,t-Narayana polynomials and establish their symmetry.
If this is right
- The interpretation supplies the first combinatorial model of these polynomials using noncrossing partitions.
- One discovered statistic gives a direct combinatorial proof of symmetry for a case without prior proof.
- The SLURP framework and MapSeek procedures are shown to produce verifiable mathematical objects in algebraic combinatorics.
- Full Lean 4 formalization and released code make the discoveries immediately checkable by others.
Where Pith is reading between the lines
- The same constrained search approach could be tried on other families of polynomials whose combinatorial interpretations remain incomplete.
- Success here indicates that rigid-proportion constraints can guide machine search toward objects that admit human-readable proofs.
- Combining the methods with proof assistants may create a repeatable workflow for generating candidate combinatorial identities.
Load-bearing premise
The statistics produced by the search procedures must reflect genuine combinatorial structure rather than artifacts of the training distribution or search heuristic.
What would settle it
Explicit enumeration of noncrossing partitions for small n and k must show that the proposed statistic reproduces the exact coefficients of the q,t-Narayana polynomial.
Figures
read the original abstract
Inspired by long-standing open problems in algebraic combinatorics, we show that modern machine learning can meaningfully contribute to verifiable mathematical discoveries. In particular, we focus on the construction of simple mathematical functions under exact distributional constraints, a setting we formalize as Simple Learning Under Rigid Proportions (SLURP). We tackle this problem by introducing two methods: MapSeek-Functional, which models the desired function alternating pseudo-labeling and supervised training steps; and MapSeek-Symbolic, designed to directly produce symbolic formulas. We successfully apply both methods to a research problem in algebraic combinatorics, discovering a new combinatorial interpretation of the $q,t$-Narayana polynomials arising from representation theory. To our knowledge, this is the first such interpretation based on noncrossing partitions. Using one discovered statistic, we find a combinatorial proof of the symmetry of these polynomials in a previously unsolved case. To streamline verification and reproducibility, we release all code, including a formalization of all the mathematical discoveries of this paper in Lean 4.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces the SLURP setting for discovering simple functions under exact distributional constraints and develops two methods (MapSeek-Functional, which alternates pseudo-labeling with supervised training, and MapSeek-Symbolic) to address it. These methods are applied to an open problem in algebraic combinatorics, yielding a new combinatorial interpretation of the q,t-Narayana polynomials in terms of noncrossing partitions (the first such interpretation) together with a combinatorial proof of symmetry in a previously unsolved case; all identities and the proof are formalized and machine-checked in Lean 4, with full code released.
Significance. If the results hold, the work demonstrates that modern machine-learning techniques can produce verifiable, parameter-free contributions to algebraic combinatorics by supplying the first noncrossing-partition interpretation of the q,t-Narayana polynomials and resolving an open symmetry question combinatorially. The explicit Lean 4 formalization of the discovered statistics, identities, and proof, together with the public release of all code, constitutes a major strength: the mathematical claims become independently verifiable and decoupled from the training procedure once found.
minor comments (3)
- The precise proportions and sampling procedure used to generate the SLURP training instances are described only at a high level in the application section; adding an explicit table or pseudocode listing the exact distributional constraints would improve reproducibility of the discovery process.
- Notation for the newly discovered statistics on noncrossing partitions is introduced inline; a dedicated subsection collecting the definitions, together with their q,t-generating functions, would aid readability.
- The Lean 4 formalization is referenced in the abstract and conclusion, but the main text should include a direct pointer to the repository or a short summary of which theorems have been checked (e.g., the symmetry identity in the open case).
Simulated Author's Rebuttal
We thank the referee for their positive assessment of the manuscript, including recognition of the SLURP framework, the application to the q,t-Narayana polynomials, the first noncrossing-partition interpretation, the combinatorial symmetry proof, and the Lean 4 formalization with public code release. We note the recommendation for minor revision and will address any editorial or presentational suggestions in the revised version.
Circularity Check
No significant circularity identified
full rationale
The paper employs MapSeek-Functional and MapSeek-Symbolic to surface candidate statistics on noncrossing partitions, then extracts an explicit combinatorial interpretation of the q,t-Narayana polynomials together with a symmetry proof; both are stated as ordinary combinatorial objects and are independently formalized and machine-checked in Lean 4 with released code. Because the final mathematical statements are parameter-free, externally verifiable, and do not redefine any quantity in terms of the search procedure or fitted values, the derivation chain does not reduce to its inputs by construction. No self-definitional equations, fitted-input predictions, or load-bearing self-citations appear in the central claims.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Standard properties of q,t-Narayana polynomials and noncrossing partitions as defined in the algebraic combinatorics literature
Reference graph
Works this paper leans on
-
[1]
Aristotle: IMO-level Automated Theorem Proving
Tudor Achim, Alex Best, Alberto Bietti, Kevin Der, Mathïs Fédérico, Sergei Gukov, Daniel Halpern-Leistner, Kirsten Henningsgard, Yury Kudryashov, Alexander Meiburg, Martin Michelsen, Riley Patterson, Eric Rodriguez, Laura Scharff, Vikram Shanker, Vladmir Sicca, 10 Hari Sowrirajan, Aidan Swope, Matyas Tamas, Vlad Tenev, Jonathan Thomm, Harold Williams, and...
work page internal anchor Pith review Pith/arXiv arXiv
-
[3]
Global lyapunov functions: a long- standing open problem in mathematics, with symbolic transformers
Alberto Alfarano, François Charton, and Amaury Hayat. Global lyapunov functions: a long- standing open problem in mathematics, with symbolic transformers. In Amir Globersons, Lester Mackey, Danielle Belgrave, Angela Fan, Ulrich Paquet, Jakub M. Tomczak, and Cheng Zhang, editors,Advances in Neural Information Processing Systems 38: Annual Conference on Neu...
-
[4]
Statistics on parallelogram polyominoes and a q, t-analogue of the Narayana numbers.J
Jean-Christophe Aval, Michele D’Adderio, Mark Dukes, Angela Hicks, and Yvan Le Borgne. Statistics on parallelogram polyominoes and a q, t-analogue of the Narayana numbers.J. Combin. Theory Ser. A, 123:271–286, 2014. ISSN 0097-3165. URL https://doi.org/10. 1016/j.jcta.2013.09.001
work page 2014
-
[6]
Easy Learning from Label Proportions.CoRR, abs/2302.03115, 2023
Róbert Istvan Busa-Fekete, Heejin Choi, Travis Dick, Claudio Gentile, and Andrés Muñoz Medina. Easy Learning from Label Proportions.CoRR, abs/2302.03115, 2023. doi: 10.48550/ ARXIV .2302.03115. URLhttps://doi.org/10.48550/arXiv.2302.03115
-
[7]
Neurosymbolic programming.Found
Swarat Chaudhuri, Kevin Ellis, Oleksandr Polozov, Rishabh Singh, Armando Solar-Lezama, and Yisong Yue. Neurosymbolic programming.Found. Trends Program. Lang., 7(3):158–243,
-
[8]
URLhttps://doi.org/10.1561/2500000049
doi: 10.1561/2500000049. URLhttps://doi.org/10.1561/2500000049
-
[11]
Michele D’Adderio, Alessandro Iraci, and Anna Vanden Wyngaerd. Theta operators, refined delta conjectures, and coinvariants.Adv. Math., 376:60, 2021. ISSN 0001-8708. doi: 10.1016/j. aim.2020.107447. Id/No 107447
work page doi:10.1016/j 2021
-
[12]
Michele D’Adderio, Alessandro Iraci, and Anna Vanden Wyngaerd.Decorated Dyck paths, polyominoes, and the Delta conjecture, volume 278 ofMemoirs of the American Mathemat- ical Society, no. 1370. American Mathematical Society, July 2022. ISBN 9781470471705 9781470471576. doi: 10.1090/memo/1370. URLhttps://www.ams.org/memo/1370/
-
[13]
Alex Davies, Petar Velickovic, Lars Buesing, Sam Blackwell, Daniel Zheng, Nenad Tomasev, Richard Tanburn, Peter W. Battaglia, Charles Blundell, András Juhász, Marc Lackenby, Geordie Williamson, Demis Hassabis, and Pushmeet Kohli. Advancing mathematics by guiding human intuition with AI.Nat., 600(7887):70–74, 2021. doi: 10.1038/S41586-021-04086-X. URL http...
-
[14]
Adriano M. Garsia and James Haglund. A proof of the q, t-Catalan positivity conjecture. Discrete Math, 256(3):677–717, 2002. ISSN 0012-365X. URL https://doi.org/10.1016/ S0012-365X(02)00343-6. LaCIM 2000 Conference on Combinatorics, Computer Science and Applications (Montreal, QC). 11
work page 2002
-
[15]
Generalized q, t- Catalan numbers.Algebraic Combinatorics, 3(4):855–886, 2020
Eugene Gorsky, Graham Hawkes, Anne Schilling, and Julianne Rainbolt. Generalized q, t- Catalan numbers.Algebraic Combinatorics, 3(4):855–886, 2020. URL https://arxiv.org/ abs/1905.10973
-
[16]
Sumit Gulwani, Oleksandr Polozov, and Rishabh Singh. Program synthesis.Found. Trends Program. Lang., 4(1-2):1–119, 2017. doi: 10.1561/2500000010. URL https://doi.org/10. 1561/2500000010
-
[17]
With an appendix on the combinatorics of Macdonald polynomials, volume 41 ofUniv
James Haglund.The q, t-Catalan numbers and the space of diagonal harmonics. With an appendix on the combinatorics of Macdonald polynomials, volume 41 ofUniv. Lect. Ser. Providence, RI: American Mathematical Society (AMS), 2008. ISBN 978-0-8218-4411-3
work page 2008
-
[18]
Hilbert schemes, polygraphs and the Macdonald positivity conjecture.J
Mark Haiman. Hilbert schemes, polygraphs and the Macdonald positivity conjecture.J. Amer. Math. Soc., 14(4):941–1006, 2001. ISSN 0894-0347. URL https://doi.org/10.1090/ S0894-0347-01-00373-3
work page 2001
-
[19]
Learning from label proportions: Bootstrapping supervised learners via belief propagation
Shreyas Havaldar, Navodita Sharma, Shubhi Sareen, Karthikeyan Shanmugam, and Aravindan Raghuveer. Learning from label proportions: Bootstrapping supervised learners via belief propagation. InThe Twelfth International Conference on Learning Representations, ICLR 2024, Vienna, Austria, May 7-11, 2024. OpenReview.net, 2024. URL https://openreview.net/ forum?...
work page 2024
-
[20]
Xiaoyu Huang, Blake Jackson, and Kyu-Hwan Lee. From black box to bijection: Interpreting machine learning to build a zeta map algorithm.CoRR, abs/2511.12421, 2025. doi: 10.48550/ ARXIV .2511.12421. URLhttps://doi.org/10.48550/arXiv.2511.12421
-
[21]
I. G. Macdonald. A new class of symmetric functions.Sémin. Lothar. Comb., 20:b20a, 41, 1988. ISSN 1286-4889. URLhttps://eudml.org/doc/120965
work page 1988
-
[22]
Mathlib community. The Lean mathematical library. InProceedings of the 9th ACM SIGPLAN International Conference on Certified Programs and Proofs, CPP 2020, pages 367–381, New York, NY , USA, 2020. Association for Computing Machinery. ISBN 9781450370974. doi: 10.1145/3372885.3373824. URLhttps://doi.org/10.1145/3372885.3373824
-
[23]
The Lean 4 theorem prover and programming language
Leonardo de Moura and Sebastian Ullrich. The Lean 4 theorem prover and programming language. InAutomated Deduction – CADE 28: 28th International Conference on Automated Deduction, Virtual Event, July 12–15, 2021, Proceedings, pages 625–635, Berlin, Heidelberg,
work page 2021
-
[24]
TheLean4theoremproverandprogramming language
Springer-Verlag. doi: 10.1007/978-3-030-79876-5_37. URL https://doi.org/10. 1007/978-3-030-79876-5_37
-
[25]
Langdon, and Nicholas Freitag McPhee.A Field Guide to Genetic Pro- gramming
Riccardo Poli, William B. Langdon, and Nicholas Freitag McPhee.A Field Guide to Genetic Pro- gramming. lulu.com, 2008. ISBN 978-1-4092-0073-4. URL http://www.gp-field-guide. org.uk/
work page 2008
-
[26]
Novi Quadrianto, Alexander J. Smola, Tibério S. Caetano, and Quoc V . Le. Estimating Labels from Label Proportions.J. Mach. Learn. Res., 10:2349–2374, 2009. doi: 10.5555/1577069. 1755865. URLhttps://dl.acm.org/doi/10.5555/1577069.1755865
-
[27]
Jeffrey B. Remmel and Andrew T. Wilson. An extension of MacMahon’s equidistribution theorem to ordered set partitions.J. Combin. Theory Ser. A, 134:242–277, 2015. ISSN 0097-
work page 2015
-
[28]
URL https://doi.org/10.1016/j.jcta.2015
doi: 10.1016/j.jcta.2015.03.012. URL https://doi.org/10.1016/j.jcta.2015. 03.012
-
[29]
Learning from Label Proportions: A Mutual Contamination Framework
Clayton Scott and Jianxin Zhang. Learning from Label Proportions: A Mutual Contamination Framework. In Hugo Larochelle, Marc’Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, and Hsuan-Tien Lin, editors,Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 202...
work page 2020
-
[30]
Sara Silva and Ernesto Costa. Dynamic limits for bloat control in genetic programming and a review of past and current bloat theories.Genet. Program. Evolvable Mach., 10(2): 141–179, 2009. doi: 10.1007/S10710-008-9075-9. URL https://doi.org/10.1007/ s10710-008-9075-9. 12
-
[32]
Constructions in combinatorics via neural networks.CoRR, abs/2104.14516, 2021
Adam Zsolt Wagner. Constructions in combinatorics via neural networks.CoRR, abs/2104.14516, 2021. URLhttps://arxiv.org/abs/2104.14516
-
[33]
From scale to speed: Adaptive test-time scaling for image editing.CoRR, abs/2603.00141, 2026
Geordie Williamson. Is deep learning a useful tool for the pure mathematician?CoRR, abs/2304.12602, 2023. doi: 10.48550/ARXIV .2304.12602. URL https://doi.org/10. 48550/arXiv.2304.12602. 13 Aq, t-Combinatorics Background & Narayana Models Triggered by the discovery of the famous Macdonald polynomials [19], q, t-combinatorics studies families of bivariate ...
work page internal anchor Pith review doi:10.48550/arxiv 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.