pith. machine review for the scientific record. sign in

arxiv: 2604.21187 · v1 · submitted 2026-04-23 · 🧮 math.CO · cs.AI

Recognition: unknown

Doubly Saturated Ramsey Graphs: A Case Study in Computer-Assisted Mathematical Discovery

Authors on Pith no claims yet

Pith reviewed 2026-05-09 21:27 UTC · model grok-4.3

classification 🧮 math.CO cs.AI
keywords doubly saturated Ramsey graphsSAT solvinglarge language modelsLean formalizationcomputer-assisted mathematicsGrinstead-Roberts questioninfinite graph families
0
0 comments X

The pith

Combining SAT solvers with LLM-generated code discovers infinite families of doubly saturated Ramsey graphs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Ramsey-good graphs contain neither an s-clique nor a t-independent set. Doubly saturated versions of these graphs have the additional property that adding or removing any single edge forces either an s-clique or a t-independent set. The paper demonstrates a workflow that uses SAT solvers together with code produced by large language models to locate base examples and then extend them into infinite families. This construction directly resolves a question posed by Grinstead and Roberts in 1982. The authors further employ large language models to produce and certify correctness proofs for these families inside the Lean theorem prover.

Core claim

The authors establish that infinite families of doubly saturated Ramsey-good graphs exist, constructed by first using SAT solvers on LLM-generated encodings to find finite instances and then generalizing the pattern, with all correctness arguments formalized and checked in Lean.

What carries the argument

Doubly saturated Ramsey-good graphs, which avoid s-cliques and t-independent sets yet become non-good under the addition or deletion of any edge.

If this is right

  • Infinite families of doubly saturated Ramsey-good graphs exist for the parameters examined.
  • The 1982 Grinstead-Roberts question receives an explicit affirmative answer via concrete constructions.
  • Machine-checked Lean proofs confirm the infinite families are correct.
  • The SAT-plus-LLM workflow can be applied to locate similar structures in other combinatorial settings.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Large language models can be used both to generate search code and to produce formal proofs for infinite combinatorial families.
  • Hybrid automated workflows may address additional open questions in extremal graph theory beyond the current parameters.
  • The method opens the possibility of systematically exploring doubly saturated properties for larger values of s and t.

Load-bearing premise

The LLM-generated code correctly encodes the definition of doubly saturated graphs as a SAT problem and the Lean scripts faithfully prove the properties of the resulting infinite families without introducing errors.

What would settle it

An independent re-implementation of the SAT encoding that returns a finite graph in one of the claimed families violating the doubly saturated condition, or an error in the Lean proof that permits a counterexample.

Figures

Figures reproduced from arXiv: 2604.21187 by Benjamin Przybocki, Bernardo Subercaseaux, John Mackey, Marijn J. H. Heule.

Figure 1
Figure 1. Figure 1: Hasse diagram for all 640 R(4, 4)-good graphs on 15 vertices, including 2 doubly saturated graphs. The size of each component is given above it. In fact, on the basis of experimental evidence, we conjecture that there are doubly saturated R(s, t)-good graphs for almost all choices of parameters: Conjecture 1. Let s, t ≥ 3 with s ≤ t and (s, t) ∈ { / (3, 4),(3, 6)}. Then, there is a doubly saturated R(s, t)… view at source ↗
Figure 2
Figure 2. Figure 2: Computer-assisted mathematics research paradigm [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Illustration of the encoding for maximality. If edge [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Illustration of the construction for t = 4 and t = 5 (13 and 19 vertices respectively). Dashed red edges are not part of the construction, and adding them introduces the K4 subgraph given by the green vertices and already present thick edges. An interesting lingering question is whether the construction in Theorem 1 is as small as possible. Our SAT experiments demonstrate this for t ≤ 5. Question 1. For ea… view at source ↗
Figure 5
Figure 5. Figure 5: A doubly saturated R(3, 7)-good graph on 20 vertices Using a similar workflow as described in Section 3, we collaborated with ChatGPT-5.1 Pro to find circulant constructions for doubly saturated R(3, t)- good graphs. After much experimentation, we conjecture the following [PITH_FULL_IMAGE:figures/full_fig_p011_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Cumulative proportion of prime-order Paley graphs that are doubly sat [PITH_FULL_IMAGE:figures/full_fig_p012_6.png] view at source ↗
read the original abstract

Ramsey-good graphs are graphs that contain neither a clique of size $s$ nor an independent set of size $t$. We study doubly saturated Ramsey-good graphs, defined as Ramsey-good graphs in which the addition or removal of any edge necessarily creates an $s$-clique or a $t$-independent set. We present a method combining SAT solving with bespoke LLM-generated code to discover infinite families of such graphs, answering a question of Grinstead and Roberts from 1982. In addition, we use LLMs to generate and formalize correctness proofs in Lean. This case study highlights the potential of integrating automated reasoning, large language models, and formal verification to accelerate mathematical discovery. We argue that such tool-driven workflows will play an increasingly central role in experimental mathematics.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The paper claims to combine SAT solving with bespoke LLM-generated code to discover infinite families of doubly saturated Ramsey-good graphs (Ramsey-good graphs where adding or removing any edge creates an s-clique or t-independent set), thereby answering a 1982 question of Grinstead and Roberts. It further uses LLMs to generate and formalize correctness proofs in Lean, presenting the work as a case study highlighting the integration of automated reasoning, LLMs, and formal verification for accelerating mathematical discovery.

Significance. If the reported families and formalizations are correct, the result would be significant as the first explicit infinite families of doubly saturated Ramsey graphs, directly resolving the Grinstead-Roberts question where prior work had only finite examples. The hybrid workflow of SAT solvers, LLM code generation, and Lean verification demonstrates a scalable approach for experimental combinatorics that could be applied to other open problems in Ramsey theory and graph theory.

major comments (1)
  1. Abstract: The central claim that SAT solving combined with LLM-generated code yields infinite families rests on the faithfulness of the (unspecified) SAT encoding for both the Ramsey-good property and the saturation condition, plus the correctness of the LLM-generated Lean proof script for lifting finite examples to an infinite parametric family. No encoding, graph parameters, or proof excerpt is provided, so the load-bearing assumption cannot be checked.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the positive assessment of the significance of our results, particularly in resolving the Grinstead-Roberts question with the first explicit infinite families of doubly saturated Ramsey graphs via a hybrid SAT-LLM workflow. We address the concern about verifiability of the central technical components below.

read point-by-point responses
  1. Referee: Abstract: The central claim that SAT solving combined with LLM-generated code yields infinite families rests on the faithfulness of the (unspecified) SAT encoding for both the Ramsey-good property and the saturation condition, plus the correctness of the LLM-generated Lean proof script for lifting finite examples to an infinite parametric family. No encoding, graph parameters, or proof excerpt is provided, so the load-bearing assumption cannot be checked.

    Authors: We agree that the abstract, as a high-level summary, omits the specific SAT encoding, graph parameters, and Lean proof excerpt, which prevents immediate verification of the core claims. The manuscript describes the overall method but does not include these details in a form that allows direct checking. To resolve this, the revised version will add an appendix containing: (1) the full SAT encoding (variables for edges, clauses enforcing no s-clique or t-independent set, and clauses for double saturation under edge addition/removal); (2) the concrete graph parameters (s, t, and the parametric family construction) used to generate the infinite families; and (3) key excerpts from the LLM-generated Lean proof script, including the main theorem statement and the lifting argument from finite SAT solutions to the infinite case. This will make the faithfulness of the encoding and the correctness of the formalization directly checkable while preserving the case-study focus of the paper. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation relies on external SAT and Lean verification

full rationale

The paper discovers infinite families of doubly saturated Ramsey-good graphs by combining SAT solving with LLM-generated code to answer the 1982 Grinstead-Roberts question, then formalizes the constructions in Lean. This chain depends on independent external tools (SAT solvers and the Lean theorem prover) rather than any self-definitional loop, fitted parameters renamed as predictions, or load-bearing self-citations. No equations or steps reduce by construction to the paper's own inputs; the result is a computational discovery verified outside the fitted values or definitions internal to the manuscript. The central claim remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim depends on the correctness of an external SAT solver, the soundness of an LLM-generated encoding, and the fidelity of a Lean proof script. No free parameters are introduced; the only background assumptions are standard properties of Ramsey numbers and the completeness of the SAT encoding, both of which are domain-standard.

axioms (1)
  • domain assumption Standard properties of Ramsey numbers and the completeness of the SAT encoding of graph properties
    Invoked implicitly when the SAT solver is used to certify absence of cliques and independent sets.

pith-pipeline@v0.9.0 · 5439 in / 1338 out tokens · 36882 ms · 2026-05-09T21:27:48.984134+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

48 extracted references · 32 canonical work pages · 1 internal anchor

  1. [1]

    Achim, T., Best, A., Bietti, A., Der, K., Fédérico, M., Gukov, S., Halpern-Leistner, D., Henningsgard, K., Kudryashov, Y., Meiburg, A., Michelsen, M., Patterson, R., Rodriguez, E., Scharff, L., Shanker, V., Sicca, V., Sowrirajan, H., Swope, A., Tamas, M., Tenev, V., Thomm, J., Williams, H., Wu, L.: Aristotle: IMO-level automated theorem proving (2025), ht...

  2. [2]

    Ajtai, M., Komlós, J., Szemerédi, E.: A note on Ramsey numbers. J. Combin. The- ory Ser. A29(3), 354–360 (1980). https://doi.org/10.1016/0097-3165(80)90030-8 Doubly Saturated Ramsey Graphs 15

  3. [3]

    Angeltveit, V., McKay, B.D.:R(5,5)≤46. J. Graph Theory (2026). https://doi. org/10.1002/jgt.70029

  4. [4]

    Avigad, J.: Varieties of mathematical understanding. Bull. Amer. Math. Soc. (N.S.) 59(1), 99–117 (2022). https://doi.org/10.1090/bull/1726

  5. [5]

    In: Proc

    Biere, A., Faller, T., Fazekas, K., Fleury, M., Froleyks, N., Pollitt, F.: CaDiCaL 2.0. In: Gurfinkel, A., Ganesh, V. (eds.) Computer Aided Verification - 36th Interna- tional Conference, CAV 2024, Montreal, QC, Canada, July 24-27, 2024, Proceed- ings, Part I. Lecture Notes in Computer Science, vol. 14681, pp. 133–152. Springer (2024). https://doi.org/10....

  6. [6]

    Brakensiek, J., Heule, M., Mackey, J., Narváez, D.E.: The resolution of Keller’s conjecture. J. Autom. Reason.66(3), 277–300 (2022), https://doi.org/10.1007/ s10817-022-09623-5

  7. [7]

    Brenner, M.P., Cohen-Addad, V., Woodruff, D.: Solving an open problem in the- oretical physics using AI-assisted discovery (2026), https://arxiv.org/abs/2603. 04735

  8. [8]

    Bubeck, S., Coester, C., Eldan, R., Gowers, T., Lee, Y.T., Lupsasca, A., Sawh- ney, M., Scherrer, R., Sellke, M., Spears, B.K., Unutmaz, D., Weil, K., Yin, S., Zhivotovskiy, N.: Early science acceleration experiments with GPT-5 (2025), https://arxiv.org/abs/2511.16072

  9. [9]

    Chvátal, V., Harary, F.: Generalized Ramsey theory for graphs. I. Diagonal numbers. Period. Math. Hungar.3, 115–124 (1973). https://doi.org/10.1007/ BF02018466, collection of articles dedicated to the memory of Alfred Renyi, II

  10. [10]

    Chvátal, V., Harary, F.: Generalized Ramsey theory for graphs. II. Small diagonal numbers. Proc. Amer. Math. Soc.32, 389–394 (1972). https://doi.org/10.2307/ 2037824

  11. [11]

    Chvátal, V., Harary, F.: Generalized Ramsey theory for graphs. III. Small off- diagonal numbers. Pacific J. Math.41, 335–345 (1972), http://projecteuclid.org/ euclid.pjm/1102968279

  12. [12]

    Constraints24(1), 1–24 (Jan 2019)

    Codish, M., Miller, A., Prosser, P., Stuckey, P.J.: Constraints for symmetry break- ing in graph representation. Constraints24(1), 1–24 (Jan 2019). https://doi.org/ 10.1007/s10601-018-9294-5

  13. [13]

    In: Proceedings of the 9th ACM SIGPLAN International Conference on Certified Programs and Proofs

    mathlib Community, T.: The Lean Mathematical Library. In: Proceedings of the 9th ACM SIGPLAN International Conference on Certified Programs and Proofs. CPP ’20, ACM (January 2020). https://doi.org/10.1145/3372885.3373824

  14. [14]

    The Electronic Journal of CombinatoricsDynamic Sur- veys, #DS19: Oct 11, 2021 (Jul 2011)

    Currie, B.L., Faudree, J.R., Faudree, R.J., Schmitt, J.R.: A survey of mini- mum saturated graphs. The Electronic Journal of CombinatoricsDynamic Sur- veys, #DS19: Oct 11, 2021 (Jul 2011). https://doi.org/10.37236/41, https://www. combinatorics.org/ojs/index.php/eljc/article/view/DS19

  15. [15]

    European J

    Damásdi, G., Keszegh, B., Malec, D., Tompkins, C., Wang, Z., Zamora, O.: Satu- ration problems in the Ramsey theory of graphs, posets and point sets. European J. Combin.95, Paper No. 103321, 21 (2021). https://doi.org/10.1016/j.ejc.2021. 103321

  16. [16]

    Eisenbrand, F., Grandoni, F.: On the complexity of fixed parameter clique and dominating set. Theor. Comput. Sci.326(1-3), 57–67 (2004). https://doi.org/10. 1016/J.TCS.2004.05.009

  17. [17]

    Erdős, P., Hajnal, A., Moon, J.W.: A problem in graph theory. Amer. Math. Monthly71, 1107–1110 (1964). https://doi.org/10.2307/2311408

  18. [18]

    Semi-autonomous mathematics discovery with gemini: A case study on the erd\h{o}s problems

    Feng, T., Trinh, T., Bingham, G., Kang, J., Zhang, S., hyun Kim, S., Barreto, K., Schildkraut, C., Jung, J., Seo, J., Pagano, C., Chervonyi, Y., Hwang, D., Hou, K., Gukov, S., Tsai, C.C., Choi, H., Jin, Y., Li, W.Y., Wu, H.A., Shiu, R.A., Shih, 16 Przybocki et al. Y.S., Le, Q.V., Luong, T.: Semi-autonomous mathematics discovery with Gemini: A case study o...

  19. [19]

    Feng, T., Trinh, T.H., Bingham, G., Hwang, D., Chervonyi, Y., Jung, J., Lee, J., Pagano, C., hyun Kim, S., Pasqualotto, F., Gukov, S., Lee, J.N., Kim, J., Hou, K., Ghiasi, G., Tay, Y., Li, Y., Kuang, C., Liu, Y., Lin, H., Liu, E.Z., Nayakanti, N., Yang, X., Cheng, H.T., Hassabis, D., Kavukcuoglu, K., Le, Q.V., Luong, T.: Towards autonomous mathematics res...

  20. [20]

    Gerencsér, L., Gyárfás, A.: On Ramsey-type problems. Ann. Univ. Sci. Budapest. Eötvös Sect. Math.10, 167–170 (1967)

  21. [21]

    Electron

    Goedgebeur, J., Radziszowski, S.P.: New computational upper bounds for Ramsey numbersR(3, k). Electron. J. Combin.20(1), Paper 30, 22 (2013). https://doi.org/ 10.37236/2824

  22. [22]

    Gowers, W.T.: Rough structure and classification. pp. 79–117 (2000). https://doi. org/10.1007/978-3-0346-0422-2_4, GAFA 2000 (Tel Aviv, 1999)

  23. [23]

    Wiley-Interscience Series in Discrete Mathematics and Optimization, John Wiley & Sons, Inc., New York, second edn

    Graham, R.L., Rothschild, B.L., Spencer, J.H.: Ramsey theory. Wiley-Interscience Series in Discrete Mathematics and Optimization, John Wiley & Sons, Inc., New York, second edn. (1990), a Wiley-Interscience Publication

  24. [24]

    Grinstead, C.M., Roberts, S.M.: On the Ramsey numbersR(3,8)andR(3,9). J. Combin. Theory Ser. B33(1), 27–51 (1982). https://doi.org/10.1016/ 0095-8956(82)90055-7

  25. [25]

    Canadian J

    Hajnal, A.: A theorem onk-saturated graphs. Canadian J. Math.17, 720–724 (1965). https://doi.org/10.4153/CJM-1965-072-1

  26. [26]

    In: McIlraith, S.A., Weinberger, K.Q

    Heule, M.J.H.: Schur number five. In: McIlraith, S.A., Weinberger, K.Q. (eds.) Pro- ceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI- 18), the 30th innovative Applications of Artificial Intelligence (IAAI-18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI- 18), New Orleans, Louisiana, US...

  27. [27]

    In: Creignou, N., Berre, D.L

    Heule, M.J.H., Kullmann, O., Marek, V.W.: Solving and verifying the Boolean Pythagorean triples problem via cube-and-conquer. In: Creignou, N., Berre, D.L. (eds.) Theory and Applications of Satisfiability Testing - SAT 2016 - 19th Interna- tional Conference, Bordeaux, France, July 5-8, 2016, Proceedings. Lecture Notes in Computer Science, vol. 9710, pp. 2...

  28. [28]

    National Science Review9(6), nwac035 (03 2022)

    Hitzler, P., Eberhart, A., Ebrahimi, M., Sarker, M.K., Zhou, L.: Neuro-symbolic approaches in artificial intelligence. National Science Review9(6), nwac035 (03 2022). https://doi.org/10.1093/nsr/nwac035

  29. [29]

    Olympiad- level formal mathematical reasoning with reinforcement learning

    Hubert, T., Mehta, R., Sartran, L., Horváth, M.Z., Žužić, G., Wieser, E., Huang, A., Schrittwieser, J., Schroecker, Y., Masoom, H., Bertolli, O., Zahavy, T., Mand- hane, A., Yung, J., Beloshapka, I., Ibarz, B., Veeriah, V., Yu, L., Nash, O., Lezeau, P., Mercuri, S., Sönne, C., Mehta, B., Davies, A., Zheng, D., Pedregosa, F., Li, Y., von Glehn, I., Rowland...

  30. [30]

    Ignatiev, A., Morgado, A., Marques-Silva, J.: PySAT: A Python toolkit for pro- totyping with SAT oracles. In: SAT. pp. 428–437 (2018). https://doi.org/10.1007/ 978-3-319-94144-8_26, https://doi.org/10.1007/978-3-319-94144-8_26

  31. [31]

    Kászonyi, L., Tuza, Z.: Saturated graphs with minimal number of edges. J. Graph Theory10(2), 203–210 (1986). https://doi.org/10.1002/jgt.3190100209 Doubly Saturated Ramsey Graphs 17

  32. [32]

    In: Sinz, C., Egly, U

    Konev, B., Lisitsa, A.: A SAT attack on the Erdős discrepancy conjecture. In: Sinz, C., Egly, U. (eds.) Theory and Applications of Satisfiability Testing - SAT 2014 - 17th International Conference, Held as Part of the Vienna Summer of Logic, VSL 2014, Vienna, Austria, July 14-17, 2014. Proceedings. Lecture Notes in Com- puter Science, vol. 8561, pp. 219–2...

  33. [33]

    Lovász, L., Plummer, M.D.: On a family of planar bicritical graphs. Proc. London Math. Soc. (3)30, 160–176 (1975). https://doi.org/10.1112/plms/s3-30.2.160

  34. [34]

    Discrete Math.312(21), 3096–3106 (2012)

    Martin, R.R., Smith, J.J.: Induced saturation number. Discrete Math.312(21), 3096–3106 (2012). https://doi.org/10.1016/j.disc.2012.06.015

  35. [35]

    Mattheus, S., Verstraete, J.: The asymptotics ofr(4, t). Ann. of Math. (2)199(2), 919–941 (2024). https://doi.org/10.4007/annals.2024.199.2.8

  36. [36]

    In: Automated Deduction – CADE 28: 28th International Conference on Automated Deduction, Virtual Event, July 12–15, 2021, Proceedings

    de Moura, L., Ullrich, S.: The Lean 4 theorem prover and programming lan- guage. In: Automated Deduction – CADE 28: 28th International Conference on Automated Deduction, Virtual Event, July 12–15, 2021, Proceedings. p. 625–635. Springer-Verlag, Berlin, Heidelberg (2021). https://doi.org/10.1007/ 978-3-030-79876-5_37

  37. [37]

    Niskanen, S., Östergård, P.R.J.: Cliquer user’s guide, version 1.0. Tech. Rep. T48, Communications Laboratory, Helsinki University of Technology, Espoo, Finland (2003)

  38. [38]

    Electron

    Przybocki, B.: Lengths of irreducible and delicate words. Electron. J. Combin. 29(3), Paper No. 3.7, 12 (2022). https://doi.org/10.37236/10669

  39. [39]

    Ramsey, F.P.: On a problem of formal logic. Proc. London Math. Soc. (2)30(4), 264–286 (1929). https://doi.org/10.1112/plms/s2-30.1.264

  40. [40]

    Journal of Artificial Intelligence Research (JAIR)80, 1437–1495 (2024)

    Schreiber, D., Sanders, P.: MallobSat: Scalable SAT solving by clause sharing. Journal of Artificial Intelligence Research (JAIR)80, 1437–1495 (2024). https: //doi.org/10.1613/jair.1.15827

  41. [41]

    2005 , publisher =

    Sinz, C.: Towards an optimal CNF encoding of boolean cardinality constraints. In: van Beek, P. (ed.) Principles and Practice of Constraint Programming - CP. Lecture Notes in Computer Science, vol. 3709, pp. 827–831. Springer (2005). https: //doi.org/10.1007/11564751_73

  42. [42]

    Subercaseaux and M

    Subercaseaux, B., Heule, M.J.H.: The packing chromatic number of the infinite square grid is 15. In: Sankaranarayanan, S., Sharygina, N. (eds.) Tools and Algo- rithms for the Construction and Analysis of Systems - 29th International Confer- ence, TACAS 2023, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2022, Par...

  43. [43]

    In: Kohlhase, A., Kovács, L

    Subercaseaux, B., Mackey, J., Heule, M.J.H., Martins, R.: Automated mathemat- ical discovery and verification: Minimizing pentagons in the plane. In: Kohlhase, A., Kovács, L. (eds.) Intelligent Computer Mathematics - 17th International Con- ference, CICM 2024, Montréal, QC, Canada, August 5-9, 2024, Proceedings. pp. 21–41. Lecture Notes in Computer Scienc...

  44. [44]

    Urban, J.: 130k lines of formal topology in two weeks: Simple and cheap autofor- malization for everyone? (2026), https://arxiv.org/abs/2601.03298

  45. [45]

    23486 18 Przybocki et al

    Weng, K., Du, L., Li, S., Lu, W., Sun, H., Liu, H., Zhang, T.: Autoformalization in the era of large language models: A survey (2025), https://arxiv.org/abs/2505. 23486 18 Przybocki et al

  46. [46]

    Woodruff, D.P., Cohen-Addad, V., Jain, L., Mao, J., Zuo, S., Bateni, M., Branzei, S., Brenner, M.P., Chen, L., Feng, Y., Fortnow, L., Fu, G., Guan, Z., Hadizadeh, Z., Hajiaghayi, M.T., JafariRaviz, M., Javanmard, A., C. S., K., Kawarabayashi, K., Kumar, R., Lattanzi, S., Lee, E., Li, Y., Panageas, I., Paparas, D., Przy- bocki, B., Subercaseaux, B., Svenss...

  47. [47]

    Xia, H., Gomes, C.P., Selman, B., Szeider, S.: Agentic neurosymbolic collaboration for mathematical discovery: A case study in combinatorial design (2026), https: //arxiv.org/abs/2603.08322

  48. [48]

    In: Che, W., Nabende, J., Shutova, E., Pilehvar, M.T

    Yan, Y., Su, J., He, J., Fu, F., Zheng, X., Lyu, Y., Wang, K., Wang, S., Wen, Q., Hu,X.:Asurveyofmathematicalreasoningintheeraofmultimodallargelanguage model: Benchmark, method & challenges. In: Che, W., Nabende, J., Shutova, E., Pilehvar, M.T. (eds.) Findings of the Association for Computational Linguistics, ACL 2025, Vienna, Austria, July 27 - August 1,...