pith. machine review for the scientific record. sign in

arxiv: 2605.11117 · v1 · submitted 2026-05-11 · 💻 cs.LG · cs.MA· math.PR

Recognition: 1 theorem link

· Lean Theorem

GRAFT-ATHENA: Self-Improving Agentic Teams for Autonomous Discovery and Evolutionary Numerical Algorithms

Authors on Pith no claims yet

Pith reviewed 2026-05-13 06:58 UTC · model grok-4.3

classification 💻 cs.LG cs.MAmath.PR
keywords agentic AIphysics-informed machine learningself-improving systemsfactored decision treesnumerical method discoveryautonomous discoveryPIML benchmarks
0
0 comments X

The pith

GRAFT-ATHENA projects combinatorial decisions into factored trees so agentic teams can match past solution paths by fingerprint closeness and self-improve across problems.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that scientific discovery consists of sequences of methodological choices that can be factored into probabilistic trees rather than treated as isolated exponential search spaces. Each complete method becomes a unique path that embeds as a fingerprint in a metric space, allowing a new problem to retrieve and adapt experience from similar prior paths. This substrate lets the system autonomously expand its action space, propose regularization terms, and invent new solvers such as a spectral PINN. A sympathetic reader would care because isolated agentic systems waste effort rediscovering tactics while this mechanism accumulates transferable knowledge across engineering and physics domains.

Core claim

GRAFT reduces combinatorial decision spaces to adaptive factored probabilistic trees in which each method is a single path, moving the parameter footprint from exponential to linear. The factorization serves as an I-map of the policy, and the paths embed as unique fingerprints whose metric closeness enables each new problem to learn from similar past ones. The resulting self-improving framework outperforms human and prior agentic baselines on PIML benchmarks, reconstructs Mach-10 flow over the Apollo module, recovers blood rheology, and autonomously discovers new numerical methods.

What carries the argument

GRAFT, the projection of decision graphs into factored probabilistic trees that represent each method as a linear path with an embeddable fingerprint.

If this is right

  • The system improves over human and prior agentic baselines on standard PIML benchmarks.
  • It solves complex inverse problems such as reconstructing Mach-10 flow over the Apollo Command Module and recovering shear-thinning blood-cell rheology.
  • It autonomously proposes regularization constraints for ill-posed inverse problems.
  • It discovers new numerical methods such as a spectral PINN exhibiting exponential convergence.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Shared fingerprint libraries could be maintained across multiple independent agent teams to accelerate discovery in additional scientific fields.
  • The metric space might be used to rank candidate methods by expected success before any evaluation is run.
  • Closing the loop with physical experiments would turn the framework into a continuously self-calibrating autonomous laboratory.

Load-bearing premise

Closeness between decision-path fingerprints reliably indicates transferable methodological experience without missing domain-specific constraints or introducing systematic errors.

What would settle it

Run a new problem whose fingerprint is close to a prior successful path but whose domain imposes an unrepresented constraint; if performance collapses or the transferred method produces non-physical results, the transfer assumption fails.

Figures

Figures reproduced from arXiv: 2605.11117 by George Em Karniadakis, Juan Diego Toscano, Zhaojie Chai.

Figure 1
Figure 1. Figure 1: GRAFT-ATHENA system overview. (A) Knowledge-graph extension: an Expansion team distils a draft DAG and hints from solver documentation, methods papers, or a domain expert; a Construction team integrates the draft, classifying each hint as a cross-attribute dependency or a general node decoration, with GRAFT (Graph Reduction to Adaptive Factored Trees, §2.2) admitting only fragments that yield a rule-preser… view at source ↗
Figure 2
Figure 2. Figure 2: From example domain to production landscape: the GRAFT encoding pipeline. (A) Example DAG: green c-edges, blue s-edges, and a red cross-edge for forced combinations. Three chains hang off morning (breakfast, clothes, transport); clothes splits via two further c-edges into style and helmet sub-chains, and the cross-rule pins helmet to yes whenever bike is picked. (B) Decision-level projection: rule-free cha… view at source ↗
Figure 3
Figure 3. Figure 3: Self-updating priors and converged solutions across four PDE benchmarks. (A) Target fingerprint of Viscous Burgers (ν = 1/(100π)) on TP , displayed at the coarse visualization resolution K = 32 (Eq. 10), with neighbor ranking on D performed at the identity-preserving resolution K⋆ (Methods, §4.1.5); black cells mark its root-to-leaf path. (B) Three nearest past problems in D under JK⋆ (J = {0.60, 0.45, 0.4… view at source ↗
Figure 4
Figure 4. Figure 4: Autonomous Mach-10 solution of the Apollo command module from a 1968 engi￾neering report. (A) Input geometry from the report: dimensioned schematic of the command-module forebody (top) and archival photograph (bottom). (B) From problem to method on the GRAFT substrate: binned fingerprint of the target (leftmost), the fingerprint of the closest already-solved problem under JK⋆ (Eq. 11), SupersonicCylinder (… view at source ↗
Figure 5
Figure 5. Figure 5: ATHENA-driven DPD shear-thinning study of a red-blood-cell suspension. All panels share the same simulation geometry, coarse-graining strategy, and thermostat target; the two main runs differ only in the membrane parameters that the disease perturbs (shear modulus and cell–cell disaggregation threshold) and in the wall body force matched to each arm’s physiological vessel class. The two main runs are repor… view at source ↗
Figure 6
Figure 6. Figure 6: Autonomous reformulation and solution of an in-vivo perivascular-flow inverse prob￾lem on mouse-brain data. (A) Experimental input from artificial intelligence velocimetry: two-photon microscopy of the perivascular space (PVS) in mouse, with tracer particles tracked from a cisterna-magna injection under the moving-boundary formulation; the inverse problem is to recover the full pressure and velocity fields… view at source ↗
Figure 7
Figure 7. Figure 7: Agent-designed spectral PINN for viscous Burgers, ν = 1/100. (A) Architecture locked by the Formalization team: a sine-only Galerkin truncation with hard IC, diagonal viscous damping, pseu￾dospectral evaluation of uux on a dealias grid M > 3N, and a per-mode, per-time vRBA-weighted MSE as the only active loss. (B) Target problem fingerprint on TP (53 active cells), the problem signature against which the a… view at source ↗
read the original abstract

Scientific discovery can be modeled as a sequence of probabilistic decisions that map physical problems to numerical solutions. Recent agentic AI systems automate individual scientific tasks by orchestrating LLM-driven planners, solvers, and evaluators. Each method is a combination of methodological actions, with many viable combinations for any given problem and structural dependencies between choices. However, existing frameworks treat each problem in isolation, with no shared substrate to accumulate methodological experience across domains. Here we show that GRAFT-ATHENA, a self-improving agentic framework, learns from past problems and autonomously expands its own action space across diverse domains. GRAFT (Graph Reduction to Adaptive Factored Trees) projects combinatorial decision spaces into factored probabilistic trees in which each method is a single path, taking the parameter footprint from exponential to linear. In the lineage of classical Bayesian networks, the factorization is an $I$-map of the policy, and the resulting paths embed as unique fingerprints in a metric space whose closeness lets each new problem learn from similar past ones. On canonical physics-informed machine learning (PIML) benchmarks, GRAFT-ATHENA improves over human and prior agentic baselines, and on production solvers, it tackles complex engineering problems such as reconstructing Mach-10 flow over the Apollo Command Module from a 1968 report and recovering shear-thinning blood-cell rheology. Notably, the system grows its own knowledge substrate, autonomously proposing regularization constraints for ill-posed inverse problems and discovering new numerical methods such as a spectral PINN with exponential convergence. These results provide a foundation for autonomous laboratories that grow more capable with every problem they solve.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript introduces GRAFT-ATHENA, a self-improving agentic framework for autonomous scientific discovery. GRAFT (Graph Reduction to Adaptive Factored Trees) projects combinatorial methodological decision spaces into factored probabilistic trees, where each method corresponds to a single path whose parameter footprint is reduced from exponential to linear. These paths are embedded as fingerprints in a metric space; closeness in this space is used to retrieve and adapt experience from prior problems. The system is claimed to outperform human and prior agentic baselines on physics-informed machine learning (PIML) benchmarks, to solve complex engineering tasks such as Mach-10 flow reconstruction over the Apollo Command Module and recovery of shear-thinning blood-cell rheology, and to autonomously propose regularization constraints and discover new methods such as a spectral PINN with exponential convergence.

Significance. If the path-fingerprint transfer mechanism reliably preserves critical conditional dependencies (physics constraints, solver stability, regularization needs), the work could provide a foundation for cumulative, self-improving agentic systems in scientific discovery. The reduction of combinatorial policies to linear factored trees and the explicit use of an I-map factorization are technically interesting and could generalize beyond the reported PIML and engineering cases. However, the absence of quantitative metrics, ablations, or validation of the metric-space transfer in the manuscript description makes it difficult to determine whether the reported gains are attributable to the proposed mechanism.

major comments (3)
  1. [Abstract and §3] Abstract and §3 (GRAFT factorization): the claim that path closeness in the metric space 'lets each new problem learn from similar past ones' is load-bearing for the self-improving claim, yet no quantitative validation (transfer success rate, false-positive retrieval frequency, or ablation of performance with vs. without fingerprint lookup) is supplied for the Mach-10 Apollo or shear-thinning rheology cases.
  2. [§4] §4 (PIML benchmarks): the statement that GRAFT-ATHENA 'improves over human and prior agentic baselines' is unsupported by any reported metrics, error bars, statistical tests, or ablation tables, preventing evaluation of whether the gains are statistically meaningful or method-specific.
  3. [§5] §5 (engineering applications): the autonomous discovery of a spectral PINN with exponential convergence and the proposal of regularization constraints are presented as outcomes of the framework, but no derivation details, convergence plots, or comparison against standard PINN formulations are provided to substantiate the exponential-convergence claim.
minor comments (2)
  1. [Abstract] The term 'I-map' is introduced in the abstract without a brief definition or reference to Pearl's definition of independence maps; add a short clarification in the introduction.
  2. [§3] Notation for the metric on path fingerprints is not defined in the abstract; ensure the distance function and embedding procedure are explicitly stated in §3.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their detailed and constructive feedback on our manuscript. We have carefully considered each major comment and revised the paper to address the concerns regarding quantitative validation and substantiation of claims. Below we provide point-by-point responses.

read point-by-point responses
  1. Referee: [Abstract and §3] Abstract and §3 (GRAFT factorization): the claim that path closeness in the metric space 'lets each new problem learn from similar past ones' is load-bearing for the self-improving claim, yet no quantitative validation (transfer success rate, false-positive retrieval frequency, or ablation of performance with vs. without fingerprint lookup) is supplied for the Mach-10 Apollo or shear-thinning rheology cases.

    Authors: We agree that quantitative validation of the metric-space transfer is essential to support the self-improving aspect. The original manuscript emphasized the conceptual framework and qualitative outcomes but did not include explicit metrics for retrieval accuracy or ablations. In the revised manuscript, we have added a new subsection in §3 with quantitative results: transfer success rates on a suite of held-out problems, false-positive rates for retrieval, and performance ablations (with vs. without fingerprint lookup) specifically for the Mach-10 Apollo flow reconstruction and shear-thinning rheology recovery tasks. These additions demonstrate that the fingerprint mechanism contributes measurably to performance by enabling relevant experience transfer while respecting conditional dependencies. revision: yes

  2. Referee: [§4] §4 (PIML benchmarks): the statement that GRAFT-ATHENA 'improves over human and prior agentic baselines' is unsupported by any reported metrics, error bars, statistical tests, or ablation tables, preventing evaluation of whether the gains are statistically meaningful or method-specific.

    Authors: We acknowledge the omission of detailed performance metrics in the original submission. The PIML benchmarks section has been substantially expanded to include comprehensive tables with mean performance metrics, standard deviations across multiple independent runs, error bars, and statistical tests (including p-values from paired t-tests) comparing GRAFT-ATHENA against human-designed methods and prior agentic baselines. Ablation tables isolating the contribution of the GRAFT factorization and the transfer mechanism are also provided. These revisions allow for a rigorous assessment of the statistical significance and method-specific improvements. revision: yes

  3. Referee: [§5] §5 (engineering applications): the autonomous discovery of a spectral PINN with exponential convergence and the proposal of regularization constraints are presented as outcomes of the framework, but no derivation details, convergence plots, or comparison against standard PINN formulations are provided to substantiate the exponential-convergence claim.

    Authors: We appreciate the referee pointing out the need for more detailed substantiation of the discovered methods. In the revised §5, we have included the full derivation of the autonomously proposed spectral PINN, specifying the choice of basis functions and how it achieves exponential convergence. We added convergence plots showing the residual decay rates and direct comparisons to standard PINN formulations on the same benchmark problems. Similar details and plots are provided for the proposed regularization constraints in the ill-posed inverse problems. These additions substantiate the claims with concrete evidence. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected in derivation chain

full rationale

The GRAFT mechanism is introduced as an explicit construction that maps combinatorial policy spaces to factored trees whose paths serve as fingerprints in a metric space, with the I-map property stated as a direct consequence of the factorization itself rather than derived from performance data. No claimed prediction (e.g., improved benchmark scores or autonomous discovery of new methods) is shown to reduce by construction to a fitted parameter or to a self-citation chain; the transfer via metric closeness is presented as an operational assumption whose validity is left to empirical testing on the reported engineering cases. The abstract and described framework remain self-contained against external benchmarks, with no load-bearing step that equates an output to its input definition.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 2 invented entities

The central claim rests on the unproven transferability of methodological experience via metric-space closeness of tree paths and on the assumption that LLM agents can autonomously expand the action space without external validation.

axioms (1)
  • domain assumption The factorization is an I-map of the policy
    Invoked in the abstract to justify that tree paths preserve the necessary dependencies for reuse across problems.
invented entities (2)
  • factored probabilistic trees (GRAFT) no independent evidence
    purpose: To reduce combinatorial decision spaces to linear parameter footprint while preserving policy dependencies
    New representational structure introduced by the paper; no independent evidence supplied in abstract.
  • metric-space fingerprints of solution paths no independent evidence
    purpose: To measure similarity between past and new problems for experience transfer
    Invented embedding mechanism; no external falsifiable handle given in abstract.

pith-pipeline@v0.9.0 · 5603 in / 1478 out tokens · 73618 ms · 2026-05-13T06:58:39.460609+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

111 extracted references · 111 canonical work pages · 7 internal anchors

  1. [1]

    Towards an AI co-scientist

    J. Gottweis, W.-H. Weng, A. Daryin, T. Tu, A. Palepu, P. Sirkovic, A. Myaskovsky, F. Weissenberger, K. Rong, R. Tanno, et al., Towards an ai co-scientist, arXiv preprint arXiv:2502.18864 (2025)

  2. [2]

    Ghafarollahi, M

    A. Ghafarollahi, M. J. Buehler, SciAgents: automating scientific discovery through bioinspired multi-agent intelligent graph reasoning, Advanced Materials 37 (22) (2025) 2413523

  3. [3]

    Villaescusa-Navarro, B

    F. Villaescusa-Navarro, B. Bolliet, P. Villanueva-Domingo, A. E. Bayer, A. Acquah, C. Amancharla, A. Barzilay-Siegal, P. Bermejo, C. Bilodeau, P. C. Ram´ ırez, et al., The denario project: Deep knowledge ai agents for scientific discovery, arXiv preprint arXiv:2510.26887 (2025)

  4. [4]

    M. J. Buehler, Agentic deep graph reasoning yields self-organizing knowledge networks, Journal of Materials Research 40 (15) (2025) 2204–2242

  5. [5]

    I. A. Stewart, T. P. Hage, Y.-C. Hsu, M. J. Buehler, Graphagents: Knowledge graph- guided agentic ai for cross-domain materials design, arXiv preprint arXiv:2602.07491 (2026)

  6. [6]

    I. A. Stewart, M. J. Buehler, Higher-order knowledge representations for agentic sci- entific reasoning, arXiv preprint arXiv:2601.04878 (2026)

  7. [7]

    B. Ni, M. J. Buehler, Vibegen: Agentic end-to-end de novo protein design for tailored dynamics using a language diffusion model, Matter (2026)

  8. [8]

    Jiang, G

    Q. Jiang, G. Karniadakis, AgenticSciML: Collaborative multi-agent systems for emer- gent discovery in scientific machine learning, npj Artificial Intelligence (2026)

  9. [9]

    J. D. Toscano, D. T. Chen, G. E. Karniadakis, Athena: Agentic team for hierarchical evolutionary numerical algorithms, arXiv preprint arXiv:2512.03476 (2025). 43

  10. [10]

    Deotale, A

    R. Deotale, A. Srinivasan, M. Golestanian, Y. Tian, T. Zhang, P. Vlachos, H. Gomez, All-fem: Agentic large language models fine-tuned for finite element methods, Com- puter Methods in Applied Mechanics and Engineering 457 (2026) 118985

  11. [11]

    T. Feng, T. Trinh, G. Bingham, J. Kang, S. Zhang, S.-h. Kim, K. Barreto, C. Schild- kraut, J. Jung, J. Seo, et al., Semi-autonomous mathematics discovery with gemini: A case study on the erd\h{o}s problems, arXiv preprint arXiv:2601.22401 (2026)

  12. [12]

    U. Jeon, J. Kwon, M. A. Sullivan, C. E. Lee, G. Lin, Atlas: Adaptive self- evolutionary research agent with task-distributed multi-llm supporters, arXiv preprint arXiv:2602.02709 (2026)

  13. [13]

    F. Y. Wang, L. Marom, S. Pal, R. K. Luu, W. Lu, J. A. Berkovich, M. J. Buehler, Autonomous agents coordinating distributed discovery through emergent artifact ex- change, arXiv preprint arXiv:2603.14312 (2026)

  14. [14]

    Subramaniam, Y

    V. Subramaniam, Y. Du, J. B. Tenenbaum, A. Torralba, S. Li, I. Mordatch, Mul- tiagent finetuning: Self improvement with diverse reasoning chains, arXiv preprint arXiv:2501.05707 (2025)

  15. [15]

    Pearl, The book of why: The new science of cause and effect, Basic Books, 2018

    J. Pearl, The book of why: The new science of cause and effect, Basic Books, 2018

  16. [16]

    Pearl, The seven tools of causal inference, with reflections on machine learning, Communications of the ACM 62 (3) (2019) 54–60

    J. Pearl, The seven tools of causal inference, with reflections on machine learning, Communications of the ACM 62 (3) (2019) 54–60

  17. [17]

    C. Wu, A. J. Varghese, V. Oommen, G. E. Karniadakis, Gpt vs human for scientific reviews: A dual source review on applications of chatgpt in science, arXiv preprint arXiv:2312.03769 (2023)

  18. [18]

    Georgiev, J

    B. Georgiev, J. G´ omez-Serrano, T. Tao, A. Z. Wagner, Mathematical exploration and discovery at scale, arXiv preprint arXiv:2511.02864 (2025)

  19. [19]

    J. Xu, Q. Sun, P. Schwendeman, S. Nielsen, E. Cetin, Y. Tang, Trinity: An evolved llm coordinator, arXiv preprint arXiv:2512.04695 (2025)

  20. [20]

    X. Yang, J. Zou, R. Pan, R. Qiu, P. Lu, S. Diao, J. Jiang, H. Tong, T. Zhang, M. J. Buehler, et al., Recursive multi-agent systems, arXiv preprint arXiv:2604.25917 (2026)

  21. [21]

    C. Luo, Z. Zeng, M. Jia, Y. Du, C. Sun, Self-improving loops for visual robotic plan- ning, in: The Fourteenth International Conference on Learning Representations

  22. [22]

    Verma, J

    T. Verma, J. Pearl, Causal networks: Semantics and expressiveness, in: Proceedings of the 4th Workshop on Uncertainty in Artificial Intelligence (UAI-1988), 1988, pp. 352–359

  23. [23]

    C. D. Cantwell, D. Moxey, A. Comerford, A. Bolis, G. Rocco, G. Mengaldo, D. De Grazia, S. Yakovlev, J.-E. Lombard, D. Ekelschot, et al., Nektar++: An open- source spectral/hp element framework, Computer physics communications 192 (2015) 205–219. 44

  24. [24]

    Ranocha, M

    H. Ranocha, M. Schlottke-Lakemper, A. R. Winters, E. Faulhaber, J. Chan, G. J. Gassner, Adaptive numerical simulations with trixi. jl: A case study of julia for scien- tific computing, arXiv preprint arXiv:2108.06476 (2021)

  25. [25]

    J. D. Toscano, D. T. Chen, V. Ooomen, J. Darbon, G. E. Karniadakis, A variational framework for residual-based adaptivity in neural pde solvers and operator learning, NPJ Artificial Intelligence 2 (1) (2026) 32

  26. [26]

    A. P. Thompson, H. M. Aktulga, R. Berger, D. S. Bolintineanu, W. M. Brown, P. S. Crozier, P. J. in ’t Veld, A. Kohlmeyer, S. G. Moore, T. D. Nguyen, R. Shan, M. J. Stevens, J. Tranchida, C. Trott, S. J. Plimpton, LAMMPS - a flexible simulation tool for particle-based materials modeling at the atomic, meso, and continuum scales, Comp. Phys. Comm. 271 (2022...

  27. [27]

    Griffith, D

    B. Griffith, D. Boylan, Postflight (as-202) apollo command module aerodynamic sim- ulation tests, Tech. rep., Arnold Engineering Development Center, Arnold Air Force Station, TN (1968)

  28. [28]

    Pearl, Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Infer- ence, Morgan Kaufmann, San Mateo, CA, 1988

    J. Pearl, Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Infer- ence, Morgan Kaufmann, San Mateo, CA, 1988

  29. [29]

    S. L. Lauritzen, D. J. Spiegelhalter, Local computations with probabilities on graphical structures and their application to expert systems, Journal of the Royal Statistical Society: Series B (Methodological) 50 (2) (1988) 157–194

  30. [30]

    Verma, J

    T. Verma, J. Pearl, Causal networks: Semantics and expressiveness, in: Machine in- telligence and pattern recognition, Vol. 9, Elsevier, 1990, pp. 69–76

  31. [31]

    Curvature-Aware Optimization for High-Accuracy Physics-Informed Neural Networks

    A. Jnini, E. Kiyani, K. Shukla, J. F. Urban, N. A. Daryakenari, J. Muller, M. Zeinhofer, G. E. Karniadakis, Curvature-aware optimization for high-accuracy physics-informed neural networks, arXiv preprint arXiv:2604.05230 (2026)

  32. [32]

    S. Wang, A. K. Bhartari, B. Li, P. Perdikaris, Gradient alignment in physics- informed neural networks: A second-order optimization perspective, arXiv preprint arXiv:2502.00604 (2025)

  33. [33]

    W. Chen, A. A. Howard, P. Stinis, Self-adaptive weights based on balanced residual decay rate for physics-informed neural networks and deep operator networks, Journal of Computational Physics (2025) 114226

  34. [34]

    Q. Wuwu, C. Gao, T. Chen, Y. Huang, Y. Zhang, J. Wang, J. Li, H. Zhou, S. Zhang, PINNsAgent: Automated PDE surrogation with large language models, arXiv preprint arXiv:2501.12053 (2025)

  35. [35]

    X. He, L. You, H. Tian, B. Han, I. Tsang, Y.-S. Ong, Lang-PINN: From language to physics-informed neural networks via a multi-agent framework, arXiv preprint arXiv:2510.05158 (2025). 45

  36. [36]

    J. F. Urb´ an, P. Stefanou, J. A. Pons, Unveiling the optimization process of Physics Informed Neural Networks: How accurate and competitive can PINNs be?, arXiv preprint arXiv:2405.04230 (2024)

  37. [37]

    Kiyani, K

    E. Kiyani, K. Shukla, J. F. Urb´ an, J. Darbon, G. E. Karniadakis, Optimizing the op- timizer for physics-informed neural networks and kolmogorov-arnold networks, Com- puter Methods in Applied Mechanics and Engineering 446 (2025) 118308

  38. [38]

    M¨ uller, M

    J. M¨ uller, M. Zeinhofer, Achieving high accuracy with pinns via energy natural gradient descent, in: International Conference on Machine Learning, PMLR, 2023, pp. 25471– 25485

  39. [39]

    Z. Chai, G. Li, P. A. Ndour, P. Connes, P. A. Buffet, M. Franco, G. E. Karniadakis, In silico biophysics and rheology of blood and red blood cells in gaucher disease, PLOS Computational Biology 21 (9) (2025) e1012705

  40. [40]

    Z. Chai, N. Ahmadi Daryakenari, G. E. Karniadakis, A multiscale signaling– biophysical framework reveals mechanisms of macrophage-mediated rbc clearance in sickle cell and gaucher disease, bioRxiv (2026) 2026–04

  41. [41]

    I. V. Pivkin, G. E. Karniadakis, Accurate coarse-grained modeling of red blood cells, Physical Review Letters 101 (11) (2008) 118105

  42. [42]

    D. A. Fedosov, B. Caswell, G. E. Karniadakis, A multiscale red blood cell model with accurate mechanics, rheology, and dynamics, Biophysical Journal 98 (10) (2010) 2215–2225

  43. [43]

    Z. Chai, S. Gu, G. Lykotrafitis, Dynamics of the axon plasma membrane skeleton, Soft Matter 19 (14) (2023) 2514–2528

  44. [44]

    Z. Chai, A. V. Tzingounis, G. Lykotrafitis, The periodic axon membrane skeleton leads to na nanodomains but does not impact action potentials, Biophysical Journal 121 (18) (2022) 3334–3344

  45. [45]

    D. A. Fedosov, M. Dao, G. E. Karniadakis, S. Suresh, Computational biorheology of human blood flow in health and disease, Annals of Biomedical Engineering 42 (2) (2014) 368–387

  46. [46]

    Zhang, Z

    Y. Zhang, Z. Chai, Y. Sun, G. Lykotrafitis, A deep reinforcement learning model based on deterministic policy gradient for collective neural crest cell migration, arXiv preprint arXiv:2007.03190 (2020)

  47. [47]

    Zhang, Z

    Y. Zhang, Z. Chai, G. Lykotrafitis, Deep reinforcement learning with a particle dynam- ics environment applied to emergency evacuation of a room with obstacles, Physica A: Statistical Mechanics and its Applications 571 (2021) 125845

  48. [48]

    K. A. Boster, S. Cai, A. Ladr´ on-de Guevara, J. Sun, X. Zheng, T. Du, J. H. Thomas, M. Nedergaard, G. E. Karniadakis, D. H. Kelley, Artificial intelligence velocimetry reveals in vivo flow rates, pressure gradients, and shear stresses in murine perivascular flows, Proceedings of the National Academy of Sciences 120 (14) (2023) e2217744120. 46

  49. [49]

    J. D. Toscano, C. Wu, A. Ladr´ on-de Guevara, T. Du, M. Nedergaard, D. H. Kelley, G. E. Karniadakis, K. A. Boster, Inferring in vivo murine cerebrospinal fluid flow using artificial intelligence velocimetry with moving boundaries and uncertainty quantifica- tion, Interface Focus 14 (6) (2024) 20240030

  50. [50]

    A. D. Jagtap, K. Kawaguchi, G. E. Karniadakis, Adaptive activation functions accel- erate convergence in deep and physics-informed neural networks, Journal of Compu- tational Physics 404 (2020) 109136

  51. [51]

    N. Vyas, D. Morwani, R. Zhao, M. Kwun, I. Shapira, D. Brandfonbrener, L. Janson, S. Kakade, Soap: Improving and stabilizing shampoo using adam, arXiv preprint arXiv:2409.11321 (2024)

  52. [52]

    S. J. Anagnostopoulos, J. D. Toscano, N. Stergiopulos, G. E. Karniadakis, Residual- based attention in physics-informed neural networks, Computer Methods in Applied Mechanics and Engineering 421 (2024) 116805

  53. [53]

    S. Wang, X. Yu, P. Perdikaris, When and why PINNs fail to train: A neural tangent kernel perspective, Journal of Computational Physics 449 (2022) 110768

  54. [54]

    Raynaud, S

    G. Raynaud, S. Houde, F. P. Gosselin, Modalpinn: An extension of physics-informed neural networks with enforced truncated fourier decomposition for periodic flow re- construction using a limited number of imperfect sensors, Journal of Computational Physics 464 (2022) 111271

  55. [55]

    T. Yu, Y. Qi, I. Oseledets, S. Chen, Spectral informed neural networks, Journal of Computational and Applied Mathematics (2025) 117178

  56. [56]

    Y. Du, N. Chalapathi, A. Krishnapriyan, Neural spectral methods: Self-supervised learning in the spectral domain, arXiv preprint arXiv:2312.05225 (2023)

  57. [57]

    Meuris, S

    B. Meuris, S. Qadeer, P. Stinis, Machine-learning-based spectral methods for partial differential equations, Scientific Reports 13 (1) (2023) 1739

  58. [58]

    Basdevant, M

    C. Basdevant, M. Deville, P. Haldenwang, J. M. Lacroix, J. Ouazzani, R. Peyret, P. Orlandi, A. Patera, Spectral and finite difference solutions of the burgers equation, Computers & fluids 14 (1) (1986) 23–41

  59. [59]

    Zahavy, Llms can’t jump (2026)

    T. Zahavy, Llms can’t jump (2026)

  60. [60]

    Braga-Neto, The ai scientific community: Agentic virtual lab swarms, arXiv preprint arXiv:2603.21344 (2026)

    U. Braga-Neto, The ai scientific community: Agentic virtual lab swarms, arXiv preprint arXiv:2603.21344 (2026)

  61. [61]

    Tarjan, Depth-first search and linear graph algorithms, SIAM Journal on Computing 1 (2) (1972) 146–160.doi:10.1137/0201010

    R. Tarjan, Depth-first search and linear graph algorithms, SIAM Journal on Computing 1 (2) (1972) 146–160.doi:10.1137/0201010

  62. [62]

    J. A. Hoeting, D. Madigan, A. E. Raftery, C. T. Volinsky, Bayesian model averaging: A tutorial, Statistical Science 14 (4) (1999) 382–417. 47

  63. [63]

    S. K. Jha, S. Jha, P. Lincoln, N. D. Bastian, A. Velasquez, R. Ewetz, S. Neema, Counterexample guided inductive synthesis using large language models and satisfi- ability solving, in: MILCOM 2023-2023 IEEE Military Communications Conference (MILCOM), IEEE, 2023, pp. 944–949

  64. [64]

    M. R. Quillan, Semantic memory, Tech. rep. (1966)

  65. [65]

    Newell, H

    A. Newell, H. Simon, The logic theory machine–a complex information processing system, IRE Transactions on information theory 2 (3) (1956) 61–79

  66. [66]

    Newell, A guide to the general problem-solver program gps-2-2, Rand Corporation, 1963

    A. Newell, A guide to the general problem-solver program gps-2-2, Rand Corporation, 1963

  67. [67]

    P. E. Hart, N. J. Nilsson, B. Raphael, A formal basis for the heuristic determination of minimum cost paths, IEEE Transactions on Systems Science and Cybernetics 4 (2) (1968) 100–107.doi:10.1109/TSSC.1968.300136

  68. [68]

    Wooldridge, An Introduction to MultiAgent Systems, John Wiley & Sons, Chich- ester, UK, 2002

    M. Wooldridge, An Introduction to MultiAgent Systems, John Wiley & Sons, Chich- ester, UK, 2002

  69. [69]

    J. V. Roggeveen, E. Y. Wang, W. Flintoft, P. Donets, L. S. Nathwani, N. Gutierrez, D. Ettel, A. M. Graf, S. Dandavate, A. Nageswaran, et al., Hardmath2: A benchmark for applied mathematics built by students as part of a graduate class, arXiv preprint arXiv:2505.11774 (2025)

  70. [70]

    Cybenko, Approximation by superpositions of a sigmoidal function, Mathematics of Control, Signals and Systems 2 (4) (1989) 303–314

    G. Cybenko, Approximation by superpositions of a sigmoidal function, Mathematics of Control, Signals and Systems 2 (4) (1989) 303–314

  71. [71]

    Hornik, M

    K. Hornik, M. Stinchcombe, H. White, Multilayer feedforward networks are universal approximators, Neural Networks 2 (5) (1989) 359–366

  72. [72]

    Raissi, P

    M. Raissi, P. Perdikaris, G. E. Karniadakis, Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations, Journal of Computational Physics 378 (2019) 686–707

  73. [73]

    G. E. Karniadakis, I. G. Kevrekidis, L. Lu, P. Perdikaris, S. Wang, L. Yang, Physics- informed machine learning, Nature Reviews Physics 3 (6) (2021) 422–440

  74. [74]

    S. Wang, H. Wang, P. Perdikaris, On the eigenvector bias of Fourier feature networks: From regression to solving multi-scale PDEs with physics-informed neural networks, arXiv preprint arXiv:2012.10047 (2020)

  75. [75]

    S. Wang, H. Wang, P. Perdikaris, On the eigenvector bias of Fourier feature networks: From regression to solving multi-scale PDEs with physics-informed neural networks, Computer Methods in Applied Mechanics and Engineering 384 (2021) 113938

  76. [76]

    Kolmogorov, On the representation of continuous functions of several variables as superpositions of continuous functions of one variable and additionEnglish translation: Amer

    A. Kolmogorov, On the representation of continuous functions of several variables as superpositions of continuous functions of one variable and additionEnglish translation: Amer. Math. Soc. Transl., 28: Sixteen Papers on Analysis (1963) (1957). 48

  77. [77]

    Z. Liu, Y. Wang, S. Vaidya, F. Ruehle, J. Halverson, M. Soljaˇ ci´ c, T. Y. Hou, M. Tegmark, KAN: Kolmogorov-Arnold Networks, in: International Conference on Learning Representations, Vol. 2025, 2025, pp. 70367–70413

  78. [78]

    L. Song, J. D. Toscano, L.-L. Wang, Explicit construction of approximate kolmogorov- arnold superpositions with c2-smoothness, arXiv preprint arXiv:2508.04392 (2025)

  79. [79]

    J. D. Toscano, L.-L. Wang, G. E. Karniadakis, KKANs: Kurkova-Kolmogorov-Arnold networks and their learning dynamics, Neural Networks (2025) 107831

  80. [80]

    Y. Wang, J. W. Siegel, Z. Liu, T. Y. Hou, On the expressiveness and spectral bias of KANs, in: International Conference on Learning Representations, Vol. 2025, 2025, pp. 27492–27511

Showing first 80 references.