AutoPDE: Reliable Agentic PDE Solving via Explicitly Represented Solver Strategies
Pith reviewed 2026-06-27 13:22 UTC · model grok-4.3
The pith
AutoPDE keeps an explicit solver strategy object separate from code so that numerical decisions can be built first and revised from failure evidence.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
AutoPDE maintains the solver strategy as an independent, inspectable object that is built before any code is written and revised using numerical evidence whenever a solve fails. The object is populated in three stages that draw from a library of reusable PDE-solving skills: PDE analysis identifies the equation type and algebraic structure; numerical method selection commits to a discretization, stabilization, and linear solver; and adaptive tuning runs low-cost pilot solves to set resolution and tolerances. This explicit representation allows the strategy to be checked and corrected before code generation and to receive targeted revisions from numerical feedback.
What carries the argument
The explicitly represented solver strategy object, an independent inspectable structure built before code and revised from numerical evidence across PDE analysis, method selection, and adaptive tuning stages.
If this is right
- Numerical decisions become inspectable before code is generated.
- Failure signals can be used to revise the underlying strategy rather than only the implementation.
- A shared library of reusable PDE-solving skills can be applied across different equations.
- Low-cost pilot solves can calibrate resolution and tolerances under given accuracy and runtime budgets.
Where Pith is reading between the lines
- The same separation of plan from implementation could be applied to other scientific computing tasks where implicit choices in generated code are hard to debug.
- Explicit strategy objects might support human oversight or transfer of successful numerical plans between similar PDEs.
- If strategy revision proves reliable, future agents could maintain libraries of proven strategy templates rather than regenerating plans from scratch each time.
Load-bearing premise
Routing numerical failure feedback to revisions of the explicit strategy object rather than directly to code edits will produce measurable gains in solve success rate.
What would settle it
An ablation that disables strategy revision and routes all failure feedback only to code edits, then measures whether the pass rate falls back to the 40.3 percent baseline level.
Figures
read the original abstract
Numerical solvers for partial differential equations (PDEs) are core computational tools in science and engineering. Building reliable PDE solvers requires not only executable code, but a numerical solver strategy, a set of decisions about discretization, stabilization, solver configuration, and resolution control, that matches the PDE structure. Recent LLM-based coding agents have begun to reduce the programming burden by generating and debugging solver implementations. However, they typically move directly from a PDE problem to solver code, leaving the solver strategy implicit in implementation details. Feedback from a failed solve is therefore routed back to code edits rather than to the underlying strategy, so numerical decisions remain hard to check before code is generated and hard to revise using numerical evidence when it fails. To address this limitation, we propose AutoPDE, a code agent that maintains the solver strategy as an explicitly represented object throughout the solving process: an independent, inspectable object that is built before any code is written and can be revised, using numerical evidence, whenever a solve fails. AutoPDE builds and maintains this object in three stages, all drawing from a library of reusable PDE-solving skills: PDE analysis identifies the equation type and algebraic structure; numerical method selection chooses a numerical method that matches the analysis result and commits to a discretization, stabilization, and linear solver accordingly; and adaptive tuning runs low-cost pilot solves to calibrate resolution and tolerances under the prescribed accuracy and runtime budget. We evaluate AutoPDE on the PDE Agent Bench, where experimental results show that AutoPDE achieves a pass rate of $54.5%$, improving over the strongest baseline by $14.2$ percentage points.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces AutoPDE, a code agent for solving partial differential equations (PDEs) that maintains an explicitly represented solver strategy object. This object is constructed in three stages—PDE analysis, numerical method selection, and adaptive tuning—using a library of reusable skills, before any code is generated. The strategy can be revised based on numerical evidence from failed solves. The authors evaluate it on the PDE Agent Bench and report a pass rate of 54.5%, which is 14.2 percentage points higher than the strongest baseline.
Significance. If the performance improvement can be attributed to the explicit strategy representation and its revision mechanism, this work would offer a valuable contribution to the development of reliable LLM-based agents for scientific computing tasks. By making solver strategies inspectable and separable from code, it addresses a limitation in existing approaches where feedback is applied directly to implementations. The staged pipeline and skill library provide a structured framework that could be extended to other domains requiring numerical expertise.
major comments (3)
- [§4 (Evaluation)] The experimental results report a 54.5% pass rate and 14.2 pp improvement without providing details on the PDE Agent Bench (e.g., number and types of PDEs, difficulty distribution), the implementation of baseline agents, the number of independent runs, or any statistical tests for significance. This omission prevents verification that the gain is not due to confounding factors such as differences in prompting or library access.
- [§3 (Method)] The central claim relies on routing numerical failure feedback to revisions of the explicit strategy object rather than direct code edits. However, the description of the three-stage pipeline does not include the specific mechanism, decision rules, or pseudocode for how numerical evidence updates the strategy object. Without this, it is unclear how the revision process operates or is validated.
- [§4 (Evaluation)] No ablation study is presented that isolates the contribution of maintaining and revising the explicit strategy object while holding other elements (such as the skill library and pilot solves) constant. The performance delta cannot be attributed to this design choice based on the current evaluation.
minor comments (1)
- The abstract contains LaTeX markup for the percentage values that should be rendered consistently in the final manuscript.
Simulated Author's Rebuttal
We thank the referee for the constructive comments. We address each major point below and will revise the manuscript to incorporate the requested clarifications and additions.
read point-by-point responses
-
Referee: [§4 (Evaluation)] The experimental results report a 54.5% pass rate and 14.2 pp improvement without providing details on the PDE Agent Bench (e.g., number and types of PDEs, difficulty distribution), the implementation of baseline agents, the number of independent runs, or any statistical tests for significance. This omission prevents verification that the gain is not due to confounding factors such as differences in prompting or library access.
Authors: We agree that the current experimental section lacks sufficient detail for reproducibility and to rule out confounds. In the revised manuscript we will expand §4 with: the exact composition of PDE Agent Bench (number of problems, PDE types, and difficulty distribution); full implementation details for each baseline (prompting, library access, and any other configuration); the number of independent runs and observed variance; and statistical significance tests (e.g., McNemar or bootstrap intervals) on the 14.2 pp difference. These additions will appear in the main text or a new appendix. revision: yes
-
Referee: [§3 (Method)] The central claim relies on routing numerical failure feedback to revisions of the explicit strategy object rather than direct code edits. However, the description of the three-stage pipeline does not include the specific mechanism, decision rules, or pseudocode for how numerical evidence updates the strategy object. Without this, it is unclear how the revision process operates or is validated.
Authors: We acknowledge that the revision mechanism is described only at a high level. We will add a dedicated subsection to §3 that specifies the decision rules (e.g., how convergence failure or accuracy violation triggers changes to discretization order, stabilization technique, or method selection from the skill library) and includes pseudocode for the update loop. This will make the routing of numerical evidence to the strategy object explicit and verifiable. revision: yes
-
Referee: [§4 (Evaluation)] No ablation study is presented that isolates the contribution of maintaining and revising the explicit strategy object while holding other elements (such as the skill library and pilot solves) constant. The performance delta cannot be attributed to this design choice based on the current evaluation.
Authors: We agree that an ablation isolating the explicit strategy object is needed to attribute the observed gain. We will add such an ablation to the revised §4, comparing the full system against a variant that routes feedback directly to code while keeping the skill library and pilot solves fixed. Results and controls will be reported in the main text or appendix. revision: yes
Circularity Check
No circularity: empirical architecture with no derivation chain or fitted predictions
full rationale
The paper describes an LLM-based agent architecture for PDE solving that maintains an explicit solver strategy object. No mathematical derivations, equations, parameter fitting, or predictions are present. Claims rest on benchmark pass rates (54.5% vs. baseline) rather than any self-referential reduction. No self-citations are load-bearing for uniqueness or ansatz; the work is self-contained as an engineering contribution. No steps match any enumerated circularity pattern.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
2012 , publisher=
Automated solution of differential equations by the finite element method: The FEniCS book , author=. 2012 , publisher=
2012
-
[2]
Archive of numerical software , volume=
The FEniCS project version 1.5 , author=. Archive of numerical software , volume=
-
[3]
ACM Transactions on Mathematical Software (TOMS) , volume=
Firedrake: automating the finite element method by composing abstractions , author=. ACM Transactions on Mathematical Software (TOMS) , volume=. 2016 , publisher=
2016
-
[4]
II—a general-purpose object-oriented finite element library , author=
deal. II—a general-purpose object-oriented finite element library , author=. ACM Transactions on Mathematical Software (TOMS) , volume=. 2007 , publisher=
2007
-
[5]
Computers & Fluids , volume=
A numerical solution of the Navier-Stokes equations using the finite element technique , author=. Computers & Fluids , volume=. 1973 , publisher=
1973
-
[6]
Calcolo , volume=
A stable finite element for the Stokes equations , author=. Calcolo , volume=. 1984 , publisher=
1984
-
[7]
Mathematical Aspects of Finite Element Methods: Proceedings of the Conference Held in Rome, December 10--12, 1975 , pages=
A mixed finite element method for 2-nd order elliptic problems , author=. Mathematical Aspects of Finite Element Methods: Proceedings of the Conference Held in Rome, December 10--12, 1975 , pages=. 2006 , organization=
1975
-
[8]
Computer methods in applied mechanics and engineering , volume=
Streamline upwind/Petrov-Galerkin formulations for convection dominated flows with particular emphasis on the incompressible Navier-Stokes equations , author=. Computer methods in applied mechanics and engineering , volume=. 1982 , publisher=
1982
-
[9]
The Galerkin/least-squares method for advective-diffusive equations , author=
A new finite element formulation for computational fluid dynamics: VIII. The Galerkin/least-squares method for advective-diffusive equations , author=. Computer methods in applied mechanics and engineering , volume=. 1989 , publisher=
1989
-
[10]
SIAM Journal on scientific and statistical computing , volume=
GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems , author=. SIAM Journal on scientific and statistical computing , volume=. 1986 , publisher=
1986
-
[11]
2003 , publisher=
Iterative methods for sparse linear systems , author=. 2003 , publisher=
2003
-
[12]
Multigrid methods , pages=
Algebraic multigrid , author=. Multigrid methods , pages=. 1987 , publisher=
1987
-
[13]
Acta numerica , volume=
Numerical solution of saddle point problems , author=. Acta numerica , volume=. 2005 , publisher=
2005
-
[14]
2014 , publisher=
Finite elements and fast iterative solvers: with applications in incompressible fluid dynamics , author=. 2014 , publisher=
2014
-
[15]
2013 , publisher=
A posteriori error estimation techniques for finite element methods , author=. 2013 , publisher=
2013
-
[16]
Computer methods in applied mechanics and engineering , volume=
A posteriori error estimation in finite element analysis , author=. Computer methods in applied mechanics and engineering , volume=. 1997 , publisher=
1997
-
[17]
SIAM review , volume=
The p and h-p versions of the finite element method, basic principles and properties , author=. SIAM review , volume=. 1994 , publisher=
1994
-
[18]
Advances in Neural Information Processing Systems , volume=
Swe-agent: Agent-computer interfaces enable automated software engineering , author=. Advances in Neural Information Processing Systems , volume=
-
[19]
SWE-bench: Can Language Models Resolve Real-World GitHub Issues?
Swe-bench: Can language models resolve real-world github issues? , author=. arXiv preprint arXiv:2310.06770 , year=
work page internal anchor Pith review Pith/arXiv arXiv
-
[21]
arXiv preprint arXiv:2510.25803 , year=
Mixture-of-Experts Operator Transformer for Large-Scale PDE Pre-Training , author=. arXiv preprint arXiv:2510.25803 , year=
-
[23]
Journal of computational physics , volume=
Fully multidimensional flux-corrected transport algorithms for fluids , author=. Journal of computational physics , volume=. 1979 , publisher=
1979
-
[24]
Journal of computational physics , volume=
Weighted essentially non-oscillatory schemes , author=. Journal of computational physics , volume=. 1994 , publisher=
1994
-
[25]
Journal of computational physics , volume=
Efficient implementation of weighted ENO schemes , author=. Journal of computational physics , volume=. 1996 , publisher=
1996
-
[26]
The twelfth international conference on learning representations , year=
MetaGPT: Meta programming for a multi-agent collaborative framework , author=. The twelfth international conference on learning representations , year=
-
[27]
Advances in neural information processing systems , volume=
Self-refine: Iterative refinement with self-feedback , author=. Advances in neural information processing systems , volume=
-
[32]
Journal of Computational physics , volume=
Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations , author=. Journal of Computational physics , volume=. 2019 , publisher=
2019
-
[33]
Fourier Neural Operator for Parametric Partial Differential Equations
Fourier neural operator for parametric partial differential equations , author=. arXiv preprint arXiv:2010.08895 , year=
work page internal anchor Pith review Pith/arXiv arXiv 2010
-
[34]
Advances in Neural Information Processing Systems , volume=
Poseidon: Efficient foundation models for pdes , author=. Advances in Neural Information Processing Systems , volume=
-
[35]
Dpot: Auto-regressive denoising operator transformer for large-scale pde pre-training , author=. arXiv preprint arXiv:2403.03542 , year=
-
[36]
arXiv preprint arXiv:2403.07187 , year=
Ups: Efficiently building foundation models for pde solving via cross-modal adaptation , author=. arXiv preprint arXiv:2403.07187 , year=
-
[37]
Huang, Zhen and Yashengjiang, Yushan and Li, Junhui and Dong, Huanshuo and Wei, Yangbo and Hao, Zhezheng and Ma, Jiangtao and Bai, Songlin and Hao, Zhongkai and Yue, Xihang and Si, Guangzong and Jiang, Dongming and Yao, Chao and Hu, Zhanhua and Zhang, Jianqing and Liu, Pengwei and Shen, Yaomin and Ren, Xingyu and Liu, Lei and Xu, Zikang and Li, Han and Ya...
-
[38]
A posteriori error estimation in finite element analysis
Mark Ainsworth and J Tinsley Oden. A posteriori error estimation in finite element analysis. Computer methods in applied mechanics and engineering, 142 0 (1-2): 0 1--88, 1997
1997
-
[39]
The fenics project version 1.5
Martin Aln s, Jan Blechta, Johan Hake, August Johansson, Benjamin Kehlet, Anders Logg, Chris Richardson, Johannes Ring, Marie E Rognes, and Garth N Wells. The fenics project version 1.5. Archive of numerical software, 3 0 (100), 2015
2015
-
[40]
A stable finite element for the stokes equations
Douglas N Arnold, Franco Brezzi, and Michel Fortin. A stable finite element for the stokes equations. Calcolo, 21 0 (4): 0 337--344, 1984
1984
-
[41]
The p and h-p versions of the finite element method, basic principles and properties
Ivo Babu s ka and Manil Suri. The p and h-p versions of the finite element method, basic principles and properties. SIAM review, 36 0 (4): 0 578--632, 1994
1994
-
[42]
Wolfgang Bangerth, Ralf Hartmann, and Guido Kanschat. deal. ii—a general-purpose object-oriented finite element library. ACM Transactions on Mathematical Software (TOMS), 33 0 (4): 0 24--es, 2007
2007
-
[43]
Numerical solution of saddle point problems
Michele Benzi, Gene H Golub, and J \"o rg Liesen. Numerical solution of saddle point problems. Acta numerica, 14: 0 1--137, 2005
2005
-
[44]
Streamline upwind/petrov-galerkin formulations for convection dominated flows with particular emphasis on the incompressible navier-stokes equations
Alexander N Brooks and Thomas JR Hughes. Streamline upwind/petrov-galerkin formulations for convection dominated flows with particular emphasis on the incompressible navier-stokes equations. Computer methods in applied mechanics and engineering, 32 0 (1-3): 0 199--259, 1982
1982
-
[45]
Teaching Large Language Models to Self-Debug
Xinyun Chen, Maxwell Lin, Nathanael Sch \"a rli, and Denny Zhou. Teaching large language models to self-debug. arXiv preprint arXiv:2304.05128, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[46]
AutoNumerics: An au- tonomous, PDE-agnostic multi-agent pipeline for scien- tific computing
Jianda Du, Youran Sun, and Haizhao Yang. Autonumerics: An autonomous, pde-agnostic multi-agent pipeline for scientific computing. arXiv preprint arXiv:2602.17607, 2026
-
[47]
Finite elements and fast iterative solvers: with applications in incompressible fluid dynamics
Howard C Elman, David J Silvester, and Andrew J Wathen. Finite elements and fast iterative solvers: with applications in incompressible fluid dynamics. Oxford university press, 2014
2014
-
[48]
Pde-sharp: Pde solver hybrids through analysis and refinement passes
Shaghayegh Fazliani and Madeleine Udell. Pde-sharp: Pde solver hybrids through analysis and refinement passes. arXiv preprint arXiv:2511.00183, 2025
-
[49]
Metagpt: Meta programming for a multi-agent collaborative framework
Sirui Hong, Mingchen Zhuge, Jonathan Chen, Xiawu Zheng, Yuheng Cheng, Jinlin Wang, Ceyao Zhang, Zili Wang, Steven Ka Shing Yau, Zijuan Lin, et al. Metagpt: Meta programming for a multi-agent collaborative framework. In The twelfth international conference on learning representations, 2023
2023
-
[50]
PDEAgent-Bench : A multi-metric benchmark for pde-to-solver code generation, 2026
Zhen Huang, Yushan Yashengjiang, Junhui Li, Huanshuo Dong, Yangbo Wei, Zhezheng Hao, Jiangtao Ma, Songlin Bai, Zhongkai Hao, Xihang Yue, Guangzong Si, Dongming Jiang, Chao Yao, Zhanhua Hu, Jianqing Zhang, Pengwei Liu, Yaomin Shen, Xingyu Ren, Lei Liu, Zikang Xu, Han Li, Qingsong Yao, Hande Dong, and Hong Wang. PDEAgent-Bench : A multi-metric benchmark for...
2026
-
[51]
A new finite element formulation for computational fluid dynamics: Viii
Thomas JR Hughes, Leopoldo P Franca, and Gregory M Hulbert. A new finite element formulation for computational fluid dynamics: Viii. the galerkin/least-squares method for advective-diffusive equations. Computer methods in applied mechanics and engineering, 73 0 (2): 0 173--189, 1989
1989
-
[52]
Efficient implementation of weighted eno schemes
Guang-Shan Jiang and Chi-Wang Shu. Efficient implementation of weighted eno schemes. Journal of computational physics, 126 0 (1): 0 202--228, 1996
1996
-
[53]
Codepde: An inference framework for llm-driven pde solver generation
Shanda Li, Tanya Marwah, Junhong Shen, Weiwei Sun, Andrej Risteski, Yiming Yang, and Ameet Talwalkar. Codepde: An inference framework for llm-driven pde solver generation. arXiv preprint arXiv:2505.08783, 2025
-
[54]
Weighted essentially non-oscillatory schemes
Xu-Dong Liu, Stanley Osher, and Tony Chan. Weighted essentially non-oscillatory schemes. Journal of computational physics, 115 0 (1): 0 200--212, 1994
1994
-
[55]
Automated solution of differential equations by the finite element method: The FEniCS book, volume 84
Anders Logg, Kent-Andre Mardal, and Garth Wells. Automated solution of differential equations by the finite element method: The FEniCS book, volume 84. Springer Science & Business Media, 2012
2012
-
[56]
Self-refine: Iterative refinement with self-feedback
Aman Madaan, Niket Tandon, Prakhar Gupta, Skyler Hallinan, Luyu Gao, Sarah Wiegreffe, Uri Alon, Nouha Dziri, Shrimai Prabhumoye, Yiming Yang, et al. Self-refine: Iterative refinement with self-feedback. Advances in neural information processing systems, 36: 0 46534--46594, 2023
2023
-
[57]
Firedrake: automating the finite element method by composing abstractions
Florian Rathgeber, David A Ham, Lawrence Mitchell, Michael Lange, Fabio Luporini, Andrew TT McRae, Gheorghe-Teodor Bercea, Graham R Markall, and Paul HJ Kelly. Firedrake: automating the finite element method by composing abstractions. ACM Transactions on Mathematical Software (TOMS), 43 0 (3): 0 1--27, 2016
2016
-
[58]
A mixed finite element method for 2-nd order elliptic problems
Pierre-Arnaud Raviart and Jean-Marie Thomas. A mixed finite element method for 2-nd order elliptic problems. In Mathematical Aspects of Finite Element Methods: Proceedings of the Conference Held in Rome, December 10--12, 1975, pages 292--315. Springer, 2006
1975
-
[59]
Algebraic multigrid
John W Ruge and Klaus St \"u ben. Algebraic multigrid. In Multigrid methods, pages 73--130. SIAM, 1987
1987
-
[60]
Gmres: A generalized minimal residual algorithm for solving nonsymmetric linear systems
Youcef Saad and Martin H Schultz. Gmres: A generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM Journal on scientific and statistical computing, 7 0 (3): 0 856--869, 1986
1986
-
[61]
Iterative methods for sparse linear systems
Yousef Saad. Iterative methods for sparse linear systems. SIAM, 2003
2003
-
[62]
Pde-controller: Llms for autoformalization and reasoning of pdes
Mauricio Soroco, Jialin Song, Mengzhou Xia, Kye Emond, Weiran Sun, and Wuyang Chen. Pde-controller: Llms for autoformalization and reasoning of pdes. arXiv preprint arXiv:2502.00963, 2025
-
[63]
A numerical solution of the navier-stokes equations using the finite element technique
Cedric Taylor and Paul Hood. A numerical solution of the navier-stokes equations using the finite element technique. Computers & Fluids, 1 0 (1): 0 73--100, 1973
1973
-
[64]
u diger Verf \
R \"u diger Verf \"u rth. A posteriori error estimation techniques for finite element methods. OUP Oxford, 2013
2013
-
[65]
OpenHands: An Open Platform for AI Software Developers as Generalist Agents
Xingyao Wang, Boxuan Li, Yufan Song, Frank F Xu, Xiangru Tang, Mingchen Zhuge, Jiayi Pan, Yueqi Song, Bowen Li, Jaskirat Singh, et al. Openhands: An open platform for ai software developers as generalist agents. arXiv preprint arXiv:2407.16741, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[66]
Swe-agent: Agent-computer interfaces enable automated software engineering
John Yang, Carlos E Jimenez, Alexander Wettig, Kilian Lieret, Shunyu Yao, Karthik Narasimhan, and Ofir Press. Swe-agent: Agent-computer interfaces enable automated software engineering. Advances in Neural Information Processing Systems, 37: 0 50528--50652, 2024
2024
-
[67]
Fully multidimensional flux-corrected transport algorithms for fluids
Steven T Zalesak. Fully multidimensional flux-corrected transport algorithms for fluids. Journal of computational physics, 31 0 (3): 0 335--362, 1979
1979
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.