Data-Efficient Neural Operator Training via Physics-Based Active Learning

Alicja Polanska; Lorenzo Zanisi; Stanislas Pamela; Vignesh Gopakumar

arxiv: 2605.21348 · v1 · pith:LE3GZBZDnew · submitted 2026-05-20 · 💻 cs.LG · cs.AI· cs.NA· math.NA· physics.comp-ph

Data-Efficient Neural Operator Training via Physics-Based Active Learning

Alicja Polanska , Lorenzo Zanisi , Vignesh Gopakumar , Stanislas Pamela This is my paper

Pith reviewed 2026-05-21 05:29 UTC · model grok-4.3

classification 💻 cs.LG cs.AIcs.NAmath.NAphysics.comp-ph

keywords neural operatorsactive learningphysics-informed learningpartial differential equationsdata efficiencyBurgers equationNavier-Stokes equations

0 comments

The pith

Physics-based active learning uses the PDE residual to select training points where neural operators understand the physics least.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces an active learning method for neural operators that solves partial differential equations with far less training data than usual. It selects new simulation points by measuring how much the current model violates the governing PDE at candidate locations. This replaces random sampling and is tested on the one-dimensional Burgers equation and two-dimensional compressible Navier-Stokes equations. Experiments show the approach beats random selection and reaches the performance level of existing state-of-the-art techniques. The method embeds physical knowledge directly into the choice of which expensive simulations to run next.

Core claim

We introduce physics-based acquisition, a novel physics-informed active learning algorithm that leverages the partial differential equation residual to guide data selection. Numerical experiments on the 1D Burgers equation and the 2D compressible Navier-Stokes equations show that physics-based acquisition consistently outperforms random acquisition and matches the state of the art in data efficiency. It has the unique advantage of injecting a physics inductive bias into the training process, ensuring that simulation cost is spent where the model's physical understanding is weakest.

What carries the argument

Physics-based acquisition function that ranks candidate points by the magnitude of the PDE residual evaluated with the current neural operator model.

Load-bearing premise

The PDE residual at a point reliably signals where the neural operator's approximation of the true solution is weakest.

What would settle it

A controlled experiment in which models trained with residual-based selection show equal or lower accuracy on test data than models trained with random selection at the same data budget.

Figures

Figures reproduced from arXiv: 2605.21348 by Alicja Polanska, Lorenzo Zanisi, Stanislas Pamela, Vignesh Gopakumar.

read the original abstract

Solving partial differential equations with neural operators significantly reduces computational costs but remains bottlenecked by high training data requirements. Active learning offers a natural framework to mitigate this by selectively acquiring the most informative samples in an iterative manner. We introduce physics-based acquisition - a novel physics-informed active learning algorithm that leverages the partial differential equation residual to guide data selection. We validate the method by presenting numerical experiments for the 1D Burgers equation and the 2D compressible Navier-Stokes equations. We show that, in our experiments, physics-based acquisition consistently outperforms random acquisition and matches the state of the art in data efficiency. At the same time, it has the unique advantage of injecting a physics inductive bias into the training process, ensuring that simulation cost is spent where the model's physical understanding is weakest.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper's core idea is to use the PDE residual from the current neural operator as the acquisition function for active learning, and the experiments on Burgers and Navier-Stokes show it beats random sampling while matching other data-efficient methods.

read the letter

The punchline is that this work gives a simple physics-driven way to pick new training solves for neural operators instead of random or uncertainty-based selection. They compute the residual of the governing PDE on the model's current output and use high-residual locations to decide where to run the next expensive simulation. That is the main novelty they claim over prior active learning for operators. The experiments on 1D Burgers and 2D compressible Navier-Stokes are the concrete evidence: physics-based acquisition consistently needs fewer samples than random to reach the same accuracy and performs on par with existing state-of-the-art active learning baselines. The framing that this injects a physics inductive bias is reasonable and distinguishes it from purely data-driven acquisition functions. What is actually new is the direct use of the residual magnitude for sample selection in the neural operator setting; earlier work on physics-informed neural networks uses residuals for loss terms, but not this way for data acquisition. The method is straightforward to implement once you have a residual evaluator, which is already available in most neural operator pipelines. The soft spots are in the experimental reporting. The abstract gives no error bars, no details on how the baselines were re-implemented, and no statistical tests, so the size of the gains is hard to judge from the summary alone. More importantly, there is no direct check that high residual regions actually coincide with high pointwise error or with dynamically important features; if discretization artifacts or architecture biases dominate the residual, the acquisition could steer effort to the wrong places. That concern from the stress-test note is worth checking in the full paper. This paper is for researchers already working on neural operators for PDEs who want to reduce the number of high-fidelity solves needed. A reader who cares about active learning or physics-informed sampling will find the method description and the two-equation results useful. I would send it to peer review. The idea is clean, the experiments are on standard benchmarks, and the central claim is falsifiable with the right plots, so referees can evaluate it properly even if revisions are needed on the validation side.

Referee Report

2 major / 2 minor

Summary. The paper proposes physics-based active learning for neural operator training on PDEs, where new training samples are selected iteratively by computing the PDE residual on the current model's predictions. Experiments on the 1D Burgers equation and 2D compressible Navier-Stokes equations are used to claim that this approach outperforms random acquisition, matches state-of-the-art data efficiency, and injects a useful physics inductive bias by focusing simulations where the model is physically weakest.

Significance. If validated, the method could meaningfully reduce the high data requirements for neural operators by directing expensive simulations toward regions of high physical inconsistency. The direct use of the PDE residual as an acquisition function provides a distinctive physics-informed alternative to standard uncertainty or diversity-based strategies, with potential for broader impact in scientific machine learning.

major comments (2)

[Abstract] Abstract: The central claim that physics-based acquisition 'consistently outperforms random acquisition and matches the state of the art in data efficiency' is presented without error bars, exact quantitative metrics, baseline implementation details, or statistical significance tests. This absence prevents verification of the reported gains and weakens the empirical support for the method's superiority.
[Method and Numerical Experiments] Method and Numerical Experiments: The key assumption that PDE residual magnitude reliably identifies locations where the neural operator's physical understanding is weakest (and that sampling there improves global performance) is not validated. No scatter plots or quantitative analysis correlating residual values with pointwise true error on held-out solves, nor ablations on residual normalization or discretization effects, are provided. If residual is dominated by artifacts rather than model error, the acquisition function may select uninformative samples.

minor comments (2)

[Method] The manuscript would benefit from a clearer statement of the precise form of the residual-based acquisition function, including any normalization or thresholding steps.
[Numerical Experiments] Figure captions and axis labels in the experimental results should explicitly state the number of independent runs and the precise error metric (e.g., relative L2) used for comparison.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments. We address each major comment below and outline the revisions we will make to strengthen the empirical support and validation of our assumptions.

read point-by-point responses

Referee: [Abstract] Abstract: The central claim that physics-based acquisition 'consistently outperforms random acquisition and matches the state of the art in data efficiency' is presented without error bars, exact quantitative metrics, baseline implementation details, or statistical significance tests. This absence prevents verification of the reported gains and weakens the empirical support for the method's superiority.

Authors: The abstract is a concise summary; the Numerical Experiments section contains the detailed comparisons, including baseline results. To improve verifiability as suggested, we will revise the manuscript to include error bars on all performance plots (from multiple random seeds), report exact metrics with standard deviations, expand baseline implementation details in the main text, and add statistical significance tests (e.g., paired t-tests) for the reported improvements. revision: yes
Referee: [Method and Numerical Experiments] Method and Numerical Experiments: The key assumption that PDE residual magnitude reliably identifies locations where the neural operator's physical understanding is weakest (and that sampling there improves global performance) is not validated. No scatter plots or quantitative analysis correlating residual values with pointwise true error on held-out solves, nor ablations on residual normalization or discretization effects, are provided. If residual is dominated by artifacts rather than model error, the acquisition function may select uninformative samples.

Authors: We agree that explicit validation of this assumption would strengthen the paper. The observed performance gains provide indirect support, but we will add direct evidence in the revision: scatter plots and quantitative correlation analysis between PDE residual values and pointwise true errors on held-out solves. We will also include ablations on residual normalization choices and discretization effects to confirm that the acquisition targets genuine model weaknesses rather than numerical artifacts. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper defines a physics-based acquisition function directly from the PDE residual evaluated on the current neural operator output and then validates the resulting active learning loop through numerical experiments on Burgers and Navier-Stokes equations. This is a standard algorithmic construction for physics-informed sampling; the claimed performance advantage (outperforming random acquisition) is presented as an empirical outcome rather than a quantity that reduces to the acquisition definition itself. No self-definitional steps, fitted inputs relabeled as predictions, or load-bearing self-citations appear in the abstract or method description. The derivation chain therefore remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that PDE residual correlates with model weakness in physical understanding; no free parameters or invented entities are described in the abstract.

axioms (1)

domain assumption The PDE residual serves as a reliable indicator of where the neural operator's physical understanding is weakest.
This premise directly guides the data selection in the physics-based acquisition algorithm.

pith-pipeline@v0.9.0 · 5674 in / 1210 out tokens · 39717 ms · 2026-05-21T05:29:38.658018+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

16 extracted references · 16 canonical work pages · 2 internal anchors

[1]

Universal physics transformers.arXiv preprint arXiv:2402.12365,

URLhttps://arxiv.org/abs/2402.12365. Pradeep Bajracharya, Javier Quetzalcoatl Toledo-Mar ´ın, Geoffrey Fox, Shantenu Jha, and Linwei Wang. Feasibility study on active learning of smart surrogates for scientific simulations.arXiv preprint arXiv:2407.07674,

work page arXiv
[2]

Qianying Cao, Somdatta Goswami, and George Em Karniadakis

URLhttps://arxiv.org/abs/2202.07643. Qianying Cao, Somdatta Goswami, and George Em Karniadakis. Lno: Laplace neural operator for solving differential equations,

work page arXiv
[3]

LNO: Laplace Neural Operator for Solving Differential Equations, May 2023

URLhttps://arxiv.org/abs/2303.10528. Maarten V . de Hoop, Daniel Zhengyu Huang, Elizabeth Qian, and Andrew M. Stuart. The cost- accuracy trade-off in operator learning with neural networks,

work page arXiv
[4]

De Hoop, D

URLhttps://arxiv. org/abs/2203.13181. Joel H. Ferziger, Milovan Peri ´c, and Robert L. Street.Computational Methods for Fluid Dy- namics. Springer Cham, Switzerland, 4th edition,

work page arXiv
[5]

doi: 10.1007/978-3-319-99693-6

ISBN 978-3-319-99693-6. doi: 10.1007/978-3-319-99693-6. Vignesh Gopakumar, Ander Gray, Lorenzo Zanisi, Timothy Nunn, Daniel Giles, Matt J Kusner, Stanislas Pamela, and Marc Peter Deisenroth. Calibrated physics-informed uncertainty quantifi- cation.arXiv preprint arXiv:2502.04406,

work page doi:10.1007/978-3-319-99693-6
[6]

Fourier Neural Operator for Parametric Partial Differential Equations

URLhttps:// arxiv.org/abs/2405.00334. Zongyi Li, Nikola Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Kaushik Bhattacharya, An- drew Stuart, and Anima Anandkumar. Fourier neural operator for parametric partial differential equations. Oct 2020a. doi: 10.48550/arXiv.2010.08895. Funding by Kortschak Scholars Pro- gram. 5 Published as a conference paper at ...

work page internal anchor Pith review doi:10.48550/arxiv.2010.08895 2010
[7]

Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators

ISSN 2522-5839. doi: 10.1038/ s42256-021-00302-5. URLhttp://dx.doi.org/10.1038/s42256-021-00302-5. David J. C. MacKay. Information-based objective functions for active data selection.Neural Com- putation, 4(4):590–604, Jul

work page doi:10.1038/s42256-021-00302-5
[8]

doi: 10.1162/neco.1992.4.4.590

ISSN 0899-7667. doi: 10.1162/neco.1992.4.4.590. Funding by Caltech Fellowship. Keith W Morton and David Francis Mayers.Numerical solution of partial differential equations: an introduction. Cambridge university press,

work page doi:10.1162/neco.1992.4.4.590 1992
[9]

Active learning for neural PDE solvers.CoRR, abs/2408.01536,

Daniel Musekamp, Marimuthu Kalimuthu, David Holzm ¨uller, Makoto Takamoto, and Mathias Niepert. Active learning for neural PDE solvers.CoRR, abs/2408.01536,

work page arXiv
[10]

Active learning for neural PDE solvers.CoRR, abs/2408.01536,

doi: 10.48550/ ARXIV .2408.01536. URLhttps://doi.org/10.48550/arXiv.2408.01536. Rapha¨el Pestourie, Youssef Mroueh, Thanh V Nguyen, et al. Active learning of deep surrogates for pdes: application to metasurface design.Computational Materials, 6(1):164,

work page doi:10.48550/arxiv.2408.01536
[11]

First results from the IllustrisTNG simulations: the stellar mass content of groups and clusters of galaxies

ISSN 0035-8711. doi: 10.1093/mnras/stx3112. URLhttps://doi.org/10.1093/mnras/stx3112. Alfio M. Quarteroni and Alberto Valli.Numerical Approximation of Partial Differential Equations. Springer Publishing Company, Incorporated, 1st ed

work page internal anchor Pith review doi:10.1093/mnras/stx3112
[12]

Anirudh Satheesh, Anant Khandelwal, Mucong Ding, and Radu Balan

URLhttps://arxiv.org/abs/ 2009.00236. Anirudh Satheesh, Anant Khandelwal, Mucong Ding, and Radu Balan. PICore: Physics-informed unsupervised coreset selection for data efficient neural operator training.Transactions on Machine Learning Research,

work page arXiv 2009
[13]

URLhttps: //arxiv.org/abs/2406.02176. S.F. Smith, S.J.P. Pamela, A. Fil, M. H¨olzl, G.T.A. Huijsmans, A. Kirk, D. Moulton, O. Myatra, A.J. Thornton, H.R. Wilson, and the JOREK team. Simulations of edge localised mode instabilities in mast-u super-x tokamak plasmas.Nuclear Fusion, 60(6):066021, may

work page arXiv
[14]

URLhttps://dx.doi.org/10.1088/1741-4326/ab826a

doi: 10.1088/ 1741-4326/ab826a. URLhttps://dx.doi.org/10.1088/1741-4326/ab826a. 6 Published as a conference paper at ICLR 2026 Workshop on AI and PDEs Makoto Takamoto, Timothy Praditia, Raphael Leiteritz, Daniel MacKinlay, Francesco Alesiani, Dirk Pfl¨uger, and Mathias Niepert. PDEBench: An extensive benchmark for scientific machine learning. In S. Koyejo...

work page doi:10.1088/1741-4326/ab826a 2026
[15]

Makoto Takamoto, Francesco Alesiani, and Mathias Niepert

URLhttps://proceedings.neurips.cc/paper_files/ paper/2022/file/0a9747136d411fb83f0cf81820d44afb-Paper-Datasets_ and_Benchmarks.pdf. Makoto Takamoto, Francesco Alesiani, and Mathias Niepert. Learning neural pde solvers with parameter-guided channel attention,

work page 2022
[16]

Dongxia Wu, Ruijia Niu, Matteo Chinazzi, Yian Ma, and Rose Yu

URLhttps://arxiv.org/abs/2304.14118. Dongxia Wu, Ruijia Niu, Matteo Chinazzi, Yian Ma, and Rose Yu. Disentangled multi-fidelity deep bayesian active learning. InInternational Conference on Machine Learning, pp. 37624–37634. PMLR, 2023a. Dongxia Wu, Ruijia Niu, Matteo Chinazzi, Alessandro Vespignani, Yi-An Ma, and Rose Yu. Deep bayesian active learning for...

work page arXiv

[1] [1]

Universal physics transformers.arXiv preprint arXiv:2402.12365,

URLhttps://arxiv.org/abs/2402.12365. Pradeep Bajracharya, Javier Quetzalcoatl Toledo-Mar ´ın, Geoffrey Fox, Shantenu Jha, and Linwei Wang. Feasibility study on active learning of smart surrogates for scientific simulations.arXiv preprint arXiv:2407.07674,

work page arXiv

[2] [2]

Qianying Cao, Somdatta Goswami, and George Em Karniadakis

URLhttps://arxiv.org/abs/2202.07643. Qianying Cao, Somdatta Goswami, and George Em Karniadakis. Lno: Laplace neural operator for solving differential equations,

work page arXiv

[3] [3]

LNO: Laplace Neural Operator for Solving Differential Equations, May 2023

URLhttps://arxiv.org/abs/2303.10528. Maarten V . de Hoop, Daniel Zhengyu Huang, Elizabeth Qian, and Andrew M. Stuart. The cost- accuracy trade-off in operator learning with neural networks,

work page arXiv

[4] [4]

De Hoop, D

URLhttps://arxiv. org/abs/2203.13181. Joel H. Ferziger, Milovan Peri ´c, and Robert L. Street.Computational Methods for Fluid Dy- namics. Springer Cham, Switzerland, 4th edition,

work page arXiv

[5] [5]

doi: 10.1007/978-3-319-99693-6

ISBN 978-3-319-99693-6. doi: 10.1007/978-3-319-99693-6. Vignesh Gopakumar, Ander Gray, Lorenzo Zanisi, Timothy Nunn, Daniel Giles, Matt J Kusner, Stanislas Pamela, and Marc Peter Deisenroth. Calibrated physics-informed uncertainty quantifi- cation.arXiv preprint arXiv:2502.04406,

work page doi:10.1007/978-3-319-99693-6

[6] [6]

Fourier Neural Operator for Parametric Partial Differential Equations

URLhttps:// arxiv.org/abs/2405.00334. Zongyi Li, Nikola Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Kaushik Bhattacharya, An- drew Stuart, and Anima Anandkumar. Fourier neural operator for parametric partial differential equations. Oct 2020a. doi: 10.48550/arXiv.2010.08895. Funding by Kortschak Scholars Pro- gram. 5 Published as a conference paper at ...

work page internal anchor Pith review doi:10.48550/arxiv.2010.08895 2010

[7] [7]

Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators

ISSN 2522-5839. doi: 10.1038/ s42256-021-00302-5. URLhttp://dx.doi.org/10.1038/s42256-021-00302-5. David J. C. MacKay. Information-based objective functions for active data selection.Neural Com- putation, 4(4):590–604, Jul

work page doi:10.1038/s42256-021-00302-5

[8] [8]

doi: 10.1162/neco.1992.4.4.590

ISSN 0899-7667. doi: 10.1162/neco.1992.4.4.590. Funding by Caltech Fellowship. Keith W Morton and David Francis Mayers.Numerical solution of partial differential equations: an introduction. Cambridge university press,

work page doi:10.1162/neco.1992.4.4.590 1992

[9] [9]

Active learning for neural PDE solvers.CoRR, abs/2408.01536,

Daniel Musekamp, Marimuthu Kalimuthu, David Holzm ¨uller, Makoto Takamoto, and Mathias Niepert. Active learning for neural PDE solvers.CoRR, abs/2408.01536,

work page arXiv

[10] [10]

Active learning for neural PDE solvers.CoRR, abs/2408.01536,

doi: 10.48550/ ARXIV .2408.01536. URLhttps://doi.org/10.48550/arXiv.2408.01536. Rapha¨el Pestourie, Youssef Mroueh, Thanh V Nguyen, et al. Active learning of deep surrogates for pdes: application to metasurface design.Computational Materials, 6(1):164,

work page doi:10.48550/arxiv.2408.01536

[11] [11]

First results from the IllustrisTNG simulations: the stellar mass content of groups and clusters of galaxies

ISSN 0035-8711. doi: 10.1093/mnras/stx3112. URLhttps://doi.org/10.1093/mnras/stx3112. Alfio M. Quarteroni and Alberto Valli.Numerical Approximation of Partial Differential Equations. Springer Publishing Company, Incorporated, 1st ed

work page internal anchor Pith review doi:10.1093/mnras/stx3112

[12] [12]

Anirudh Satheesh, Anant Khandelwal, Mucong Ding, and Radu Balan

URLhttps://arxiv.org/abs/ 2009.00236. Anirudh Satheesh, Anant Khandelwal, Mucong Ding, and Radu Balan. PICore: Physics-informed unsupervised coreset selection for data efficient neural operator training.Transactions on Machine Learning Research,

work page arXiv 2009

[13] [13]

URLhttps: //arxiv.org/abs/2406.02176. S.F. Smith, S.J.P. Pamela, A. Fil, M. H¨olzl, G.T.A. Huijsmans, A. Kirk, D. Moulton, O. Myatra, A.J. Thornton, H.R. Wilson, and the JOREK team. Simulations of edge localised mode instabilities in mast-u super-x tokamak plasmas.Nuclear Fusion, 60(6):066021, may

work page arXiv

[14] [14]

URLhttps://dx.doi.org/10.1088/1741-4326/ab826a

doi: 10.1088/ 1741-4326/ab826a. URLhttps://dx.doi.org/10.1088/1741-4326/ab826a. 6 Published as a conference paper at ICLR 2026 Workshop on AI and PDEs Makoto Takamoto, Timothy Praditia, Raphael Leiteritz, Daniel MacKinlay, Francesco Alesiani, Dirk Pfl¨uger, and Mathias Niepert. PDEBench: An extensive benchmark for scientific machine learning. In S. Koyejo...

work page doi:10.1088/1741-4326/ab826a 2026

[15] [15]

Makoto Takamoto, Francesco Alesiani, and Mathias Niepert

URLhttps://proceedings.neurips.cc/paper_files/ paper/2022/file/0a9747136d411fb83f0cf81820d44afb-Paper-Datasets_ and_Benchmarks.pdf. Makoto Takamoto, Francesco Alesiani, and Mathias Niepert. Learning neural pde solvers with parameter-guided channel attention,

work page 2022

[16] [16]

Dongxia Wu, Ruijia Niu, Matteo Chinazzi, Yian Ma, and Rose Yu

URLhttps://arxiv.org/abs/2304.14118. Dongxia Wu, Ruijia Niu, Matteo Chinazzi, Yian Ma, and Rose Yu. Disentangled multi-fidelity deep bayesian active learning. InInternational Conference on Machine Learning, pp. 37624–37634. PMLR, 2023a. Dongxia Wu, Ruijia Niu, Matteo Chinazzi, Alessandro Vespignani, Yi-An Ma, and Rose Yu. Deep bayesian active learning for...

work page arXiv