pith. sign in

arxiv: 2512.24365 · v2 · pith:VPZLPZSDnew · submitted 2025-12-30 · ⚛️ physics.geo-ph · cs.LG

A Critical Assessment of PINNs and Operator Learning for Geotechnical Engineering

Pith reviewed 2026-05-21 15:46 UTC · model grok-4.3

classification ⚛️ physics.geo-ph cs.LG
keywords physics-informed neural networksoperator learninggeotechnical engineeringfinite difference methodsextrapolationinverse analysisscientific machine learning
0
0 comments X

The pith

Neural networks fit geotechnical data inside training ranges but fail to extrapolate and cost far more than finite-difference solvers.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper benchmarks multi-layer perceptrons, physics-informed neural networks, deep operator networks, and graph network simulators against finite-difference and particle-based methods on problems such as Terzaghi consolidation, damped oscillators, one-dimensional waves, and beams on elastic foundations. It tracks accuracy outside sampled domains, training and inference costs, transfer to new instances, and performance on inverse tasks. Neural approaches match reference solutions only where data or residuals are supplied during training, diverge sharply beyond those intervals, and require computation equivalent to thousands or millions of traditional solves. This matters because scientific machine learning is promoted for replacing or accelerating engineering simulations, yet the results indicate neural networks serve best for interpolation within validated domains. For inverse analysis the paper shows automatic differentiation through a conventional solver recovers material properties in seconds with about one percent error.

Core claim

Neural networks match reference solutions inside training intervals yet diverge outside them, as when an MLP trained on two years of Terzaghi consolidation predicts 290 mm instead of the reference 99.3 mm at year ten. PINN training on the one-dimensional wave equation runs roughly 96,000 times slower than finite differences and yields lower accuracy. DeepONet training for the beam problem equals the cost of 1.8 million finite-difference solves, while graph networks demand full trajectories and large memory. Automatic differentiation through an existing finite-difference solver recovers the material profile in seconds with roughly one percent error. These outcomes support using neural methods

What carries the argument

Benchmark comparisons of extrapolation error, training cost, inference speed, and physics accuracy between neural architectures and conventional numerical solvers on geotechnical test cases.

Load-bearing premise

The chosen benchmarks of consolidation, oscillation, wave propagation, and beam deflection represent the main difficulties encountered in practical geotechnical engineering.

What would settle it

A demonstration that a PINN trained only on the interval [0,1] for the damped oscillator produces accurate predictions at later times with error below one percent would challenge the claim of limited extrapolation.

Figures

Figures reproduced from arXiv: 2512.24365 by Krishna Kumar.

Figure 1
Figure 1. Figure 1: A single neuron (perceptron) computes a weighted sum of inputs plus bias, then [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Extrapolation failure for multi-layer perceptrons with different activation functions [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Spatial autocorrelation in geotechnical data. (a) Random split creates spatial [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Physics-Informed Neural Network architecture for 1D wave propagation. The [PITH_FULL_IMAGE:figures/full_fig_p012_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Wave propagation comparison between PINN and finite difference from [PITH_FULL_IMAGE:figures/full_fig_p014_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Automatic differentiation modes. Left: Forward mode accumulates derivatives [PITH_FULL_IMAGE:figures/full_fig_p016_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Inverse problem comparison: recovering velocity profiles from waveform data [PITH_FULL_IMAGE:figures/full_fig_p019_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Deep Operator Network (DeepONet) architecture. The branch network encodes [PITH_FULL_IMAGE:figures/full_fig_p021_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: GNS generalization to unseen boundary conditions. The model trains on granular [PITH_FULL_IMAGE:figures/full_fig_p025_9.png] view at source ↗
read the original abstract

Scientific machine learning (SciML) offers neural-network alternatives to numerical workflows in geotechnical engineering. This paper benchmarks multi-layer perceptrons (MLPs), physics-informed neural networks (PINNs), deep operator networks (DeepONet), and graph network simulators (GNS) against finite-difference and particle-based references on geotechnical benchmarks, and compares PINN inversion with automatic differentiation (AD) through a conventional solver. We evaluate each method for extrapolation, training, and inference cost, transfer across problem instances, and physics accuracy. An MLP trained on two years of Terzaghi consolidation fits the data, but at year ten predicts ~290 mm with ReLU and ~60 mm with tanh or sigmoid, against a reference of 99.3 mm. A PINN on a damped oscillator with a time domain inside [0,1] matches the closed form within that interval but fails outside, since the residual constrains the fit only where it is sampled. For the 1D wave equation, PINN training is ~96,000 times slower than finite-difference methods and less accurate. DeepONet avoids PINN retraining, yet for the beam on elastic foundation, its training cost equals ~1.8 million finite-difference solves, and inference is slower per query than the direct solver. GNS improves geometric transfer through local particle interactions, though formulations still need trajectories, large training sets, and substantial memory. In the inverse wave benchmark, AD through the finite-difference solver recovers the material profile in seconds with ~1% error. The results support a cautious role for SciML. Neural networks suit interpolation and pattern recognition inside validated domains, while inverse analysis should first try differentiable physics-based solvers when a reliable forward solver exists.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript benchmarks SciML approaches (MLPs, PINNs, DeepONet, GNS) against finite-difference and particle-based references on geotechnical problems: Terzaghi consolidation (MLP extrapolation from 2-year training to year-10 prediction of ~290 mm ReLU / ~60 mm tanh vs 99.3 mm reference), damped oscillator (PINN accurate only inside sampled [0,1] domain), 1D wave (PINN ~96,000× slower and less accurate than FD), beam on elastic foundation (DeepONet training cost ~1.8 million FD solves), and an inverse wave problem (AD through FD solver recovers material profile in seconds with ~1% error). The central claim is that neural networks are suited to interpolation inside validated domains while inverse analysis should prefer differentiable physics-based solvers when a reliable forward model exists.

Significance. If the numerical gaps hold, the work supplies concrete, reproducible evidence that can temper enthusiasm for direct SciML replacement of established workflows in geotechnical engineering. Credit is due for the direct, non-circular comparisons to independent reference solutions and for the explicit cost and accuracy quantifications that allow readers to judge the practical trade-offs.

major comments (3)
  1. [Conclusions / inverse-wave benchmark] The broader recommendation that inverse analysis should first try differentiable physics-based solvers is load-bearing on the assumption that the chosen benchmarks (predominantly 1D linear or low-dimensional problems with closed-form or simple numerical references) are representative of geotechnical practice. The manuscript should add an explicit discussion, or an additional nonlinear 3D heterogeneous case, in the conclusions or limitations section to address whether the observed failures (extrapolation error, domain restriction, training cost) are intrinsic or artifacts of problem simplicity.
  2. [1D wave equation benchmark] In the 1D wave equation results, the claim of ~96,000× slowdown for PINN training versus finite-difference is a central quantitative support for the cost critique. The comparison would be more robust if the manuscript reported the precise accuracy target, hardware, and whether the FD solver was also run to equivalent residual tolerance; without these, it is unclear whether the gap is intrinsic to the PINN formulation or implementation-dependent.
  3. [Terzaghi consolidation / MLP extrapolation] The MLP extrapolation result for Terzaghi consolidation reports point predictions (~290 mm ReLU, ~60 mm tanh/sigmoid vs 99.3 mm reference) but provides only summary statistics. Adding error distributions across random seeds or statistical significance tests would strengthen the claim that the discrepancy is systematic rather than run-specific.
minor comments (2)
  1. [Abstract] The abstract uses approximate symbols (~) for several quantities; replacing them with exact reported values or explicit ranges would improve precision and reproducibility.
  2. [Methods] Notation for activation functions and network architectures is introduced without a consolidated table; a short summary table would aid readers comparing the four methods.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and positive assessment of our manuscript, particularly the recognition of our direct, non-circular comparisons and quantitative cost/accuracy metrics. We address each major comment below and will revise the manuscript accordingly to improve robustness and clarity.

read point-by-point responses
  1. Referee: [Conclusions / inverse-wave benchmark] The broader recommendation that inverse analysis should first try differentiable physics-based solvers is load-bearing on the assumption that the chosen benchmarks (predominantly 1D linear or low-dimensional problems with closed-form or simple numerical references) are representative of geotechnical practice. The manuscript should add an explicit discussion, or an additional nonlinear 3D heterogeneous case, in the conclusions or limitations section to address whether the observed failures (extrapolation error, domain restriction, training cost) are intrinsic or artifacts of problem simplicity.

    Authors: We agree that the selected benchmarks are primarily 1D and linear to permit exact comparisons against closed-form or converged references, which is essential for isolating specific SciML limitations. These problems are standard in the geotechnical literature precisely because they allow reproducible quantification of extrapolation failure and training overhead. We will add an explicit discussion paragraph in the Conclusions and Limitations section addressing the scope of the findings, noting that issues such as domain-restricted accuracy and high training costs arise from fundamental properties of neural network optimization and sampling rather than problem simplicity alone. While a new nonlinear 3D heterogeneous benchmark is beyond the scope of the current revision, the added text will outline how the observed patterns are expected to generalize and suggest directions for future work. revision: yes

  2. Referee: [1D wave equation benchmark] In the 1D wave equation results, the claim of ~96,000× slowdown for PINN training versus finite-difference is a central quantitative support for the cost critique. The comparison would be more robust if the manuscript reported the precise accuracy target, hardware, and whether the FD solver was also run to equivalent residual tolerance; without these, it is unclear whether the gap is intrinsic to the PINN formulation or implementation-dependent.

    Authors: We thank the referee for this suggestion to strengthen the quantitative claim. The reported factor was obtained by training the PINN until the PDE residual reached approximately 1e-4 on an NVIDIA A100 GPU, while the finite-difference solver used a fixed discretization run to machine-precision convergence. In the revised manuscript we will explicitly state the residual tolerance, hardware platform, and confirm that both methods were driven to comparable accuracy levels. This clarification will demonstrate that the large gap is attributable to the iterative optimization and automatic differentiation overhead inherent to PINNs rather than implementation artifacts. revision: yes

  3. Referee: [Terzaghi consolidation / MLP extrapolation] The MLP extrapolation result for Terzaghi consolidation reports point predictions (~290 mm ReLU, ~60 mm tanh/sigmoid vs 99.3 mm reference) but provides only summary statistics. Adding error distributions across random seeds or statistical significance tests would strengthen the claim that the discrepancy is systematic rather than run-specific.

    Authors: We agree that reporting variability across initializations would make the extrapolation result more convincing. In the revision we will include results from ten independent training runs with different random seeds, presenting the mean and standard deviation of the year-10 settlement predictions for each activation function. These statistics will show that the large systematic deviations (approximately 290 mm for ReLU and 60 mm for tanh/sigmoid versus the 99.3 mm reference) are consistent across seeds and not attributable to a single atypical run. revision: yes

Circularity Check

0 steps flagged

No significant circularity: claims rest on independent numerical benchmarks

full rationale

The paper conducts direct empirical comparisons of MLP, PINN, DeepONet, and GNS performance against independent finite-difference and particle-based reference solutions on specific benchmarks (Terzaghi consolidation, damped oscillator, 1D wave, beam on elastic foundation, inverse wave). All reported errors, costs, and extrapolation failures are computed from these external references rather than from any fitted parameter that is then reused as a prediction target. No self-definitional loops, fitted-input predictions, or load-bearing self-citation chains appear in the evaluation chain. The conclusions follow from explicit numerical results against non-derived baselines, making the assessment self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The paper relies on standard numerical methods and loss formulations from the SciML literature; it introduces no new free parameters, axioms, or postulated entities beyond those already used in the referenced PINN and operator-learning frameworks.

axioms (2)
  • domain assumption The residual loss of a PINN is only enforced at the collocation points sampled during training.
    Invoked when explaining why PINN predictions fail outside the training interval.
  • standard math Finite-difference and particle-based codes provide ground-truth reference solutions for the chosen benchmarks.
    Used throughout the comparison sections.

pith-pipeline@v0.9.0 · 5839 in / 1445 out tokens · 56195 ms · 2026-05-21T15:46:23.423482+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

26 extracted references · 26 canonical work pages

  1. [1]

    Balestriero, R., Pesenti, J., and LeCun, Y. (2021). Learning in high dimension always amounts to extrapolation.arXiv preprint arXiv:2110.09485

  2. [2]

    Cybenko, G. (1989). Approximation by superpositions of a sigmoidal function.Mathematics of Control, Signals and Systems, 2(4):303–314

  3. [3]

    Durante, M. G. and Rathje, E. M. (2021). An exploration of the use of machine learning to predict lateral spreading.Earthquake Spectra, 37(4):2288–2314

  4. [4]

    Fenton, G. A. and Griffiths, D. V. (1999). Random field generation and the local average 28 subdivision method. InProbabilistic Methods in Geotechnical Engineering, number 491 in CISM Courses and Lectures, pages 201–223. Springer

  5. [5]

    Weinhart, T., Ye, D., and Cheng, H. (2025). Towards scientific machine learning for granular material simulations: challenges and opportunities.Archives of Computational Methods in Engineering

  6. [6]

    and Maurer, B

    Geyin, M. and Maurer, B. W. (2023). Us national vs 30 models and maps informed by remote sensing and machine learning.Seismological Society of America, 94(3):1467–1477

  7. [7]

    Hornik, K., Stinchcombe, M., and White, H. (1989). Multilayer feedforward networks are universal approximators.Neural Networks, 2(5):359–366

  8. [8]

    S., Ulmer, K

    Hudson, K. S., Ulmer, K. J., Zimmaro, P., Kramer, S. L., Stewart, J. P., and Brandenberg, S. J. (2023). Unsupervised machine learning for detecting soil layer boundaries from cone penetration test data.Earthquake Engineering & Structural Dynamics, 52:3201–3215

  9. [9]

    M., Stewart, J

    Ilhan, O., Hashash, Y. M., Stewart, J. P., Rathje, E. M., Nikolaou, S., and Campbell, K. W. (2025). Artificial neural network- based simulated site amplification models for central and eastern north america.Earthquake Spectra, 41(4):3190–3212

  10. [10]

    Krishnapriyan, A., Gholami, A., Zhe, S., Kirby, R., and Mahoney, M. W. (2021). Character- izing possible failure modes in physics-informed neural networks. InAdvances in Neural Information Processing Systems, volume 34, pages 26548–26560

  11. [11]

    Kumar, K., Cheng, H., and Soga, K. (2022). Differentiable material point method for learning and optimization.Computer Methods in Applied Mechanics and Engineering, 393:114763. 29

  12. [12]

    and Vantassel, J

    Kumar, K. and Vantassel, J. (2021). Lbm-diff: Lattice boltzmann method with automatic differentiation for inverse problems.Computer Physics Communications, 270:108166

  13. [13]

    Li, Z., Kovachki, N., Azizzadenesheli, K., Liu, B., Bhattacharya, K., Stuart, A., and Anand- kumar, A. (2021). Fourier neural operator for parametric partial differential equations. In International Conference on Learning Representations

  14. [14]

    E., and Rodriguez, A

    Liu, C., Macedo, J. E., and Rodriguez, A. (2025). Leveraging physics-informed neural networks in geotechnical earthquake engineering: An assessment on seismic site response analyses.Computers and Geotechnics, 181:107052

  15. [15]

    Lu, L., Jin, P., and Karniadakis, G. E. (2021). Learning nonlinear operators via deeponet based on the universal approximation theorem of operators.Nature Machine Intelligence, 3(3):218–229

  16. [16]

    Matheron, G. (1963). Principles of geostatistics.Economic Geology, 58(8):1246–1266

  17. [17]

    Pfaff, T., Fortunato, M., Sanchez-Gonzalez, A., and Battaglia, P. (2021). Learning mesh- based simulation with graph networks. InInternational Conference on Learning Repre- sentations

  18. [18]

    Raissi, M., Perdikaris, P., and Karniadakis, G. E. (2019). Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations.Journal of Computational Physics, 378:686–707

  19. [19]

    Sanchez-Gonzalez, A., Godwin, J., Pfaff, T., Ying, R., Leskovec, J., and Battaglia, P. (2020). Learning to simulate complex physics with graph networks. InProceedings of the 37th International Conference on Machine Learning, volume 119 ofPMLR, pages 8459–8468

  20. [20]

    P., Song, J., Salgado, R., and Prezzi, M

    Sastre, R. P., Song, J., Salgado, R., and Prezzi, M. (2024). Machine learning-based soil– structure interaction analysis of laterally loaded piles through physics-informed neural networks.Acta Geotechnica, 19:2681–2706. 30

  21. [21]

    Vanmarcke, E. H. (1977). Probabilistic modeling of soil profiles.Journal of the Geotechnical Engineering Division, 103(11):1227–1246

  22. [22]

    T., Qu, T., and Wang, M

    Wang, M., Kumar, K., Feng, Y. T., Qu, T., and Wang, M. (2025). Machine learning aided modeling of granular materials: A review.Archives of Computational Methods in Engineering, 32(4):1997–2034

  23. [23]

    Wang, S., Teng, Y., and Perdikaris, P. (2021). Understanding and mitigating gradient flow pathologies in physics-informed neural networks.SIAM Journal on Scientific Computing, 43(5):A3055–A3081

  24. [24]

    Wang, S., Yu, X., and Perdikaris, P. (2022). When and why pinns fail to train: A neural tangent kernel perspective.Journal of Computational Physics, 449:110768

  25. [25]

    Zhang, N., Zhou, A., Pan, Y., and Shen, S.-L. (2023). A multi-fidelity deep operator net- work (DeepONet) for fusing simulation and monitoring data: Application to real-time settlement prediction during tunnel construction.Engineering Applications of Artificial Intelligence, 125:106702

  26. [26]

    Zheng, W., Xu, H., Zhang, X., Xiong, H., and Cao, J. (2024). Physics-informed neural network solver for numerical analysis in geoengineering.Georisk: Assessment and Man- agement of Risk for Engineered Systems and Geohazards, 18(1):33–51. 31