pith. sign in

arxiv: 2601.03613 · v2 · pith:YVLQOJZBnew · submitted 2026-01-07 · ⚛️ physics.flu-dyn

A Simple but Efficient Transformer-Based Physics-Informed Neural Network for Incompressible Navier--Stokes Equations

Pith reviewed 2026-05-21 16:58 UTC · model grok-4.3

classification ⚛️ physics.flu-dyn
keywords flowframeworkphysicsformerproposedtextitcomputationalefficientequation
0
0 comments X

The pith

PhysicsFormer applies a lightweight Transformer PINN with pseudo-sequential representations to convection, Burgers, lid-driven cavity, and inverse Navier-Stokes problems, reporting near-zero error in parameter identification and flow reconstruction from sparse noisy data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Simulating how fluids move is usually slow and requires lots of computer memory for grids. This work replaces traditional neural networks with a Transformer model that pays attention to long sequences of time and space data. It adds a special loss that weights the physics equations more heavily during training. Tests on simple wave and turbulence-like equations plus flow around a cylinder show it can recover the full flow field and unknown constants even when only a few measurements are available and some data is noisy.

Core claim

For the inverse Navier-Stokes problem at Re=100, the proposed framework simultaneously reconstructs the flow field and identifies governing equation parameters with nearly 0% absolute error under both clean and noisy data conditions.

Load-bearing premise

The dynamics-weighted loss and pseudo-sequential spatio-temporal representations will produce stable convergence and accurate predictions for strongly nonlinear time-dependent flows without post-hoc tuning or additional regularization beyond what is described.

Figures

Figures reproduced from arXiv: 2601.03613 by Biswanath Barman, Debdeep Chatterjee, Rajendra K. Ray.

Figure 1
Figure 1. Figure 1: Flowchart illustrating the Physics-Informed Neural Networks (PINNs) architecture for solving partial differential equations (PDEs). Barman, Chatterjee, Ray: Preprint submitted to Elsevier Page 6 of 35 [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Flowchart of PhysicsFormer for solving general PDEs. 15 10 5 0 5 10 15 20 25 x 5 0 5 y Supervised Vorticity Contour of the Circular Cylinder 3 2 1 0 1 2 3 [PITH_FULL_IMAGE:figures/full_fig_p011_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: The dark-shaded region indicates the supervised data used by the PhysicsFormer model to reconstruct the flow past a circular cylinder and identify unknown physical parameters, while the remaining field is inferred through embedded physics constraints. The source dataset was reused with the permission of the author from Raissi et al., J. Comput. Phys. 378, 686–707 (2019). Barman, Chatterjee, Ray: Preprint s… view at source ↗
Figure 4
Figure 4. Figure 4: Training data distribution for the flow past a circular cylinder. The left panel shows the 𝑢-velocity and the right panel shows the 𝑣-velocity data, randomly sampled from time slices between 𝑡 = 0.0𝑠 and 𝑡 = 19.90𝑠. A total of 1500 spatial–temporal data points were used for training the PhysicsFormer model. The source dataset was reused with the permission of the author from Raissi et al., J. Comput. Phys.… view at source ↗
Figure 5
Figure 5. Figure 5: Convection equation results at 𝛽 = 50 using PINNs and proposed PhysicsFormer. In Figure (a), the exact solution is computed using data reused from Zhao et al., PINNsFormer: A Transformer-Based Framework for Physics-Informed Neural Networks, arXiv:2307.11833 (2023), licensed under a Creative Commons Attribution (CC BY 4.0) license. (b) and (c) are the PINNs prediction and absolute error, while (d) and (e) a… view at source ↗
Figure 6
Figure 6. Figure 6: Comparison of solutions to Burgers’ equation: (a) exact reference solution [The source dataset was reused with the permission of the author from Raissi et al., J. Comput. Phys. 378, 686–707 (2019).], (b) prediction derived from the proposed PhysicsFormer model, and (c) distribution of absolute error. The findings demonstrate that PhysicsFormer effectively represents the shock behavior while preserving a mi… view at source ↗
Figure 7
Figure 7. Figure 7: Comparison of the predicted and exact solutions of Burgers’ equation at three time intervals: 𝑡 = 0.25 s, 𝑡 = 0.50 s, and 𝑡 = 0.75 s. The proposed PhysicsFormer accurately captures the evolution of the shock wave, with anticipated results consistently aligning with the exact solutions across all snapshots. Barman, Chatterjee, Ray: Preprint submitted to Elsevier Page 18 of 35 [PITH_FULL_IMAGE:figures/full_… view at source ↗
Figure 8
Figure 8. Figure 8: Burgers’ equation training setup and convergence performance. The left image shows the initial and boundary training data distribution in the solution space, while the right image depicts the training loss versus epoch for the PhysicsFormer model. The model demonstrates stable convergence and reaches an accurate solution after approximately 500 epochs. The source dataset was reused with the permission of t… view at source ↗
Figure 9
Figure 9. Figure 9: Comparison of training loss versus epoch across different approaches, including PhysicsFormer, PINNsFormer, PINNs, QRes, and FLS.The results indicate that the proposed PhysicsFormer demonstrates consistent convergence behavior with improved agreement across all cases. Part of the data in this figure were reused from Zhao et al., PNNsFormer: A Transformer-Based Framework for Physics-Informed Neural Networks… view at source ↗
Figure 10
Figure 10. Figure 10: Pressure field reconstruction results for the incompressible Navier–Stokes equations. The top row (a) shows the reference exact solution, the source dataset was reused with permission from the author of Raissi et al., J. Comput. Phys. 378, 686–707 (2019). The middle row (b)-(f) presents the predicted pressure fields obtained using different models, and the bottom row (g)-(k) illustrates the corresponding … view at source ↗
Figure 11
Figure 11. Figure 11: Comparison of the reference exact solution [The source dataset was reused with permission from the author of Raissi et al., J. Comput. Phys. 378, 686–707 (2019).](left column) with the predictions derived from the proposed PhysicsFormer model (right column) for 𝑢-velocity, 𝑣-velocity, and pressure fields. The reconstruction utilizes merely 0.15% (1500) sparse supervised velocity and pressure data, demonst… view at source ↗
Figure 12
Figure 12. Figure 12: Comparison of the vorticity field in the wake of a circular cylinder: CFD benchmark solution (a) vs the reconstruction derived from the proposed PhysicsFormer model (b). The model is trained with about 0.15% (1500 samples) of the available data, yet it effectively captures the typical vorticity structures within the wake zone. The source dataset was reused with permission from the author of Raissi et al.,… view at source ↗
Figure 13
Figure 13. Figure 13: Analysis of streamline configurations in the wake region of a circular cylinder. The left column displays the CFD benchmark solution (a), whereas the right column (b)illustrates the reconstruction derived from the proposed PhysicsFormer model. Despite its training on only 0.15% (1500 samples) of the velocity data, PhysicsFormer proficiently reproduces the wake streamlines and delineates the fundamental fl… view at source ↗
Figure 14
Figure 14. Figure 14: Inverse problem of the Navier-Stokes equations utilizing 1500 velocity samples. The top two rows present a comparison of the exact results and those predicted by PhysicsFormer for 𝑢- and 𝑣-velocity, together with the associated absolute error for clean data. The bottom two rows depict the reconstruction of the 𝑢- and 𝑣-velocity under 1% Gaussian noise, along with the corresponding absolute error. The exac… view at source ↗
Figure 15
Figure 15. Figure 15 [PITH_FULL_IMAGE:figures/full_fig_p027_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: Comparison of vorticity fields for clean and noisy data cases. The left two panels show the exact vorticity results, while the right two panels present the predictions obtained using the proposed PhysicsFormer model. The PhysicsFormer predictions demonstrate good agreement with the exact vorticity in both clean and noisy data scenarios. The exact vorticity source dataset was reused with permission from th… view at source ↗
Figure 17
Figure 17. Figure 17: Optimize reconstruction in the wake zone of a circular cylinder for the inverse Navier–Stokes problem utilizing 1500 supervised velocity samples. The left two panel indicate the CFD Benchmark streamline, while right two panels present the predictions obtained using the proposed PhysicsFormer model both clean and noisy data respectively. The CFD Benchmark source dataset was reused with permission from the … view at source ↗
Figure 18
Figure 18. Figure 18: The convergence of 𝜆1 and 𝜆2 during training on both clean and noisy data over 5000 epochs illustrates their approach towards the true values; the left (a) plot depicts the convergence of 𝜆1 , while the right (b) plot represents the convergence of 𝜆2 . Barman, Chatterjee, Ray: Preprint submitted to Elsevier Page 29 of 35 [PITH_FULL_IMAGE:figures/full_fig_p029_18.png] view at source ↗
Figure 19
Figure 19. Figure 19: Temporal variations in vorticity fields across a complete oscillation cycle. (a) denotes the vorticity field at the initial reference time t = t0 . (b) illustrates the vorticity at t = t0 + 𝑇 ∕2, the midpoint of a complete period (𝑇 ), depicting the inverted configuration of the flow. (c) exhibits the vorticity at t = t0 + 𝑇 , concluding the complete oscillation period and demonstrating a mirror image of … view at source ↗
read the original abstract

Traditional computational fluid dynamics and physics-informed neural networks (PINNs) often suffer from high computational cost, mesh sensitivity, and reduced accuracy for strongly nonlinear and time-dependent flows. To address these limitations, we propose \textit{PhysicsFormer}, a simple and efficient Transformer-based physics-informed neural network framework for complex fluid flow simulations. The proposed architecture employs encoder--decoder multi-head attention to capture long-range temporal dependencies and enhance spatio-temporal information propagation. Unlike conventional multilayer perceptron-based PINNs, \textit{PhysicsFormer} utilizes pseudo-sequential spatio-temporal representations together with a dynamics-weighted loss formulation to improve convergence, stability, and predictive accuracy. Owing to its lightweight architecture and parallel learning strategy, the proposed framework achieves faster training and lower computational cost than existing Transformer-based PINN models. The performance of the proposed framework is demonstrated on the convection equation, Burgers' equation, lid-driven cavity flow at $Re=100$, and inverse Navier--Stokes and flow reconstruction problems for flow past a circular cylinder at $Re=100$ and $Re=3900$. For the inverse Navier--Stokes problem at $Re=100$, the proposed framework simultaneously reconstructs the flow field and identifies governing equation parameters with nearly $0\%$ absolute error under both clean and noisy data conditions. Furthermore, for the high-Reynolds-number case at $Re=3900$, \textit{PhysicsFormer} accurately reconstructs the velocity and pressure fields using only $25$ spatial measurements per snapshot over $100$ temporal snapshots. The obtained results demonstrate that \textit{PhysicsFormer} provides an accurate, robust, and computationally efficient framework for complex time-dependent fluid flow problems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 1 minor

Summary. The manuscript presents PhysicsFormer, a Transformer-based physics-informed neural network for incompressible Navier-Stokes equations. It utilizes an encoder-decoder multi-head attention mechanism to capture long-range temporal dependencies and employs pseudo-sequential spatio-temporal representations with a dynamics-weighted loss to enhance convergence and accuracy. The approach is demonstrated on benchmark problems including the convection equation, Burgers' equation, lid-driven cavity flow at Re=100, and inverse flow reconstruction for a circular cylinder at Re=100 and Re=3900, with claims of near-zero error in parameter identification for the inverse problem at Re=100 even with noisy data and accurate reconstruction using sparse measurements at high Re.

Significance. Should the numerical results prove reproducible and generalizable, the work could contribute to the development of more efficient PINN architectures for complex fluid flows by leveraging Transformer attention mechanisms. The lightweight design and parallel learning strategy are noted strengths that could reduce computational costs compared to existing models. However, the absence of detailed ablation studies and baseline comparisons limits the immediate impact assessment.

major comments (3)
  1. Abstract: The claim that the framework identifies governing equation parameters with nearly 0% absolute error for the inverse Navier-Stokes problem at Re=100 under both clean and noisy data is load-bearing but lacks supporting evidence in the form of the explicit dynamics-weighted loss formulation or sensitivity analysis to noise levels and weighting parameters.
  2. Results (high-Re case): For the Re=3900 cylinder flow reconstruction using only 25 spatial points per snapshot over 100 temporal snapshots, the manuscript does not provide error bars, comparisons to standard PINN baselines, or details on data exclusion criteria, which undermines the robustness claim for strongly nonlinear flows.
  3. Method (dynamics-weighted loss): The dynamics-weighted loss formulation is central to the stability and accuracy claims but is not accompanied by the weighting schedule, relative coefficients between PDE and data terms, or ablation studies, raising concerns about whether the near-zero errors are due to implicit tuning rather than the architecture.
minor comments (1)
  1. Abstract: Consider adding a brief mention of the specific Transformer architecture details, such as number of layers or attention heads, for better context.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive and detailed review of our manuscript. We have carefully addressed each major comment below with point-by-point responses. Where revisions are warranted, we will update the manuscript to improve clarity, robustness, and completeness of the presented results.

read point-by-point responses
  1. Referee: Abstract: The claim that the framework identifies governing equation parameters with nearly 0% absolute error for the inverse Navier-Stokes problem at Re=100 under both clean and noisy data is load-bearing but lacks supporting evidence in the form of the explicit dynamics-weighted loss formulation or sensitivity analysis to noise levels and weighting parameters.

    Authors: We appreciate the referee's emphasis on supporting evidence for this key claim. The explicit formulation of the dynamics-weighted loss appears in Section 3.2, where the weights are computed dynamically as the inverse of the exponential moving average of each loss component to automatically balance the PDE residual and data fidelity terms. The reported near-zero absolute errors for parameter recovery (viscosity and other coefficients) under clean and noisy conditions are shown quantitatively in the results for the inverse problem. That said, we agree that a dedicated sensitivity study to noise amplitude and weighting hyperparameters would strengthen the presentation. We will add this analysis, including plots for noise levels between 1% and 10%, in a new appendix of the revised manuscript. revision: yes

  2. Referee: Results (high-Re case): For the Re=3900 cylinder flow reconstruction using only 25 spatial points per snapshot over 100 temporal snapshots, the manuscript does not provide error bars, comparisons to standard PINN baselines, or details on data exclusion criteria, which undermines the robustness claim for strongly nonlinear flows.

    Authors: We thank the referee for identifying these gaps in the high-Re results. The current manuscript reports L2 relative errors for velocity and pressure but does not include variability across runs or direct baseline comparisons. We will augment the results section with error bars computed from multiple independent trainings and add a side-by-side comparison against a standard MLP-based PINN using the same data and loss settings. Regarding data selection, the 25 points per snapshot were drawn uniformly at random from the interior domain (excluding the cylinder surface and far-field boundaries); we will state this selection procedure explicitly in the revised Methods and figure captions. revision: yes

  3. Referee: Method (dynamics-weighted loss): The dynamics-weighted loss formulation is central to the stability and accuracy claims but is not accompanied by the weighting schedule, relative coefficients between PDE and data terms, or ablation studies, raising concerns about whether the near-zero errors are due to implicit tuning rather than the architecture.

    Authors: We acknowledge that the weighting schedule and coefficient choices merit more explicit documentation. In the manuscript the PDE weight follows a linear ramp from a small initial value to unity over the first 2000 epochs, while the data weight remains fixed at unity; these choices were selected to ensure early data-driven fitting before enforcing the physics constraints. Although we performed limited internal checks on the weighting parameters, we did not report a full ablation study. We will expand the Methods section with the precise schedule and coefficient values and include a concise ablation table (in the main text or supplementary material) that varies the initial PDE weight and ramp duration to demonstrate that the reported accuracy stems from the combination of architecture and loss rather than from hidden hyperparameter tuning alone. revision: partial

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The framework rests on standard incompressible Navier-Stokes assumptions and neural network optimization practices; no new entities are postulated.

axioms (1)
  • domain assumption Incompressible Navier-Stokes equations accurately describe the target flows at the tested Reynolds numbers.
    Invoked throughout the problem statements and loss formulation.

pith-pipeline@v0.9.0 · 5848 in / 1117 out tokens · 52320 ms · 2026-05-21T16:58:21.394702+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

58 extracted references · 58 canonical work pages · 4 internal anchors

  1. [1]

    Ding,H.,Shu,C.,Yeo,K.S.andXu,D.(2004).SimulationofincompressibleviscousflowspastacircularcylinderbyhybridFDschemeand meshless least square-based finite difference method.Computer Methods in Applied Mechanics and Engineering, 193(9–11), 727–744

  2. [2]

    and Zheng, X

    Liu, F. and Zheng, X. (1996). A strongly coupled time-marching method for solving the Navier–Stokes and𝑘–𝜔turbulence model equations with multigrid.Journal of Computational Physics, 128(2), 289–300

  3. [3]

    and Silva, W.A

    Lucia, D.J., Beran, P.S. and Silva, W.A. (2004). Reduced-order modeling: new approaches for computational physics.Progress in Aerospace Sciences, 40(1–2), 51–117

  4. [4]

    Henshaw, M.D.C., Badcock, K.J., Vio, G.A., Allen, C.B., Chamberlain, J., Kaynes, I., Dimitriadis, G., Cooper, J.E., Woodgate, M.A., Rampurawala,A.M.andJones,D.(2007).Non-linearaeroelasticpredictionforaircraftapplications.ProgressinAerospaceSciences,43(4–6), 65–137

  5. [5]

    and Nichols, J.W

    Jovanović, M.R., Schmid, P.J. and Nichols, J.W. (2014). Sparsity-promoting dynamic mode decomposition.Physics of Fluids, 26(2)

  6. [6]

    and Rowley, C.W

    Hemati, M.S., Williams, M.O. and Rowley, C.W. (2014). Dynamic mode decomposition for large and streaming datasets.Physics of Fluids, 26(11). Barman, Chatterjee, Ray:Preprint submitted to ElsevierPage 33 of 35 An Efficient and Fast Transformer-Based PINNs

  7. [7]

    and Barati Farimani, A

    Hemmasian, A. and Barati Farimani, A. (2023). Reduced-order modeling of fluid flows with transformers.Physics of Fluids, 35(5)

  8. [8]

    and Fotiadis, D.I

    Lagaris, I.E., Likas, A. and Fotiadis, D.I. (1998). Artificial neural networks for solving ordinary and partial differential equations.IEEE Transactions on Neural Networks, 9(5), 987–1000

  9. [9]

    and Karniadakis, G.E

    Raissi, M., Perdikaris, P. and Karniadakis, G.E. (2019). Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations.Journal of Computational Physics, 378, 686–707

  10. [10]

    Raissi, M. (2018). Deep hidden physics models: Deep learning of nonlinear partial differential equations.Journal of Machine Learning Research, 19(25), 1–24

  11. [11]

    and Tchelepi, H.A

    Fuks, O. and Tchelepi, H.A. (2020). Limitations of physics-informed machine learning for nonlinear two-phase transport in porous media. Journal of Machine Learning for Modeling and Computing, 1(1)

  12. [12]

    and Mahoney, M.W

    Krishnapriyan, A., Gholami, A., Zhe, S., Kirby, R. and Mahoney, M.W. (2021). Characterizing possible failure modes in physics-informed neural networks.Advances in Neural Information Processing Systems, 34, 26548–26560

  13. [13]

    Wang,S.,Yu,X.andPerdikaris,P.(2022).WhenandwhyPINNsfailtotrain:Aneuraltangentkernelperspective.JournalofComputational Physics, 449, 110768

  14. [14]

    Physics Informed Deep Learning (Part I): Data-driven Solutions of Nonlinear Partial Differential Equations

    Raissi, M., Perdikaris, P. and Karniadakis, G.E. (2017). Physics-informed deep learning (Part I): Data-driven solutions of nonlinear partial differential equations.arXiv preprintarXiv:1711.10561

  15. [15]

    and Perdikaris, P

    Zhu, Y., Zabaras, N., Koutsourelakis, P.S. and Perdikaris, P. (2019). Physics-constrained deep learning for high-dimensional surrogate modeling and uncertainty quantification without labeled data.Journal of Computational Physics, 394, 56–81

  16. [16]

    and Sun, H

    Chen, Z., Liu, Y. and Sun, H. (2021). Physics-informed learning of governing equations from scarce data.Nature Communications, 12(1), 6136

  17. [17]

    and Karniadakis, G.E

    Mao, Z., Jagtap, A.D. and Karniadakis, G.E. (2020). Physics-informed neural networks for high-speed flows.Computer Methods in Applied Mechanics and Engineering, 360, 112789

  18. [18]

    and Perdikaris, P

    Wang, S., Teng, Y. and Perdikaris, P. (2021). Understanding and mitigating gradient flow pathologies in physics-informed neural networks. SIAM Journal on Scientific Computing, 43(5), A3055–A3081

  19. [19]

    and Byrom, T.G

    Huebner, K.H., Dewhirst, D.L., Smith, D.E. and Byrom, T.G. (2001).The Finite Element Method for Engineers. John Wiley & Sons

  20. [20]

    Fourier Neural Operator for Parametric Partial Differential Equations

    Li, Z., Kovachki, N., Azizzadenesheli, K., Liu, B., Bhattacharya, K., Stuart, A. and Anandkumar, A. (2020). Fourier neural operator for parametric partial differential equations.arXiv preprintarXiv:2010.08895

  21. [21]

    and Polosukhin, I

    Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł. and Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems, 30

  22. [22]

    and Prakash, B.A

    Zhao, Z., Ding, X. and Prakash, B.A. (2023). PINNsFormer: A transformer-based framework for physics-informed neural networks.arXiv preprintarXiv:2307.11833

  23. [23]

    and Liu, L

    Zhu, Z., Huang, Y. and Liu, L. (2025). PhysicsSolver: Transformer-enhanced physics-informed neural networks for forward and forecasting problems in partial differential equations.arXiv preprintarXiv:2502.19290

  24. [24]

    Sod, G. A. (1978). A survey of several finite difference methods for systems of nonlinear hyperbolic conservation laws.Journal of Computational Physics, 27(1), 1–31

  25. [25]

    Ciarlet, P. G. and Lions, J. L. (1990).Handbook of Numerical Analysis(Vol. 11). Gulf Professional Publishing

  26. [26]

    and Bickel, B

    Umetani, N. and Bickel, B. (2018). Learning three-dimensional flow for interactive aerodynamic design.ACM Transactions on Graphics (TOG), 37(4), 1–10

  27. [27]

    Yu, B. (2018). The deep Ritz method: A deep learning-based numerical algorithm for solving variational problems.Communications in Mathematics and Statistics, 6(1), 1–12

  28. [28]

    Jin,S.,Ma,Z.andWu,K.(2023).Asymptotic-preservingneuralnetworksformultiscalekineticequations.arXivpreprintarXiv:2306.15381

  29. [29]

    and Zhu, Z

    Liu, L., Wang, Y., Zhu, X. and Zhu, Z. (2025). Asymptotic-preserving neural networks for the semiconductor Boltzmann equation and its application on inverse problems.Journal of Computational Physics, 523, 113669

  30. [30]

    and Karniadakis, G.E

    Lu, L., Jin, P., Pang, G., Zhang, Z. and Karniadakis, G.E. (2021). Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators.Nature Machine Intelligence, 3(3), 218–229

  31. [31]

    and Anandkumar, A

    Li, Z., Kovachki, N., Choy, C., Li, B., Kossaifi, J., Otta, S., Nabian, M.A., Stadler, M., Hundt, C., Azizzadenesheli, K. and Anandkumar, A. (2023).Geometry-informedneuraloperatorforlarge-scale3DPDEs.AdvancesinNeuralInformationProcessingSystems,36,35836–35854

  32. [32]

    A., Ross, Z

    Rahman, M.A., Ross, Z.E. and Azizzadenesheli, K. (2022). U-NO: U-shaped neural operators.arXiv preprint arXiv:2204.11127

  33. [33]

    and Gallinari, P

    Yin, Y., Kirchmeyer, M., Franceschi, J.Y., Rakotomamonjy, A. and Gallinari, P. (2022). Continuous PDE dynamics forecasting with implicit neural representations.arXiv preprint arXiv:2209.14855

  34. [34]

    and Zdeborová, L

    Carleo, G., Cirac, I., Cranmer, K., Daudet, L., Schuld, M., Tishby, N., Vogt-Maranto, L. and Zdeborová, L. (2019). Machine learning and the physical sciences.Reviews of Modern Physics, 91(4), 045002

  35. [35]

    and Karniadakis, G.E

    Yang, L., Zhang, D. and Karniadakis, G.E. (2020). Physics-informed generative adversarial networks for stochastic differential equations. SIAM Journal on Scientific Computing, 42(1), A292–A317

  36. [36]

    and Hu, X

    Wang, Y., Han, X., Chang, C.Y., Zha, D., Braga-Neto, U. and Hu, X. (2022). Auto-PINN: Understanding and optimizing physics-informed neural architecture.arXiv preprintarXiv:2205.13748

  37. [37]

    Cuomo,S.,DiCola,V.S.,Giampaolo,F.,Rozza,G.,Raissi,M.andPiccialli,F.(2022).Scientificmachinelearningthroughphysics–informed neural networks: Where we are and what’s next.Journal of Scientific Computing, 92(3), 88

  38. [38]

    Braga-Neto, L.M.U. (2021). Self-Adaptive Physics-Informed Neural Networks using a Soft Attention Mechanism

  39. [39]

    Han,J.,Jentzen,A.andE,W.(2018).Solvinghigh-dimensionalpartialdifferentialequationsusingdeeplearning.ProceedingsoftheNational Academy of Sciences, 115(34), 8505–8510

  40. [40]

    and Karniadakis, G.E

    Lou, Q., Meng, X. and Karniadakis, G.E. (2021). Physics-informed neural networks for solving forward and inverse flow problems via the Boltzmann–BGK formulation.Journal of Computational Physics, 447, 110676. Barman, Chatterjee, Ray:Preprint submitted to ElsevierPage 34 of 35 An Efficient and Fast Transformer-Based PINNs

  41. [41]

    and Sangeetha, S

    Kalyan, K.S., Rajasekharan, A. and Sangeetha, S. (2021). Ammus: A survey of transformer-based pretrained models in natural language processing.arXiv preprintarXiv:2108.05542

  42. [42]

    5884–5888)

    Dong,L.,Xu,S.andXu,B.(2018,April).Speech-transformer:Ano-recurrencesequence-to-sequencemodelforspeechrecognition.In2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)(pp. 5884–5888). IEEE

  43. [43]

    IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(1), 87–110

    Han,K.,Wang,Y.,Chen,H.,Chen,X.,Guo,J.,Liu,Z.,Tang,Y.,Xiao,A.,Xu,C.,Xu,Y.andYang,Z.(2022).Asurveyonvisiontransformer. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(1), 87–110

  44. [44]

    Transformers in time series: A survey,

    Wen, Q., Zhou, T., Zhang, C., Chen, W., Ma, Z., Yan, J. and Sun, L. (2022). Transformers in time series: A survey.arXiv preprint arXiv:2202.07125

  45. [45]

    Cao, S. (2021). Choose a transformer: Fourier or Galerkin.Advances in Neural Information Processing Systems, 34, 24924–24940

  46. [46]

    Transolver: A Fast Transformer Solver for PDEs on General Geometries

    Wu, H., Luo, H., Wang, H., Wang, J. and Long, M. (2024). Transolver: A fast transformer solver for PDEs on general geometries.arXiv preprintarXiv:2402.02366

  47. [47]

    and Siskind, J.M

    Baydin, A.G., Pearlmutter, B.A., Radul, A.A. and Siskind, J.M. (2018). Automatic differentiation in machine learning: A survey.Journal of Machine Learning Research, 18(153), 1–43

  48. [48]

    Adam: A Method for Stochastic Optimization

    Kingma, D.P. and Ba, J. (2014). Adam: A method for stochastic optimization.arXiv preprintarXiv:1412.6980

  49. [49]

    Hornik, K. (1991). Approximation capabilities of multilayer feedforward networks.Neural Networks, 4(2), 251–257

  50. [50]

    and Toutanova, K

    Devlin, J., Chang, M.W., Lee, K. and Toutanova, K. (2019, June). BERT: Pre-training of deep bidirectional transformers for language understanding. InProceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)(pp. 4171–4186)

  51. [51]

    and Wen, X

    Lai, B., Liu, Y. and Wen, X. (2024). Temporal and spatial flow field reconstruction from low-resolution PIV data and pressure probes using physics-informed neural networks.Measurement Science and Technology, 35(6), 065304

  52. [52]

    Bu,J.andKarpatne,A.(2021).Quadraticresidualnetworks:Anewclassofneuralnetworksforsolvingforwardandinverseproblemsinphysics involvingPDEs.InProceedingsofthe2021SIAMInternationalConferenceonDataMining(SDM)(pp.675–683).SocietyforIndustrialand Applied Mathematics

  53. [53]

    and Ong, Y.S

    Wong, J.C., Ooi, C.C., Gupta, A. and Ong, Y.S. (2022). Learning in sinusoidal spaces with physics-informed neural networks.IEEE Transactions on Artificial Intelligence, 5(3), 985–1000

  54. [54]

    Bateman, H. (1915). Some recent researches on the motion of fluids.Monthly Weather Review, 43(4), 163–170

  55. [55]

    Burgers, J.M. (1948). A mathematical model illustrating the theory of turbulence.Advances in Applied Mechanics, 1, 171–199

  56. [56]

    and Nocedal, J

    Liu, D.C. and Nocedal, J. (1989). On the limited memory BFGS method for large scale optimization.Mathematical Programming, 45(1), 503–528

  57. [57]

    Cheng,C.andZhang,G.T.(2021).Deeplearningmethodbasedonphysics-informedneuralnetworkwithResNetblockforsolvingfluidflow problems.Water, 13(4), 423

  58. [58]

    and Obaido, G

    Mienye, I.D., Swart, T.G. and Obaido, G. (2024). Recurrent neural networks: A comprehensive review of architectures, variants, and applications.Information, 15(9), 517. Barman, Chatterjee, Ray:Preprint submitted to ElsevierPage 35 of 35