Attention mechanism for scalable mesh-based neural surrogates of free-surface fluids

Federico Lanteri; Massimiliano Cremonesi

arxiv: 2606.23251 · v1 · pith:V5MQGSZ4new · submitted 2026-06-22 · 💻 cs.CE · cs.LG

Attention mechanism for scalable mesh-based neural surrogates of free-surface fluids

Federico Lanteri , Massimiliano Cremonesi This is my paper

Pith reviewed 2026-06-26 06:08 UTC · model grok-4.3

classification 💻 cs.CE cs.LG

keywords neural surrogateself-attentionPFEMfree-surface flowsmesh discretizationnon-Newtonian fluidsscalable modeling

0 comments

The pith

Self-attention neural surrogates predict free-surface fluid flows on evolving meshes with improved scalability.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper introduces a self-attention architecture to create neural surrogates that approximate the dynamics of free-surface fluid simulations performed with the Particle Finite Element Method. The goal is to achieve faster computations than full PFEM runs while handling evolving geometries and non-Newtonian materials. By operating directly on the mesh nodes, the model maintains the discretization needed for accurate long-term predictions and for calculating derived quantities like stresses using existing finite element tools. Readers interested in engineering simulations would care because it offers a way to run many scenarios or longer times without the full computational burden of traditional methods.

Core claim

The self-attention-based neural surrogate accurately predicts transient dynamics and final configurations of free-surface flows with significantly improved scalability while preserving the PFEM mesh discretization and enabling reconstruction of derived mechanical quantities.

What carries the argument

Self-attention layers that model interactions between nodes on the PFEM mesh to capture spatial dependencies in evolving free-surface flows.

If this is right

Accurate predictions hold for two- and three-dimensional benchmarks with varying material parameters and non-Newtonian fluids.
The linear attention variant reduces computational cost and improves scalability over standard self-attention.
Mesh preservation allows direct reconstruction of stress fields and other mechanical quantities via standard finite element operators.
Improved long-term stability during rollouts on changing geometries without additional remeshing rules.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Such surrogates might be combined with optimization loops for design of fluid-handling structures.
The approach could apply to other mesh-based Lagrangian simulations beyond free-surface flows.
Future work might test generalization to unseen rheologies or larger domains not in the training set.

Load-bearing premise

Attention layers reliably capture spatial dependencies for stable long-term rollouts on arbitrary evolving meshes without needing extra constraints.

What would settle it

Running the surrogate on a long-duration 3D simulation of a non-Newtonian free-surface flow and comparing the predicted mesh evolution and quantities against a full PFEM reference solution to check for accumulating errors or instability.

Figures

Figures reproduced from arXiv: 2606.23251 by Federico Lanteri, Massimiliano Cremonesi.

**Figure 2.** Figure 2: A schematic of the GNN architecture during the prediction phase. The model leverages mesh connectivity to [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: Standard self-attention architecture during the prediction phase. Here, [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗

**Figure 4.** Figure 4: Linear self-attention architecture during the prediction phase. By removing the softmax nonlinearity, attention [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗

**Figure 5.** Figure 5: 2D inclined plane benchmark: initial configuration (left) and final equilibrium state (right). [PITH_FULL_IMAGE:figures/full_fig_p012_5.png] view at source ↗

**Figure 6.** Figure 6: Temporal snapshots of NeuralPFEM with standard self-attention predictions (top row of each pair) and [PITH_FULL_IMAGE:figures/full_fig_p013_6.png] view at source ↗

**Figure 7.** Figure 7: Comparison of the normalized runout evolution over time obtained with PFEM ( [PITH_FULL_IMAGE:figures/full_fig_p014_7.png] view at source ↗

**Figure 8.** Figure 8: 3D bingham cone slump experiment: initial configuration (left) and final equilibrium state (right). [PITH_FULL_IMAGE:figures/full_fig_p016_8.png] view at source ↗

**Figure 9.** Figure 9: Temporal snapshots of NeuralPFEM with standard self-attention predictions (top row of each pair) and [PITH_FULL_IMAGE:figures/full_fig_p018_9.png] view at source ↗

**Figure 10.** Figure 10: 3D casting experiment: the fluid, initially suspended above the container, is released and flows under gravity [PITH_FULL_IMAGE:figures/full_fig_p019_10.png] view at source ↗

**Figure 11.** Figure 11: Temporal snapshots of NeuralPFEM with linear self-attention predictions and reference PFEM solutions [PITH_FULL_IMAGE:figures/full_fig_p021_11.png] view at source ↗

read the original abstract

High-fidelity simulations of free-surface flows using Lagrangian methods such as the Particle Finite Element Method (PFEM) are computationally demanding due to continuous domain updates and repeated solution of the governing equations. This challenge is further amplified by non-Newtonian rheologies, where material nonlinearities increase computational cost. These limitations motivate the development of efficient surrogate models to approximate PFEM dynamics at reduced cost. While data-driven deep learning approaches are promising, a key challenge is designing models that operate on arbitrary and evolving geometries. We propose a self-attention-based neural surrogate for PFEM simulations of free-surface flows. The architecture leverages attention mechanisms to model node interactions and capture complex spatial dependencies, while preserving the PFEM mesh discretization. This provides a geometric and topological framework for remeshing and node redistribution, maintaining high-quality spatial discretization during rollouts, improving long-term stability, and enabling reconstruction of derived mechanical quantities via standard finite element operators. Two attention formulations are considered: a standard self-attention mechanism and a linear variant that reduces computational cost and improves scalability. The models are evaluated on two- and three-dimensional free-surface flow benchmarks with evolving geometries, varying material parameters, and non-Newtonian fluids. Results show accurate prediction of transient dynamics and final configurations, with significantly improved scalability. The mesh-based formulation also enables direct reconstruction of quantities such as stress fields. Overall, the framework provides an accurate and scalable surrogate strategy for PFEM simulations in engineering-scale applications.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Attention on PFEM meshes is a reasonable new application but the abstract gives no numbers to support the accuracy or long-term stability claims.

read the letter

The main takeaway is that the paper puts self-attention on top of PFEM meshes to build a surrogate for free-surface flows that keeps the original discretization. This lets them run the model on evolving geometries, do remeshing inside the same framework, and pull out stresses with standard finite-element operators afterward. They also test a linear-attention version for speed and include non-Newtonian cases.

What stands out as useful is the explicit choice to stay inside the PFEM topology instead of moving to a graph or point-cloud representation that would lose the mesh structure. That decision is practical for anyone who already has PFEM code and wants to reuse the existing remeshing and post-processing tools.

The soft spot is the complete absence of quantitative results in the abstract. No error norms, no baseline comparisons against other surrogates or reduced-order models, and no description of how many steps the rollouts actually run before quality degrades. The stress-test worry about attention failing to enforce element quality or conservation on long horizons is still open; if the training data only covers short, well-behaved trajectories, the claimed stability for engineering-scale runs would need stronger evidence than is visible here.

The work is aimed at people already doing mesh-based Lagrangian fluid surrogates who need to keep the discretization intact. A reader in that niche can extract the architectural idea and the motivation even if the numbers are thin.

It is worth sending to peer review so the full experiments can be checked, but the current version sits at the lower end of what most editors would forward without seeing the actual error tables and rollout lengths.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes a self-attention-based neural surrogate for Particle Finite Element Method (PFEM) simulations of free-surface flows. It employs standard and linear self-attention layers to model node interactions on arbitrary evolving meshes, preserving the PFEM discretization to support remeshing, long-term stability, and reconstruction of derived mechanical quantities via standard FE operators. The models are evaluated on 2D and 3D benchmarks involving varying material parameters and non-Newtonian fluids, with claims of accurate transient and final-state predictions plus significantly improved scalability.

Significance. If the quantitative claims hold, the work would supply a practical mesh-preserving surrogate that reduces the cost of repeated PFEM solves while retaining the ability to apply standard remeshing and post-process stress/strain fields, which is a concrete advantage over purely particle- or grid-based neural surrogates for engineering free-surface problems.

major comments (2)

[Abstract] Abstract and evaluation description: the central claim that the surrogate 'accurately predicts transient dynamics and final configurations' and 'significantly improved scalability' is asserted without any reported error metrics (e.g., L2 velocity or interface error), baseline comparisons against PFEM or other surrogates, or validation protocols for long-horizon rollouts. This absence directly undermines assessment of the accuracy and stability assertions.
[Architecture / Model description] The architecture description does not specify any explicit mechanism (regularization, loss term, or architectural constraint) that would enforce mesh-quality metrics, volume preservation, or divergence-free conditions on the evolving node sets. Because the skeptic correctly notes that standard self-attention supplies no such guarantees, the claim that the model 'maintains high-quality spatial discretization during rollouts' without post-hoc rules remains unsecured.

minor comments (1)

[Abstract] The abstract is unusually long and contains several forward-looking claims that would be better placed in the conclusions or results summary.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive report. We address each major comment below.

read point-by-point responses

Referee: [Abstract] Abstract and evaluation description: the central claim that the surrogate 'accurately predicts transient dynamics and final configurations' and 'significantly improved scalability' is asserted without any reported error metrics (e.g., L2 velocity or interface error), baseline comparisons against PFEM or other surrogates, or validation protocols for long-horizon rollouts. This absence directly undermines assessment of the accuracy and stability assertions.

Authors: The abstract is a high-level summary. Quantitative error metrics (L2 norms on velocity and interface position), direct PFEM baseline comparisons, and long-horizon rollout statistics are reported in Sections 4 and 5 of the manuscript. We will revise the abstract to include representative quantitative values and a brief statement on the validation protocol. revision: yes
Referee: [Architecture / Model description] The architecture description does not specify any explicit mechanism (regularization, loss term, or architectural constraint) that would enforce mesh-quality metrics, volume preservation, or divergence-free conditions on the evolving node sets. Because the skeptic correctly notes that standard self-attention supplies no such guarantees, the claim that the model 'maintains high-quality spatial discretization during rollouts' without post-hoc rules remains unsecured.

Authors: The surrogate is formulated to output updated nodal positions and velocities on the existing PFEM mesh connectivity. This design choice deliberately retains the PFEM discretization so that the standard PFEM remeshing and node-redistribution algorithms can be applied after each surrogate step exactly as in the original solver. No additional regularization terms for volume or divergence are present in the loss; the network is trained to reproduce the PFEM trajectories, which already satisfy these constraints. The claim of maintained discretization quality therefore rests on the ability to invoke the existing PFEM remeshing machinery rather than on an internal architectural guarantee. revision: partial

Circularity Check

0 steps flagged

No significant circularity; standard data-driven surrogate

full rationale

The paper describes a neural surrogate architecture (self-attention or linear attention) trained on PFEM simulation trajectories to predict node positions and velocities on evolving meshes. No load-bearing equations, uniqueness theorems, or ansatzes are presented that reduce a claimed prediction to a fitted input or self-citation by construction. Evaluation relies on empirical rollout accuracy against held-out PFEM runs; the mesh-based output enables post-hoc FE operators but does not redefine or tautologically derive any mechanical quantity. This is the expected non-finding for a purely empirical ML modeling paper.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract supplies no explicit free parameters, axioms, or invented entities; the model is described as a standard neural architecture trained on simulation data.

pith-pipeline@v0.9.1-grok · 5789 in / 1102 out tokens · 35700 ms · 2026-06-26T06:08:25.231378+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

56 extracted references · 47 canonical work pages · 5 internal anchors

[1]

Idelsohn, E

S. Idelsohn, E. Oñate, F. D. Pin, The particle finite element method: A powerful tool to solve incompressible flows with free-surfaces and breaking waves, International Journal for Numerical Methods in Engineering 61 (7) (2004) 964–989.doi:10.1002/nme.1096

work page doi:10.1002/nme.1096 2004
[2]

Cremonesi, A

M. Cremonesi, A. Franci, S. Idelsohn, E. Oñate, A State of the Art Review of the Particle Finite Element Method (PFEM), Arch Computat Methods Eng 27 (5) (2020) 1709–1735.doi:10.1007/s11831-020-09468-4

work page doi:10.1007/s11831-020-09468-4 2020
[3]

Franci, M

A. Franci, M. Cremonesi, U. Perego, G. Crosta, E. Oñate, 3D simulation of Vajont disaster. Part 1: Numerical formulation and validation, Engineering Geology 279 (2020) 105854. doi:10.1016/j.enggeo.2020.105854

work page doi:10.1016/j.enggeo.2020.105854 2020
[4]

J. M. Carbonell, L. Monforte, M. O. Ciantia, M. Arroyo, A. Gens, Geotechnical particle finite element method for modeling of soil-structure interaction under large deformation conditions, Journal of Rock Mechanics and Geotechnical Engineering 14 (3) (2022) 967–983.doi:10.1016/j.jrmge.2021.12.006

work page doi:10.1016/j.jrmge.2021.12.006 2022
[5]

Rizzieri, L

G. Rizzieri, L. Ferrara, M. Cremonesi, Simulation of viscoelastic free-surface flows with the Particle Finite Element Method, Comp. Part. Mech. 11 (5) (2024) 2043–2067.doi:10.1007/s40571-024-00730-1. 22 Attention mechanism for scalable mesh-based neural surrogates of free-surface fluids

work page doi:10.1007/s40571-024-00730-1 2024
[6]

Rizzieri, L

G. Rizzieri, L. Ferrara, M. Cremonesi, A partitioned Lagrangian finite element approach for the simulation of viscoelastic and elasto-viscoplastic free-surface flows, Computer Methods in Applied Mechanics and Engineering 443 (2025) 118071.doi:10.1016/j.cma.2025.118071

work page doi:10.1016/j.cma.2025.118071 2025
[7]

Idelsohn, M

S. Idelsohn, M. Mier-Torrecilla, E. Oñate, Multi-fluid flows with the Particle Finite Element Method, Computer Methods in Applied Mechanics and Engineering 198 (33) (2009) 2750–2767. doi:10.1016/j.cma.2009.04. 002

work page doi:10.1016/j.cma.2009.04 2009
[8]

Meduri, M

S. Meduri, M. Cremonesi, U. Perego, O. Bettinotti, A. Kurkchubasche, V . Oancea, A partitioned fully explicit Lagrangian finite element method for highly nonlinear fluid-structure interaction problems, International Journal for Numerical Methods in Engineering 113 (1) (2018) 43–64.doi:10.1002/nme.5602

work page doi:10.1002/nme.5602 2018
[10]

C. Fu, M. Cremonesi, U. Perego, A hybrid Lagrangian–Eulerian particle finite element method for free-surface and fluid–structure interaction problems, International Journal for Numerical Methods in Engineering 125 (5) (2024) e7402.doi:10.1002/nme.7402

work page doi:10.1002/nme.7402 2024
[11]

P. B. Ryzhakov, J. García, E. Oñate, Lagrangian finite element model for the 3D simulation of glass forming processes, Computers & Structures 177 (2016) 126–140.doi:10.1016/j.compstruc.2016.09.007

work page doi:10.1016/j.compstruc.2016.09.007 2016
[12]

Rizzieri, L

G. Rizzieri, L. Ferrara, M. Cremonesi, Numerical simulation of the extrusion and layer deposition processes in 3D concrete printing with the Particle Finite Element Method, Comput Mech 73 (2) (2024) 277–295. doi: 10.1007/s00466-023-02367-y

work page doi:10.1007/s00466-023-02367-y 2024
[13]

Leyssens, M

T. Leyssens, M. Henry, J. Lambrechts, J.-F. Remacle, A Delaunay refinement algorithm for the particle finite element method applied to free surface flows, International Journal for Numerical Methods in Engineering 125 (18) (2024) e7554.doi:10.1002/nme.7554

work page doi:10.1002/nme.7554 2024
[14]

Quarteroni, A

A. Quarteroni, A. Manzoni, F. Negri, Reduced Basis Methods for Partial Differential Equations, V ol. 92 of UNITEXT, Springer International Publishing, Cham, 2016.doi:10.1007/978-3-319-15431-2

work page doi:10.1007/978-3-319-15431-2 2016
[15]

Beckermann, R

M. Beckermann, R. Scanff, M. Cremonesi, A. Barbarulo, A new strategy using the Proper Generalized De- composition to model time evolving spatial domains, Computers & Structures 316 (2025) 107860. doi: 10.1016/j.compstruc.2025.107860

work page doi:10.1016/j.compstruc.2025.107860 2025
[16]

Brivio, S

S. Brivio, S. Fresca, A. Manzoni, Handling geometrical variability in nonlinear reduced order modeling through Continuous Geometry-Aware DL-ROMs, Computer Methods in Applied Mechanics and Engineering 442 (2025) 117989.doi:10.1016/j.cma.2025.117989

work page doi:10.1016/j.cma.2025.117989 2025
[17]

Tierz, I

A. Tierz, I. Alfaro, D. González, F. Chinesta, E. Cueto, Graph neural networks informed locally by thermodynamics, Engineering Applications of Artificial Intelligence 144 (2025) 110108. doi:10.1016/j.engappai.2025. 110108

work page doi:10.1016/j.engappai.2025 2025
[18]

Sharma, O

V . Sharma, O. Fink, A physics-informed graph neural network conserving linear and angular momentum for dynamical systems, Nat Commun 17 (1) (2026) 1045.doi:10.1038/s41467-025-67802-5

work page doi:10.1038/s41467-025-67802-5 2026
[19]

Sanchez-Gonzalez, J

A. Sanchez-Gonzalez, J. Godwin, T. Pfaff, R. Ying, J. Leskovec, P. W. Battaglia, Learning to Simulate Complex Physics with Graph Networks (Sep. 2020).arXiv:2002.09405,doi:10.48550/arXiv.2002.09405

work page doi:10.48550/arxiv.2002.09405 2020
[20]

Z. Li, A. B. Farimani, Graph neural network-accelerated Lagrangian fluid simulation, Computers & Graphics 103 (2022) 201–211.doi:10.1016/j.cag.2022.02.004

work page doi:10.1016/j.cag.2022.02.004 2022
[21]

S. Zhao, H. Chen, J. Zhao, A physical-information-flow-constrained temporal graph neural network-based simulator for granular materials, Computer Methods in Applied Mechanics and Engineering 433 (2025) 117536. doi:10.1016/j.cma.2024.117536

work page doi:10.1016/j.cma.2024.117536 2025
[22]

Y . Choi, K. Kumar, Graph Neural Network-based surrogate model for granular flows, Computers and Geotechnics 166 (2024) 106015.doi:10.1016/j.compgeo.2023.106015

work page doi:10.1016/j.compgeo.2023.106015 2024
[23]

Z. Li, K. Meidani, P. Yadav, A. Barati Farimani, Graph neural networks accelerated molecular dynamics, J. Chem. Phys. 156 (14) (2022) 144103.doi:10.1063/5.0083060

work page doi:10.1063/5.0083060 2022
[24]

Com- puter Methods in Applied Mechanics and Engineering449, 118476 (2026)

L. Tesán, M. M. Iparraguirre, D. González, P. Martins, E. Cueto, On the under-reaching phenomenon in message passing neural PDE solvers: Revisiting the CFL condition, Computer Methods in Applied Mechanics and Engineering 449 (2026) 118476.doi:10.1016/j.cma.2025.118476

work page doi:10.1016/j.cma.2025.118476 2026
[25]

Lanteri, M

F. Lanteri, M. Cremonesi, A mesh-based Graph Neural Network approach for surrogate modeling of Lagrangian free surface fluid flows, Computers & Fluids 301 (2025) 106773.doi:10.1016/j.compfluid.2025.106773. 23 Attention mechanism for scalable mesh-based neural surrogates of free-surface fluids

work page doi:10.1016/j.compfluid.2025.106773 2025
[26]

Vaswani, N

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. ukasz Kaiser, I. Polosukhin, Attention is All you Need, in: Advances in Neural Information Processing Systems, V ol. 30, Curran Associates, Inc., 2017

2017
[27]

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, N. Houlsby, An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (Jun. 2021).arXiv:2010.11929,doi:10.48550/arXiv.2010.11929

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2010.11929 2021
[28]

Jumper, R

J. Jumper, R. Evans, A. Pritzel, T. Green, M. Figurnov, O. Ronneberger, K. Tunyasuvunakool, R. Bates, A. Žídek, A. Potapenko, A. Bridgland, C. Meyer, S. A. A. Kohl, A. J. Ballard, A. Cowie, B. Romera-Paredes, S. Nikolov, R. Jain, J. Adler, T. Back, S. Petersen, D. Reiman, E. Clancy, M. Zielinski, M. Steinegger, M. Pacholska, T. Berghammer, S. Bodenstein, ...

2021
[29]

Z. Li, K. Meidani, A. B. Farimani, Transformer for Partial Differential Equations’ Operator Learning (Apr. 2023). arXiv:2205.13671,doi:10.48550/arXiv.2205.13671

work page doi:10.48550/arxiv.2205.13671 2023
[30]

H. Wu, H. Luo, H. Wang, J. Wang, M. Long, Transolver: A Fast Transformer Solver for PDEs on General Geometries (Jun. 2024).arXiv:2402.02366,doi:10.48550/arXiv.2402.02366

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2402.02366 2024
[31]

GeoTransolver: Learning Physics on Irregular Domains Using Multi-scale Geometry Aware Physics Attention Transformer

C. Adams, R. Ranade, R. Cherukuri, S. Choudhry, GeoTransolver: Learning Physics on Irregular Domains Using Multi-scale Geometry Aware Physics Attention Transformer (Dec. 2025). arXiv:2512.20399, doi: 10.48550/arXiv.2512.20399

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2512.20399 2025
[32]

M. M. Iparraguirre, I. Alfaro, D. Gonzalez, E. Cueto, MeshGraphNet-Transformer: Scalable Mesh-based Learned Simulation for Solid Mechanics (Feb. 2026).arXiv:2601.23177,doi:10.48550/arXiv.2601.23177

work page doi:10.48550/arxiv.2601.23177 2026
[33]

N. Wang, S. Zheng, Y . Chen, H. Zhao, Z. Fang, FluidFormer : Transformer with continuous convolution for particle-based fluid simulation, Neural Networks 198 (2026) 108631. doi:10.1016/j.neunet.2026.108631

work page doi:10.1016/j.neunet.2026.108631 2026
[34]

Alkin, M

B. Alkin, M. Bleeker, R. Kurle, T. Kronlachner, R. Sonnleitner, M. Dorfer, J. Brandstetter, AB-UPT: Scaling Neural CFD Surrogates for High-Fidelity Automotive Aerodynamics Simulations via Anchored-Branched Universal Physics Transformers (Oct. 2025).arXiv:2502.09692,doi:10.48550/arXiv.2502.09692

work page doi:10.48550/arxiv.2502.09692 2025
[35]

Saberi, A

M. Saberi, A. B. Farimani, S. Jamali, RheOFormer: A generative transformer model for simulation of complex fluids and flows (Oct. 2025).arXiv:2510.01365,doi:10.48550/arXiv.2510.01365

work page doi:10.48550/arxiv.2510.01365 2025
[36]

T. Dao, D. Fu, S. Ermon, A. Rudra, C. Ré, FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness, Advances in Neural Information Processing Systems 35 (2022) 16344–16359

2022
[37]

H. Zhou, H. Wu, H. Shangguan, Y . Ma, H. Weng, J. Wang, M. Long, Transolver-3: Scaling Up Transformer Solvers to Industrial-Scale Geometries (Feb. 2026). arXiv:2602.04940, doi:10.48550/arXiv.2602.04940

work page doi:10.48550/arxiv.2602.04940 2026
[38]

Alkin, T

B. Alkin, T. Kronlachner, S. Papa, S. Pirker, T. Lichtenegger, J. Brandstetter, NeuralDEM – Real-time Simulation of Industrial Particulate Flows (Feb. 2025).arXiv:2411.09678,doi:10.48550/arXiv.2411.09678

work page doi:10.48550/arxiv.2411.09678 2025
[39]

Katharopoulos, A

A. Katharopoulos, A. Vyas, N. Pappas, F. Fleuret, Transformers are RNNs: Fast autoregressive transformers with linear attention, in: Proceedings of the 37th International Conference on Machine Learning, V ol. 119 of ICML’20, JMLR.org, 2020, pp. 5156–5165

2020
[40]

In: IEEE Winter Conference on Applications of Computer Vision (WACV)

S. Zhuoran, Z. Mingyuan, Z. Haiyu, Y . Shuai, L. Hongsheng, Efficient Attention: Attention with Linear Com- plexities, in: 2021 IEEE Winter Conference on Applications of Computer Vision (W ACV), 2021, pp. 3530–3538. doi:10.1109/WACV48630.2021.00357

work page doi:10.1109/wacv48630.2021.00357 2021
[41]

T. C. Papanastasiou, Flows of materials with yield, J. Rheol. 31 (5) (1987) 385–404.doi:10.1122/1.549926

work page doi:10.1122/1.549926 1987
[42]

T. J. R. Hughes, L. P. Franca, M. Balestra, A new finite element formulation for computational fluid dynamics: V. Circumventing the babuška-brezzi condition: A stable Petrov-Galerkin formulation of the stokes problem accommodating equal-order interpolations, Computer Methods in Applied Mechanics and Engineering 59 (1) (1986) 85–99.doi:10.1016/0045-7825(86)90025-3

work page doi:10.1016/0045-7825(86)90025-3 1986
[43]

Oñate, S

E. Oñate, S. R. Idelsohn, F. Del Pin, R. Aubry, The particle finite element method — an overview, Int. J. Comput. Methods 01 (02) (2004) 267–307.doi:10.1142/S0219876204000204

work page doi:10.1142/s0219876204000204 2004
[44]

Edelsbrunner, E

H. Edelsbrunner, E. P. Mücke, Three-dimensional alpha shapes, in: Proceedings of the 1992 Workshop on V olume Visualization, VVS ’92, Association for Computing Machinery, New York, NY , USA, 1992, pp. 75–82. doi:10.1145/147130.147153

work page doi:10.1145/147130.147153 1992
[45]

Meduri, M

S. Meduri, M. Cremonesi, U. Perego, An efficient runtime mesh smoothing technique for 3D explicit Lagrangian free-surface fluid flow simulations, International Journal for Numerical Methods in Engineering 117 (4) (2019) 430–452.doi:10.1002/nme.5962. 24 Attention mechanism for scalable mesh-based neural surrogates of free-surface fluids

work page doi:10.1002/nme.5962 2019
[46]

H. Zhou, S. Cheng, Improving long-term autoregressive spatiotemporal predictions: A proof of concept with fluid dynamics, Computer Methods in Applied Mechanics and Engineering 447 (2025) 118332. doi:10.1016/j.cma. 2025.118332

work page doi:10.1016/j.cma 2025
[47]

McCabe, P

M. McCabe, P. Harrington, S. Subramanian, J. Brown, Towards Stability of Autoregressive Neural Operators, Transactions on Machine Learning Research (Jun. 2023)

2023
[48]

doi: 10.1109/tnn.2008.2005605

F. Scarselli, M. Gori, A. C. Tsoi, M. Hagenbuchner, G. Monfardini, The Graph Neural Network Model, IEEE Transactions on Neural Networks 20 (1) (2009) 61–80.doi:10.1109/TNN.2008.2005605

work page doi:10.1109/tnn.2008.2005605 2009
[49]

P. W. Battaglia, J. B. Hamrick, V . Bapst, A. Sanchez-Gonzalez, V . Zambaldi, M. Malinowski, A. Tacchetti, D. Raposo, A. Santoro, R. Faulkner, C. Gulcehre, F. Song, A. Ballard, J. Gilmer, G. Dahl, A. Vaswani, K. Allen, C. Nash, V . Langston, C. Dyer, N. Heess, D. Wierstra, P. Kohli, M. Botvinick, O. Vinyals, Y . Li, R. Pascanu, Relational inductive biases...

Pith/arXiv arXiv 2018
[51]

T. K. Rusch, M. M. Bronstein, S. Mishra, A Survey on Oversmoothing in Graph Neural Networks (Mar. 2023). arXiv:2303.10993,doi:10.48550/arXiv.2303.10993

work page doi:10.48550/arxiv.2303.10993 2023
[52]

Z. Li, N. B. Kovachki, C. Choy, B. Li, J. Kossaifi, S. P. Otta, M. A. Nabian, M. Stadler, C. Hundt, K. Azizzade- nesheli, A. Anandkumar, Geometry-informed neural operator for large-scale 3D PDEs, in: Proceedings of the 37th International Conference on Neural Information Processing Systems, NIPS ’23, Curran Associates Inc., Red Hook, NY , USA, 2023, pp. 35...

2023
[53]

C. K. Joshi, Transformers are Graph Neural Networks (Jun. 2025). arXiv:2506.22084, doi:10.48550/arXiv. 2506.22084

work page internal anchor Pith review doi:10.48550/arxiv 2025
[54]

Cao, Choose a Transformer: Fourier or Galerkin, in: Advances in Neural Information Processing Systems, V ol

S. Cao, Choose a Transformer: Fourier or Galerkin, in: Advances in Neural Information Processing Systems, V ol. 34, Curran Associates, Inc., 2021, pp. 24924–24940

2021
[55]

J. Su, M. Ahmed, Y . Lu, S. Pan, W. Bo, Y . Liu, RoFormer: Enhanced transformer with Rotary Position Embedding, Neurocomputing 568 (2024) 127063.doi:10.1016/j.neucom.2023.127063

work page doi:10.1016/j.neucom.2023.127063 2024
[56]

FiLM: Visual Reasoning with a General Conditioning Layer

E. Perez, F. Strub, H. de Vries, V . Dumoulin, A. Courville, FiLM: Visual Reasoning with a General Conditioning Layer (Dec. 2017).arXiv:1709.07871,doi:10.48550/arXiv.1709.07871

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1709.07871 2017
[57]

Rizzieri, F

G. Rizzieri, F. Lanteri, L. Ferrara, M. Cremonesi,ShapeGen3DCP: A deep learning framework for layer shape prediction in 3D concrete printing, Computers & Structures 323 (2026) 108142. doi:10.1016/j.compstruc. 2026.108142

work page doi:10.1016/j.compstruc 2026
[58]

S. Wen, A. Kumbhat, L. Lingsch, S. Mousavi, Y . Zhao, P. Chandrashekar, S. Mishra, Geometry Aware Operator Transformer as an efficient and accurate neural surrogate for PDEs on arbitrary domains, Advances in Neural Information Processing Systems 38 (2026) 155423–155501. 25

2026

[1] [1]

Idelsohn, E

S. Idelsohn, E. Oñate, F. D. Pin, The particle finite element method: A powerful tool to solve incompressible flows with free-surfaces and breaking waves, International Journal for Numerical Methods in Engineering 61 (7) (2004) 964–989.doi:10.1002/nme.1096

work page doi:10.1002/nme.1096 2004

[2] [2]

Cremonesi, A

M. Cremonesi, A. Franci, S. Idelsohn, E. Oñate, A State of the Art Review of the Particle Finite Element Method (PFEM), Arch Computat Methods Eng 27 (5) (2020) 1709–1735.doi:10.1007/s11831-020-09468-4

work page doi:10.1007/s11831-020-09468-4 2020

[3] [3]

Franci, M

A. Franci, M. Cremonesi, U. Perego, G. Crosta, E. Oñate, 3D simulation of Vajont disaster. Part 1: Numerical formulation and validation, Engineering Geology 279 (2020) 105854. doi:10.1016/j.enggeo.2020.105854

work page doi:10.1016/j.enggeo.2020.105854 2020

[4] [4]

J. M. Carbonell, L. Monforte, M. O. Ciantia, M. Arroyo, A. Gens, Geotechnical particle finite element method for modeling of soil-structure interaction under large deformation conditions, Journal of Rock Mechanics and Geotechnical Engineering 14 (3) (2022) 967–983.doi:10.1016/j.jrmge.2021.12.006

work page doi:10.1016/j.jrmge.2021.12.006 2022

[5] [5]

Rizzieri, L

G. Rizzieri, L. Ferrara, M. Cremonesi, Simulation of viscoelastic free-surface flows with the Particle Finite Element Method, Comp. Part. Mech. 11 (5) (2024) 2043–2067.doi:10.1007/s40571-024-00730-1. 22 Attention mechanism for scalable mesh-based neural surrogates of free-surface fluids

work page doi:10.1007/s40571-024-00730-1 2024

[6] [6]

Rizzieri, L

G. Rizzieri, L. Ferrara, M. Cremonesi, A partitioned Lagrangian finite element approach for the simulation of viscoelastic and elasto-viscoplastic free-surface flows, Computer Methods in Applied Mechanics and Engineering 443 (2025) 118071.doi:10.1016/j.cma.2025.118071

work page doi:10.1016/j.cma.2025.118071 2025

[7] [7]

Idelsohn, M

S. Idelsohn, M. Mier-Torrecilla, E. Oñate, Multi-fluid flows with the Particle Finite Element Method, Computer Methods in Applied Mechanics and Engineering 198 (33) (2009) 2750–2767. doi:10.1016/j.cma.2009.04. 002

work page doi:10.1016/j.cma.2009.04 2009

[8] [8]

Meduri, M

S. Meduri, M. Cremonesi, U. Perego, O. Bettinotti, A. Kurkchubasche, V . Oancea, A partitioned fully explicit Lagrangian finite element method for highly nonlinear fluid-structure interaction problems, International Journal for Numerical Methods in Engineering 113 (1) (2018) 43–64.doi:10.1002/nme.5602

work page doi:10.1002/nme.5602 2018

[9] [10]

C. Fu, M. Cremonesi, U. Perego, A hybrid Lagrangian–Eulerian particle finite element method for free-surface and fluid–structure interaction problems, International Journal for Numerical Methods in Engineering 125 (5) (2024) e7402.doi:10.1002/nme.7402

work page doi:10.1002/nme.7402 2024

[10] [11]

P. B. Ryzhakov, J. García, E. Oñate, Lagrangian finite element model for the 3D simulation of glass forming processes, Computers & Structures 177 (2016) 126–140.doi:10.1016/j.compstruc.2016.09.007

work page doi:10.1016/j.compstruc.2016.09.007 2016

[11] [12]

Rizzieri, L

G. Rizzieri, L. Ferrara, M. Cremonesi, Numerical simulation of the extrusion and layer deposition processes in 3D concrete printing with the Particle Finite Element Method, Comput Mech 73 (2) (2024) 277–295. doi: 10.1007/s00466-023-02367-y

work page doi:10.1007/s00466-023-02367-y 2024

[12] [13]

Leyssens, M

T. Leyssens, M. Henry, J. Lambrechts, J.-F. Remacle, A Delaunay refinement algorithm for the particle finite element method applied to free surface flows, International Journal for Numerical Methods in Engineering 125 (18) (2024) e7554.doi:10.1002/nme.7554

work page doi:10.1002/nme.7554 2024

[13] [14]

Quarteroni, A

A. Quarteroni, A. Manzoni, F. Negri, Reduced Basis Methods for Partial Differential Equations, V ol. 92 of UNITEXT, Springer International Publishing, Cham, 2016.doi:10.1007/978-3-319-15431-2

work page doi:10.1007/978-3-319-15431-2 2016

[14] [15]

Beckermann, R

M. Beckermann, R. Scanff, M. Cremonesi, A. Barbarulo, A new strategy using the Proper Generalized De- composition to model time evolving spatial domains, Computers & Structures 316 (2025) 107860. doi: 10.1016/j.compstruc.2025.107860

work page doi:10.1016/j.compstruc.2025.107860 2025

[15] [16]

Brivio, S

S. Brivio, S. Fresca, A. Manzoni, Handling geometrical variability in nonlinear reduced order modeling through Continuous Geometry-Aware DL-ROMs, Computer Methods in Applied Mechanics and Engineering 442 (2025) 117989.doi:10.1016/j.cma.2025.117989

work page doi:10.1016/j.cma.2025.117989 2025

[16] [17]

Tierz, I

A. Tierz, I. Alfaro, D. González, F. Chinesta, E. Cueto, Graph neural networks informed locally by thermodynamics, Engineering Applications of Artificial Intelligence 144 (2025) 110108. doi:10.1016/j.engappai.2025. 110108

work page doi:10.1016/j.engappai.2025 2025

[17] [18]

Sharma, O

V . Sharma, O. Fink, A physics-informed graph neural network conserving linear and angular momentum for dynamical systems, Nat Commun 17 (1) (2026) 1045.doi:10.1038/s41467-025-67802-5

work page doi:10.1038/s41467-025-67802-5 2026

[18] [19]

Sanchez-Gonzalez, J

A. Sanchez-Gonzalez, J. Godwin, T. Pfaff, R. Ying, J. Leskovec, P. W. Battaglia, Learning to Simulate Complex Physics with Graph Networks (Sep. 2020).arXiv:2002.09405,doi:10.48550/arXiv.2002.09405

work page doi:10.48550/arxiv.2002.09405 2020

[19] [20]

Z. Li, A. B. Farimani, Graph neural network-accelerated Lagrangian fluid simulation, Computers & Graphics 103 (2022) 201–211.doi:10.1016/j.cag.2022.02.004

work page doi:10.1016/j.cag.2022.02.004 2022

[20] [21]

S. Zhao, H. Chen, J. Zhao, A physical-information-flow-constrained temporal graph neural network-based simulator for granular materials, Computer Methods in Applied Mechanics and Engineering 433 (2025) 117536. doi:10.1016/j.cma.2024.117536

work page doi:10.1016/j.cma.2024.117536 2025

[21] [22]

Y . Choi, K. Kumar, Graph Neural Network-based surrogate model for granular flows, Computers and Geotechnics 166 (2024) 106015.doi:10.1016/j.compgeo.2023.106015

work page doi:10.1016/j.compgeo.2023.106015 2024

[22] [23]

Z. Li, K. Meidani, P. Yadav, A. Barati Farimani, Graph neural networks accelerated molecular dynamics, J. Chem. Phys. 156 (14) (2022) 144103.doi:10.1063/5.0083060

work page doi:10.1063/5.0083060 2022

[23] [24]

Com- puter Methods in Applied Mechanics and Engineering449, 118476 (2026)

L. Tesán, M. M. Iparraguirre, D. González, P. Martins, E. Cueto, On the under-reaching phenomenon in message passing neural PDE solvers: Revisiting the CFL condition, Computer Methods in Applied Mechanics and Engineering 449 (2026) 118476.doi:10.1016/j.cma.2025.118476

work page doi:10.1016/j.cma.2025.118476 2026

[24] [25]

Lanteri, M

F. Lanteri, M. Cremonesi, A mesh-based Graph Neural Network approach for surrogate modeling of Lagrangian free surface fluid flows, Computers & Fluids 301 (2025) 106773.doi:10.1016/j.compfluid.2025.106773. 23 Attention mechanism for scalable mesh-based neural surrogates of free-surface fluids

work page doi:10.1016/j.compfluid.2025.106773 2025

[25] [26]

Vaswani, N

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. ukasz Kaiser, I. Polosukhin, Attention is All you Need, in: Advances in Neural Information Processing Systems, V ol. 30, Curran Associates, Inc., 2017

2017

[26] [27]

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, N. Houlsby, An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (Jun. 2021).arXiv:2010.11929,doi:10.48550/arXiv.2010.11929

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2010.11929 2021

[27] [28]

Jumper, R

J. Jumper, R. Evans, A. Pritzel, T. Green, M. Figurnov, O. Ronneberger, K. Tunyasuvunakool, R. Bates, A. Žídek, A. Potapenko, A. Bridgland, C. Meyer, S. A. A. Kohl, A. J. Ballard, A. Cowie, B. Romera-Paredes, S. Nikolov, R. Jain, J. Adler, T. Back, S. Petersen, D. Reiman, E. Clancy, M. Zielinski, M. Steinegger, M. Pacholska, T. Berghammer, S. Bodenstein, ...

2021

[28] [29]

Z. Li, K. Meidani, A. B. Farimani, Transformer for Partial Differential Equations’ Operator Learning (Apr. 2023). arXiv:2205.13671,doi:10.48550/arXiv.2205.13671

work page doi:10.48550/arxiv.2205.13671 2023

[29] [30]

H. Wu, H. Luo, H. Wang, J. Wang, M. Long, Transolver: A Fast Transformer Solver for PDEs on General Geometries (Jun. 2024).arXiv:2402.02366,doi:10.48550/arXiv.2402.02366

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2402.02366 2024

[30] [31]

GeoTransolver: Learning Physics on Irregular Domains Using Multi-scale Geometry Aware Physics Attention Transformer

C. Adams, R. Ranade, R. Cherukuri, S. Choudhry, GeoTransolver: Learning Physics on Irregular Domains Using Multi-scale Geometry Aware Physics Attention Transformer (Dec. 2025). arXiv:2512.20399, doi: 10.48550/arXiv.2512.20399

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2512.20399 2025

[31] [32]

M. M. Iparraguirre, I. Alfaro, D. Gonzalez, E. Cueto, MeshGraphNet-Transformer: Scalable Mesh-based Learned Simulation for Solid Mechanics (Feb. 2026).arXiv:2601.23177,doi:10.48550/arXiv.2601.23177

work page doi:10.48550/arxiv.2601.23177 2026

[32] [33]

N. Wang, S. Zheng, Y . Chen, H. Zhao, Z. Fang, FluidFormer : Transformer with continuous convolution for particle-based fluid simulation, Neural Networks 198 (2026) 108631. doi:10.1016/j.neunet.2026.108631

work page doi:10.1016/j.neunet.2026.108631 2026

[33] [34]

Alkin, M

B. Alkin, M. Bleeker, R. Kurle, T. Kronlachner, R. Sonnleitner, M. Dorfer, J. Brandstetter, AB-UPT: Scaling Neural CFD Surrogates for High-Fidelity Automotive Aerodynamics Simulations via Anchored-Branched Universal Physics Transformers (Oct. 2025).arXiv:2502.09692,doi:10.48550/arXiv.2502.09692

work page doi:10.48550/arxiv.2502.09692 2025

[34] [35]

Saberi, A

M. Saberi, A. B. Farimani, S. Jamali, RheOFormer: A generative transformer model for simulation of complex fluids and flows (Oct. 2025).arXiv:2510.01365,doi:10.48550/arXiv.2510.01365

work page doi:10.48550/arxiv.2510.01365 2025

[35] [36]

T. Dao, D. Fu, S. Ermon, A. Rudra, C. Ré, FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness, Advances in Neural Information Processing Systems 35 (2022) 16344–16359

2022

[36] [37]

H. Zhou, H. Wu, H. Shangguan, Y . Ma, H. Weng, J. Wang, M. Long, Transolver-3: Scaling Up Transformer Solvers to Industrial-Scale Geometries (Feb. 2026). arXiv:2602.04940, doi:10.48550/arXiv.2602.04940

work page doi:10.48550/arxiv.2602.04940 2026

[37] [38]

Alkin, T

B. Alkin, T. Kronlachner, S. Papa, S. Pirker, T. Lichtenegger, J. Brandstetter, NeuralDEM – Real-time Simulation of Industrial Particulate Flows (Feb. 2025).arXiv:2411.09678,doi:10.48550/arXiv.2411.09678

work page doi:10.48550/arxiv.2411.09678 2025

[38] [39]

Katharopoulos, A

A. Katharopoulos, A. Vyas, N. Pappas, F. Fleuret, Transformers are RNNs: Fast autoregressive transformers with linear attention, in: Proceedings of the 37th International Conference on Machine Learning, V ol. 119 of ICML’20, JMLR.org, 2020, pp. 5156–5165

2020

[39] [40]

In: IEEE Winter Conference on Applications of Computer Vision (WACV)

S. Zhuoran, Z. Mingyuan, Z. Haiyu, Y . Shuai, L. Hongsheng, Efficient Attention: Attention with Linear Com- plexities, in: 2021 IEEE Winter Conference on Applications of Computer Vision (W ACV), 2021, pp. 3530–3538. doi:10.1109/WACV48630.2021.00357

work page doi:10.1109/wacv48630.2021.00357 2021

[40] [41]

T. C. Papanastasiou, Flows of materials with yield, J. Rheol. 31 (5) (1987) 385–404.doi:10.1122/1.549926

work page doi:10.1122/1.549926 1987

[41] [42]

T. J. R. Hughes, L. P. Franca, M. Balestra, A new finite element formulation for computational fluid dynamics: V. Circumventing the babuška-brezzi condition: A stable Petrov-Galerkin formulation of the stokes problem accommodating equal-order interpolations, Computer Methods in Applied Mechanics and Engineering 59 (1) (1986) 85–99.doi:10.1016/0045-7825(86)90025-3

work page doi:10.1016/0045-7825(86)90025-3 1986

[42] [43]

Oñate, S

E. Oñate, S. R. Idelsohn, F. Del Pin, R. Aubry, The particle finite element method — an overview, Int. J. Comput. Methods 01 (02) (2004) 267–307.doi:10.1142/S0219876204000204

work page doi:10.1142/s0219876204000204 2004

[43] [44]

Edelsbrunner, E

H. Edelsbrunner, E. P. Mücke, Three-dimensional alpha shapes, in: Proceedings of the 1992 Workshop on V olume Visualization, VVS ’92, Association for Computing Machinery, New York, NY , USA, 1992, pp. 75–82. doi:10.1145/147130.147153

work page doi:10.1145/147130.147153 1992

[44] [45]

Meduri, M

S. Meduri, M. Cremonesi, U. Perego, An efficient runtime mesh smoothing technique for 3D explicit Lagrangian free-surface fluid flow simulations, International Journal for Numerical Methods in Engineering 117 (4) (2019) 430–452.doi:10.1002/nme.5962. 24 Attention mechanism for scalable mesh-based neural surrogates of free-surface fluids

work page doi:10.1002/nme.5962 2019

[45] [46]

H. Zhou, S. Cheng, Improving long-term autoregressive spatiotemporal predictions: A proof of concept with fluid dynamics, Computer Methods in Applied Mechanics and Engineering 447 (2025) 118332. doi:10.1016/j.cma. 2025.118332

work page doi:10.1016/j.cma 2025

[46] [47]

McCabe, P

M. McCabe, P. Harrington, S. Subramanian, J. Brown, Towards Stability of Autoregressive Neural Operators, Transactions on Machine Learning Research (Jun. 2023)

2023

[47] [48]

doi: 10.1109/tnn.2008.2005605

F. Scarselli, M. Gori, A. C. Tsoi, M. Hagenbuchner, G. Monfardini, The Graph Neural Network Model, IEEE Transactions on Neural Networks 20 (1) (2009) 61–80.doi:10.1109/TNN.2008.2005605

work page doi:10.1109/tnn.2008.2005605 2009

[48] [49]

P. W. Battaglia, J. B. Hamrick, V . Bapst, A. Sanchez-Gonzalez, V . Zambaldi, M. Malinowski, A. Tacchetti, D. Raposo, A. Santoro, R. Faulkner, C. Gulcehre, F. Song, A. Ballard, J. Gilmer, G. Dahl, A. Vaswani, K. Allen, C. Nash, V . Langston, C. Dyer, N. Heess, D. Wierstra, P. Kohli, M. Botvinick, O. Vinyals, Y . Li, R. Pascanu, Relational inductive biases...

Pith/arXiv arXiv 2018

[49] [51]

T. K. Rusch, M. M. Bronstein, S. Mishra, A Survey on Oversmoothing in Graph Neural Networks (Mar. 2023). arXiv:2303.10993,doi:10.48550/arXiv.2303.10993

work page doi:10.48550/arxiv.2303.10993 2023

[50] [52]

Z. Li, N. B. Kovachki, C. Choy, B. Li, J. Kossaifi, S. P. Otta, M. A. Nabian, M. Stadler, C. Hundt, K. Azizzade- nesheli, A. Anandkumar, Geometry-informed neural operator for large-scale 3D PDEs, in: Proceedings of the 37th International Conference on Neural Information Processing Systems, NIPS ’23, Curran Associates Inc., Red Hook, NY , USA, 2023, pp. 35...

2023

[51] [53]

C. K. Joshi, Transformers are Graph Neural Networks (Jun. 2025). arXiv:2506.22084, doi:10.48550/arXiv. 2506.22084

work page internal anchor Pith review doi:10.48550/arxiv 2025

[52] [54]

Cao, Choose a Transformer: Fourier or Galerkin, in: Advances in Neural Information Processing Systems, V ol

S. Cao, Choose a Transformer: Fourier or Galerkin, in: Advances in Neural Information Processing Systems, V ol. 34, Curran Associates, Inc., 2021, pp. 24924–24940

2021

[53] [55]

J. Su, M. Ahmed, Y . Lu, S. Pan, W. Bo, Y . Liu, RoFormer: Enhanced transformer with Rotary Position Embedding, Neurocomputing 568 (2024) 127063.doi:10.1016/j.neucom.2023.127063

work page doi:10.1016/j.neucom.2023.127063 2024

[54] [56]

FiLM: Visual Reasoning with a General Conditioning Layer

E. Perez, F. Strub, H. de Vries, V . Dumoulin, A. Courville, FiLM: Visual Reasoning with a General Conditioning Layer (Dec. 2017).arXiv:1709.07871,doi:10.48550/arXiv.1709.07871

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1709.07871 2017

[55] [57]

Rizzieri, F

G. Rizzieri, F. Lanteri, L. Ferrara, M. Cremonesi,ShapeGen3DCP: A deep learning framework for layer shape prediction in 3D concrete printing, Computers & Structures 323 (2026) 108142. doi:10.1016/j.compstruc. 2026.108142

work page doi:10.1016/j.compstruc 2026

[56] [58]

S. Wen, A. Kumbhat, L. Lingsch, S. Mousavi, Y . Zhao, P. Chandrashekar, S. Mishra, Geometry Aware Operator Transformer as an efficient and accurate neural surrogate for PDEs on arbitrary domains, Advances in Neural Information Processing Systems 38 (2026) 155423–155501. 25

2026