pith. sign in

arxiv: 2606.10909 · v1 · pith:S7YC2XDFnew · submitted 2026-06-09 · 💻 cs.CE · cs.LG· physics.comp-ph

Non-linear mechanical field reconstruction coupling recurrent neural networks with physics-informed graph neural networks

Pith reviewed 2026-06-27 11:18 UTC · model grok-4.3

classification 💻 cs.CE cs.LGphysics.comp-ph
keywords graph neural networksLSTMstress field reconstructionelasto-plasticityphysics-informed neural networksmesh-agnostic modelsmulti-scale simulationnonlinear mechanics
0
0 comments X

The pith

A coupled LSTM-GNN reconstructs local stress fields under nonlinear history-dependent loading three orders of magnitude faster than finite elements while generalizing to longer sequences and different meshes.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes a framework that pairs an LSTM to encode path-dependent macroscopic stress-strain histories with a physics-informed GNN to output spatially resolved stress fields at each step. A relative weighting scheme with linear warm-up balances the data reconstruction loss against a discrete divergence equilibrium penalty, which the authors show is required to train successfully in the elasto-plastic regime. Trained on 10,000 non-proportional paths for a periodic plate-with-a-hole under von Mises plasticity, the surrogate reproduces high-fidelity quad-element finite-element fields. Because the GNN operates on mesh connectivity, the same trained model applies without retraining to coarser or finer meshes and to triangular elements. A reader would care because the method removes the dominant computational cost in multi-scale modeling of heterogeneous materials under complex loading.

Core claim

The central claim is that an LSTM hidden state encoding the macroscopic loading path, passed at each time step to a MeshGraphNet-style GNN whose message-passing layers are regularized by a divergence-based equilibrium penalty, yields a surrogate that matches finite-element stress fields to high accuracy, runs three orders of magnitude faster, generalizes to loading sequences twice the training length with 1.9 percent cumulative error, and transfers directly to meshes of different element type and resolution because the architecture depends only on nodal connectivity rather than element shape functions.

What carries the argument

The coupled LSTM-GNN with relative weighting and linear warm-up of the discrete divergence-based equilibrium penalty.

If this is right

  • The surrogate delivers three orders of magnitude speedup over finite-element simulation for the same microstructure and constitutive model.
  • Loading sequences up to twice the training length are reproduced with 1.9 percent cumulative error.
  • A single trained model reproduces the high-fidelity quad-element solution on meshes with different element types and on both coarser and finer resolutions.
  • LSTM hidden states exhibit a low-dimensional structure correlated with the internal variables of the underlying constitutive model.
  • Message passing on mesh connectivity renders the reconstruction mesh-agnostic without retraining.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The architecture could be inserted into existing multi-scale homogenization loops to replace repeated micro-scale solves.
  • Analysis of the LSTM state trajectory might yield reduced-order constitutive models whose internal variables are directly interpretable.
  • The same coupling could be tested on three-dimensional microstructures or on data drawn from digital image correlation experiments.
  • If the equilibrium penalty generalizes, the method may extend to other history-dependent problems such as viscoplasticity or damage evolution.

Load-bearing premise

The discrete divergence-based equilibrium penalty, when combined with relative weighting and linear warm-up, is sufficient to enforce mechanical equilibrium throughout training and inference in the elasto-plastic regime without introducing systematic bias.

What would settle it

Run the trained model on a new triangular-element mesh under a non-proportional loading path longer than training data and compare the reconstructed stress field pointwise against an independent finite-element solution on the identical mesh; systematic deviation above 2 percent cumulative error would falsify the generalization claim.

Figures

Figures reproduced from arXiv: 2606.10909 by Bj\"orn Kiefer, \'Etienne Pruli\`ere, Manuel Ricardo Guevara Garban, Martin Abendroth, Micha\"el Cl\'ement, Yves Chemisky.

Figure 1
Figure 1. Figure 1: Implemented two-layer, stacked LSTM architecture, unrolled through time. At each step [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: GNN architecture based on the encode-process-decode paradigm. The physical mesh [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Architecture of the coupled LSTM-GNN framework. The LSTM processes the mean strain history to [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Examples of non-proportional strain paths randomly sampled from the test database. Each path consists of [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: FE mesh of the plate with central hole (1,512 nodes, 1,402 quad elements). (a) Triangular mesh (1,487 nodes, 2,758 CST elements): visi￾ble locking artifacts. (b) Quadrilateral mesh (1,512 nodes, 1,402 elements): smooth stress field [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Comparison of constant strain triangular (CST) and quadrilateral discretizations for the same elasto-plastic [PITH_FULL_IMAGE:figures/full_fig_p009_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Example of a local stress field snapshot (MPa) with its corresponding mean stress-strain curve. The red point [PITH_FULL_IMAGE:figures/full_fig_p010_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: The total training time is approximately 10 minutes on an NVIDIA RTX A1000 GPU (4 GB VRAM), [PITH_FULL_IMAGE:figures/full_fig_p010_8.png] view at source ↗
Figure 8
Figure 8. Figure 8: LSTM training and validation loss convergence over [PITH_FULL_IMAGE:figures/full_fig_p011_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: GNN and P-DivGNN training convergence over 100 epochs. 4 Results The following sections evaluate the coupled LSTM-GNN framework against reference FE solutions. Macroscopic ac￾curacy is quantified using the Mean Squared Error (MSE), Mean Absolute Error (MAE), and weighted Mean Absolute Percentage Error (wMAPE): MSE (y, yˆ) = 1 N X N i=1 (ˆyi − yi) 2 , (26) MAE (y, yˆ) = 1 N X N i=1 |yˆi − yi | , (27) wMAPE … view at source ↗
Figure 10
Figure 10. Figure 10: Macroscopic stress-strain responses for the two best test predictions, ranked by mean absolute error: FE [PITH_FULL_IMAGE:figures/full_fig_p013_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Macroscopic stress-strain responses for the two worst test predictions, ranked by mean absolute error: FE [PITH_FULL_IMAGE:figures/full_fig_p014_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Explained variance ratio of the principal components of the LSTM hidden states. The dashed line marks [PITH_FULL_IMAGE:figures/full_fig_p015_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Dimensionality reduction of the LSTM hidden states averaged across all test simulations. Each point [PITH_FULL_IMAGE:figures/full_fig_p015_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Activation heatmap of the 64 LSTM hidden units, averaged over the test set. Persistent units encode accumulated plastic state while transient units respond to loading changes. units shown here do neither. The LSTM appears to encode a mixture: fields that evolve rapidly in the elastic regime, such as stress, together with variables that track the change of microstructural state once a dissipative process i… view at source ↗
Figure 15
Figure 15. Figure 15: Eight-segment, non-proportional strain loading path used for out-of-distribution testing ( [PITH_FULL_IMAGE:figures/full_fig_p017_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: Macroscopic stress-strain response for the eight-segment path: FE reference (solid blue) vs. LSTM predic [PITH_FULL_IMAGE:figures/full_fig_p017_16.png] view at source ↗
Figure 17
Figure 17. Figure 17: Dimensionality reduction of the LSTM hidden states for the eight-segment path. Colors indicate time step [PITH_FULL_IMAGE:figures/full_fig_p018_17.png] view at source ↗
Figure 18
Figure 18. Figure 18: Activation heatmap of the 64 LSTM hidden units across 201 time steps. Eight vertical band transitions are visible. Persistent units encode accumulated plastic history, while transient units respond to loading reversals. 18 [PITH_FULL_IMAGE:figures/full_fig_p018_18.png] view at source ↗
Figure 19
Figure 19. Figure 19: Microscopic loading case: prescribed strain sequence, representative mesh points, and macroscopic stress [PITH_FULL_IMAGE:figures/full_fig_p019_19.png] view at source ↗
Figure 20
Figure 20. Figure 20: Local stress fields (MPa) at the last loading step: FE (top), GNN (middle), and [PITH_FULL_IMAGE:figures/full_fig_p020_20.png] view at source ↗
Figure 21
Figure 21. Figure 21: Per-node NMSE of the local stress field at the last loading step, relative to the FE (Quad) solution: GNN [PITH_FULL_IMAGE:figures/full_fig_p021_21.png] view at source ↗
Figure 22
Figure 22. Figure 22: Local stress divergence (MPa per unit length) at the last loading step for FE, GNN, and [PITH_FULL_IMAGE:figures/full_fig_p021_22.png] view at source ↗
Figure 23
Figure 23. Figure 23: Time evolution of the stress state at four representative points: FE (solid blue), GNN (dashed red), and [PITH_FULL_IMAGE:figures/full_fig_p022_23.png] view at source ↗
Figure 24
Figure 24. Figure 24: Quad (left) and derived tri (right) meshes used for the cross-mesh generalization experiment. Both meshes [PITH_FULL_IMAGE:figures/full_fig_p023_24.png] view at source ↗
Figure 25
Figure 25. Figure 25: Stress fields (MPa) at the last time step: FE (Quad) reference (a), FE (Tri) (b), GNN(Tri) (c), [PITH_FULL_IMAGE:figures/full_fig_p024_25.png] view at source ↗
Figure 26
Figure 26. Figure 26: Time evolution of the stress state at four points: FE (Quad) (solid blue), FE (Tri) (dash-dot orange), GNN [PITH_FULL_IMAGE:figures/full_fig_p025_26.png] view at source ↗
Figure 27
Figure 27. Figure 27: NMSE and mean stress divergence (MPa per unit length) for FE, GNN, and [PITH_FULL_IMAGE:figures/full_fig_p026_27.png] view at source ↗
Figure 28
Figure 28. Figure 28: Temporal evolution of the mean stress divergence (MPa per unit length) for all six model-mesh combina [PITH_FULL_IMAGE:figures/full_fig_p027_28.png] view at source ↗
Figure 29
Figure 29. Figure 29: Coarse-to-fine sequence of tri6 meshes (left to right). [PITH_FULL_IMAGE:figures/full_fig_p027_29.png] view at source ↗
Figure 30
Figure 30. Figure 30: FE stress fields (MPa) across the refinement levels. [PITH_FULL_IMAGE:figures/full_fig_p028_30.png] view at source ↗
Figure 31
Figure 31. Figure 31: GNN stress predictions (MPa) across the refinement levels. [PITH_FULL_IMAGE:figures/full_fig_p029_31.png] view at source ↗
Figure 32
Figure 32. Figure 32: P-DivGNN stress predictions (MPa) across the refinement levels. 30 [PITH_FULL_IMAGE:figures/full_fig_p030_32.png] view at source ↗
Figure 33
Figure 33. Figure 33: Per-component NMSE of the coarse-mesh GNN and [PITH_FULL_IMAGE:figures/full_fig_p031_33.png] view at source ↗
Figure 34
Figure 34. Figure 34: Computational time comparison for FE, LSTM-GNN, and LSTM at the last inference step (double log [PITH_FULL_IMAGE:figures/full_fig_p032_34.png] view at source ↗
read the original abstract

Reconstructing local stress fields in heterogeneous microstructures under non-linear, history-dependent loading remains a major computational bottleneck in multi-scale simulations. We propose a coupled LSTM-GNN framework that links the temporal and spatial aspects of local stress field reconstruction. A Long Short-Term Memory network encodes macroscopic stress-strain sequences into a compact hidden state that captures the path-dependent constitutive response, while a physics-informed Graph Neural Network reconstructs the spatially-resolved stress field at each time step. We introduce a relative weighting strategy with linear warm-up to balance the data-driven reconstruction loss and a discrete divergence-based equilibrium penalty. This resolves the scale mismatch that prevents fixed-weight formulations from converging in the elasto-plastic regime. The model is trained on 10,000 non-proportional loading paths applied to a periodic plate-with-a-hole microstructure and von Mises elasto-plasticity. The model achieves three orders of magnitude speedup over finite element simulations and generalizes to loading sequences twice the training length, with 1.9% cumulative error. Because the graph relies on mesh connectivity instead of the specific element type, one trained surrogate can be applied directly without retraining to meshes with different element types and to both coarser and finer resolutions, while in all cases reproducing the high-fidelity quad-element FE field used during training. Indeed, the message passing characteristics inherent to GNN and MeshGraphNet architecture render the model mesh-agnostic. Analysis of the LSTM hidden states suggests a low-dimensional structure related to the internal state variables of the constitutive model.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes a coupled LSTM-GNN surrogate for reconstructing spatially resolved stress fields in a periodic plate-with-a-hole microstructure under non-proportional elasto-plastic loading. An LSTM encodes macroscopic stress-strain paths into a hidden state; a physics-informed GNN then reconstructs the field at each step using a data loss plus a discrete divergence-based equilibrium penalty whose relative weight follows a linear warm-up schedule. The model is trained on 10,000 FE paths and is reported to deliver three orders of magnitude speedup, 1.9 % cumulative error on sequences twice the training length, and direct transfer to meshes with different element types and resolutions while reproducing the training quad-element FE solution.

Significance. If the equilibrium penalty is shown to remain effective throughout the elasto-plastic regime, the approach would provide a mesh-agnostic, history-dependent surrogate that could accelerate multi-scale simulations by orders of magnitude while preserving mechanical consistency; the combination of recurrent encoding with graph-based spatial reconstruction and the explicit handling of loss-scale mismatch are technically distinctive.

major comments (2)
  1. [Training procedure / loss formulation] The relative weighting strategy with linear warm-up is presented as the mechanism that resolves the scale mismatch between data loss and equilibrium penalty in the von Mises elasto-plastic regime, yet the manuscript supplies no quantitative monitoring of the equilibrium residual (e.g., divergence norm) during training or inference, nor any ablation on the warm-up schedule parameters; without such evidence the 1.9 % error and mesh-transfer claims rest on an unverified assumption that the penalty remains load-bearing.
  2. [Results and generalization experiments] All reported performance figures (speedup, cumulative error on 2×-length sequences, mesh transfer) presuppose that the reconstructed fields satisfy equilibrium at every time step; the only supporting mechanism is the discrete divergence penalty whose effectiveness is not demonstrated by residual statistics or comparison against a pure data-driven baseline in plastic zones.
minor comments (2)
  1. [Numerical experiments] Validation details (train/validation/test split sizes, number of independent microstructures, error bars on the 1.9 % figure, sensitivity to LSTM hidden dimension) are not reported, limiting reproducibility.
  2. [Model architecture] The claim that the model is “mesh-agnostic” because of GNN message passing would be strengthened by an explicit statement of the node and edge feature definitions that remain invariant under element-type change.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We address each major comment below.

read point-by-point responses
  1. Referee: [Training procedure / loss formulation] The relative weighting strategy with linear warm-up is presented as the mechanism that resolves the scale mismatch between data loss and equilibrium penalty in the von Mises elasto-plastic regime, yet the manuscript supplies no quantitative monitoring of the equilibrium residual (e.g., divergence norm) during training or inference, nor any ablation on the warm-up schedule parameters; without such evidence the 1.9 % error and mesh-transfer claims rest on an unverified assumption that the penalty remains load-bearing.

    Authors: We agree that direct quantitative monitoring of the equilibrium residual would provide stronger verification. Although the reported 1.9 % cumulative error on extended sequences and successful mesh transfer offer indirect support, we will add plots of the divergence norm evolution during training and inference for representative paths in the revised manuscript. We will also include a sensitivity analysis on the warm-up schedule parameters in the supplementary material. revision: yes

  2. Referee: [Results and generalization experiments] All reported performance figures (speedup, cumulative error on 2×-length sequences, mesh transfer) presuppose that the reconstructed fields satisfy equilibrium at every time step; the only supporting mechanism is the discrete divergence penalty whose effectiveness is not demonstrated by residual statistics or comparison against a pure data-driven baseline in plastic zones.

    Authors: The performance metrics derive from the physics-informed model. To explicitly demonstrate the penalty's contribution, we will add a comparison to a pure data-driven baseline (identical architecture without the divergence term) in the revised version, focusing on accuracy within plastic zones. Residual statistics will be reported alongside the monitoring figures noted above. revision: yes

Circularity Check

0 steps flagged

No significant circularity; claims rest on independent FE benchmarks

full rationale

The paper trains an LSTM-GNN surrogate on 10,000 independent FE-generated loading paths and reports empirical metrics (speedup, 1.9% error on longer sequences, mesh transfer) against those external simulations. The discrete divergence penalty is an external mechanical constraint, not a fitted output or self-referential definition. Relative weighting and linear warm-up are hyperparameters selected for training convergence; they do not reduce the reported field accuracy or generalization results to the inputs by construction. No self-citations, uniqueness theorems, or ansatzes are invoked in a load-bearing manner. The derivation chain is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 0 invented entities

The central claim rests on the assumption that a discrete divergence penalty plus data-driven loss, balanced by a hand-tuned warm-up schedule, is sufficient to recover equilibrium-consistent fields; no new physical entities are postulated. The LSTM hidden state is treated as implicitly learning internal variables without an explicit mapping being derived.

free parameters (2)
  • relative weighting schedule parameters
    The linear warm-up and relative weighting coefficients between reconstruction loss and equilibrium penalty are chosen to achieve convergence and are not derived from first principles.
  • LSTM hidden-state dimension
    Chosen to capture path-dependent response; its size is a modeling hyper-parameter fitted during training.
axioms (2)
  • domain assumption The discrete divergence operator on the graph approximates the continuous equilibrium condition sufficiently well for the chosen mesh resolution.
    Invoked when the physics-informed penalty is introduced; no proof of approximation quality is supplied in the abstract.
  • domain assumption von Mises elasto-plasticity on the periodic plate-with-hole constitutes a representative test case for general non-linear history-dependent microstructures.
    The training corpus is generated exclusively from this constitutive model and geometry.

pith-pipeline@v0.9.1-grok · 5831 in / 1769 out tokens · 19095 ms · 2026-06-27T11:18:13.609445+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

71 extracted references · 42 canonical work pages

  1. [1]

    Feyel, Multiscale FE 2 elastoviscoplastic analysis of composite structures, Computational Materials Science 16 (1999) 344–354

    F. Feyel, Multiscale FE 2 elastoviscoplastic analysis of composite structures, Computational Materials Science 16 (1999) 344–354. doi:10.1016/s0927-0256(99)00077-4

  2. [2]

    F. Feyel, A multilevel finite element method (FE 2) to describe the response of highly non-linear structures using generalized continua, Computer Methods in Applied Mechanics and Engineering (CMAME) 192 (2003) 3233–3244. doi:10.1016/s0045-7825(03)00348-7

  3. [3]

    J. Qu, M. Cherkaoui, Fundamentals of Micromechanics of Solids, 1st ed., Wiley, 2006

  4. [4]

    Dvorak, Transformation field analysis of inelastic composite materials, Proceedings of the Royal Society of London

    G. Dvorak, Transformation field analysis of inelastic composite materials, Proceedings of the Royal Society of London. Series A: Mathematical and Physical Sciences 437 (1992) 311–327. doi:10.1098/rspa.1992.0063

  5. [5]

    Michel, P

    J. Michel, P. Suquet, Nonuniform transformation field analysis, International Journal of Solids and Structures 40 (2003) 6937–6955. doi:10.1016/s0020-7683(03)00346-9

  6. [6]

    Roussette, J

    S. Roussette, J. Michel, P. Suquet, Nonuniform transformation field analysis of elastic–viscoplastic composites, Composites Science and Technology 69 (2009) 22–27. doi:10.1016/j.compscitech.2007.10.032

  7. [7]

    Danoun, E

    A. Danoun, E. Prulière, Y . Chemisky, FE-LSTM: A hybrid approach to accelerate multiscale simulations of ar- chitectured materials using recurrent neural networks and finite element analysis, Computer Methods in Applied Mechanics and Engineering (CMAME) 429 (2024) 117192. doi:10.1016/j.cma.2024.117192

  8. [8]

    Malik, M

    A. Malik, M. Abendroth, G. Hütter, B. Kiefer, A hybrid approach employing neural networks to simulate the elasto–plastic deformation behavior of 3D-foam structures, Advanced Engineering Materials 24 (2022) 2100641. doi:10.1002/adem.202100641

  9. [9]

    Lange, G

    N. Lange, G. Hütter, B. Kiefer, An efficient monolithic solution scheme for FE 2 problems, Computer Methods in Applied Mechanics and Engineering (CMAME) 382 (2021) 113886. doi:10.1016/j.cma.2021.113886

  10. [10]

    Lange, G

    N. Lange, G. Hütter, B. Kiefer, A monolithic hyper ROM FE 2 method with clustered training at finite deformations, Computer Methods in Applied Mechanics and Engineering (CMAME) 418 (2024) 116522. doi:10.1016/j.cma.2023.116522

  11. [11]

    Lange, A

    N. Lange, A. Malik, M. Abendroth, G. Hütter, B. Kiefer, A comparison of classical phenomenological, hy- brid neural network, and high-performance ROM FE 2 models for open-cell foams: Efficiency, accuracy, and flexibility, GAMM-Mitteilungen 48 (2025) e70004. doi:10.1002/gamm.70004

  12. [12]

    Abendroth, N

    M. Abendroth, N. Lange, A. Malik, G. Hütter, B. Kiefer, Supplementary data for multi-scale simulations of a Weaire–Phelan foam, 2025. doi:10.5281/zenodo.15076895

  13. [13]

    Z. Liu, M. Bessa, W. K. Liu, Self-consistent clustering analysis: An efficient multi-scale scheme for inelastic heterogeneous materials, Computer Methods in Applied Mechanics and Engineering (CMAME) 306 (2016) 319–341. doi:10.1016/j.cma.2016.04.004

  14. [14]

    Bhaduri, Y

    A. Bhaduri, Y . He, M. D. Shields, L. Graham-Brady, R. M. Kirby, Stochastic collocation approach with adaptive mesh refinement for parametric uncertainty analysis, Journal of Computational Physics 371 (2018) 732–750. doi:10.1016/j.jcp.2018.06.003

  15. [15]

    Bhaduri, C

    A. Bhaduri, C. S. Meyer, J. W. Gillespie, B. Z. G. Haque, M. D. Shields, L. Graham-Brady, Probabilistic modeling of discrete structural response with application to composite plate penetration models, Journal of Engineering Mechanics 147 (2021) 04021087. doi:10.1061/(asce)em.1943-7889.0001996

  16. [16]

    Haghighat, M

    E. Haghighat, M. Raissi, A. Moure, H. Gomez, R. Juanes, A physics-informed deep learning framework for inversion and surrogate modeling in solid mechanics, Computer Methods in Applied Mechanics and Engineering (CMAME) 379 (2021) 113741. doi:10.1016/j.cma.2021.113741

  17. [17]

    Proceedings of the IEEE , author =

    Y . Lecun, L. Bottou, Y . Bengio, P. Haffner, Gradient-based learning applied to document recognition, Proceed- ings of the IEEE 86 (1998) 2278–2324. doi:10.1109/5.726791

  18. [18]

    Z. Nie, H. Jiang, L. B. Kara, Stress field prediction in cantilevered structures using convolutional neural networks, Journal of Computing and Information Science in Engineering 20 (2019) 011002. doi:10.1115/1.4044097

  19. [19]

    Gupta, A

    A. Gupta, A. Bhaduri, L. Graham-Brady, Accelerated multiscale mechanics modeling in a deep learning frame- work, Mechanics of Materials 184 (2023) 104709. doi:10.1016/j.mechmat.2023.104709

  20. [20]

    Ronneberger, P

    O. Ronneberger, P. Fischer, T. Brox, U-Net: Convolutional networks for biomedical image segmentation, in: Medical Image Computing and Computer-Assisted Intervention - MICCAI, 2015

  21. [21]

    B. P. Croom, M. Berkson, R. K. Mueller, M. Presley, S. Storck, Deep learning prediction of stress fields in additively manufactured metals with intricate defect networks, Mechanics of Materials 165 (2022) 104191. doi:10.1016/j.mechmat.2021.104191. 34 PREPRINT

  22. [22]

    Goodfellow, J

    I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y . Bengio, Genera- tive adversarial nets, in: Advances in Neural Information Processing Systems, 2014

  23. [23]

    J. Ho, A. Jain, P. Abbeel, Denoising diffusion probabilistic models, in: Advances in Neural Information Pro- cessing Systems 33: Annual Conference on Neural Information Processing Systems (NeurIPS), 2020

  24. [24]

    Yang, C.-H

    Z. Yang, C.-H. Yu, K. Guo, M. J. Buehler, End-to-end deep learning method to predict complete strain and stress tensors for complex hierarchical composite microstructures, Journal of the Mechanics and Physics of Solids 154 (2021) 104506. doi:10.1016/j.jmps.2021.104506

  25. [25]

    Jadhav, J

    Y . Jadhav, J. Berthel, C. Hu, R. Panat, J. Beuth, A. Barati Farimani, StressD: 2D stress estimation using denoising diffusion model, Computer Methods in Applied Mechanics and Engineering (CMAME) 416 (2023) 116343. doi:10.1016/j.cma.2023.116343

  26. [26]

    J. He, S. Koric, D. Abueidda, A. Najafi, I. Jasiuk, Geom-DeepONet: A point-cloud-based deep operator network for field predictions on 3D parameterized geometries, Computer Methods in Applied Mechanics and Engineering (CMAME) 429 (2024) 117130. doi:10.1016/j.cma.2024.117130

  27. [27]

    Scarselli, M

    F. Scarselli, M. Gori, A. C. Tsoi, M. Hagenbuchner, G. Monfardini, The graph neural network model, IEEE Transactions on Neural Networks 20 (2009) 61–80. doi:10.1109/tnn.2008.2005605

  28. [28]

    Pfaff, M

    T. Pfaff, M. Fortunato, A. Sanchez-Gonzalez, P. W. Battaglia, Learning mesh-based simulation with graph networks, in: International Conference on Learning Representations, (ICLR), 2021

  29. [29]

    Maurizi, C

    M. Maurizi, C. Gao, F. Berto, Predicting stress, strain and deformation fields in materials and structures with graph neural networks, Scientific Reports 12 (2022) 21834. doi:10.1038/s41598-022-26424-3

  30. [30]

    Storm, I

    J. Storm, I. Rocha, F. van der Meer, A microstructure-based graph neural network for accelerating multiscale simulations, Computer Methods in Applied Mechanics and Engineering (CMAME) 427 (2024) 117001. doi:10 .1016/j.cma.2024.117001

  31. [31]

    Fortunato, T

    M. Fortunato, T. Pfaff, P. Wirnsberger, A. Pritzel, P. W. Battaglia, MultiScale MeshGraphNets, CoRR abs/2210.00612 (2022)

  32. [32]

    Kar- niadakis

    M. Raissi, P. Perdikaris, G. Karniadakis, Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations, Journal of Computational Physics 378 (2019) 686–707. doi:10.1016/j.jcp.2018.10.045

  33. [33]

    Horie, N

    M. Horie, N. Mitsume, Physics-embedded neural networks: Graph neural PDE solvers with mixed boundary conditions, in: Advances in Neural Information Processing Systems 35: Annual Conference on Neural Informa- tion Processing Systems (NeurIPS), 2022

  34. [34]

    Hernández, A

    Q. Hernández, A. Badías, F. Chinesta, E. Cueto, Thermodynamics-informed graph neural networks, IEEE Transactions on Artificial Intelligence 5 (2024) 967–976. doi:10.1109/tai.2022.3179681

  35. [35]

    Richter-Powell, Y

    J. Richter-Powell, Y . Lipman, R. T. Q. Chen, Neural conservation laws: A divergence-free perspective, in: Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems (NeurIPS), 2022

  36. [36]

    M. R. Guevara Garban, Y . Chemisky, M. Clément, E. Prulière, Physics-informed graph neural networks to reconstruct local fields considering finite strain hyperelasticity, International Journal for Numerical Methods in Engineering 126 (2025) e70193. doi:10.1002/nme.70193

  37. [37]

    Mozaffar, R

    M. Mozaffar, R. Bostanabad, W. Chen, K. Ehmann, J. Cao, M. A. Bessa, Deep learning predicts path-dependent plasticity, Proceedings of the National Academy of Sciences 116 (2019) 26414–26420. doi:10.1073/pnas.1 911815116

  38. [38]

    M. B. Gorji, M. Mozaffar, J. N. Heidenreich, J. Cao, D. Mohr, On the potential of recurrent neural networks for modeling path dependent plasticity, Journal of the Mechanics and Physics of Solids 143 (2020) 103972. doi:10.1016/j.jmps.2020.103972

  39. [39]

    Ghavamian, A

    F. Ghavamian, A. Simone, Accelerating multiscale finite element simulations of history-dependent materials using a recurrent neural network, Computer Methods in Applied Mechanics and Engineering (CMAME) 357 (2019) 112594. doi:10.1016/j.cma.2019.112594

  40. [40]

    L. Wu, V . D. Nguyen, N. G. Kilingar, L. Noels, A recurrent neural network-accelerated multi-scale model for elasto-plastic heterogeneous materials subjected to random cyclic and non-proportional loading paths, Computer Methods in Applied Mechanics and Engineering (CMAME) 369 (2020) 113234. doi:10.1016/j.cma.2020.1 13234. 35 PREPRINT

  41. [42]

    Y . Wu, Y . Li, B. He, J. Liu, M. Huang, Z. Li, A physics-informed GNN-LSTM framework for prediction of microvoid growth in heterogeneous polycrystals, Engineering Fracture Mechanics 336 (2026) 111938. doi:10.1 016/j.engfracmech.2026.111938

  42. [43]

    Kirchdoerfer, M

    T. Kirchdoerfer, M. Ortiz, Data-driven computational mechanics, Computer Methods in Applied Mechanics and Engineering (CMAME) 304 (2016) 81–101. doi:10.1016/j.cma.2016.02.001

  43. [44]

    F. Masi, I. Stefanou, P. Vannucci, V . Maffi-Berthier, Thermodynamics-based artificial neural networks for con- stitutive modeling, Journal of the Mechanics and Physics of Solids 147 (2021) 104277. doi:10.1016/j.jmps .2020.104277

  44. [45]

    S. Im, J. Lee, M. Cho, Surrogate modeling of elasto-plastic problems via long short-term memory neural networks and proper orthogonal decomposition, Computer Methods in Applied Mechanics and Engineering (CMAME) 385 (2021) 114030. doi:10.1016/j.cma.2021.114030

  45. [46]

    M. A. Maia, I. B. C. M. Rocha, P. Kerfriden, F. P. van der Meer, Physically recurrent neural networks for path-dependent heterogeneous materials: Embedding constitutive models in a data-driven surrogate, Computer Methods in Applied Mechanics and Engineering (CMAME) 407 (2023) 115934. doi:10.1016/j.cma.2023.1 15934

  46. [47]

    Sanchez-Gonzalez, J

    A. Sanchez-Gonzalez, J. Godwin, T. Pfaff, R. Ying, J. Leskovec, P. W. Battaglia, Learning to simulate complex physics with graph networks, in: Proceedings of the 37th International Conference on Machine Learning (ICML), volume 119, 2020, pp. 8459–8468

  47. [48]

    Brandstetter, D

    J. Brandstetter, D. E. Worrall, M. Welling, Message passing neural PDE solvers, in: International Conference on Learning Representations (ICLR), 2022

  48. [49]

    Y . Hu, G. Zhou, M.-G. Lee, P. Wu, D. Li, A temporal graph neural network for cross-scale modelling of polycrystals considering microstructure interaction, International Journal of Plasticity 179 (2024) 104017. doi:10 .1016/j.ijplas.2024.104017

  49. [50]

    Z. Li, N. B. Kovachki, K. Azizzadenesheli, B. Liu, K. Bhattacharya, A. Stuart, A. Anandkumar, Fourier neural operator for parametric partial differential equations, in: International Conference on Learning Representations (ICLR), 2021

  50. [51]

    L. Lu, P. Jin, G. Pang, Z. Zhang, G. E. Karniadakis, Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators, Nature Machine Intelligence 3 (2021) 218–229. doi:10.1038/s4 2256-021-00302-5

  51. [52]

    J. He, S. Kushwaha, J. Park, S. Koric, D. Abueidda, I. Jasiuk, Sequential deep operator networks (S-DeepONet) for predicting full-field solutions under time-dependent loads, Engineering Applications of Artificial Intelligence 127 (2024) 107258. doi:10.1016/j.engappai.2023.107258

  52. [53]

    N. B. Kovachki, Z. Li, B. Liu, K. Azizzadenesheli, K. Bhattacharya, A. M. Stuart, A. Anandkumar, Neural operator: Learning maps between function spaces with applications to PDEs, Journal of Machine Learning Research 24 (2023) 1–97

  53. [54]

    P. M. Suquet, Elements of homogenization for inelastic solid mechanics, in: E. Sanchez-Palencia, A. Zaoui (Eds.), Homogenization Techniques for Composite Media, 1987, pp. 193–198

  54. [55]

    Chatzigeorgiou, N

    G. Chatzigeorgiou, N. Charalambakis, Y . Chemisky, F. Meraghni, Periodic homogenization for fully coupled thermomechanical modeling of dissipative generalized standard materials, International Journal of Plasticity 81 (2016) 18–39. doi:10.1016/j.ijplas.2016.01.013

  55. [56]

    Lemaitre, J.-L

    J. Lemaitre, J.-L. Chaboche, Mechanics of Solid Materials, Cambridge University Press, 1990

  56. [57]

    J. C. Simo, T. J. R. Hughes, Computational Inelasticity, Springer-Verlag, New York, 1998

  57. [58]

    Hochreiter, J

    S. Hochreiter, J. Schmidhuber, Long short-term memory, Neural computation 9 (1997) 1735–1780. doi:10.116 2/neco.1997.9.8.1735

  58. [59]

    S. Hochreiter, The vanishing gradient problem during learning recurrent neural nets and problem solutions, International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 6 (1998) 107–116. doi:10.114 2/S0218488598000094

  59. [60]

    Danoun, Numerical simulation of heterogeneous materials combining Artificial Intelligence and physics- based modeling, Ph.D

    A. Danoun, Numerical simulation of heterogeneous materials combining Artificial Intelligence and physics- based modeling, Ph.D. thesis, Université de Bordeaux, 2022. 36 PREPRINT

  60. [61]

    Prulière, Y

    E. Prulière, Y . Chemisky, 3MAH : Un ensemble de librairies pour analyser le comportement complexe de matériaux hétérogènes, in: Colloque National en Calcul des Structures (CSMA), 2023

  61. [62]

    Belytschko, W

    T. Belytschko, W. K. Liu, B. Moran, Nonlinear Finite Elements for Continua and Structures, John Wiley & Sons, 2000

  62. [63]

    D. P. Kingma, J. Ba, Adam: A method for stochastic optimization, in: International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, 2015

  63. [64]

    Paszke, S

    A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Köpf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, S. Chintala, PyTorch: An imperative style, high-performance deep learning library, in: Advances in Neural Information Processing System...

  64. [65]

    M. Fey, J. E. Lenssen, Fast graph representation learning with PyTorch Geometric, in: ICLR Workshop on Representation Learning on Graphs and Manifolds, 2019

  65. [66]

    S. Wang, Y . Teng, P. Perdikaris, Understanding and mitigating gradient flow pathologies in physics-informed neural networks, SIAM Journal on Scientific Computing 43 (2021) A3055–A3081. doi:10.1137/20M1318043

  66. [67]

    S. Wang, X. Yu, P. Perdikaris, When and why PINNs fail to train: A neural tangent kernel perspective, Journal of Computational Physics 449 (2022) 110768. doi:10.1016/j.jcp.2021.110768

  67. [68]

    Z. Chen, V . Badrinarayanan, C.-Y . Lee, A. Rabinovich, GradNorm: Gradient normalization for adaptive loss balancing in deep multitask networks, in: Proceedings of the 35th International Conference on Machine Learning (ICML), volume 80, 2018, pp. 794–803

  68. [69]

    Kraus,Multi-Objective Loss Balancing for Physics-Informed Deep Learning, Computer Methods in Applied Mechanics and Engineering439(2025), 117914, DOI 10.1016/j.cma.2025.117914

    R. Bischof, M. A. Kraus, Multi-objective loss balancing for physics-informed deep learning, Computer Methods in Applied Mechanics and Engineering (CMAME) 439 (2025) 117914. doi:10.1016/j.cma.2025.117914

  69. [70]

    I. T. Jolliffe, Principal Component Analysis, 2nd ed., Springer, 2002. doi:10.1007/b98835

  70. [71]

    van der Maaten, G

    L. van der Maaten, G. Hinton, Visualizing data using t-SNE, Journal of Machine Learning Research 9 (2008) 2579–2605

  71. [72]

    Schroeder, K

    W. Schroeder, K. Martin, B. Lorensen, The Visualization Toolkit, 4th ed., Kitware, 2006. 37