Physics-Informed Neural Networks with Attention Feature Expansion for Monge-Amp\`ere Equations

Anxiao Yu; Bangmin Wu; Dongwoo Sheen; Xinlong Feng; Zhengbang Zha

arxiv: 2605.22115 · v1 · pith:OBR7R2GUnew · submitted 2026-05-21 · 🧮 math.NA · cs.NA

Physics-Informed Neural Networks with Attention Feature Expansion for Monge-Amp\`ere Equations

Anxiao Yu , Bangmin Wu , Zhengbang Zha , Xinlong Feng , Dongwoo Sheen This is my paper

Pith reviewed 2026-05-22 04:10 UTC · model grok-4.3

classification 🧮 math.NA cs.NA

keywords Monge-Ampère equationphysics-informed neural networksattention feature expansioninput convex neural networksimage enhancementmedical image registrationnumerical solutionconvexity guarantees

0 comments

The pith

Physics-informed neural networks with attention and input convexity solve the Monge-Ampère equation accurately with theoretical guarantees.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a new approach called PINN-AFE to find numerical solutions to the Monge-Ampère equation, a key nonlinear partial differential equation used in many areas. It builds a feature pool using multi-head attention to represent nonlinear aspects adaptively and employs input convex neural networks to ensure the solution is strictly convex, which comes with mathematical proofs. The training uses a weighted loss that changes dynamically along with hybrid optimization to speed up convergence. Tests show this gives precise and fast results, and the same idea works well for enhancing images and registering medical images in a way that respects physical properties.

Core claim

The PINN-AFE framework integrates multi-head attention enhanced feature pool for adaptive nonlinear feature representation and input convex neural networks to impose strict convexity of solutions with rigorous theoretical guarantees, while using a dynamically weighted loss function combined with hybrid optimization to accelerate training convergence, achieving accurate and computationally efficient solutions to the Monge-Ampère equation that extend to high-quality results in image enhancement and medical image registration.

What carries the argument

The multi-head attention enhanced feature pool combined with input convex neural networks, which together enable adaptive feature representation and enforce strict convexity with theoretical backing.

If this is right

The solutions produced satisfy the strict convexity required by the Monge-Ampère equation.
Training converges faster due to the dynamically weighted loss and hybrid optimization.
The method produces accurate numerical solutions for the equation.
High-quality and physically consistent results are obtained when applied to image enhancement and medical image registration.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The attention feature expansion could potentially improve performance in solving other fully nonlinear elliptic equations.
Input convex neural networks might be useful in other contexts where convexity constraints are needed in neural network approximations.
Success in image tasks suggests the framework could handle inverse problems or data-driven modeling in related fields.

Load-bearing premise

The multi-head attention enhanced feature pool provides adaptive nonlinear feature representation and input convex neural networks impose strict convexity of solutions with rigorous theoretical guarantees.

What would settle it

Compare the neural network solution to an exact known solution of a Monge-Ampère problem on a simple domain and verify that the maximum error is below a small threshold and that the computed solution remains convex.

Figures

Figures reproduced from arXiv: 2605.22115 by Anxiao Yu, Bangmin Wu, Dongwoo Sheen, Xinlong Feng, Zhengbang Zha.

**Figure 1.** Figure 1: PINN-AFE framework The PINN-AFE framework adopts a cascaded three-module design, as illustrated in Figure 1. The overall logical flow is: (x, y) → F (R n ) → Fattn (R m) → h-layer ICNN → uˆ(x) (R 1 ). 2.2. Key Design Choices The proposed PINN-AFE framework incorporates four design principles to address the inherent challenges of solving elliptic Monge-Amp`ere equations, including strong nonlinearity, str… view at source ↗

**Figure 2.** Figure 2: illustrates the exact solution, PINN-AFE predicted solution, and absolute error distribution over the unit square domain. The predicted result well captures the convex profile of the exact solution, and the absolute error is maintained at the order of 10−6 . This validates the high accuracy of the PINN-AFE method for solving smooth Monge–Amp`ere equations. (a) Exact solution (b) Predicted solution (c) Abso… view at source ↗

**Figure 3.** Figure 3: Training loss curve Intel Core i5-12450H CPU, an NVIDIA RTX 3050 6GB GPU, and Windows 11 OS. The PINN-AFE costs 84 seconds in total. As summarized in [PITH_FULL_IMAGE:figures/full_fig_p020_3.png] view at source ↗

**Figure 4.** Figure 4: presents the core numerical results of the singular case, including the exact solution, predicted solution and absolute error distribution. It can be observed that the proposed method accurately reproduces the distribution of the exact solution, and the predicted solution maintains strict convexity consistent with the theoretical property of the Monge-Amp`ere equation. The absolute error is globally cont… view at source ↗

**Figure 5.** Figure 5: Contour Slices The logarithm-scaled absolute error plots enable simultaneous visualization of errors across multiple orders of magnitude. The error distribution is perfectly symmetric on all three planes with no abnormal peaks, indicating stable and uniform training. The error reaches a minimum of 10−8 in the central region and increases gradually to approximately 10−4 at the corners, while the global mean… view at source ↗

**Figure 6.** Figure 6: 3D Surface Slices [PITH_FULL_IMAGE:figures/full_fig_p024_6.png] view at source ↗

**Figure 7.** Figure 7: Orthogonal Slices of Absolute Error [PITH_FULL_IMAGE:figures/full_fig_p025_7.png] view at source ↗

**Figure 8.** Figure 8: Isosurfaces of Predicted Solution [PITH_FULL_IMAGE:figures/full_fig_p026_8.png] view at source ↗

**Figure 9.** Figure 9: Volume Rendering [PITH_FULL_IMAGE:figures/full_fig_p027_9.png] view at source ↗

**Figure 10.** Figure 10: Image enhancement results via the PINN-AFE framework. [PITH_FULL_IMAGE:figures/full_fig_p029_10.png] view at source ↗

**Figure 11.** Figure 11: Qualitative visualization of T1-FDG PET registration [PITH_FULL_IMAGE:figures/full_fig_p031_11.png] view at source ↗

**Figure 12.** Figure 12: Final overlay 31 [PITH_FULL_IMAGE:figures/full_fig_p031_12.png] view at source ↗

read the original abstract

The Monge-Amp\`ere equation is a fundamental fully nonlinear elliptic partial differential equation that finds extensive applications across multiple disciplines. This study proposes a novel physics-informed neural network integrated with attention feature expansion (PINN-AFE) for its numerical solution. A multi-head attention enhanced feature pool is constructed to enable adaptive nonlinear feature representation, and input convex neural networks are adopted to impose strict convexity of solutions with rigorous theoretical guarantees. Meanwhile, a dynamically weighted loss function combined with hybrid optimization is formulated to accelerate training convergence. Comprehensive numerical experiments validate the accuracy and computational efficiency of the developed framework. The PINN-AFE paradigm is further extended to image processing tasks, delivering high-quality and physically consistent results in both image enhancement and medical image registration scenarios.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

PINN with attention and ICNNs for Monge-Ampère claims convexity guarantees that likely fail once attention is added, though the image applications are a reasonable extension.

read the letter

The main point on this paper is that it builds a PINN variant called PINN-AFE that adds multi-head attention for feature expansion and input convex networks to enforce strict convexity when solving the Monge-Ampère equation, plus a dynamic loss weighting scheme. It then shows results on image enhancement and medical image registration where the outputs stay physically consistent. That application step is the part that actually lands as useful progress rather than just another architecture tweak. The targeted mix for this fully nonlinear PDE is new enough on its own terms and builds directly on existing PINN and ICNN ideas without obvious circularity. The experiments are described as comprehensive, which at least suggests they ran tests beyond toy cases. The soft spot sits in the theory. Standard ICNN convexity requires non-negative weights and convex non-decreasing activations, and strict convexity needs the Hessian to stay positive definite. Inserting attention layers introduces data-dependent mixing and softmax operations that can produce effective negative contributions or break the required conditions. The abstract asserts rigorous guarantees for the combined model, but if the paper only recycles the usual ICNN proof without re-deriving the conditions under attention and the dynamic loss, the guarantee does not carry over to the trained network that actually approximates det(D²u) = f. That is a load-bearing claim, so it needs explicit verification or numerical checks on the Hessian. Without seeing the full error tables or baseline comparisons it is hard to judge how large the practical gain is over plain PINNs. This work is for people who already work on numerical methods for nonlinear elliptic PDEs or on PINN customizations for scientific computing. A reader focused on optimal transport or imaging applications would get the most out of the later sections. It deserves a serious referee because the core construction is coherent and the applications are relevant, even if the convexity argument needs tightening.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes PINN-AFE, a physics-informed neural network augmented with attention feature expansion, for numerically solving the Monge-Ampère equation. It constructs a multi-head attention enhanced feature pool for adaptive nonlinear representations, adopts input convex neural networks (ICNNs) to enforce strict convexity of solutions together with claimed rigorous theoretical guarantees, introduces a dynamically weighted loss combined with hybrid optimization for faster convergence, validates the framework via numerical experiments, and extends it to image enhancement and medical image registration tasks.

Significance. If the convexity guarantees survive the attention modification and the numerical results establish clear accuracy and efficiency gains over standard PINN or finite-difference baselines for the Monge-Ampère equation, the work would strengthen the applicability of physics-informed networks to fully nonlinear elliptic problems and supply a practical tool for imaging applications. The explicit use of ICNNs to target convexity is a constructive idea worth developing further.

major comments (2)

[§3] §3 (Architecture and convexity analysis): The abstract and method description assert that ICNNs supply 'rigorous theoretical guarantees' of strict convexity for the learned solution. Standard ICNN convexity requires non-negative weights on all relevant paths and convex non-decreasing activations; however, inserting a multi-head attention enhanced feature pool before or inside the ICNN layers introduces data-dependent mixing that can produce effective negative weights or non-convex operations. No re-derivation of the convexity conditions under this modification, nor verification that the trained network remains strictly convex (positive-definite Hessian everywhere), is supplied. This directly affects the central claim that the framework delivers solutions with rigorous convexity guarantees.
[§4] §4 (Numerical validation): The abstract states that 'comprehensive numerical experiments validate the accuracy and computational efficiency,' yet the provided text supplies no quantitative error tables, convergence rates, baseline comparisons (e.g., against standard PINNs or monotone finite-difference schemes), or dataset specifications. Without these, the empirical support for the accuracy and efficiency claims cannot be assessed.

minor comments (2)

[Abstract] Abstract: The claim of 'high-quality and physically consistent results' in image tasks would be strengthened by a brief mention of the quantitative metrics used (e.g., PSNR, SSIM, or registration error).
[§3.3] Notation: The dynamic weighting scheme in the loss function should be given an explicit equation number and a short description of how the weights are updated at each epoch.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed report. We address each major comment below and indicate the revisions we will make to the manuscript.

read point-by-point responses

Referee: [§3] §3 (Architecture and convexity analysis): The abstract and method description assert that ICNNs supply 'rigorous theoretical guarantees' of strict convexity for the learned solution. Standard ICNN convexity requires non-negative weights on all relevant paths and convex non-decreasing activations; however, inserting a multi-head attention enhanced feature pool before or inside the ICNN layers introduces data-dependent mixing that can produce effective negative weights or non-convex operations. No re-derivation of the convexity conditions under this modification, nor verification that the trained network remains strictly convex (positive-definite Hessian everywhere), is supplied. This directly affects the central claim that the framework delivers solutions with rigorous convexity guarantees.

Authors: We acknowledge the validity of this observation. The introduction of the multi-head attention feature pool does require explicit verification that the overall architecture preserves the strict convexity property of the ICNN. In the revised manuscript we will add a dedicated subsection in §3 that re-derives the convexity conditions under the attention modification, including constraints on attention weights to maintain non-negative paths and a numerical check confirming positive-definiteness of the Hessian at representative points. This will directly support the theoretical guarantees claim. revision: yes
Referee: [§4] §4 (Numerical validation): The abstract states that 'comprehensive numerical experiments validate the accuracy and computational efficiency,' yet the provided text supplies no quantitative error tables, convergence rates, baseline comparisons (e.g., against standard PINNs or monotone finite-difference schemes), or dataset specifications. Without these, the empirical support for the accuracy and efficiency claims cannot be assessed.

Authors: We agree that the numerical results section must contain explicit quantitative comparisons to allow independent assessment. The current manuscript contains error metrics and some baseline runs, but these are not presented in tabular form with convergence rates or full dataset details. In the revision we will insert clear error tables (L2 and max-norm errors versus finite-difference references), convergence plots, direct comparisons against standard PINNs and monotone finite-difference schemes, and explicit dataset specifications for all test cases. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper describes a PINN-AFE framework that integrates multi-head attention for feature expansion with input convex neural networks (ICNNs) drawn from established prior literature to enforce convexity, alongside a dynamically weighted loss. These components are presented as extensions of standard PINN methodology, with accuracy claims supported by numerical experiments on the Monge-Ampère equation and downstream tasks rather than any reduction of outputs to fitted parameters, self-definitions, or unverified self-citations. No load-bearing step equates a prediction or guarantee directly to its own inputs by construction; the approach remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the assumption that attention-based feature expansion yields adaptive nonlinear representations and that input convex networks deliver strict convexity with rigorous guarantees; these are introduced without independent external benchmarks in the abstract.

axioms (1)

domain assumption Input convex neural networks impose strict convexity of solutions with rigorous theoretical guarantees.
Explicitly adopted in the abstract as a core component of the framework.

pith-pipeline@v0.9.0 · 5665 in / 1272 out tokens · 60355 ms · 2026-05-22T04:10:50.807794+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

input convex neural networks are adopted to impose strict convexity of solutions with rigorous theoretical guarantees... all weight matrices between consecutive hidden layers are constrained to be element-wise non-negative; all activation functions are smooth convex functions, specifically the Softplus function
IndisputableMonolith/Foundation/BranchSelection.lean branch_selection unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

the convexity of the mapping can be obtained by the property that the composition of a convex function and a convex mapping preserves convexity

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

50 extracted references · 50 canonical work pages · 3 internal anchors

[1]

A. D. Aleksandrov. Uniqueness theorems for surfaces in the large. i, ii.Am. Math. Soc. Transl. Ser. 2, 21:341–388, 1962. doi: 10.1090/trans2/21

work page doi:10.1090/trans2/21 1962
[2]

B. Amos, L. Xu, and J. Z. Kolter. Input convex neural networks. InICML, pages 146–155, 2017. URLhttps://arxiv.org/abs/1609.07152

work page internal anchor Pith review Pith/arXiv arXiv 2017
[3]

M. M. S. Andreassen, P. E. Goa, T. E. Sjøbakk, and et al. Semi-automatic segmenta- tion from intrinsically-registered 18f-fdg–pet/mri for treatment response assessment in a breast cancer cohort.Magn. Reson. Mater. Phys. Biol. Med., 33:317–328, 2020. doi: 10.1007/s10334-020-00839-9

work page doi:10.1007/s10334-020-00839-9 2020
[4]

B. B. Avants, C. L. Epstein, M. Grossman, and J. C. Gee. Symmetric diffeomorphic image registration with cross-correlation.Med. Image Anal., 12:26–41, 2008. doi: 10. 1016/j.media.2007.06.004. 32

work page 2008
[5]

E. J. Bacon, C. Jin, D. He, S. Hu, L. Wang, H. Li, and S. Qi. Epileptogenic zone localization in refractory epilepsy by fdg-pet.Front. Neurol., 12:724680, 2021. doi: 10.3389/fneur.2021.724680

work page doi:10.3389/fneur.2021.724680 2021
[6]

J. D. Benamou and Y. Brenier. A computational fluid mechanics solution to the monge- kantorovich mass transfer problem.Numer. Math., 84:375–393, 2003. doi: 10.1007/ s00211-002-0421-z

work page 2003
[7]

L. Bottou. Stochastic gradient descent tricks. InNeural Networks: Tricks of the Trade, pages 421–436. Springer, 2012. doi: 10.1007/978-3-642-35289-8 25

work page doi:10.1007/978-3-642-35289-8 2012
[8]

Boyd and L

S. Boyd and L. Vandenberghe.Convex Optimization. Cambridge Univ. Press, 2004. doi: 10.1017/CBO9780511804441

work page doi:10.1017/cbo9780511804441 2004
[9]

Broggi, E

S. Broggi, E. Scalco, M. L. Belli, and et al. A comparative evaluation of 3 different free- form deformable image registration methods.Technol. Cancer Res. Treat., 16:220–229,

work page
[10]

doi: 10.1177/1533034617703760

work page doi:10.1177/1533034617703760
[11]

B¨ ohmer

K. B¨ ohmer. On finite element methods for fully nonlinear elliptic equations of second order.SIAM J. Numer. Anal., 46:1212–1249, 2008. doi: 10.1137/070686353

work page doi:10.1137/070686353 2008
[12]

L. A. Caffarelli, L. Nirenberg, and J. Spruck. The dirichlet problem for nonlinear second-order elliptic equations i. monge-amp` ere equation.Commun. Pure Appl. Math., 37:369–402, 1984. doi: 10.1002/cpa.3160370306

work page doi:10.1002/cpa.3160370306 1984
[13]

K. Cao, X. Ding, J. Zhao, and X. Feng. Self-learning multi-head weight and enhanced physics-informed residual connection neural networks.Physics of Fluids, 37(4):046121,

work page
[14]

doi: 10.1063/5.0260860

work page doi:10.1063/5.0260860
[15]

W. Chen, A. Howard, and P. Stinis. Self-adaptive weights based on balanced residual decay rate for pinns.J. Comput. Phys., 542:114226, 2025. doi: 10.1016/j.jcp.2025. 114226

work page doi:10.1016/j.jcp.2025 2025
[16]

Y. Chen, Y. Shi, and B. Zhang. Optimal control via neural networks: A convex ap- proach. InICLR, 2019. URLhttps://arxiv.org/abs/1810.04337

work page internal anchor Pith review Pith/arXiv arXiv 2019
[17]

E. J. Dean and R. Glowinski. Numerical methods for fully nonlinear elliptic equations of the monge-amp` ere type.Comput. Methods Appl. Mech. Eng., 195:1344–1386, 2006. doi: 10.1016/j.cma.2005.04.017

work page doi:10.1016/j.cma.2005.04.017 2006
[18]

X. Ding, K. Cao, J. Zhao, and X. Feng. Enhanced architecture with adaptive sampling method for solving elliptic partial differential equations.Physics of Fluids, 37(7):077170,

work page
[19]

doi: 10.1063/5.0274928. 33

work page doi:10.1063/5.0274928
[20]

Dung and V

D. Dung and V. K. Nguyen. Deep relu neural networks in high-dimensional approxi- mation.Neural Netw., 142:619–635, 2021. doi: 10.1016/j.neunet.2021.06.015

work page doi:10.1016/j.neunet.2021.06.015 2021
[22]

R. Franzen. Kodak lossless true color image suite.https://r0k.us/graphics/kodak/ index.html, 2024. Accessed May 13, 2026

work page 2024
[23]

Hacking and et al

R. Hacking and et al. A neural network approach for solving the monge–amp` ere equation with transport boundary condition.J. Comput. Math. Data Sci., 15:100119, 2025. doi: 10.1016/j.jcmds.2025.100119

work page doi:10.1016/j.jcmds.2025.100119 2025
[24]

Haker, A

S. Haker, A. Tannenbaum, and R. Kikinis. Mass preserving mappings and image regis- tration. InMICCAI, pages 120–127, 2001. doi: 10.1007/3-540-45468-3 15

work page doi:10.1007/3-540-45468-3 2001
[25]

C. W. Huang, R. T. Q. Chen, C. Tsirigotis, and A. C. Courville. Convex potential flows: Universal probability distributions with optimal transport. InICLR, 2021. URL https://arxiv.org/abs/2012.05932

work page arXiv 2021
[26]

Neural Tangent Kernel: Convergence and Generalization in Neural Networks

A. Jacot, F. Gabriel, and C. Hongler. Neural tangent kernel: Convergence and gen- eralization in neural networks. InNeurIPS, pages 8571–8580, 2018. URLhttps: //arxiv.org/abs/1806.07572

work page arXiv 2018
[27]

K. A. Johnson and J. A. Becker. Neuroimaging primer: Introduction to neuroimaging. https://www.med.harvard.edu/aanlib/home.html, 2024. Accessed May 13, 2026

work page 2024
[28]

J. H. Jung, Y. Choi, and K. C. Im. Pet/mri: Technical challenges and recent advances. Nucl. Med. Mol. Imaging, 50:3–12, 2016. doi: 10.1007/s13139-015-0368-9

work page doi:10.1007/s13139-015-0368-9 2016
[29]

S. J. Kiebel, J. Ashburner, J. B. Poline, and K. J. Friston. Mri and pet coregistration. NeuroImage, 5:271–279, 1997. doi: 10.1006/nimg.1997.0262

work page doi:10.1006/nimg.1997.0262 1997
[30]

T. Liu, Y. Wang, W. Yao, X. Feng, and J. Liu. A pod-driven deep learning prediction model for supersonic combustion.Aerospace Science and Technology, 175:112005, 2026. doi: 10.1016/j.ast.2026.112005

work page doi:10.1016/j.ast.2026.112005 2026
[31]

Z. Long, Y. Lu, B. Dong, and et al. Pde-net 2.0: Learning pdes from data with a numeric-symbolic hybrid deep network.J. Comput. Phys., 399:108925, 2019. doi: 10.1016/j.jcp.2019.108925

work page doi:10.1016/j.jcp.2019.108925 2019
[32]

J. Lu, Z. Shen, H. Yang, and S. Zhang. Deep network approximation for smooth functions.SIAM J. Math. Anal., 53:5465–5506, 2021. doi: 10.1137/20M1357215. 34

work page doi:10.1137/20m1357215 2021
[33]

Arratia L´ opez, H

P. Arratia L´ opez, H. Mella, S. Uribe, D. E. Hurtado, and F. Sahli Costabal. Warppinn: Cine-mr image registration with physics-informed neural networks.Med. Image Anal., 89:102925, 2023. doi: 10.1016/j.media.2023.102925

work page doi:10.1016/j.media.2023.102925 2023
[34]

S. N. Maqbool, F. Ali, X. Feng, M. Usman, and M. Islam. Pytorch-based deep neural network model for the calendering process of non-newtonian fluids with temperature- dependent viscosity.Heat Transfer, 55(1):574–617, 2026. doi: 10.1002/htj.70095

work page doi:10.1002/htj.70095 2026
[35]

Z. Min, Z. M. C. Baum, S. U. Saeed, M. Emberton, D. C. Barratt, Z. A. Taylor, and Y. Hu. Biomechanics-informed non-rigid medical image registration. InMICCAI, pages 564–574, 2024. doi: 10.1007/978-3-031-72069-7 55

work page doi:10.1007/978-3-031-72069-7 2024
[36]

Nystr¨ om and M

K. Nystr¨ om and M. Vestberg. Solving the dirichlet problem for the monge–amp` ere equation using neural networks.J. Comput. Math. Data Sci., 8, 2023. URLhttps: //arxiv.org/abs/2211.04218

work page arXiv 2023
[37]

Perona and J

P. Perona and J. Malik. Scale-space and edge detection using anisotropic diffusion. IEEE Trans. Pattern Anal. Mach. Intell., 12:629–639, 1990. doi: 10.1109/34.57686

work page doi:10.1109/34.57686 1990
[38]

A. V. Pogorelov.Monge-Amp` ere equations of elliptic type. P. Noordhoff, 1964. doi: 10.1007/978-94-011-8034-1

work page doi:10.1007/978-94-011-8034-1 1964
[39]

Rahaman, A

N. Rahaman, A. Baratin, D. Arpit, and et al. On the spectral bias of neural networks. InICML, pages 5301–5310, 2019. URLhttps://arxiv.org/abs/1905.08573

work page arXiv 2019
[40]

Raissi, P

M. Raissi, P. Perdikaris, and G. E. Karniadakis. Physics-informed neural networks: A deep learning framework for solving pdes.J. Comput. Phys., 378:686–707, 2019. doi: 10.1016/j.jcp.2018.10.045

work page doi:10.1016/j.jcp.2018.10.045 2019
[41]

L. I. Rudin, S. Osher, and E. Fatemi. Nonlinear total variation based noise removal algorithms.Physica D, 60:259–268, 1992. doi: 10.1016/0167-2789(92)90242-F

work page doi:10.1016/0167-2789(92)90242-f 1992
[42]

O. Savin. The obstacle problem for monge-amp` ere equation.Calc. Var. Partial Differ. Equ., 22:303–320, 2005. doi: 10.1007/s00526-004-0289-z

work page doi:10.1007/s00526-004-0289-z 2005
[43]

Sotiras, C

A. Sotiras, C. Davatzikos, and N. Paragios. Deformable medical image registration: A survey.IEEE Trans. Med. Imaging, 32(7):1153–1190, 2013. doi: 10.1109/TMI.2013. 2256013

work page doi:10.1109/tmi.2013 2013
[44]

J. I. E. Urbas. The generalized dirichlet problem for equations of monge-amp` ere type.Ann. Inst. H. Poincar´ e Anal. Non Lin´ eaire, 3:209–228, 1986. doi: 10.1016/ S0294-1449(86)80014-5. 35

work page 1986
[45]

V. N. Vapnik.The Nature of Statistical Learning Theory. Springer, 1995. doi: 10.1007/ 978-1-4757-3264-1

work page 1995
[46]

Attention Is All You Need

A. Vaswani, N. Shazeer, N. Parmar, and et al. Attention is all you need. InNeurIPS, pages 5998–6008, 2017. URLhttps://arxiv.org/abs/1706.03762

work page internal anchor Pith review Pith/arXiv arXiv 2017
[47]

Villani.Optimal Transport: Old and New

C. Villani.Optimal Transport: Old and New. Springer, 2009. doi: 10.1007/ 978-3-540-71050-9

work page 2009
[48]

S. Wang, X. Yu, and P. Perdikaris. When and why pinns fail to train: A neural tangent kernel perspective.J. Comput. Phys., 449:110768, 2022. doi: 10.1016/j.jcp.2021.110768

work page doi:10.1016/j.jcp.2021.110768 2022
[49]

Yarotsky , Error bounds for approximations with deep relu networks , Neural Net- works, 94 (2017), pp

D. Yarotsky. Error bounds for approximations with deep relu networks.Neural Netw., 94:103–114, 2017. doi: 10.1016/j.neunet.2017.07.002

work page doi:10.1016/j.neunet.2017.07.002 2017
[50]

Z. Zhao, X. Ding, and B. A. Prakash. Pinnsformer: A transformer-based framework for physics-informed neural networks. InICLR, 2024. URLhttps://arxiv.org/abs/ 2307.11833

work page arXiv 2024
[51]

X. P. Zong, H. B. Zhang, L. Hao, and et al. Improved ant colony algorithm for prostate dwi registration. InAdv. Mater. Res., pages 530–534, 2014. doi: 10.4028/www.scientific. net/AMR.1049-1050.530. 36

work page doi:10.4028/www.scientific 2014

[1] [1]

A. D. Aleksandrov. Uniqueness theorems for surfaces in the large. i, ii.Am. Math. Soc. Transl. Ser. 2, 21:341–388, 1962. doi: 10.1090/trans2/21

work page doi:10.1090/trans2/21 1962

[2] [2]

B. Amos, L. Xu, and J. Z. Kolter. Input convex neural networks. InICML, pages 146–155, 2017. URLhttps://arxiv.org/abs/1609.07152

work page internal anchor Pith review Pith/arXiv arXiv 2017

[3] [3]

M. M. S. Andreassen, P. E. Goa, T. E. Sjøbakk, and et al. Semi-automatic segmenta- tion from intrinsically-registered 18f-fdg–pet/mri for treatment response assessment in a breast cancer cohort.Magn. Reson. Mater. Phys. Biol. Med., 33:317–328, 2020. doi: 10.1007/s10334-020-00839-9

work page doi:10.1007/s10334-020-00839-9 2020

[4] [4]

B. B. Avants, C. L. Epstein, M. Grossman, and J. C. Gee. Symmetric diffeomorphic image registration with cross-correlation.Med. Image Anal., 12:26–41, 2008. doi: 10. 1016/j.media.2007.06.004. 32

work page 2008

[5] [5]

E. J. Bacon, C. Jin, D. He, S. Hu, L. Wang, H. Li, and S. Qi. Epileptogenic zone localization in refractory epilepsy by fdg-pet.Front. Neurol., 12:724680, 2021. doi: 10.3389/fneur.2021.724680

work page doi:10.3389/fneur.2021.724680 2021

[6] [6]

J. D. Benamou and Y. Brenier. A computational fluid mechanics solution to the monge- kantorovich mass transfer problem.Numer. Math., 84:375–393, 2003. doi: 10.1007/ s00211-002-0421-z

work page 2003

[7] [7]

L. Bottou. Stochastic gradient descent tricks. InNeural Networks: Tricks of the Trade, pages 421–436. Springer, 2012. doi: 10.1007/978-3-642-35289-8 25

work page doi:10.1007/978-3-642-35289-8 2012

[8] [8]

Boyd and L

S. Boyd and L. Vandenberghe.Convex Optimization. Cambridge Univ. Press, 2004. doi: 10.1017/CBO9780511804441

work page doi:10.1017/cbo9780511804441 2004

[9] [9]

Broggi, E

S. Broggi, E. Scalco, M. L. Belli, and et al. A comparative evaluation of 3 different free- form deformable image registration methods.Technol. Cancer Res. Treat., 16:220–229,

work page

[10] [10]

doi: 10.1177/1533034617703760

work page doi:10.1177/1533034617703760

[11] [11]

B¨ ohmer

K. B¨ ohmer. On finite element methods for fully nonlinear elliptic equations of second order.SIAM J. Numer. Anal., 46:1212–1249, 2008. doi: 10.1137/070686353

work page doi:10.1137/070686353 2008

[12] [12]

L. A. Caffarelli, L. Nirenberg, and J. Spruck. The dirichlet problem for nonlinear second-order elliptic equations i. monge-amp` ere equation.Commun. Pure Appl. Math., 37:369–402, 1984. doi: 10.1002/cpa.3160370306

work page doi:10.1002/cpa.3160370306 1984

[13] [13]

K. Cao, X. Ding, J. Zhao, and X. Feng. Self-learning multi-head weight and enhanced physics-informed residual connection neural networks.Physics of Fluids, 37(4):046121,

work page

[14] [14]

doi: 10.1063/5.0260860

work page doi:10.1063/5.0260860

[15] [15]

W. Chen, A. Howard, and P. Stinis. Self-adaptive weights based on balanced residual decay rate for pinns.J. Comput. Phys., 542:114226, 2025. doi: 10.1016/j.jcp.2025. 114226

work page doi:10.1016/j.jcp.2025 2025

[16] [16]

Y. Chen, Y. Shi, and B. Zhang. Optimal control via neural networks: A convex ap- proach. InICLR, 2019. URLhttps://arxiv.org/abs/1810.04337

work page internal anchor Pith review Pith/arXiv arXiv 2019

[17] [17]

E. J. Dean and R. Glowinski. Numerical methods for fully nonlinear elliptic equations of the monge-amp` ere type.Comput. Methods Appl. Mech. Eng., 195:1344–1386, 2006. doi: 10.1016/j.cma.2005.04.017

work page doi:10.1016/j.cma.2005.04.017 2006

[18] [18]

X. Ding, K. Cao, J. Zhao, and X. Feng. Enhanced architecture with adaptive sampling method for solving elliptic partial differential equations.Physics of Fluids, 37(7):077170,

work page

[19] [19]

doi: 10.1063/5.0274928. 33

work page doi:10.1063/5.0274928

[20] [20]

Dung and V

D. Dung and V. K. Nguyen. Deep relu neural networks in high-dimensional approxi- mation.Neural Netw., 142:619–635, 2021. doi: 10.1016/j.neunet.2021.06.015

work page doi:10.1016/j.neunet.2021.06.015 2021

[21] [22]

R. Franzen. Kodak lossless true color image suite.https://r0k.us/graphics/kodak/ index.html, 2024. Accessed May 13, 2026

work page 2024

[22] [23]

Hacking and et al

R. Hacking and et al. A neural network approach for solving the monge–amp` ere equation with transport boundary condition.J. Comput. Math. Data Sci., 15:100119, 2025. doi: 10.1016/j.jcmds.2025.100119

work page doi:10.1016/j.jcmds.2025.100119 2025

[23] [24]

Haker, A

S. Haker, A. Tannenbaum, and R. Kikinis. Mass preserving mappings and image regis- tration. InMICCAI, pages 120–127, 2001. doi: 10.1007/3-540-45468-3 15

work page doi:10.1007/3-540-45468-3 2001

[24] [25]

C. W. Huang, R. T. Q. Chen, C. Tsirigotis, and A. C. Courville. Convex potential flows: Universal probability distributions with optimal transport. InICLR, 2021. URL https://arxiv.org/abs/2012.05932

work page arXiv 2021

[25] [26]

Neural Tangent Kernel: Convergence and Generalization in Neural Networks

A. Jacot, F. Gabriel, and C. Hongler. Neural tangent kernel: Convergence and gen- eralization in neural networks. InNeurIPS, pages 8571–8580, 2018. URLhttps: //arxiv.org/abs/1806.07572

work page arXiv 2018

[26] [27]

K. A. Johnson and J. A. Becker. Neuroimaging primer: Introduction to neuroimaging. https://www.med.harvard.edu/aanlib/home.html, 2024. Accessed May 13, 2026

work page 2024

[27] [28]

J. H. Jung, Y. Choi, and K. C. Im. Pet/mri: Technical challenges and recent advances. Nucl. Med. Mol. Imaging, 50:3–12, 2016. doi: 10.1007/s13139-015-0368-9

work page doi:10.1007/s13139-015-0368-9 2016

[28] [29]

S. J. Kiebel, J. Ashburner, J. B. Poline, and K. J. Friston. Mri and pet coregistration. NeuroImage, 5:271–279, 1997. doi: 10.1006/nimg.1997.0262

work page doi:10.1006/nimg.1997.0262 1997

[29] [30]

T. Liu, Y. Wang, W. Yao, X. Feng, and J. Liu. A pod-driven deep learning prediction model for supersonic combustion.Aerospace Science and Technology, 175:112005, 2026. doi: 10.1016/j.ast.2026.112005

work page doi:10.1016/j.ast.2026.112005 2026

[30] [31]

Z. Long, Y. Lu, B. Dong, and et al. Pde-net 2.0: Learning pdes from data with a numeric-symbolic hybrid deep network.J. Comput. Phys., 399:108925, 2019. doi: 10.1016/j.jcp.2019.108925

work page doi:10.1016/j.jcp.2019.108925 2019

[31] [32]

J. Lu, Z. Shen, H. Yang, and S. Zhang. Deep network approximation for smooth functions.SIAM J. Math. Anal., 53:5465–5506, 2021. doi: 10.1137/20M1357215. 34

work page doi:10.1137/20m1357215 2021

[32] [33]

Arratia L´ opez, H

P. Arratia L´ opez, H. Mella, S. Uribe, D. E. Hurtado, and F. Sahli Costabal. Warppinn: Cine-mr image registration with physics-informed neural networks.Med. Image Anal., 89:102925, 2023. doi: 10.1016/j.media.2023.102925

work page doi:10.1016/j.media.2023.102925 2023

[33] [34]

S. N. Maqbool, F. Ali, X. Feng, M. Usman, and M. Islam. Pytorch-based deep neural network model for the calendering process of non-newtonian fluids with temperature- dependent viscosity.Heat Transfer, 55(1):574–617, 2026. doi: 10.1002/htj.70095

work page doi:10.1002/htj.70095 2026

[34] [35]

Z. Min, Z. M. C. Baum, S. U. Saeed, M. Emberton, D. C. Barratt, Z. A. Taylor, and Y. Hu. Biomechanics-informed non-rigid medical image registration. InMICCAI, pages 564–574, 2024. doi: 10.1007/978-3-031-72069-7 55

work page doi:10.1007/978-3-031-72069-7 2024

[35] [36]

Nystr¨ om and M

K. Nystr¨ om and M. Vestberg. Solving the dirichlet problem for the monge–amp` ere equation using neural networks.J. Comput. Math. Data Sci., 8, 2023. URLhttps: //arxiv.org/abs/2211.04218

work page arXiv 2023

[36] [37]

Perona and J

P. Perona and J. Malik. Scale-space and edge detection using anisotropic diffusion. IEEE Trans. Pattern Anal. Mach. Intell., 12:629–639, 1990. doi: 10.1109/34.57686

work page doi:10.1109/34.57686 1990

[37] [38]

A. V. Pogorelov.Monge-Amp` ere equations of elliptic type. P. Noordhoff, 1964. doi: 10.1007/978-94-011-8034-1

work page doi:10.1007/978-94-011-8034-1 1964

[38] [39]

Rahaman, A

N. Rahaman, A. Baratin, D. Arpit, and et al. On the spectral bias of neural networks. InICML, pages 5301–5310, 2019. URLhttps://arxiv.org/abs/1905.08573

work page arXiv 2019

[39] [40]

Raissi, P

M. Raissi, P. Perdikaris, and G. E. Karniadakis. Physics-informed neural networks: A deep learning framework for solving pdes.J. Comput. Phys., 378:686–707, 2019. doi: 10.1016/j.jcp.2018.10.045

work page doi:10.1016/j.jcp.2018.10.045 2019

[40] [41]

L. I. Rudin, S. Osher, and E. Fatemi. Nonlinear total variation based noise removal algorithms.Physica D, 60:259–268, 1992. doi: 10.1016/0167-2789(92)90242-F

work page doi:10.1016/0167-2789(92)90242-f 1992

[41] [42]

O. Savin. The obstacle problem for monge-amp` ere equation.Calc. Var. Partial Differ. Equ., 22:303–320, 2005. doi: 10.1007/s00526-004-0289-z

work page doi:10.1007/s00526-004-0289-z 2005

[42] [43]

Sotiras, C

A. Sotiras, C. Davatzikos, and N. Paragios. Deformable medical image registration: A survey.IEEE Trans. Med. Imaging, 32(7):1153–1190, 2013. doi: 10.1109/TMI.2013. 2256013

work page doi:10.1109/tmi.2013 2013

[43] [44]

J. I. E. Urbas. The generalized dirichlet problem for equations of monge-amp` ere type.Ann. Inst. H. Poincar´ e Anal. Non Lin´ eaire, 3:209–228, 1986. doi: 10.1016/ S0294-1449(86)80014-5. 35

work page 1986

[44] [45]

V. N. Vapnik.The Nature of Statistical Learning Theory. Springer, 1995. doi: 10.1007/ 978-1-4757-3264-1

work page 1995

[45] [46]

Attention Is All You Need

A. Vaswani, N. Shazeer, N. Parmar, and et al. Attention is all you need. InNeurIPS, pages 5998–6008, 2017. URLhttps://arxiv.org/abs/1706.03762

work page internal anchor Pith review Pith/arXiv arXiv 2017

[46] [47]

Villani.Optimal Transport: Old and New

C. Villani.Optimal Transport: Old and New. Springer, 2009. doi: 10.1007/ 978-3-540-71050-9

work page 2009

[47] [48]

S. Wang, X. Yu, and P. Perdikaris. When and why pinns fail to train: A neural tangent kernel perspective.J. Comput. Phys., 449:110768, 2022. doi: 10.1016/j.jcp.2021.110768

work page doi:10.1016/j.jcp.2021.110768 2022

[48] [49]

Yarotsky , Error bounds for approximations with deep relu networks , Neural Net- works, 94 (2017), pp

D. Yarotsky. Error bounds for approximations with deep relu networks.Neural Netw., 94:103–114, 2017. doi: 10.1016/j.neunet.2017.07.002

work page doi:10.1016/j.neunet.2017.07.002 2017

[49] [50]

Z. Zhao, X. Ding, and B. A. Prakash. Pinnsformer: A transformer-based framework for physics-informed neural networks. InICLR, 2024. URLhttps://arxiv.org/abs/ 2307.11833

work page arXiv 2024

[50] [51]

X. P. Zong, H. B. Zhang, L. Hao, and et al. Improved ant colony algorithm for prostate dwi registration. InAdv. Mater. Res., pages 530–534, 2014. doi: 10.4028/www.scientific. net/AMR.1049-1050.530. 36

work page doi:10.4028/www.scientific 2014