pith. sign in

arxiv: 2605.24876 · v1 · pith:TMYMXTSUnew · submitted 2026-05-24 · 🧮 math.NA · cs.LG· cs.NA

IV-Net: A neural network for elliptic PDEs with random and highly varying coefficients

Pith reviewed 2026-06-29 23:51 UTC · model grok-4.3

classification 🧮 math.NA cs.LGcs.NA
keywords neural operatorselliptic PDEshigh-contrast coefficientsmultigrid methodsconvolutional layersuncertainty quantificationinverse problems
0
0 comments X

The pith

A neural network structured like a multigrid V-cycle solves linear elliptic PDEs with high-contrast random coefficients by mapping inputs directly to solution fields.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents IV-Net as a neural operator that approximates solutions to linear elliptic partial differential equations whose coefficients vary sharply in space. The architecture copies the structure of a V-cycle multigrid iteration but replaces the fixed smoothing and restriction steps with trainable convolutional layers that operate in the physical domain. For coercive problems whose coefficients are highly heterogeneous, the network produces more accurate fields than proper orthogonal decomposition or several other neural operator models. The authors also examine how the approximation error behaves with mesh size and data volume and illustrate use in uncertainty quantification and inverse problems.

Core claim

IV-Net realizes a mapping from spatially varying coefficients and right-hand side to the solution field by implementing an iterated V-cycle whose smoothing, restriction, and prolongation operators are realized as convolutional layers defined on the physical mesh; for coercive elliptic problems with high-contrast coefficients this learned iteration yields lower error than POD-based reduced models and existing neural operators while maintaining comparable performance to Fourier neural operators on smooth-coefficient Helmholtz problems.

What carries the argument

The Iterated V-shaped Net (IV-Net), an architecture that embeds a fixed V-cycle multigrid template inside trainable convolutional layers acting in the physical domain to learn the coefficient-to-solution map.

If this is right

  • The learned operator produces lower pointwise and energy-norm errors than POD or other neural operators on coercive high-contrast test problems.
  • Error and convergence rates can be bounded in terms of the underlying discretization mesh and the number of training samples.
  • The same architecture yields useful predictions for quantities of interest in uncertainty quantification and inverse problems without retraining.
  • Performance on low-frequency oscillatory Helmholtz problems with smooth coefficients matches that of a Fourier neural operator.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the convolutional layers truly inherit the contraction properties of multigrid smoothing, the same template could be applied to time-dependent or nonlinear problems whose linearizations admit similar V-cycle structure.
  • Because the layers act directly on the physical mesh, the network may transfer across different discretizations more readily than methods that require a fixed spectral basis.
  • Data efficiency claims suggest that modest numbers of high-fidelity solves could suffice to train the network for families of coefficients drawn from the same statistical distribution.

Load-bearing premise

A fixed V-cycle multigrid template, once its components are replaced by trainable convolutions, remains stable and accurate for arbitrary high-contrast coefficient fields without any change in iteration count or relaxation parameters.

What would settle it

Numerical experiments on a sequence of coefficient realizations whose contrast ratio exceeds the training range, showing that the network error fails to decrease with additional layers or data while a standard multigrid solver still converges.

Figures

Figures reproduced from arXiv: 2605.24876 by George Biros, Shan Zhong.

Figure 2
Figure 2. Figure 2: Example coefficients and solutions for Poisson, Helmholtz, and Darcy problem. [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: The architecture of a block completing one iterative update. Firstly, concatenate [PITH_FULL_IMAGE:figures/full_fig_p012_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Convergence plot and error distribution for the Poisson prob [PITH_FULL_IMAGE:figures/full_fig_p017_4.png] view at source ↗
Figure 4
Figure 4. Figure 4: Test errors for the high contrast Poisson problem ( [PITH_FULL_IMAGE:figures/full_fig_p021_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Explored structural variants of a single block.(a) No residual connection; (b) [PITH_FULL_IMAGE:figures/full_fig_p025_5.png] view at source ↗
read the original abstract

We introduce a novel neural operator architecture designed to approximate solutions of linear elliptic partial differential equations with high-contrast, spatially varying coefficients. The network, termed the Iterated V-shaped Net (IV-Net), realizes a mapping from the input coefficients and righthand side to the corresponding solution field. The architecture of IV-Net is informed by, and closely resembles, a V-cycle multigrid solver. The IV-Net model is parameterized via convolutional layers defined in the physical domain. For coercive problems with highly heterogeneous coefficients, the proposed network exhibits superior performance relative to a proper orthogonal decomposition (POD) approach and several existing neural operator architectures. For low-frequency oscillatory Helmholtz problems with smooth coefficients, its performance is similar to that of a Fourier neural operator. We analyze the approximation error and convergence behavior of IV-Net, its data efficiency, and its dependence on the underlying discretization mesh. Furthermore, we demonstrate the practical effectiveness of the architecture through a series of numerical experiments, including applications to uncertainty quantification, inverse problems, and prediction of quantities of interest.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces IV-Net, a neural operator architecture modeled on the V-cycle multigrid method for approximating solutions of linear elliptic PDEs with high-contrast, spatially varying coefficients. The network maps coefficients and right-hand side to the solution via trainable convolutional layers defined in the physical domain. It claims superior performance relative to POD and existing neural operators on coercive high-contrast problems (with similar performance to FNO on low-frequency Helmholtz), supported by analyses of approximation error, convergence, data efficiency, and mesh dependence, plus numerical experiments on uncertainty quantification, inverse problems, and quantities of interest.

Significance. If the empirical superiority and supporting analyses hold under the stated conditions, the work provides a concrete way to embed classical multigrid structure into neural operators, potentially improving robustness and data efficiency for heterogeneous-coefficient elliptic problems. The explicit study of mesh dependence and the demonstration on downstream tasks (UQ, inversion) are positive features that could influence subsequent operator-learning research.

major comments (2)
  1. [Analysis of approximation error and convergence behavior] The strongest claim (superior performance on arbitrary high-contrast coercive problems) rests on the assumption that a fixed V-cycle template with trainable convolutional layers remains stable and accurate without per-realization tuning of iteration count or relaxation parameters. The analysis of approximation error does not appear to supply a worst-case bound that survives changes in contrast ratio, correlation length, or spatial arrangement outside the training ensemble; this leaves open whether observed gains are general or ensemble-specific.
  2. [Numerical experiments section] The manuscript states that numerical experiments support the performance claims, yet the description of data splits, training/validation protocols, and the precise range of contrast ratios tested is not sufficiently detailed to allow independent verification that the superiority versus POD and other operators is not an artifact of the chosen ensemble.
minor comments (2)
  1. [Architecture description] Notation for the convolutional layers and their relation to the classical multigrid restriction/prolongation operators should be made fully explicit, including any assumptions on the underlying mesh.
  2. [Numerical experiments] A short table summarizing the contrast ratios, mesh sizes, and number of training samples across all reported experiments would improve readability and allow direct comparison with the cited baselines.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed comments. We address each major comment below and indicate the revisions planned for the manuscript.

read point-by-point responses
  1. Referee: [Analysis of approximation error and convergence behavior] The strongest claim (superior performance on arbitrary high-contrast coercive problems) rests on the assumption that a fixed V-cycle template with trainable convolutional layers remains stable and accurate without per-realization tuning of iteration count or relaxation parameters. The analysis of approximation error does not appear to supply a worst-case bound that survives changes in contrast ratio, correlation length, or spatial arrangement outside the training ensemble; this leaves open whether observed gains are general or ensemble-specific.

    Authors: We acknowledge that the approximation error analysis is derived under the distribution of the training ensemble and does not furnish a worst-case bound that is independent of contrast ratio, correlation length, or spatial arrangement outside that ensemble. The performance claims are therefore scoped to problems statistically similar to the training data, with empirical support across the tested range of contrasts. We will revise the manuscript to state this scope explicitly, to clarify that no per-realization tuning is performed, and to discuss the distinction between ensemble-specific gains and fully general guarantees. revision: yes

  2. Referee: [Numerical experiments section] The manuscript states that numerical experiments support the performance claims, yet the description of data splits, training/validation protocols, and the precise range of contrast ratios tested is not sufficiently detailed to allow independent verification that the superiority versus POD and other operators is not an artifact of the chosen ensemble.

    Authors: We agree that the experimental protocol requires additional detail for independent verification. In the revised version we will expand the numerical experiments section to specify the data-generation procedure, the exact training/validation/test splits, the full training protocol (including optimizer, learning-rate schedule, and stopping criteria), and the precise ranges of contrast ratios and correlation lengths used in all reported comparisons. revision: yes

Circularity Check

0 steps flagged

No circularity: architecture and empirical validation are self-contained

full rationale

The paper introduces IV-Net as a convolutional parameterization of a V-cycle multigrid template and reports empirical performance against external baselines (POD and published neural operators) plus mesh-dependence and approximation-error analysis. No load-bearing claim reduces by construction to a fitted parameter renamed as prediction, a self-citation chain, or an ansatz smuggled from prior author work; the central mapping is learned from data and the comparisons are to independent methods. The derivation chain therefore remains non-circular.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Only the abstract is available; therefore the ledger records only the minimal background assumptions visible in the abstract. No free parameters, invented entities, or ad-hoc axioms are stated explicitly.

axioms (1)
  • domain assumption The linear elliptic PDE is coercive for the coefficient fields considered.
    Required for the well-posedness of the problems on which superiority is claimed.

pith-pipeline@v0.9.1-grok · 5710 in / 1282 out tokens · 21724 ms · 2026-06-29T23:51:36.110787+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

62 extracted references · 25 canonical work pages · 7 internal anchors

  1. [1]

    Y. Chen, B. Dong, J. Xu, Meta-MgNet: Meta Multigrid Networks for Solving Parameterized Partial Differential Equations (Nov. 2020). arXiv:2010.14088

  2. [2]

    W.E,B.Yu, TheDeepRitzMethod: ADeepLearning-BasedNumerical Algorithm for Solving Variational Problems, Commun. Math. Stat. 6 (1) (2018) 1–12

  3. [3]

    Raissi, P

    M. Raissi, P. Perdikaris, G. E. Karniadakis, Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations, Journal of Computational Physics 378 (2019) 686–707. 30

  4. [4]

    Sitzmann, J

    V. Sitzmann, J. N. P. Martel, A. W. Bergman, D. B. Lindell, G. Wet- zstein, Implicit Neural Representations with Periodic Activation Func- tions (Jun. 2020).arXiv:2006.09661

  5. [5]

    Y. Zhu, n. Zabaras, Bayesian deep convolutional encoder–decoder net- works for surrogate modeling and uncertainty quantification, Journal of Computational Physics 366 (2018) 415–447

  6. [6]

    Y. Khoo, J. Lu, L. Ying, Solving parametric PDE problems with ar- tificial neural networks, Eur. J. Appl. Math 32 (3) (2021) 421–435. arXiv:1707.03351

  7. [7]

    Raonic, T

    B. Raonic, T. Rohner, CONVOLUTIONAL NEURAL OPERATORS (2023)

  8. [8]

    J. He, X. Liu, J. Xu, MgNO: Efficient Parameterization of Linear Op- erators via Multigrid (Oct. 2023).arXiv:2310.19809

  9. [9]

    Winovich, K

    N. Winovich, K. Ramani, G. Lin, ConvPDE-UQ: Convolutional neu- ral networks with quantified uncertainty for heterogeneous elliptic par- tial differential equations on varied domains, Journal of Computational Physics 394 (2019) 263–279

  10. [10]

    T. Chen, H. Chen, Universal approximation to nonlinear operators by neural networks with arbitrary activation functions and its application to dynamical systems, IEEE Trans Neural Netw 6 (4) (1995) 911–917

  11. [11]

    L. Lu, P. Jin, G. Pang, Z. Zhang, G. E. Karniadakis, Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators, Nat Mach Intell 3 (3) (2021) 218–229

  12. [12]

    Kontolati, S

    K. Kontolati, S. Goswami, G. Em Karniadakis, M. D. Shields, Learning nonlinear operators in latent spaces for real-time predictions of complex dynamics in physical systems, Nat Commun 15 (1) (2024) 5101

  13. [13]

    Z. Li, N. Kovachki, K. Azizzadenesheli, B. Liu, K. Bhattacharya, A. Stu- art, A. Anandkumar, Fourier Neural Operator for Parametric Partial Differential Equations (May 2021).arXiv:2010.08895

  14. [14]

    M. A. Rahman, Z. E. Ross, K. Azizzadenesheli, U-NO: U-shaped Neural Operators, Transactions on Machine Learning Research (Jan. 2023). 31

  15. [15]

    G. Wen, Z. Li, K. Azizzadenesheli, A. Anandkumar, S. M. Benson, U- FNO – An enhanced Fourier neural operator-based deep-learning model for multiphase flow (May 2022).arXiv:2109.03697

  16. [16]

    Kovachki, Z

    N. Kovachki, Z. Li, B. Liu, K. Azizzadenesheli, K. Bhattacharya, A. Stu- art, Neural Operator: Learning Maps Between Function Spaces With Applications to PDEs

  17. [17]

    Bhattacharya, B

    K. Bhattacharya, B. Hosseini, N. B. Kovachki, A. M. Stuart, Model Re- duction And Neural Networks For Parametric PDEs, The SMAI Journal of computational mathematics 7 (2021) 121–157

  18. [18]

    L. Lu, X. Meng, S. Cai, Z. Mao, S. Goswami, Z. Zhang, G. E. Karni- adakis, A comprehensive and fair comparison of two neural operators (with practical extensions) based on FAIR data, Computer Methods in Applied Mechanics and Engineering 393 (2022) 114778

  19. [19]

    O’Leary-Roseberry, P

    T. O’Leary-Roseberry, P. Chen, U. Villa, O. Ghattas, Derivative- Informed Neural Operator: An efficient framework for high-dimensional parametric derivative learning, Journal of Computational Physics 496 (2024) 112555

  20. [20]

    Gupta, X

    G. Gupta, X. Xiao, P. Bogdan, Multiwavelet-based Operator Learning for Differential Equations, in: Advances in Neural Information Process- ing Systems, Vol. 34, Curran Associates, Inc., 2021, pp. 24048–24062

  21. [21]

    Vaswani, N

    A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. ukasz Kaiser, I. Polosukhin, Attention is All you Need, in: Advances in Neural Information Processing Systems, Vol. 30, Curran Associates, Inc., 2017

  22. [22]

    Cao, Choose a Transformer: Fourier or Galerkin (Nov

    S. Cao, Choose a Transformer: Fourier or Galerkin (Nov. 2021).arXiv: 2105.14995

  23. [23]

    Z. Li, K. Meidani, A. B. Farimani, Transformer for Partial Differential Equations’ Operator Learning (Apr. 2023).arXiv:2205.13671

  24. [24]

    Z. Ye, X. Huang, L. Chen, H. Liu, Z. Wang, B. Dong, PDEformer: Towards a Foundation Model for One-Dimensional Partial Differential Equations (Apr. 2024).arXiv:2402.12652. 32

  25. [25]

    Kissas, J

    G. Kissas, J. H. Seidman, L. F. Guilhoto, V. M. Preciado, G. J. Pap- pas, P. Perdikaris, Learning operators with coupled attention, J. Mach. Learn. Res. 23 (1) (2022) 215:9636–215:9698

  26. [26]

    Ronneberger, P

    O. Ronneberger, P. Fischer, T. Brox, U-Net: Convolutional Networks for Biomedical Image Segmentation, in: Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2015

  27. [27]

    F. Yu, V. Koltun, Multi-Scale Context Aggregation by Dilated Convo- lutions (Apr. 2016).arXiv:1511.07122

  28. [28]

    I.M.Babuška, S.A.Sauter, IsthePollutionEffectoftheFEMAvoidable for the Helmholtz Equation Considering High Wave Numbers?, SIAM J. Numer. Anal. 34 (6) (1997) 2392–2423

  29. [29]

    Azulay, E

    Y. Azulay, E. Treister, Multigrid-Augmented Deep Learning Precondi- tioners for the Helmholtz Equation, SIAM J. Sci. Comput. 45 (3) (2023) S127–S151

  30. [30]

    Benner, S

    P. Benner, S. Gugercin, K. Willcox, A Survey of Projection-Based Model Reduction Methods for Parametric Dynamical Systems, SIAM Rev. 57 (4) (2015) 483–531

  31. [31]

    Quarteroni, A

    A. Quarteroni, A. Manzoni, F. Negri, Reduced basis methods for partial differential equations: an introduction, Vol. 92, Springer, 2015

  32. [32]

    E, A Proposal on Machine Learning via Dynamical Systems, Com- mun

    W. E, A Proposal on Machine Learning via Dynamical Systems, Com- mun. Math. Stat. 5 (1) (2017) 1–11

  33. [33]

    K. He, X. Zhang, S. Ren, J. Sun, Deep Residual Learning for Image Recognition, in: 2016IEEEConferenceonComputerVisionandPattern Recognition (CVPR), IEEE, Las Vegas, NV, USA, 2016, pp. 770–778

  34. [34]

    Z. Long, Y. Lu, X. Ma, B. Dong, PDE-Net: Learning PDEs from Data, in: Proceedings of the 35th International Conference on Machine Learn- ing, PMLR, 2018, pp. 3208–3216

  35. [35]

    Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

    S. Ioffe, C. Szegedy, Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift (Mar. 2015).arXiv: 1502.03167. 33

  36. [36]

    Odena, V

    A. Odena, V. Dumoulin, C. Olah, Deconvolution and Checkerboard Ar- tifacts, Distill 1 (10) (2016) e3

  37. [37]

    Lanthaler, R

    S. Lanthaler, R. Molinaro, P. Hadorn, S. Mishra, Nonlinear Reconstruc- tion for Operator Learning of PDEs with Discontinuities (Oct. 2022). arXiv:2210.01074

  38. [38]

    Fanaskov, I

    V. Fanaskov, I. Oseledets, Spectral Neural Operators (Apr. 2024). arXiv:2205.10573

  39. [39]

    Y. Lin, Y. J. Lee, J. Jia, Green Multigrid Network (Jul. 2024).arXiv: 2407.03593

  40. [40]

    Lanthaler, S

    S. Lanthaler, S. Mishra, G. E. Karniadakis, Error estimates for Deep- Onets: A deep learning framework in infinite dimensions (Jan. 2022). arXiv:2102.09618

  41. [41]

    Kopaničáková, G

    A. Kopaničáková, G. E. Karniadakis, DeepOnet Based Preconditioning Strategies For Solving Parametric Linear Systems of Equations (Jan. 2024).arXiv:2401.02016

  42. [42]

    Zhang, A

    E. Zhang, A. Kahana, E. Turkel, R. Ranade, J. Pathak, G. E. Karni- adakis, A Hybrid Iterative Numerical Transferable Solver (HINTS) for PDEs Based on Deep Operator Network and Relaxation Methods (Aug. 2022).arXiv:2208.13273

  43. [43]

    S. Mao, R. Dong, L. Lu, K. M. Yi, S. Wang, P. Perdikaris, PPDONet: Deep Operator Networks for Fast Prediction of Steady-state Solutions in Disk–Planet Systems, ApJL 950 (2) (2023) L12

  44. [44]

    C. Lin, Z. Li, L. Lu, S. Cai, M. Maxey, G. E. Karniadakis, Operator learning for predicting multiscale bubble growth dynamics, The Journal of Chemical Physics 154 (10) (2021) 104118

  45. [45]

    S. Cai, Z. Wang, L. Lu, T. A. Zaki, G. E. Karniadakis, DeepM&Mnet: Inferring the electroconvection multiphysics fields based on operator ap- proximation by neural networks, Journal of Computational Physics 436 (2021) 110296

  46. [46]

    Zhang, A

    E. Zhang, A. Kahana, A. Kopaničáková, E. Turkel, R. Ranade, J. Pathak, G. E. Karniadakis, Blending neural operators and relaxation 34 methods in PDE numerical solvers, Nat Mach Intell 6 (11) (2024) 1303– 1313

  47. [47]

    J. Hu, P. Jin, A hybrid iterative method based on MIONet for PDEs: Theory and numerical examples, arXiv:2402.07156 [math] (Feb. 2024). doi:10.48550/arXiv.2402.07156. URLhttp://arxiv.org/abs/2402.07156

  48. [48]

    FourCastNet: A Global Data-driven High-resolution Weather Model using Adaptive Fourier Neural Operators

    J. Pathak, S. Subramanian, P. Harrington, S. Raja, A. Chattopadhyay, M. Mardani, T. Kurth, D. Hall, Z. Li, K. Azizzadenesheli, P. Hassan- zadeh, K. Kashinath, A. Anandkumar, FourCastNet: A Global Data- driven High-resolution Weather Model using Adaptive Fourier Neural Operators (Feb. 2022).arXiv:2202.11214

  49. [49]

    Gopakumar, S

    V. Gopakumar, S. Pamela, L. Zanisi, Z. Li, A. Anandkumar, MAST. Team, Fourier Neural Operator for Plasma Modelling (Feb. 2023).arXiv:2302.06542

  50. [50]

    L. Lu, X. Meng, Z. Mao, G. E. Karniadakis, DeepXDE: A Deep Learning LibraryforSolvingDifferentialEquations, SIAMRev.63(1)(2021)208– 228

  51. [51]

    Z. Long, Y. Lu, B. Dong, PDE-Net 2.0: Learning PDEs from data withanumeric-symbolichybriddeepnetwork, JournalofComputational Physics 399 (2019) 108925

  52. [52]

    X. Guo, W. Li, F. Iorio, Convolutional Neural Networks for Steady Flow Approximation, in: Proceedings of the 22nd ACM SIGKDD Inter- national Conference on Knowledge Discovery and Data Mining, KDD ’16, Association for Computing Machinery, New York, NY, USA, 2016, pp. 481–490

  53. [53]

    Deep Learning of Preconditioners for Conjugate Gradient Solvers in Urban Water Related Problems

    J. Sappl, L. Seiler, M. Harders, W. Rauch, Deep Learning of Precondi- tioners for Conjugate Gradient Solvers in Urban Water Related Prob- lems (Jun. 2019).arXiv:1906.06925

  54. [54]

    Y. Khoo, L. Ying, SwitchNet: a neural network model for forward and inverse scattering problems, SIAM Journal on Scientific Computing 41 (5) (2019) A3182–A3201. 35

  55. [55]

    LeCun, Y

    Y. LeCun, Y. Bengio, G. Hinton, Deep learning, Nature 521 (7553) (2015) 436–444

  56. [56]

    Rahaman, A

    N. Rahaman, A. Baratin, D. Arpit, F. Draxler, M. Lin, F. Hamprecht, Y. Bengio, A. Courville, On the Spectral Bias of Neural Networks, in: Proceedings of the 36th International Conference on Machine Learning, PMLR, 2019, pp. 5301–5310

  57. [57]

    Basri, M

    R. Basri, M. Galun, A. Geifman, D. Jacobs, Y. Kasten, S. Kritchman, Frequency Bias in Neural Networks for Input of Non-Uniform Density, in: Proceedings of the 37th International Conference on Machine Learn- ing, PMLR, 2020, pp. 685–694

  58. [58]

    and Mildenhall, Ben and Fridovich-Keil, Sara and Raghavan, Nithin and Singhal, Utkarsh and Ramamoorthi, Ravi and Barron, Jonathan T

    M. Tancik, P. P. Srinivasan, B. Mildenhall, S. Fridovich-Keil, N. Ragha- van, U. Singhal, R. Ramamoorthi, J. T. Barron, R. Ng, Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Do- mains (Jun. 2020).arXiv:2006.10739

  59. [59]

    An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

    A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J.Uszkoreit, N.Houlsby, AnImageisWorth16x16Words: Transformers for Image Recognition at Scale (Jun. 2021).arXiv:2010.11929

  60. [60]

    J. He, J. Xu, MgNet: A Unified Framework of Multigrid and Convo- lutional Neural Network, Sci. China Math. 62 (7) (2019) 1331–1354. arXiv:1901.10415

  61. [61]

    Karras, M

    T. Karras, M. Aittala, S. Laine, E. Härkönen, J. Hellsten, J. Lehtinen, T.Aila, Alias-FreeGenerativeAdversarialNetworks(Oct.2021).arXiv: 2106.12423

  62. [62]

    Saad, A Flexible Inner-Outer Preconditioned GMRES Algorithm, SIAM J

    Y. Saad, A Flexible Inner-Outer Preconditioned GMRES Algorithm, SIAM J. Sci. Comput. 14 (2) (1993) 461–469. 36