Frequency-adaptive tensor neural networks for high-dimensional multi-scale problems

Jizu Huang; Rukang You; Yue Qiu

arxiv: 2508.15198 · v2 · pith:NQTYTVFUnew · submitted 2025-08-21 · 💻 cs.LG · math-ph· math.MP

Frequency-adaptive tensor neural networks for high-dimensional multi-scale problems

Jizu Huang , Yue Qiu , Rukang You This is my paper

Pith reviewed 2026-05-18 21:33 UTC · model grok-4.3

classification 💻 cs.LG math-phmath.MP

keywords tensor neural networksfrequency principlehigh-dimensional problemsmulti-scale problemsrandom Fourier featuresdiscrete Fourier transformcurse of dimensionality

0 comments

The pith

Tensor neural networks capture high-frequency features in high-dimensional multi-scale problems by applying discrete Fourier transforms to their one-dimensional component functions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Tensor neural networks already handle high-dimensional problems well through their tensor structure, but like other networks they follow the frequency principle and therefore miss high-frequency details in multi-scale solutions. The authors incorporate random Fourier features to boost expressivity and then extract frequency information by running the discrete Fourier transform only on the one-dimensional factors that make up each tensor. Because the transform acts separately on these lower-dimensional pieces, the method sidesteps the exponential cost that would come from analyzing the full high-dimensional function at once. Numerical tests confirm that the resulting frequency-adaptive algorithm solves complex multi-scale problems more accurately than standard tensor networks.

Core claim

Performing the Discrete Fourier Transform on the one-dimensional component functions of a tensor neural network extracts the frequency content of the corresponding high-dimensional function without incurring the curse of dimensionality; combining this extraction with random Fourier features produces a frequency-adaptive TNN whose training dynamics can represent both low- and high-frequency structures, thereby enabling accurate solutions to high-dimensional multi-scale problems.

What carries the argument

Frequency-adaptive TNN algorithm that augments standard tensor networks with random Fourier features and extracts frequency information via discrete Fourier transforms applied separately to each one-dimensional component function.

If this is right

TNNs can now represent solutions containing both smooth and oscillatory components in dimensions where direct high-dimensional Fourier analysis is infeasible.
Training dynamics of tensor networks can be steered toward high-frequency modes without changing the underlying tensor decomposition.
The same one-dimensional transform strategy can be reused inside other tensor-based solvers to add frequency awareness at modest extra cost.
Robustness across different multi-scale regimes follows from the separation of frequency extraction from the high-dimensional representation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The approach may generalize to other tensor decompositions or to physics-informed networks that already use Fourier features.
It suggests a broader principle: frequency adaptation can be factored along the same low-rank structure used for the spatial representation itself.
One could test whether the extracted frequencies can be updated dynamically during training rather than computed once in advance.
Similar component-wise transforms might help other neural architectures that suffer from the frequency principle in high dimensions.

Load-bearing premise

That the frequency features of a high-dimensional function can be recovered accurately by transforming only its one-dimensional tensor factors rather than the full function.

What would settle it

A controlled numerical test on a known high-dimensional multi-scale function in which the frequency-adaptive TNN produces larger errors than a standard TNN or fails to recover the high-frequency components identified by full-dimensional Fourier analysis.

Figures

Figures reproduced from arXiv: 2508.15198 by Jizu Huang, Rukang You, Yue Qiu.

**Figure 1.** Figure 1: , each component function of these tensor decompositions is approximated by an individual neural network. Moreover, we illustrate how these decompositions can be seamlessly integrated into the PINNs framework [32], enabling efficient approximation of solutions to high-dimensional PDEs. (a) CP-PINNs (b) TT-PINNs [PITH_FULL_IMAGE:figures/full_fig_p005_1.png] view at source ↗

**Figure 2.** Figure 2: The frequency features of the fitting function (2.3) and the convergence of TNNs to their frequency features. We consider the problem of fitting the following two-dimensional function: f(x, y) = X 3 i=1 sin(kix) + sin kiy , (2.3) where x, y ∈ [0, 2π], k1 = 2, k2 = 4, and k3 = 6 represent three distinct frequency components in both the x- and y- directions. The Fourier coefficients of f(x, y) are illustra… view at source ↗

**Figure 3.** Figure 3: The variables xi in each dimension are transformed using equation (3.1) before being fed into the corresponding sub-network. The choice between CP-PINNs and TT-PINNs depends on the distinct output structure and computational processing of each subnetwork, which will not be elaborated upon here. Next, we present a simple numerical example to demonstrate that the proposed transformation can partially allev… view at source ↗

**Figure 4.** Figure 4: The relative L2 errors for CP-PINNs, TT-PINNs, CP-PINNs-FF, and TT-PINNs-FF with different values of k. 4. Frequency-adaptive Tensor Neural Networks It is straightforward to adopt the frequency-adaptive MscaleDNNs proposed in [38] to determine the frequency features for each input dimension xi within CP-PINNs or TT-PINNs. However, this method relies on the DFT for frequency extraction, which becomes ineffi… view at source ↗

**Figure 5.** Figure 5: The relative L2 errors for CP-PINNs-FF, TT-PINNs-FF, frequency-adaptive TNNs algorithm at It = 1 based on CP decomposition (denoted as CP-1st) and based on TT decomposition (denoted as TT-1st). 5. Numerical Examples In this section, we evaluate the performance of the newly proposed frequency-adaptive TNNs in solving high-dimensional PDEs with multi-scale features. The network weights are initialized using… view at source ↗

**Figure 6.** Figure 6: The distributions of ⟨|uˆ It 1,k |⟩α for Poisson equation (3.2) with d = 3. In this test case, the parameter M in equation (4.5) is set to 10, representing the number of selected frequencies. Using Algorithm 2, we construct the frequency feature set B1, which is then used to initialize the updated network u 1 net(x; θ). After retraining, we obtain the refined approximation u 1 net(x; θ ∗ 1 ). The incorpora… view at source ↗

**Figure 7.** Figure 7: The distributions of ⟨|uˆ It 1,k |⟩α for Poisson equation (3.2) with d = 12. Next, we consider a more challenging example involving higher dimensionality and more intricate frequency features. Specifically, we set d = 12, Ω = (0, 1)12 and choose appropriate functions f(x) and g(x) in the Poisson equation (3.2) such that the exact solution is given by uexact(x) = X 12 i=1 sin(2k1πxi ) + sin(2k2πxi ) + 0.1 s… view at source ↗

**Figure 8.** Figure 8: Point-wise errors for Poisson equation (3.2) with d = 12. Using Algorithm 2, we construct the frequency feature set B1, which is then used to initialize the updated network u 1 net(x; θ). After retraining, we obtain the approximation u 1 net(x; θ ∗ 1 ) in the first adaptive step. For CPPINNs, since the dominant frequencies are selected in B1, the relative L2 is reduced to 2.088e-04, substantially lower th… view at source ↗

**Figure 9.** Figure 9: Point-wise errors for heat equation (5.1). Following the approach proposed in [50], the initial condition is directly embedded into the neural network architecture, eliminating the need for a separate loss term. This not only ensures more accurate satisfaction of the initial condition but also facilitates more efficient extraction of frequency features. In this framework, the time dimension is treated in t… view at source ↗

**Figure 10.** Figure 10: Point-wise errors for wave equation (5.2) [PITH_FULL_IMAGE:figures/full_fig_p024_10.png] view at source ↗

read the original abstract

Tensor neural networks (TNNs) have demonstrated their superiority in solving high-dimensional problems. However, similar to conventional neural networks, TNNs are also influenced by the Frequency Principle, which limits their ability to accurately capture high-frequency features of the solution. In this work, we analyze the training dynamics of TNNs by Fourier analysis and enhance their expressivity for high-dimensional multi-scale problems by incorporating random Fourier features. Leveraging the inherent tensor structure of TNNs, we further propose a novel approach to extract frequency features of high-dimensional functions by performing the Discrete Fourier Transform to one-dimensional component functions. This strategy effectively mitigates the curse of dimensionality. Building on this idea, we propose a frequency-adaptive TNNs algorithm, which significantly improves the ability of TNNs in solving complex multi-scale problems. Extensive numerical experiments are performed to validate the effectiveness and robustness of the proposed frequency-adaptive TNNs algorithm.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives tensor NNs a frequency adaptation via DFT on 1D components, but this risks missing cross-term couplings in non-separable cases.

read the letter

The main thing here is that they've built a frequency-adaptive version of tensor neural networks by adding random Fourier features and then using the Discrete Fourier Transform on the one-dimensional parts of the tensor decomposition to handle high-dimensional multi-scale problems more effectively. They do a good job with the Fourier analysis of how TNNs train and why they struggle with high frequencies, and the proposal to extract features this way is a smart use of the tensor structure to avoid computational blowup. That part feels like a genuine extension of frequency-aware methods into the tensor domain, and the claim of extensive experiments suggests they've checked it on a range of cases. The softer part is the handling of non-separable functions. If the multi-scale behavior involves frequencies that couple across dimensions, applying DFT separately to each 1D component could miss those joint effects, and the adaptation might not target the right modes. The abstract also doesn't spell out the quantitative improvements or the exact baselines, which makes it tough to assess how big the practical win is without the full details. This kind of paper is useful for researchers in scientific machine learning who are already working with tensor networks or neural PDE solvers and want better tools for multi-scale issues. A reader looking for incremental but implementable advances in high-dimensional approximation would find value in the method and the validation experiments. I'd push for peer review on this. The core construction is clear enough and addresses a real limitation, so referees could help sharpen the experiments and clarify any limitations around separability.

Referee Report

1 major / 2 minor

Summary. The manuscript proposes frequency-adaptive tensor neural networks (TNNs) for high-dimensional multi-scale problems. It analyzes TNN training dynamics via Fourier analysis, augments expressivity with random Fourier features, and introduces a method to extract frequency features of high-dimensional functions by applying the Discrete Fourier Transform to the one-dimensional component functions of the tensor decomposition. This is presented as mitigating the curse of dimensionality, yielding a frequency-adaptive algorithm that significantly improves TNN performance on complex multi-scale problems, supported by extensive numerical experiments.

Significance. If the central frequency-extraction step is shown to recover relevant modes even for non-separable targets, the work could offer a practical advance for scientific machine learning applications involving high-dimensional multi-scale functions, such as PDE solvers. The exploitation of tensor structure to perform only 1D DFTs is a conceptually appealing way to sidestep full high-dimensional Fourier analysis.

major comments (1)

[frequency feature extraction procedure] The core technical claim (abstract and the description of the frequency-adaptive algorithm): performing the Discrete Fourier Transform on one-dimensional component functions is asserted to extract frequency features of high-dimensional functions while mitigating the curse of dimensionality. For non-separable multi-scale functions whose frequencies arise from cross-dimensional interactions (e.g., modes depending on sums or products of coordinates), the 1D marginal spectra do not contain the joint frequency information; the paper must therefore supply either a theoretical argument or a concrete numerical counter-example showing that the adaptation mechanism still targets the correct high-frequency content.

minor comments (2)

[Abstract] The abstract states that 'extensive numerical experiments' validate the method but provides no information on test functions, baselines, error metrics, or quantitative gains; the main text should include a concise table summarizing these details for reproducibility.
[Methods] Notation for the tensor decomposition and the frequency-adaptation step could be clarified with a small illustrative example in the methods section.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful and constructive review of our manuscript. We address the major comment point by point below, providing clarification on the frequency extraction mechanism while remaining faithful to the paper's content and results.

read point-by-point responses

Referee: The core technical claim (abstract and the description of the frequency-adaptive algorithm): performing the Discrete Fourier Transform on one-dimensional component functions is asserted to extract frequency features of high-dimensional functions while mitigating the curse of dimensionality. For non-separable multi-scale functions whose frequencies arise from cross-dimensional interactions (e.g., modes depending on sums or products of coordinates), the 1D marginal spectra do not contain the joint frequency information; the paper must therefore supply either a theoretical argument or a concrete numerical counter-example showing that the adaptation mechanism still targets the correct high-frequency content.

Authors: We appreciate the referee's highlighting of this subtlety. In the tensor decomposition underlying TNNs (CP or tensor-train format), a high-dimensional function is expressed as a sum or contraction of products of one-dimensional component functions. Cross terms arising from coordinate sums or products (e.g., sin(x+y) expanding into products of trigonometric functions) are therefore encoded in the frequency content of the individual 1D components. Applying the 1D DFT to these components identifies the frequencies that, when combined through the tensor structure, produce the observed high-dimensional multi-scale behavior. This is consistent with the Fourier analysis of TNN training dynamics already developed in the manuscript. Our extensive numerical experiments (Section 5) include non-separable test problems with interacting scales, where the frequency-adaptive algorithm yields clear accuracy gains over baseline TNNs, confirming that the adaptation targets the relevant content in practice. We will insert a short clarifying paragraph in the revised version to make this tensor-structure argument explicit. revision: partial

Circularity Check

0 steps flagged

No significant circularity; frequency-adaptive TNNs defined via explicit construction and externally validated

full rationale

The paper defines its frequency-adaptive TNNs algorithm through a sequence of explicit constructions: Fourier analysis of TNN training dynamics, incorporation of random Fourier features, and a novel extraction of frequency features via DFT applied separately to the one-dimensional component functions of the tensor decomposition. These steps are presented as a methodological proposal whose effectiveness is then checked against external numerical experiments on multi-scale problems. No derivation reduces a claimed result to a fitted parameter renamed as prediction, no self-citation chain is invoked as load-bearing justification, and the mitigation of the curse of dimensionality follows directly from the stated algorithmic choice rather than from any self-referential definition. The overall chain therefore remains self-contained with independent empirical support.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on the domain assumption that TNNs are limited by the frequency principle and on the ad-hoc strategy that 1D DFT on tensor components extracts high-dimensional frequencies without dimensionality curse; no explicit free parameters or invented entities are stated in the abstract.

axioms (2)

domain assumption Tensor neural networks are influenced by the Frequency Principle in the same way as conventional neural networks.
Directly stated as the starting limitation in the abstract.
ad hoc to paper Discrete Fourier Transform applied to one-dimensional component functions extracts frequency features of high-dimensional functions while mitigating the curse of dimensionality.
This is the core novel strategy introduced to enable the frequency-adaptive algorithm.

pith-pipeline@v0.9.0 · 5691 in / 1282 out tokens · 43906 ms · 2026-05-18T21:33:18.649082+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Leveraging the inherent tensor structure of TNNs, we further propose a novel approach to extract frequency features of high-dimensional functions by performing the Discrete Fourier Transform to one-dimensional component functions. This strategy effectively mitigates the curse of dimensionality.
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We define the spectral error ... and simplify it as ... gradients ... Hx(z) and Hy(z) decay exponentially ...

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

62 extracted references · 62 canonical work pages · 2 internal anchors

[1]

Krizhevsky, I

A. Krizhevsky, I. Sutskever, G. E. Hinton, Imagenet classification with deep convolutional neural networks, Advances in Neural Information Processing Systems 25 (2012)

work page 2012
[2]

Hinton, L

G. Hinton, L. Deng, D. Yu, G. E. Dahl, A.-r. Mohamed, N. Jaitly, A. Senior, V . Vanhoucke, P . Nguyen, T. N. Sainath, et al., Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups, IEEE Signal Processing Magazine 29 (6) (2012) 82–97

work page 2012
[3]

Vaswani, Attention is all you need, Advances in Neural Information Processing Systems (2017)

A. Vaswani, Attention is all you need, Advances in Neural Information Processing Systems (2017)

work page 2017
[4]

Devlin, M.-W

J. Devlin, M.-W. Chang, K. Lee, K. Toutanova, BERT: Pre-training of deep bidirectional transformers for language understanding, in: NAACL-HLT 2019, (2019), pp. 4171–4186

work page 2019
[5]

Brown, B

T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P . Dhariwal, A. Neelakantan, P . Shyam, G. Sastry, A. Askell, et al., Language models are few-shot learners, Advances in Neural Information Processing Systems 33 (2020) 1877–1901

work page 2020
[6]

J. Han, A. Jentzen, et al., Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backward stochastic differential equations, Communications in Mathematics and Statistics 5 (4) (2017) 349–380. 24

work page 2017
[7]

Yu, et al., The deep Ritz method: A deep learning-based numerical algorithm for solving variational problems, Communications in Mathematics and Statistics 6 (1) (2018) 1–12

B. Yu, et al., The deep Ritz method: A deep learning-based numerical algorithm for solving variational problems, Communications in Mathematics and Statistics 6 (1) (2018) 1–12

work page 2018
[8]

Y. Zang, G. Bao, X. Ye, H. Zhou, Weak adversarial networks for high-dimensional partial di fferential equations, Journal of Computational Physics 411 (2020) 109409

work page 2020
[9]

T. Tran, A. Hamilton, M. Best McKay, B. Quiring, P . S. Vassilevski, DNN approximation of nonlinear finite element equations, CoRR abs/1911.05240 (2019)

work page arXiv 1911
[10]

J. Han, A. Jentzen, W. E, Solving high-dimensional partial di fferential equations using deep learning, Proceedings of the National Academy of Sciences 115 (34) (2018) 8505–8510

work page 2018
[11]

J. Han, L. Zhang, R. Car, et al., Deep potential: A general representation of a many-body potential energy surface, Communications in Computational Physics, 23 (3) (2018) 629

work page 2018
[12]

J. He, L. Li, J. Xu, C. Zheng, Relu deep neural networks and linear finite elements, Journal of Compu- tational Mathematics, 38 (3) (2020) 502–527

work page 2020
[13]

Y. L. Ming, et al., Deep Nitsche method: Deep Ritz method with essential boundary conditions, Communications in Computational Physics, 29 (5) (2021) 1365–1384

work page 2021
[14]

Raissi, P

M. Raissi, P . Perdikaris, G. E. Karniadakis, Physics-informed neural networks: A deep learning frame- work for solving forward and inverse problems involving nonlinear partial di fferential equations, Journal of Computational Physics 378 (2019) 686–707

work page 2019
[15]

C. M. Strofer, J.-L. Wu, H. Xiao, E. Paterson, Data-driven, physics-based feature extraction from fluid flow fields using convolutional neural networks, Communications in Computational Physics 25 (3) (2019) 625–650

work page 2019
[16]

Z. Wang, Z. Zhang, A mesh-free method for interface problems using the deep learning approach, Journal of Computational Physics 400 (2020) 108963

work page 2020
[17]

Rahaman, A

N. Rahaman, A. Baratin, D. Arpit, F. Draxler, M. Lin, F. Hamprecht, Y. Bengio, A. Courville, On the spectral bias of neural networks, in: International Conference on Machine Learning, PMLR, (2019), pp. 5301–5310

work page 2019
[18]

Z. J. Xu, Understanding training and generalization in deep learning by Fourier analysis, CoRR abs/1808.04295 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018
[19]

Z.-Q. J. Xu, Frequency principle: Fourier analysis sheds light on deep neural networks, Communications in Computational Physics 28 (5) ((2020),) 1746–1767

work page 2020
[20]

Explicitizing an Implicit Bias of the Frequency Principle in Two-layer Neural Networks

Y. Zhang, Z.-Q. J. Xu, T. Luo, Z. Ma, Explicitizing an implicit bias of the frequency principle in two-layer neural networks, CoRR abs/1905.10264 (2019)

work page internal anchor Pith review Pith/arXiv arXiv 1905
[21]

Z.-Q. J. Xu, Y. Zhang, Y. Xiao, Training behavior of deep neural network in frequency domain, in: Neural Information Processing: 26th International Conference, ICONIP 2019, Sydney, NSW, Australia, December 12–15, 2019, Proceedings, Part I 26, Springer, (2019), pp. 264–274

work page 2019
[22]

Dahmen, R

W. Dahmen, R. DeVore, L. Grasedyck, E. Süli, Tensor-sparsity of solutions to high-dimensional elliptic partial differential equations, Foundations of Computational Mathematics 16 (4) (2016) 813–874

work page 2016
[23]

Schwab, C

C. Schwab, C. J. Gittelson, Sparse tensor discretizations of high-dimensional parametric and stochastic PDEs, Acta Numerica 20 (2011) 291–467

work page 2011
[24]

S. Zeng, Z. Zhang, Q. Zou, Adaptive deep neural networks methods for high-dimensional partial differential equations, Journal of Computational Physics 463 (2022) 111232

work page 2022
[25]

F. L. Hitchcock, The expression of a tensor or a polyadic as a sum of products, Journal of Mathematics and Physics 6 (1-4) (1927) 164–189. 25

work page 1927
[26]

L. R. Tucker, Some mathematical notes on three-mode factor analysis, Psychometrika 31 (3) (1966) 279–311

work page 1966
[27]

I. V . Oseledets, Tensor-train decomposition, SIAM Journal on Scientific Computing 33 (5) (2011) 2295– 2317

work page 2011
[28]

Cohen, R

A. Cohen, R. Devore, C. Schwab, Analytic regularity and polynomial approximation of parametric and stochastic elliptic PDE’s, Analysis and Applications 9 (01) (2011) 11–47

work page 2011
[29]

Bachmayr, A

M. Bachmayr, A. Cohen, Kolmogorov widths and low-rank approximations of parametric elliptic PDEs, Mathematics of Computation 86 (304) (2017) 701–724

work page 2017
[30]

Novikov, D

A. Novikov, D. Podoprikhin, A. Osokin, D. P . Vetrov, Tensorizing neural networks, Advances in Neural Information Processing Systems 28 (2015)

work page 2015
[31]

P . Jin, S. Meng, L. Lu, MIONet: Learning multiple-input operators via tensor product, SIAM Journal on Scientific Computing 44 (6) (2022) A3490–A3514

work page 2022
[32]

S. K. Vemuri, T. Büchner, J. Niebling, J. Denzler, Functional tensor decompositions for physics-informed neural networks, in: International Conference on Pattern Recognition, Springer, (2025), pp. 32–46

work page 2025
[33]

Y. Wang, H. Xie, P . Jin, Tensor neural network and its numerical integration, Journal of Computational Mathematics 42 (6) (2024) 1714–1742

work page 2024
[34]

Z. Liu, W. Cai, Z.-Q. J. Xu, Multi-scale deep neural network (MscaleDNN) for solving Poisson- Boltzmann equation in complex domains, Communications in Computational Physics 28 (5) (2020) 1970–2001

work page 2020
[35]

multi-scale deep neural network (MscaleDNN) for solving Poisson-Boltzmann equation in complex domains cicp, 28 (5): 1970–2001, 2020

L. Zhang, W. Cai, Z.-Q. J. Xu, A correction and comments on " multi-scale deep neural network (MscaleDNN) for solving Poisson-Boltzmann equation in complex domains cicp, 28 (5): 1970–2001, 2020", Communications in Computational Physics 33 (5) (2023) 1509–1513

work page 1970
[36]

Tancik, P

M. Tancik, P . Srinivasan, B. Mildenhall, S. Fridovich-Keil, N. Raghavan, U. Singhal, R. Ramamoorthi, J. Barron, R. Ng, Fourier features let networks learn high frequency functions in low dimensional domains, Advances in Neural Information Processing Systems 33 (2020) 7537–7547

work page 2020
[37]

S. Wang, H. Wang, P . Perdikaris, On the eigenvector bias of Fourier feature networks: From regression to solving multi-scale PDEs with physics-informed neural networks, Computer Methods in Applied Mechanics and Engineering 384 (2021) 113938

work page 2021
[38]

Huang, R

J. Huang, R. You, T. Zhou, Frequency-adaptive multi-scale deep neural networks, Computer Methods in Applied Mechanics and Engineering 437 (2025) 117751

work page 2025
[39]

Falcó, W

A. Falcó, W. Hackbusch, A. Nouy, On the Dirac–Frenkel variational principle on tensor Banach spaces, Foundations of computational mathematics 19 (2019) 159–204

work page 2019
[40]

Rozza, D

G. Rozza, D. B. P . Huynh, A. T. Patera, Reduced basis approximation and a posteriori error estimation for parametrized partial di fferential equations, Archives of Computational Methods in Engineering 15 (3) (2008) 229–275

work page 2008
[41]

B. N. Khoromskij, C. Schwab, Tensor-structured Galerkin approximation of parametric and stochastic elliptic PDEs, SIAM Journal on Scientific Computing 33 (1) (2011) 364–385

work page 2011
[42]

H. Chen, R. Fu, Y. Wang, H. Xie, Solving high-dimensional parametric elliptic equation using tensor neural network, CoRR abs/2402.00040 (2024)

work page arXiv 2024
[43]

Y. Wang, Z. Lin, Y. Liao, H. Liu, H. Xie, Solving high-dimensional partial di fferential equations using tensor neural network and a posteriori error estimators, Journal of Scientific Computing 101 (3) (2024) 1–29. 26

work page 2024
[44]

Rahimi, B

A. Rahimi, B. Recht, Random features for large-scale kernel machines, Advances in Neural Information Processing Systems 20 (2007)

work page 2007
[45]

E. D. Zhong, T. Bepler, J. H. Davis, B. Berger, Reconstructing continuous distributions of 3D protein structure from cryo-EM images, in: International Conference on Learning Representations, (2020)

work page 2020
[46]

M. Chen, R. Niu, W. Zheng, Adaptive multi-scale neural network with Resnet blocks for solving partial differential equations, Nonlinear Dynamics 111 (7) (2023) 6499–6518

work page 2023
[47]

Cai, Z.-Q

W. Cai, Z.-Q. J. Xu, Multi-scale deep neural networks for solving high dimensional PDEs, CoRR abs/1910.11710 (2019)

work page arXiv 1910
[48]

Glorot, Y

X. Glorot, Y. Bengio, Understanding the di fficulty of training deep feedforward neural networks, in: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, JMLR Workshop and Conference Proceedings, (2010), pp. 249–256

work page 2010
[49]

D. P . Kingma, J. Ba, Adam: A method for stochastic optimization, in: Proceedings of the 3rd International Conference on Learning Representations (ICLR), (2015)

work page 2015
[50]

I. E. Lagaris, A. Likas, D. I. Fotiadis, Artificial neural networks for solving ordinary and partial di ffer- ential equations, IEEE Transactions on Neural Networks 9 (5) (1998) 987–1000

work page 1998
[51]

S. Wang, S. Sankaran, P . Perdikaris, Respecting causality for training physics-informed neural networks, Computer Methods in Applied Mechanics and Engineering 421 (2024) 116813

work page 2024
[52]

W. Cai, X. Li, L. Liu, A phase shift deep neural network for high frequency approximation and wave problems, SIAM Journal on Scientific Computing 42 (5) (2020) A3285–A3312

work page 2020
[53]

A. D. Jagtap, K. Kawaguchi, G. E. Karniadakis, Adaptive activation functions accelerate convergence in deep and physics-informed neural networks, Journal of Computational Physics 404 (2020) 109136

work page 2020
[54]

Liang, L

S. Liang, L. Lyu, C. Wang, H. Yang, Reproducing activation function for deep learning, CoRR abs/2101.04844 (2021)

work page arXiv 2021
[55]

J. Chen, X. Chi, Z. Yang, et al., Bridging traditional and machine learning-based algorithms for solving PDEs: The random feature method, J Mach Learn 1 (2022) 268–98

work page 2022
[56]

Beylkin, M

G. Beylkin, M. J. Mohlenkamp, Numerical operator calculus in higher dimensions, Proceedings of the National Academy of Sciences 99 (16) (2002) 10246–10251

work page 2002
[57]

Beylkin, M

G. Beylkin, M. J. Mohlenkamp, Algorithms for numerical analysis in high dimensions, SIAM Journal on Scientific Computing 26 (6) (2005) 2133–2159

work page 2005
[58]

C. Wu, M. Zhu, Q. Tan, Y. Kartha, L. Lu, A comprehensive study of non-adaptive and residual-based adaptive sampling for physics-informed neural networks, Computer Methods in Applied Mechanics and Engineering 403 (2023) 115671

work page 2023
[59]

Z. Gao, L. Yan, T. Zhou, Failure-informed adaptive sampling for PINNs, SIAM Journal on Scientific Computing 45 (4) (2023) A1971–A1994

work page 2023
[60]

K. Tang, X. Wan, C. Yang, DAS-PINNs: A deep adaptive sampling method for solving high-dimensional partial differential equations, Journal of Computational Physics 476 (2023) 111868

work page 2023
[61]

Jacot, F

A. Jacot, F. Gabriel, C. Hongler, Neural tangent kernel: Convergence and generalization in neural networks, Advances in Neural Information Processing Systems 31 (2018)

work page 2018
[62]

Li, Z.-Q

X.-A. Li, Z.-Q. J. Xu, L. Zhang, Subspace decomposition based DNN algorithm for elliptic type multi- scale PDEs, Journal of Computational Physics 488 (2023) 112242. 27 Appendix A. Computational Details The gradient of spectral loss L(kx, ky) with respect to wx,j is calculated as: ∂L(kx, ky) ∂wx,j = D(kx, ky) ∂D(kx, ky) ∂wx,j + D(kx, ky) ∂D(kx, ky) ∂wx,j =...

work page 2023

[1] [1]

Krizhevsky, I

A. Krizhevsky, I. Sutskever, G. E. Hinton, Imagenet classification with deep convolutional neural networks, Advances in Neural Information Processing Systems 25 (2012)

work page 2012

[2] [2]

Hinton, L

G. Hinton, L. Deng, D. Yu, G. E. Dahl, A.-r. Mohamed, N. Jaitly, A. Senior, V . Vanhoucke, P . Nguyen, T. N. Sainath, et al., Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups, IEEE Signal Processing Magazine 29 (6) (2012) 82–97

work page 2012

[3] [3]

Vaswani, Attention is all you need, Advances in Neural Information Processing Systems (2017)

A. Vaswani, Attention is all you need, Advances in Neural Information Processing Systems (2017)

work page 2017

[4] [4]

Devlin, M.-W

J. Devlin, M.-W. Chang, K. Lee, K. Toutanova, BERT: Pre-training of deep bidirectional transformers for language understanding, in: NAACL-HLT 2019, (2019), pp. 4171–4186

work page 2019

[5] [5]

Brown, B

T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P . Dhariwal, A. Neelakantan, P . Shyam, G. Sastry, A. Askell, et al., Language models are few-shot learners, Advances in Neural Information Processing Systems 33 (2020) 1877–1901

work page 2020

[6] [6]

J. Han, A. Jentzen, et al., Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backward stochastic differential equations, Communications in Mathematics and Statistics 5 (4) (2017) 349–380. 24

work page 2017

[7] [7]

Yu, et al., The deep Ritz method: A deep learning-based numerical algorithm for solving variational problems, Communications in Mathematics and Statistics 6 (1) (2018) 1–12

B. Yu, et al., The deep Ritz method: A deep learning-based numerical algorithm for solving variational problems, Communications in Mathematics and Statistics 6 (1) (2018) 1–12

work page 2018

[8] [8]

Y. Zang, G. Bao, X. Ye, H. Zhou, Weak adversarial networks for high-dimensional partial di fferential equations, Journal of Computational Physics 411 (2020) 109409

work page 2020

[9] [9]

T. Tran, A. Hamilton, M. Best McKay, B. Quiring, P . S. Vassilevski, DNN approximation of nonlinear finite element equations, CoRR abs/1911.05240 (2019)

work page arXiv 1911

[10] [10]

J. Han, A. Jentzen, W. E, Solving high-dimensional partial di fferential equations using deep learning, Proceedings of the National Academy of Sciences 115 (34) (2018) 8505–8510

work page 2018

[11] [11]

J. Han, L. Zhang, R. Car, et al., Deep potential: A general representation of a many-body potential energy surface, Communications in Computational Physics, 23 (3) (2018) 629

work page 2018

[12] [12]

J. He, L. Li, J. Xu, C. Zheng, Relu deep neural networks and linear finite elements, Journal of Compu- tational Mathematics, 38 (3) (2020) 502–527

work page 2020

[13] [13]

Y. L. Ming, et al., Deep Nitsche method: Deep Ritz method with essential boundary conditions, Communications in Computational Physics, 29 (5) (2021) 1365–1384

work page 2021

[14] [14]

Raissi, P

M. Raissi, P . Perdikaris, G. E. Karniadakis, Physics-informed neural networks: A deep learning frame- work for solving forward and inverse problems involving nonlinear partial di fferential equations, Journal of Computational Physics 378 (2019) 686–707

work page 2019

[15] [15]

C. M. Strofer, J.-L. Wu, H. Xiao, E. Paterson, Data-driven, physics-based feature extraction from fluid flow fields using convolutional neural networks, Communications in Computational Physics 25 (3) (2019) 625–650

work page 2019

[16] [16]

Z. Wang, Z. Zhang, A mesh-free method for interface problems using the deep learning approach, Journal of Computational Physics 400 (2020) 108963

work page 2020

[17] [17]

Rahaman, A

N. Rahaman, A. Baratin, D. Arpit, F. Draxler, M. Lin, F. Hamprecht, Y. Bengio, A. Courville, On the spectral bias of neural networks, in: International Conference on Machine Learning, PMLR, (2019), pp. 5301–5310

work page 2019

[18] [18]

Z. J. Xu, Understanding training and generalization in deep learning by Fourier analysis, CoRR abs/1808.04295 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018

[19] [19]

Z.-Q. J. Xu, Frequency principle: Fourier analysis sheds light on deep neural networks, Communications in Computational Physics 28 (5) ((2020),) 1746–1767

work page 2020

[20] [20]

Explicitizing an Implicit Bias of the Frequency Principle in Two-layer Neural Networks

Y. Zhang, Z.-Q. J. Xu, T. Luo, Z. Ma, Explicitizing an implicit bias of the frequency principle in two-layer neural networks, CoRR abs/1905.10264 (2019)

work page internal anchor Pith review Pith/arXiv arXiv 1905

[21] [21]

Z.-Q. J. Xu, Y. Zhang, Y. Xiao, Training behavior of deep neural network in frequency domain, in: Neural Information Processing: 26th International Conference, ICONIP 2019, Sydney, NSW, Australia, December 12–15, 2019, Proceedings, Part I 26, Springer, (2019), pp. 264–274

work page 2019

[22] [22]

Dahmen, R

W. Dahmen, R. DeVore, L. Grasedyck, E. Süli, Tensor-sparsity of solutions to high-dimensional elliptic partial differential equations, Foundations of Computational Mathematics 16 (4) (2016) 813–874

work page 2016

[23] [23]

Schwab, C

C. Schwab, C. J. Gittelson, Sparse tensor discretizations of high-dimensional parametric and stochastic PDEs, Acta Numerica 20 (2011) 291–467

work page 2011

[24] [24]

S. Zeng, Z. Zhang, Q. Zou, Adaptive deep neural networks methods for high-dimensional partial differential equations, Journal of Computational Physics 463 (2022) 111232

work page 2022

[25] [25]

F. L. Hitchcock, The expression of a tensor or a polyadic as a sum of products, Journal of Mathematics and Physics 6 (1-4) (1927) 164–189. 25

work page 1927

[26] [26]

L. R. Tucker, Some mathematical notes on three-mode factor analysis, Psychometrika 31 (3) (1966) 279–311

work page 1966

[27] [27]

I. V . Oseledets, Tensor-train decomposition, SIAM Journal on Scientific Computing 33 (5) (2011) 2295– 2317

work page 2011

[28] [28]

Cohen, R

A. Cohen, R. Devore, C. Schwab, Analytic regularity and polynomial approximation of parametric and stochastic elliptic PDE’s, Analysis and Applications 9 (01) (2011) 11–47

work page 2011

[29] [29]

Bachmayr, A

M. Bachmayr, A. Cohen, Kolmogorov widths and low-rank approximations of parametric elliptic PDEs, Mathematics of Computation 86 (304) (2017) 701–724

work page 2017

[30] [30]

Novikov, D

A. Novikov, D. Podoprikhin, A. Osokin, D. P . Vetrov, Tensorizing neural networks, Advances in Neural Information Processing Systems 28 (2015)

work page 2015

[31] [31]

P . Jin, S. Meng, L. Lu, MIONet: Learning multiple-input operators via tensor product, SIAM Journal on Scientific Computing 44 (6) (2022) A3490–A3514

work page 2022

[32] [32]

S. K. Vemuri, T. Büchner, J. Niebling, J. Denzler, Functional tensor decompositions for physics-informed neural networks, in: International Conference on Pattern Recognition, Springer, (2025), pp. 32–46

work page 2025

[33] [33]

Y. Wang, H. Xie, P . Jin, Tensor neural network and its numerical integration, Journal of Computational Mathematics 42 (6) (2024) 1714–1742

work page 2024

[34] [34]

Z. Liu, W. Cai, Z.-Q. J. Xu, Multi-scale deep neural network (MscaleDNN) for solving Poisson- Boltzmann equation in complex domains, Communications in Computational Physics 28 (5) (2020) 1970–2001

work page 2020

[35] [35]

multi-scale deep neural network (MscaleDNN) for solving Poisson-Boltzmann equation in complex domains cicp, 28 (5): 1970–2001, 2020

L. Zhang, W. Cai, Z.-Q. J. Xu, A correction and comments on " multi-scale deep neural network (MscaleDNN) for solving Poisson-Boltzmann equation in complex domains cicp, 28 (5): 1970–2001, 2020", Communications in Computational Physics 33 (5) (2023) 1509–1513

work page 1970

[36] [36]

Tancik, P

M. Tancik, P . Srinivasan, B. Mildenhall, S. Fridovich-Keil, N. Raghavan, U. Singhal, R. Ramamoorthi, J. Barron, R. Ng, Fourier features let networks learn high frequency functions in low dimensional domains, Advances in Neural Information Processing Systems 33 (2020) 7537–7547

work page 2020

[37] [37]

S. Wang, H. Wang, P . Perdikaris, On the eigenvector bias of Fourier feature networks: From regression to solving multi-scale PDEs with physics-informed neural networks, Computer Methods in Applied Mechanics and Engineering 384 (2021) 113938

work page 2021

[38] [38]

Huang, R

J. Huang, R. You, T. Zhou, Frequency-adaptive multi-scale deep neural networks, Computer Methods in Applied Mechanics and Engineering 437 (2025) 117751

work page 2025

[39] [39]

Falcó, W

A. Falcó, W. Hackbusch, A. Nouy, On the Dirac–Frenkel variational principle on tensor Banach spaces, Foundations of computational mathematics 19 (2019) 159–204

work page 2019

[40] [40]

Rozza, D

G. Rozza, D. B. P . Huynh, A. T. Patera, Reduced basis approximation and a posteriori error estimation for parametrized partial di fferential equations, Archives of Computational Methods in Engineering 15 (3) (2008) 229–275

work page 2008

[41] [41]

B. N. Khoromskij, C. Schwab, Tensor-structured Galerkin approximation of parametric and stochastic elliptic PDEs, SIAM Journal on Scientific Computing 33 (1) (2011) 364–385

work page 2011

[42] [42]

H. Chen, R. Fu, Y. Wang, H. Xie, Solving high-dimensional parametric elliptic equation using tensor neural network, CoRR abs/2402.00040 (2024)

work page arXiv 2024

[43] [43]

Y. Wang, Z. Lin, Y. Liao, H. Liu, H. Xie, Solving high-dimensional partial di fferential equations using tensor neural network and a posteriori error estimators, Journal of Scientific Computing 101 (3) (2024) 1–29. 26

work page 2024

[44] [44]

Rahimi, B

A. Rahimi, B. Recht, Random features for large-scale kernel machines, Advances in Neural Information Processing Systems 20 (2007)

work page 2007

[45] [45]

E. D. Zhong, T. Bepler, J. H. Davis, B. Berger, Reconstructing continuous distributions of 3D protein structure from cryo-EM images, in: International Conference on Learning Representations, (2020)

work page 2020

[46] [46]

M. Chen, R. Niu, W. Zheng, Adaptive multi-scale neural network with Resnet blocks for solving partial differential equations, Nonlinear Dynamics 111 (7) (2023) 6499–6518

work page 2023

[47] [47]

Cai, Z.-Q

W. Cai, Z.-Q. J. Xu, Multi-scale deep neural networks for solving high dimensional PDEs, CoRR abs/1910.11710 (2019)

work page arXiv 1910

[48] [48]

Glorot, Y

X. Glorot, Y. Bengio, Understanding the di fficulty of training deep feedforward neural networks, in: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, JMLR Workshop and Conference Proceedings, (2010), pp. 249–256

work page 2010

[49] [49]

D. P . Kingma, J. Ba, Adam: A method for stochastic optimization, in: Proceedings of the 3rd International Conference on Learning Representations (ICLR), (2015)

work page 2015

[50] [50]

I. E. Lagaris, A. Likas, D. I. Fotiadis, Artificial neural networks for solving ordinary and partial di ffer- ential equations, IEEE Transactions on Neural Networks 9 (5) (1998) 987–1000

work page 1998

[51] [51]

S. Wang, S. Sankaran, P . Perdikaris, Respecting causality for training physics-informed neural networks, Computer Methods in Applied Mechanics and Engineering 421 (2024) 116813

work page 2024

[52] [52]

W. Cai, X. Li, L. Liu, A phase shift deep neural network for high frequency approximation and wave problems, SIAM Journal on Scientific Computing 42 (5) (2020) A3285–A3312

work page 2020

[53] [53]

A. D. Jagtap, K. Kawaguchi, G. E. Karniadakis, Adaptive activation functions accelerate convergence in deep and physics-informed neural networks, Journal of Computational Physics 404 (2020) 109136

work page 2020

[54] [54]

Liang, L

S. Liang, L. Lyu, C. Wang, H. Yang, Reproducing activation function for deep learning, CoRR abs/2101.04844 (2021)

work page arXiv 2021

[55] [55]

J. Chen, X. Chi, Z. Yang, et al., Bridging traditional and machine learning-based algorithms for solving PDEs: The random feature method, J Mach Learn 1 (2022) 268–98

work page 2022

[56] [56]

Beylkin, M

G. Beylkin, M. J. Mohlenkamp, Numerical operator calculus in higher dimensions, Proceedings of the National Academy of Sciences 99 (16) (2002) 10246–10251

work page 2002

[57] [57]

Beylkin, M

G. Beylkin, M. J. Mohlenkamp, Algorithms for numerical analysis in high dimensions, SIAM Journal on Scientific Computing 26 (6) (2005) 2133–2159

work page 2005

[58] [58]

C. Wu, M. Zhu, Q. Tan, Y. Kartha, L. Lu, A comprehensive study of non-adaptive and residual-based adaptive sampling for physics-informed neural networks, Computer Methods in Applied Mechanics and Engineering 403 (2023) 115671

work page 2023

[59] [59]

Z. Gao, L. Yan, T. Zhou, Failure-informed adaptive sampling for PINNs, SIAM Journal on Scientific Computing 45 (4) (2023) A1971–A1994

work page 2023

[60] [60]

K. Tang, X. Wan, C. Yang, DAS-PINNs: A deep adaptive sampling method for solving high-dimensional partial differential equations, Journal of Computational Physics 476 (2023) 111868

work page 2023

[61] [61]

Jacot, F

A. Jacot, F. Gabriel, C. Hongler, Neural tangent kernel: Convergence and generalization in neural networks, Advances in Neural Information Processing Systems 31 (2018)

work page 2018

[62] [62]

Li, Z.-Q

X.-A. Li, Z.-Q. J. Xu, L. Zhang, Subspace decomposition based DNN algorithm for elliptic type multi- scale PDEs, Journal of Computational Physics 488 (2023) 112242. 27 Appendix A. Computational Details The gradient of spectral loss L(kx, ky) with respect to wx,j is calculated as: ∂L(kx, ky) ∂wx,j = D(kx, ky) ∂D(kx, ky) ∂wx,j + D(kx, ky) ∂D(kx, ky) ∂wx,j =...

work page 2023