Walsh-Hadamard Neural Operators for Solving PDEs with Discontinuous Coefficients
Pith reviewed 2026-05-21 19:13 UTC · model grok-4.3
The pith
Walsh-Hadamard Neural Operators handle PDEs with discontinuous coefficients more accurately than Fourier Neural Operators by preserving sharp interfaces, and weighted ensembles of the two cut mean squared error by 35-40 percent.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that Walsh-Hadamard transforms provide a spectral basis better matched to piecewise-constant coefficients than Fourier transforms, allowing a neural operator built on them to learn solution maps for such PDEs with higher fidelity and without spurious oscillations at interfaces; when these operators are combined in weighted ensembles with Fourier Neural Operators, the hybrid achieves further error reductions because the bases capture distinct aspects of the solution fields.
What carries the argument
Walsh-Hadamard transform layers that convert input fields into sequency-ordered coefficients, followed by learnable linear maps on the lowest-sequency components that feed into the neural operator decoder.
If this is right
- WHNO can replace or supplement FNO in any operator-learning pipeline where material interfaces dominate the solution structure.
- The observed complementarity implies that hybrid spectral bases become a systematic way to improve accuracy on multi-physics problems containing both jumps and smooth regions.
- The ensemble construction offers a parameter-light route to lower maximum pointwise errors at discontinuities without retraining either network from scratch.
- The same Walsh-Hadamard layer can be inserted into other neural-operator architectures beyond the FNO template.
Where Pith is reading between the lines
- The same basis switch could be tested on hyperbolic conservation laws where discontinuities propagate and interact.
- Extending the ensemble weighting to more than two spectral bases might further reduce error on problems with mixed regularity.
- The method's efficiency on rectangular domains suggests it may scale favorably to high-resolution 3D simulations once fast Walsh-Hadamard transforms are implemented on GPUs.
Load-bearing premise
Walsh-Hadamard rectangular waves are assumed to represent piecewise-constant coefficient fields without creating new interface artifacts that would offset their advantage over Fourier bases.
What would settle it
Train the reported ensemble on a fresh 2D or 3D PDE with discontinuous coefficients whose interface geometry differs from the training set and check whether mean squared error still drops by at least 30 percent relative to the better single model.
Figures
read the original abstract
Neural operators have emerged as powerful tools for learning solution operators of partial differential equations (PDEs). However, standard spectral methods based on Fourier transforms struggle with problems involving discontinuous coefficients due to the Gibbs phenomenon and poor representation of sharp interfaces. We introduce the Walsh-Hadamard Neural Operator (WHNO), which leverages Walsh-Hadamard transforms-a spectral basis of rectangular wave functions naturally suited for piecewise constant fields-combined with learnable spectral weights that transform low-sequency Walsh coefficients to capture global dependencies efficiently. We validate WHNO on three problems: steady-state Darcy flow (preliminary validation), heat conduction with discontinuous thermal conductivity, and the 2D Burgers equation with discontinuous initial conditions. In controlled comparisons with Fourier Neural Operators (FNO) under identical conditions, WHNO demonstrates superior accuracy with better preservation of sharp solution features at material interfaces. Critically, we discover that weighted ensemble combinations of WHNO and FNO achieve substantial improvements over either model alone: for both heat conduction and Burgers equation, optimal ensembles reduce mean squared error by 35-40 percent and maximum error by up to 25 percent compared to individual models. This demonstrates that Walsh-Hadamard and Fourier representations capture complementary aspects of discontinuous PDE solutions, with WHNO excelling at sharp interfaces while FNO captures smooth features effectively.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces the Walsh-Hadamard Neural Operator (WHNO) that replaces the Fourier basis in neural operators with Walsh-Hadamard transforms, which consist of rectangular waves suited to piecewise-constant fields. It reports numerical experiments on steady-state Darcy flow, heat conduction with discontinuous conductivity, and the 2D Burgers equation with discontinuous initial data, claiming that WHNO preserves sharp interfaces better than FNO under identical training conditions and that weighted ensembles of WHNO and FNO reduce MSE by 35-40% and maximum error by up to 25% relative to either model alone.
Significance. If the ensemble gains are obtained without test-set leakage and the experimental protocol is fully reproducible, the work would demonstrate that distinct spectral bases can capture complementary solution features (sharp jumps versus smooth variations) and would supply a practical recipe for improving neural-operator accuracy on heterogeneous-media problems. The absence of error bars, dataset cardinalities, and a clear description of the ensemble-weighting procedure currently limits the strength of this claim.
major comments (2)
- Abstract and results section: the headline claim that 'optimal ensembles' reduce MSE by 35-40% and max error by up to 25% is load-bearing for the paper's central discovery of complementary representations. The manuscript provides no description of how these optimal weights are obtained (validation-set optimization, fixed rule, or direct test-set minimization). If the weights were tuned on the test data, the reported percentage improvements do not constitute evidence of generalization.
- Experimental validation sections: no error bars, standard deviations across random seeds, or exact training/validation/test split sizes are reported for any of the three PDE problems. In addition, the procedure used to generate the discontinuous coefficients (Darcy) and initial conditions (Burgers) is not specified, preventing assessment of whether the observed interface-capturing advantage is robust or an artifact of the chosen discontinuity realizations.
minor comments (2)
- Notation: define 'sequency' explicitly when introducing the Walsh-Hadamard basis and clarify how the learnable low-sequency weights are parameterized and applied inside the operator layers.
- Figures: solution visualizations should include direct side-by-side comparisons of WHNO, FNO, and ensemble predictions together with pointwise error fields at the material interfaces.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed comments. These points highlight important aspects of reproducibility and the strength of our central claims. We address each major comment below and will revise the manuscript to incorporate the requested clarifications and additional experimental details.
read point-by-point responses
-
Referee: Abstract and results section: the headline claim that 'optimal ensembles' reduce MSE by 35-40% and max error by up to 25% is load-bearing for the paper's central discovery of complementary representations. The manuscript provides no description of how these optimal weights are obtained (validation-set optimization, fixed rule, or direct test-set minimization). If the weights were tuned on the test data, the reported percentage improvements do not constitute evidence of generalization.
Authors: We agree that a clear description of the ensemble-weighting procedure is essential to support the generalization of the reported gains. The optimal weights were determined by minimizing the validation-set error using a simple grid search over weight combinations in [0,1] with step size 0.05; the test set was never used for this optimization. We will add a dedicated subsection in the revised manuscript that fully documents this procedure, including the validation-set optimization details and confirmation that no test data were involved. revision: yes
-
Referee: Experimental validation sections: no error bars, standard deviations across random seeds, or exact training/validation/test split sizes are reported for any of the three PDE problems. In addition, the procedure used to generate the discontinuous coefficients (Darcy) and initial conditions (Burgers) is not specified, preventing assessment of whether the observed interface-capturing advantage is robust or an artifact of the chosen discontinuity realizations.
Authors: We concur that statistical variability measures and precise data-generation protocols are necessary for robust evaluation. In the revised manuscript we will report mean and standard deviation of all metrics across five independent random seeds, state the exact cardinalities of the training/validation/test splits for each problem, and provide a complete description of how the discontinuous conductivity fields (Darcy and heat conduction) and initial conditions (Burgers) were sampled, including the underlying random processes and any hyperparameters used to control discontinuity locations and magnitudes. revision: yes
Circularity Check
No significant circularity; claims rest on empirical validation.
full rationale
The paper proposes the WHNO architecture motivated by the suitability of Walsh-Hadamard bases for piecewise-constant fields and reports empirical gains from ensembles on heat conduction and Burgers problems. No derivation chain, equations, or self-citations are presented that reduce a claimed prediction or uniqueness result to fitted inputs or prior author work by construction. The 35-40% MSE improvements are stated as outcomes of controlled comparisons rather than tautological fits, and the approach is self-contained against external benchmarks without load-bearing self-referential steps.
Axiom & Free-Parameter Ledger
free parameters (1)
- learnable spectral weights
axioms (1)
- domain assumption Walsh-Hadamard transforms are naturally suited for piecewise constant fields
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We introduce the Walsh-Hadamard Neural Operator (WHNO), which leverages Walsh-Hadamard transforms—a spectral basis of rectangular wave functions naturally suited for piecewise constant fields—combined with learnable spectral weights that transform low-sequency Walsh coefficients
-
IndisputableMonolith/Foundation/AlexanderDuality.leanalexander_duality_circle_linking unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
The Hadamard matrix of order n=2^k is defined recursively: H1=[1], H2n=[Hn Hn; Hn -Hn]
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
N. Kovachki, Z. Li, B. Liu, K. Azizzadenesheli, K. Bhattacharya, A. Stu- art, A. Anandkumar, Neural operator: Learning maps between function spaces with applications to pdes, Journal of Machine Learning Research 24 (89) (2023) 1–97
work page 2023
-
[2]
Z. Li, N. Kovachki, K. Azizzadenesheli, B. Liu, K. Bhattacharya, A. Stu- art, A. Anandkumar, Fourier neural operator for parametric partial dif- ferential equations, arXiv preprint arXiv:2010.08895 (2020)
work page internal anchor Pith review Pith/arXiv arXiv 2010
-
[3]
Z. Li, D. Z. Huang, B. Liu, A. Anandkumar, Fourier neural operator with learned deformations for pdes on general geometries, Journal of Machine Learning Research 24 (388) (2023) 1–26
work page 2023
-
[4]
N. J. Fine, On the walsh functions, Transactions of the American Math- ematical Society 65 (3) (1949) 372–414
work page 1949
-
[5]
L. Lu, P. Jin, G. Pang, Z. Zhang, G. E. Karniadakis, Learning nonlinear operators via deeponet based on the universal approximation theorem of operators, Nature Machine Intelligence 3 (3) (2021) 218–229
work page 2021
- [6]
-
[7]
S. Cao, Choose a transformer: Fourier or galerkin, Advances in Neural Information Processing Systems 34 (2021) 24924–24940
work page 2021
-
[8]
Z. Li, D. Z. Huang, B. Liu, A. Anandkumar, Fourier neural operator with learned deformations for pdes on general geometries, Journal of Computational Physics 471 (2022) 111617. 21
work page 2022
-
[9]
J. Pathak, S. Subramanian, P. Harrington, S. Raja, A. Chattopadhyay, M. Mardani, et al., Fourcastnet: A global data-driven high-resolution weather model using adaptive fourier neural operators, arXiv preprint arXiv:2202.11214 (2022)
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[10]
Carslaw, Horatio S, The Theory of Fourier’s Series and Integrals, Nature 75 (1931) (1906) 14–14
work page 1931
- [11]
-
[12]
G. E. Karniadakis, I. G. Kevrekidis, L. Lu, P. Perdikaris, S. Wang, L. Yang, Physics-informed machine learning, Nature Reviews Physics 3 (6) (2021) 422–440
work page 2021
-
[13]
Z. Mao, A. D. Jagtap, G. E. Karniadakis, Physics-informed neural net- works for high-speed flows, Computer Methods in Applied Mechanics and Engineering 360 (2020) 112789
work page 2020
-
[14]
J. L. Walsh, A closed set of normal orthogonal functions, American Journal of Mathematics 45 (1) (1923) 5–24
work page 1923
-
[15]
K. G. Beauchamp, Walsh functions and their applications, Academic Press, 1975
work page 1975
-
[16]
B. J. Fino, V. R. Algazi, Unified matrix treatment of the fast walsh- hadamard transform, IEEE Transactions on Computers 100 (11) (1976) 1142–1146
work page 1976
-
[17]
K. S. Shanmugam, A. M. Breipohl, Walsh-hadamard transform for im- age coding, Proceedings of the IEEE 67 (7) (1979) 1025–1026
work page 1979
-
[18]
Z. Li, N. Kovachki, K. Azizzadenesheli, B. Liu, K. Bhattacharya, A. Stu- art, A. Anandkumar, Neural operator: Graph kernel network for partial differential equations, arXiv preprint arXiv:2003.03485 (2020)
work page internal anchor Pith review Pith/arXiv arXiv 2003
-
[19]
Z. Li, N. Kovachki, K. Azizzadenesheli, B. Liu, A. Stuart, K. Bhat- tacharya, A. Anandkumar, Multipole graph neural operator for para- metric partial differential equations, Advances in Neural Information Processing Systems 33 (2020) 6755–6766. 22
work page 2020
-
[20]
S. Mo, Y. Zhu, N. Zabaras, X. Shi, J. Wu, Deep convolutional encoder- decoder networks for uncertainty quantification of dynamic multiphase flow in heterogeneous media, Water Resources Research 55 (1) (2019) 703–728
work page 2019
-
[21]
Y. Zhu, N. Zabaras, Bayesian deep convolutional encoder-decoder net- works for surrogate modeling and uncertainty quantification, Journal of Computational Physics 366 (2018) 415–447
work page 2018
- [22]
-
[23]
S. Cai, Z. Mao, Z. Wang, M. Yin, G. E. Karniadakis, Physics-informed neural networks (pinns) for heat transfer problems, Journal of Heat Transfer 143 (6) (2021) 060801
work page 2021
-
[24]
K. Kashinath, M. Mustafa, A. Albert, J. Wu, C. Jiang, S. Esmaeilzadeh, et al., Physics-informed machine learning: Case studies for weather and climate modelling, Philosophical Transactions of the Royal Society A 379 (2194) (2021) 20200093. 23
work page 2021
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.