pith. sign in

arxiv: 2511.07347 · v2 · pith:NJDPPGDTnew · submitted 2025-11-10 · ⚛️ physics.comp-ph · cs.LG

Walsh-Hadamard Neural Operators for Solving PDEs with Discontinuous Coefficients

Pith reviewed 2026-05-21 19:13 UTC · model grok-4.3

classification ⚛️ physics.comp-ph cs.LG
keywords Walsh-Hadamard Neural Operatorneural operatorsPDE with discontinuous coefficientsFourier Neural Operatorensemble methodsspectral basissharp interfacesheat conduction
0
0 comments X

The pith

Walsh-Hadamard Neural Operators handle PDEs with discontinuous coefficients more accurately than Fourier Neural Operators by preserving sharp interfaces, and weighted ensembles of the two cut mean squared error by 35-40 percent.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces the Walsh-Hadamard Neural Operator to overcome the Gibbs phenomenon that arises when Fourier Neural Operators encounter sharp jumps from discontinuous coefficients. It replaces the Fourier basis with rectangular Walsh-Hadamard functions that align naturally with piecewise-constant fields and adds learnable weights on low-sequency coefficients to capture global structure. Tests on heat conduction with discontinuous conductivity and the 2D Burgers equation with discontinuous initial data show that WHNO maintains sharp solution features better than FNO under matched conditions. The central empirical result is that optimal weighted ensembles of the two models reduce mean squared error by 35-40 percent and peak error by up to 25 percent relative to either model alone. This outcome indicates that the two spectral representations capture complementary solution features across discontinuous and smooth regions.

Core claim

The central claim is that Walsh-Hadamard transforms provide a spectral basis better matched to piecewise-constant coefficients than Fourier transforms, allowing a neural operator built on them to learn solution maps for such PDEs with higher fidelity and without spurious oscillations at interfaces; when these operators are combined in weighted ensembles with Fourier Neural Operators, the hybrid achieves further error reductions because the bases capture distinct aspects of the solution fields.

What carries the argument

Walsh-Hadamard transform layers that convert input fields into sequency-ordered coefficients, followed by learnable linear maps on the lowest-sequency components that feed into the neural operator decoder.

If this is right

  • WHNO can replace or supplement FNO in any operator-learning pipeline where material interfaces dominate the solution structure.
  • The observed complementarity implies that hybrid spectral bases become a systematic way to improve accuracy on multi-physics problems containing both jumps and smooth regions.
  • The ensemble construction offers a parameter-light route to lower maximum pointwise errors at discontinuities without retraining either network from scratch.
  • The same Walsh-Hadamard layer can be inserted into other neural-operator architectures beyond the FNO template.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same basis switch could be tested on hyperbolic conservation laws where discontinuities propagate and interact.
  • Extending the ensemble weighting to more than two spectral bases might further reduce error on problems with mixed regularity.
  • The method's efficiency on rectangular domains suggests it may scale favorably to high-resolution 3D simulations once fast Walsh-Hadamard transforms are implemented on GPUs.

Load-bearing premise

Walsh-Hadamard rectangular waves are assumed to represent piecewise-constant coefficient fields without creating new interface artifacts that would offset their advantage over Fourier bases.

What would settle it

Train the reported ensemble on a fresh 2D or 3D PDE with discontinuous coefficients whose interface geometry differs from the training set and check whether mean squared error still drops by at least 30 percent relative to the better single model.

Figures

Figures reproduced from arXiv: 2511.07347 by Alfredo Pinelli, Giorgio M. Cavallazzi, Miguel P\'erez Cuadrado.

Figure 1
Figure 1. Figure 1: Darcy flow predictions for porous media with binary permeability. Left: ground [PITH_FULL_IMAGE:figures/full_fig_p011_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Error map comparison between WHNO (top) and FNO (bottom) for heat con [PITH_FULL_IMAGE:figures/full_fig_p013_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Ensemble architecture combining WHNO and FNO predictions. The input field [PITH_FULL_IMAGE:figures/full_fig_p014_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Error comparison for heat conduction. Error maps for WHNO (top), FNO [PITH_FULL_IMAGE:figures/full_fig_p015_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Burgers equation predictions comparison. Ground truth, WHNO, FNO, and [PITH_FULL_IMAGE:figures/full_fig_p016_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Error comparison for Burgers equation. Error maps for WHNO, FNO, and en [PITH_FULL_IMAGE:figures/full_fig_p017_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Error distributions across 100 test samples for Burgers equation. (Left) MSE [PITH_FULL_IMAGE:figures/full_fig_p017_7.png] view at source ↗
read the original abstract

Neural operators have emerged as powerful tools for learning solution operators of partial differential equations (PDEs). However, standard spectral methods based on Fourier transforms struggle with problems involving discontinuous coefficients due to the Gibbs phenomenon and poor representation of sharp interfaces. We introduce the Walsh-Hadamard Neural Operator (WHNO), which leverages Walsh-Hadamard transforms-a spectral basis of rectangular wave functions naturally suited for piecewise constant fields-combined with learnable spectral weights that transform low-sequency Walsh coefficients to capture global dependencies efficiently. We validate WHNO on three problems: steady-state Darcy flow (preliminary validation), heat conduction with discontinuous thermal conductivity, and the 2D Burgers equation with discontinuous initial conditions. In controlled comparisons with Fourier Neural Operators (FNO) under identical conditions, WHNO demonstrates superior accuracy with better preservation of sharp solution features at material interfaces. Critically, we discover that weighted ensemble combinations of WHNO and FNO achieve substantial improvements over either model alone: for both heat conduction and Burgers equation, optimal ensembles reduce mean squared error by 35-40 percent and maximum error by up to 25 percent compared to individual models. This demonstrates that Walsh-Hadamard and Fourier representations capture complementary aspects of discontinuous PDE solutions, with WHNO excelling at sharp interfaces while FNO captures smooth features effectively.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces the Walsh-Hadamard Neural Operator (WHNO) that replaces the Fourier basis in neural operators with Walsh-Hadamard transforms, which consist of rectangular waves suited to piecewise-constant fields. It reports numerical experiments on steady-state Darcy flow, heat conduction with discontinuous conductivity, and the 2D Burgers equation with discontinuous initial data, claiming that WHNO preserves sharp interfaces better than FNO under identical training conditions and that weighted ensembles of WHNO and FNO reduce MSE by 35-40% and maximum error by up to 25% relative to either model alone.

Significance. If the ensemble gains are obtained without test-set leakage and the experimental protocol is fully reproducible, the work would demonstrate that distinct spectral bases can capture complementary solution features (sharp jumps versus smooth variations) and would supply a practical recipe for improving neural-operator accuracy on heterogeneous-media problems. The absence of error bars, dataset cardinalities, and a clear description of the ensemble-weighting procedure currently limits the strength of this claim.

major comments (2)
  1. Abstract and results section: the headline claim that 'optimal ensembles' reduce MSE by 35-40% and max error by up to 25% is load-bearing for the paper's central discovery of complementary representations. The manuscript provides no description of how these optimal weights are obtained (validation-set optimization, fixed rule, or direct test-set minimization). If the weights were tuned on the test data, the reported percentage improvements do not constitute evidence of generalization.
  2. Experimental validation sections: no error bars, standard deviations across random seeds, or exact training/validation/test split sizes are reported for any of the three PDE problems. In addition, the procedure used to generate the discontinuous coefficients (Darcy) and initial conditions (Burgers) is not specified, preventing assessment of whether the observed interface-capturing advantage is robust or an artifact of the chosen discontinuity realizations.
minor comments (2)
  1. Notation: define 'sequency' explicitly when introducing the Walsh-Hadamard basis and clarify how the learnable low-sequency weights are parameterized and applied inside the operator layers.
  2. Figures: solution visualizations should include direct side-by-side comparisons of WHNO, FNO, and ensemble predictions together with pointwise error fields at the material interfaces.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed comments. These points highlight important aspects of reproducibility and the strength of our central claims. We address each major comment below and will revise the manuscript to incorporate the requested clarifications and additional experimental details.

read point-by-point responses
  1. Referee: Abstract and results section: the headline claim that 'optimal ensembles' reduce MSE by 35-40% and max error by up to 25% is load-bearing for the paper's central discovery of complementary representations. The manuscript provides no description of how these optimal weights are obtained (validation-set optimization, fixed rule, or direct test-set minimization). If the weights were tuned on the test data, the reported percentage improvements do not constitute evidence of generalization.

    Authors: We agree that a clear description of the ensemble-weighting procedure is essential to support the generalization of the reported gains. The optimal weights were determined by minimizing the validation-set error using a simple grid search over weight combinations in [0,1] with step size 0.05; the test set was never used for this optimization. We will add a dedicated subsection in the revised manuscript that fully documents this procedure, including the validation-set optimization details and confirmation that no test data were involved. revision: yes

  2. Referee: Experimental validation sections: no error bars, standard deviations across random seeds, or exact training/validation/test split sizes are reported for any of the three PDE problems. In addition, the procedure used to generate the discontinuous coefficients (Darcy) and initial conditions (Burgers) is not specified, preventing assessment of whether the observed interface-capturing advantage is robust or an artifact of the chosen discontinuity realizations.

    Authors: We concur that statistical variability measures and precise data-generation protocols are necessary for robust evaluation. In the revised manuscript we will report mean and standard deviation of all metrics across five independent random seeds, state the exact cardinalities of the training/validation/test splits for each problem, and provide a complete description of how the discontinuous conductivity fields (Darcy and heat conduction) and initial conditions (Burgers) were sampled, including the underlying random processes and any hyperparameters used to control discontinuity locations and magnitudes. revision: yes

Circularity Check

0 steps flagged

No significant circularity; claims rest on empirical validation.

full rationale

The paper proposes the WHNO architecture motivated by the suitability of Walsh-Hadamard bases for piecewise-constant fields and reports empirical gains from ensembles on heat conduction and Burgers problems. No derivation chain, equations, or self-citations are presented that reduce a claimed prediction or uniqueness result to fitted inputs or prior author work by construction. The 35-40% MSE improvements are stated as outcomes of controlled comparisons rather than tautological fits, and the approach is self-contained against external benchmarks without load-bearing self-referential steps.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that rectangular Walsh functions align with piecewise-constant coefficient fields and on the architectural choice of learnable spectral weights; no new physical entities are postulated.

free parameters (1)
  • learnable spectral weights
    Weights that map low-sequency Walsh coefficients are trained on data and therefore constitute fitted parameters of the model.
axioms (1)
  • domain assumption Walsh-Hadamard transforms are naturally suited for piecewise constant fields
    Invoked in the abstract as the reason the basis avoids the Gibbs phenomenon at material interfaces.

pith-pipeline@v0.9.0 · 5772 in / 1283 out tokens · 65968 ms · 2026-05-21T19:13:47.220658+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

24 extracted references · 24 canonical work pages · 3 internal anchors

  1. [1]

    Kovachki, Z

    N. Kovachki, Z. Li, B. Liu, K. Azizzadenesheli, K. Bhattacharya, A. Stu- art, A. Anandkumar, Neural operator: Learning maps between function spaces with applications to pdes, Journal of Machine Learning Research 24 (89) (2023) 1–97

  2. [2]

    Z. Li, N. Kovachki, K. Azizzadenesheli, B. Liu, K. Bhattacharya, A. Stu- art, A. Anandkumar, Fourier neural operator for parametric partial dif- ferential equations, arXiv preprint arXiv:2010.08895 (2020)

  3. [3]

    Z. Li, D. Z. Huang, B. Liu, A. Anandkumar, Fourier neural operator with learned deformations for pdes on general geometries, Journal of Machine Learning Research 24 (388) (2023) 1–26

  4. [4]

    N. J. Fine, On the walsh functions, Transactions of the American Math- ematical Society 65 (3) (1949) 372–414

  5. [5]

    L. Lu, P. Jin, G. Pang, Z. Zhang, G. E. Karniadakis, Learning nonlinear operators via deeponet based on the universal approximation theorem of operators, Nature Machine Intelligence 3 (3) (2021) 218–229

  6. [6]

    A. Tran, A. Mathews, L. Xie, C. S. Ong, Factorized fourier neural op- erators, arXiv preprint arXiv:2111.13802 (2021)

  7. [7]

    Cao, Choose a transformer: Fourier or galerkin, Advances in Neural Information Processing Systems 34 (2021) 24924–24940

    S. Cao, Choose a transformer: Fourier or galerkin, Advances in Neural Information Processing Systems 34 (2021) 24924–24940

  8. [8]

    Z. Li, D. Z. Huang, B. Liu, A. Anandkumar, Fourier neural operator with learned deformations for pdes on general geometries, Journal of Computational Physics 471 (2022) 111617. 21

  9. [9]

    FourCastNet: A Global Data-driven High-resolution Weather Model using Adaptive Fourier Neural Operators

    J. Pathak, S. Subramanian, P. Harrington, S. Raja, A. Chattopadhyay, M. Mardani, et al., Fourcastnet: A global data-driven high-resolution weather model using adaptive fourier neural operators, arXiv preprint arXiv:2202.11214 (2022)

  10. [10]

    Carslaw, Horatio S, The Theory of Fourier’s Series and Integrals, Nature 75 (1931) (1906) 14–14

  11. [11]

    Raissi, P

    M. Raissi, P. Perdikaris, G. E. Karniadakis, Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations, Journal of Computational Physics 378 (2019) 686–707

  12. [12]

    G. E. Karniadakis, I. G. Kevrekidis, L. Lu, P. Perdikaris, S. Wang, L. Yang, Physics-informed machine learning, Nature Reviews Physics 3 (6) (2021) 422–440

  13. [13]

    Z. Mao, A. D. Jagtap, G. E. Karniadakis, Physics-informed neural net- works for high-speed flows, Computer Methods in Applied Mechanics and Engineering 360 (2020) 112789

  14. [14]

    J. L. Walsh, A closed set of normal orthogonal functions, American Journal of Mathematics 45 (1) (1923) 5–24

  15. [15]

    K. G. Beauchamp, Walsh functions and their applications, Academic Press, 1975

  16. [16]

    B. J. Fino, V. R. Algazi, Unified matrix treatment of the fast walsh- hadamard transform, IEEE Transactions on Computers 100 (11) (1976) 1142–1146

  17. [17]

    K. S. Shanmugam, A. M. Breipohl, Walsh-hadamard transform for im- age coding, Proceedings of the IEEE 67 (7) (1979) 1025–1026

  18. [18]

    Z. Li, N. Kovachki, K. Azizzadenesheli, B. Liu, K. Bhattacharya, A. Stu- art, A. Anandkumar, Neural operator: Graph kernel network for partial differential equations, arXiv preprint arXiv:2003.03485 (2020)

  19. [19]

    Z. Li, N. Kovachki, K. Azizzadenesheli, B. Liu, A. Stuart, K. Bhat- tacharya, A. Anandkumar, Multipole graph neural operator for para- metric partial differential equations, Advances in Neural Information Processing Systems 33 (2020) 6755–6766. 22

  20. [20]

    S. Mo, Y. Zhu, N. Zabaras, X. Shi, J. Wu, Deep convolutional encoder- decoder networks for uncertainty quantification of dynamic multiphase flow in heterogeneous media, Water Resources Research 55 (1) (2019) 703–728

  21. [21]

    Y. Zhu, N. Zabaras, Bayesian deep convolutional encoder-decoder net- works for surrogate modeling and uncertainty quantification, Journal of Computational Physics 366 (2018) 415–447

  22. [22]

    Raissi, P

    M. Raissi, P. Perdikaris, G. E. Karniadakis, Numerical gaussian pro- cesses for time-dependent and nonlinear partial differential equations, SIAM Journal on Scientific Computing 40 (1) (2018) A172–A198

  23. [23]

    S. Cai, Z. Mao, Z. Wang, M. Yin, G. E. Karniadakis, Physics-informed neural networks (pinns) for heat transfer problems, Journal of Heat Transfer 143 (6) (2021) 060801

  24. [24]

    Kashinath, M

    K. Kashinath, M. Mustafa, A. Albert, J. Wu, C. Jiang, S. Esmaeilzadeh, et al., Physics-informed machine learning: Case studies for weather and climate modelling, Philosophical Transactions of the Royal Society A 379 (2194) (2021) 20200093. 23