pith. sign in

arxiv: 2606.03936 · v1 · pith:CKOQF7ZKnew · submitted 2026-06-02 · 💻 cs.LG · physics.geo-ph

Correcting Neural Operator Spectral Bias via Diffusion Posterior Sampling with Sparse Observations

Pith reviewed 2026-06-28 11:08 UTC · model grok-4.3

classification 💻 cs.LG physics.geo-ph
keywords neural operatorsspectral biasdiffusion posterior samplingsparse observationselastic wavefieldsfrequency-dependent guidanceposterior sampling
0
0 comments X

The pith

Frequency-dependent guidance in diffusion posterior sampling eliminates neural operator spectral bias.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Neural operators approximate PDE solutions quickly but systematically attenuate high-frequency content. The paper combines an unconditional diffusion prior with posterior sampling conditioned on sparse sensors and guided by a frozen neural operator. A closed-form spectrally shaped guidance score weights the operator contribution by its frequency-dependent accuracy, derived from the residual's approximate spectral diagonality. A distribution-free bound shows the frequency dependence of the guidance is preserved across the frequency-time plane. Experiments on 3D elastic wavefields at 2% and 5% sensor coverage produce samples with near-zero spectral bias, while the surrogate alone, sensor-only DPS, and isotropic guidance all retain high-frequency attenuation.

Core claim

Treating neural operator predictions as auxiliary observations inside diffusion posterior sampling and applying a spectrally shaped guidance score that weights the surrogate according to its per-frequency accuracy removes the operator's inherent high-frequency attenuation, yielding near-zero spectral bias across all bands even at 2% sensor coverage.

What carries the argument

Spectrally shaped guidance score that weights the neural operator surrogate by its frequency-dependent accuracy, justified by approximate spectral diagonality of the residual.

If this is right

  • Near-zero spectral bias is achieved across all frequency bands on 3D elastic wavefield prediction at 5% and 2% sensor coverage.
  • Isotropic guidance improves pointwise accuracy but carries the surrogate's spectral bias into the posterior nearly intact.
  • The closed-form guidance requires no denoiser backpropagation and needs only paired surrogate/reference data.
  • The guidance's frequency dependence remains valid under a distribution-free error bound across the frequency-diffusion-time plane.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The provided coherence diagnostic offers a practical check for whether a new surrogate satisfies the spectral-diagonality premise before applying the method.
  • The same weighting idea could be tested on other biased surrogates whenever sparse point measurements exist, even outside wave problems.
  • If the diagonality holds only approximately, the method may still reduce but not fully eliminate bias in regimes with very high sensor sparsity.

Load-bearing premise

The residual between the neural operator and the true solution is approximately diagonal in the frequency domain.

What would settle it

A dataset in which the coherence diagnostic shows strong off-diagonal residual terms yet the frequency-weighted guidance still produces flat spectral error would falsify the necessity of the diagonality assumption.

Figures

Figures reproduced from arXiv: 2606.03936 by Fanny Lehmann, Filippo Gatti, Niccol\`o Perrone, Stefania Fresca.

Figure 1
Figure 1. Figure 1: Overview of FreqNO-DPS. Top-left (Observations & calibration): the frozen MIFNO surrogate Gϕ predicts surface wavefields uNO from geology and source inputs; spectral statistics H(k), σ2 NO(k), Pu(k) are estimated once from paired (u, uNO); sparse observations y are obtained from a random sensor mask on the ground truth wavefield u. Top-right (Denoiser training): an unconditional denoiser Dθ is trained on S… view at source ↗
Figure 2
Figure 2. Figure 2: Expected NO guidance magnitude at six noise levels across the diffusion schedule (E–W component). [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: East–west velocity at t = 2.32 s for a representative test sample. Left column: sensor-independent references (top: SEM simulation, bottom: MIFNO surrogate). Middle column: DPS + NO (iso) at ρ = 5% (top) and ρ = 2% (bottom). Right column: FreqNO-DPS at ρ = 5% (top) and ρ = 2% (bottom). Yellow star: source location; white triangles: sensor positions. DPS + NO (iso) shows the opposite pattern: it has the nar… view at source ↗
Figure 4
Figure 4. Figure 4: Vertical component velocity time histories at a sensor location ( [PITH_FULL_IMAGE:figures/full_fig_p015_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Ensemble-averaged frequency spectrum at ρ = 5%. For each method, |F(u)(f)| is computed per spatial location along the temporal axis and averaged over all locations, velocity components, and 1,000 test samples. Black dashed: reference simulation; red: MIFNO; green: DPS + NO (iso); orange: FreqNO￾DPS (α= 1); blue: FreqNO-DPS. Shaded bands correspond to the low, mid, and high rFFT ranges in [PITH_FULL_IMAGE:… view at source ↗
Figure 6
Figure 6. Figure 6: Regime decomposition of the wavenumber–diffusion-noise plane. The Gaussian moment￾matching approximation is partitioned into four regimes by three boundaries derived from the calibrated spectral quantities: ν = 1 ⇔ στ = p Pu(k), ζ = 1 ⇔ στ = σNO(k)/|H(k)|, and γ(k∗) = 1, marking the wavenumber above which the surrogate residual dominates the signal and the spectral weight λ˜ τ (k) tends to 0. Scaling of th… view at source ↗
Figure 7
Figure 7. Figure 7: shows the transfer function H˜ (k) estimated from N = 2,000 paired ground-truth/MIFNO samples on the held-out calibration split. The magnitude |H˜ (k)| (left) confirms the expected spectral-bias profile: |H| ≈ 0.9 at low wavenumbers and decreasing monotonically toward zero at high ∥k∥, with all three components behaving similarly. The phase arg H˜ (k) (right) remains bounded within ±0.5 rad across the spec… view at source ↗
Figure 8
Figure 8. Figure 8: NO error profile estimated from the calibration split ( [PITH_FULL_IMAGE:figures/full_fig_p024_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Off-diagonal cross-spectral coherence of the NO residual (E–W component), estimated from [PITH_FULL_IMAGE:figures/full_fig_p025_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Sensitivity to λNO at ρ = 5%. Vertical dashed line: calibrated operating point λNO = 0.35. Left: pointwise metrics (rMAE, rRMSE) are flat across the shaded region λNO ∈ [0.1, 0.7]. Right: banded spectral biases vary monotonically and cross zero near the calibrated value, with rFFThigh spanning −0.189 to +0.457 across the sweep. 35 [PITH_FULL_IMAGE:figures/full_fig_p035_10.png] view at source ↗
read the original abstract

Neural operator surrogates (NO) approximate PDE solutions orders of magnitude faster than numerical solvers, but suffer from spectral bias: high-frequency content is systematically attenuated, limiting reliability where fine-scale structure matters. Sparse sensor measurements of the field are often available too, offering pointwise accuracy without spectral distortion but covering only a small fraction of the domain. We address this by treating NO predictions as auxiliary observations in a diffusion posterior sampling framework. Our method, FreqNO-DPS (https://github.com/niccoloperrone/FreqNO-DPS), combines an unconditional score-based diffusion prior, trained on high-fidelity simulations, with diffusion posterior sampling (DPS) conditioned on sparse observations and guided by a frozen neural operator. Naive integration reintroduces the surrogate's spectral bias; we resolve this with a closed-form, spectrally shaped guidance score that weights the surrogate by its frequency-dependent accuracy and needs no denoiser backpropagation. A distribution-free analysis bounds the approximation error across the frequency-diffusion-time plane and shows the guidance's frequency dependence is preserved regardless of distributional assumptions. On 3D elastic wavefield prediction at 5% and 2% sensor coverage, the method reaches near-zero spectral bias across all bands, where both the surrogate and sensor-only DPS show systematic high-frequency attenuation. Isotropic guidance, the natural baseline, improves pointwise accuracy but carries the bias into the posterior nearly intact, confirming that frequency-dependent calibration is essential, not merely beneficial. The framework needs only paired surrogate/reference data and exploits no problem-specific structure beyond the residual's approximate spectral diagonality, verifiable for new surrogates via the coherence diagnostic we provide.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The paper proposes FreqNO-DPS, which embeds a frozen neural operator surrogate as an auxiliary observation within diffusion posterior sampling (DPS) conditioned on sparse pointwise sensor data. A closed-form spectrally shaped guidance score is derived by weighting the surrogate according to its frequency-dependent accuracy, justified by approximate spectral diagonality of the residual; this is claimed to be the sole problem-specific structure and is accompanied by a distribution-free error bound across the frequency-diffusion-time plane. Experiments on 3D elastic wavefield prediction at 5% and 2% sensor coverage report near-zero spectral bias across bands, contrasting with systematic high-frequency attenuation in both the raw surrogate and sensor-only DPS; isotropic guidance is shown to preserve the bias.

Significance. If the distribution-free bounds and the spectral-diagonality assumption hold under the reported conditions, the work offers a practical route to mitigate spectral bias in neural operators for high-dimensional PDEs without retraining or problem-specific tuning beyond a verifiable diagnostic. Explicit credit is due for releasing reproducible code (https://github.com/niccoloperrone/FreqNO-DPS) and for supplying the coherence diagnostic that allows independent verification of the key assumption for new surrogates.

major comments (1)
  1. [Abstract / guidance derivation] Abstract and guidance derivation: the closed-form spectrally shaped guidance score is derived under the assumption that the residual is approximately spectrally diagonal, which is invoked to justify frequency-dependent weighting without denoiser back-propagation. The paper states this is verifiable via the coherence diagnostic and is the only problem-specific structure used; however, no explicit confirmation is provided that diagonality holds at high frequencies for the 3D elastic wavefield residuals (where wave-propagation physics may induce off-diagonal correlations). If diagonality fails, the frequency-dependent correction cannot be guaranteed to eliminate the surrogate's high-frequency attenuation, directly undermining the central claim of near-zero spectral bias at 2–5% coverage.
minor comments (1)
  1. [Abstract] The abstract refers to 'paired surrogate/reference data' but does not specify the exact training split or reference solver used for the 3D elastic experiments; adding a brief statement on data provenance would improve reproducibility.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful review and constructive feedback. The major comment concerns the need for explicit verification of the spectral diagonality assumption at high frequencies for the 3D elastic case. We address this below and will strengthen the manuscript accordingly.

read point-by-point responses
  1. Referee: [Abstract / guidance derivation] Abstract and guidance derivation: the closed-form spectrally shaped guidance score is derived under the assumption that the residual is approximately spectrally diagonal, which is invoked to justify frequency-dependent weighting without denoiser back-propagation. The paper states this is verifiable via the coherence diagnostic and is the only problem-specific structure used; however, no explicit confirmation is provided that diagonality holds at high frequencies for the 3D elastic wavefield residuals (where wave-propagation physics may induce off-diagonal correlations). If diagonality fails, the frequency-dependent correction cannot be guaranteed to eliminate the surrogate's high-frequency attenuation, directly undermining the central claim of near-zero spectral bias at 2–5% coverage.

    Authors: We thank the referee for this observation. The manuscript introduces the coherence diagnostic precisely to allow verification of the approximate spectral diagonality assumption and states that it is the sole problem-specific element used. While the experimental success (near-zero bias at 2–5% coverage) is consistent with the assumption holding, we agree that an explicit high-frequency confirmation for the 3D elastic residuals was not separately highlighted. In the revision we will add a dedicated panel (or subsection) displaying the coherence matrices or off-diagonal norms for the elastic surrogate across frequency bands, including the highest frequencies. This will directly confirm that off-diagonal correlations remain small, justifying the closed-form frequency-dependent guidance without back-propagation. The distribution-free error bound across the frequency–diffusion-time plane is independent of the diagonality assumption and continues to hold. We therefore view the addition as a clarification that strengthens rather than alters the central claims. revision: yes

Circularity Check

0 steps flagged

No significant circularity; closed-form guidance derivation is independent of target results

full rationale

The paper presents a closed-form spectrally shaped guidance score derived from frequency-dependent surrogate accuracy, supported by a distribution-free error bound across the frequency-diffusion-time plane. The only problem-specific element invoked is the residual's approximate spectral diagonality, which is explicitly positioned as verifiable via a provided coherence diagnostic rather than assumed or fitted from the 3D elastic experiments. No equations reduce the guidance, bounds, or near-zero spectral bias claim to fitted parameters from the target data, self-citations, or self-definitional loops. The method is described as requiring only paired surrogate/reference data for the prior, with the frequency-dependent correction preserved regardless of distributional assumptions. This structure keeps the central derivation self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides no explicit free parameters, axioms, or invented entities; the distribution-free analysis and spectral diagonality assumption are invoked but not detailed enough to enumerate.

pith-pipeline@v0.9.1-grok · 5838 in / 1189 out tokens · 29530 ms · 2026-06-28T11:08:01.844521+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

22 extracted references · 18 canonical work pages · 4 internal anchors

  1. [1]

    Guiding diffusion models to reconstruct flow fields from sparse data

    Marc Amorós-Trepat, Luis Medrano-Navarro, Qiang Liu, Luca Guastoni, and Nils Thuerey. Guiding diffusion models to reconstruct flow fields from sparse data. (arXiv:2510.19971), October

  2. [2]

    arXiv (2023)

    doi: 10.48550/arXiv. 2510.19971. URLhttp://arxiv.org/abs/2510.19971. arXiv:2510.19971 [physics]. Jan-Hendrik Bastek, WaiChing Sun, and Dennis M Kochmann. Physics-informed diffusion models.arXiv preprint arXiv:2403.14404,

  3. [3]

    Spectral-refiner: Fine-tuning of accurate spatiotemporal neural operator for turbulent flows.arXiv preprint arXiv:2405.17211,

    Shuhao Cao, Francesco Brarda, Ruipeng Li, and Yuanzhe Xi. Spectral-refiner: Fine-tuning of accurate spatiotemporal neural operator for turbulent flows.arXiv preprint arXiv:2405.17211,

  4. [4]

    Diffusion Posterior Sampling for General Noisy Inverse Problems

    Hyungjin Chung, Jeongsol Kim, Michael T Mccann, Marc L Klasky, and Jong Chul Ye. Diffusion posterior sampling for general noisy inverse problems.arXiv preprint arXiv:2209.14687,

  5. [5]

    Spectral bias in physics-informed and operator learning: Analysis and mitigation guidelines

    Siavash Khodakarami, Vivek Oommen, Nazanin Ahmadi Daryakenari, Maxim Beekenkamp, and George Em Karniadakis. Spectral bias in physics-informed and operator learning: Analysis and mitigation guidelines. arXiv preprint arXiv:2602.19265,

  6. [6]

    Fourier Neural Operator for Parametric Partial Differential Equations

    Zongyi Li, Nikola Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Kaushik Bhattacharya, Andrew Stuart, and Anima Anandkumar. Fourier neural operator for parametric partial differential equations.arXiv preprint arXiv:2010.08895,

  7. [7]

    Decoupled diffusion sampling for inverse problems on function spaces.arXiv preprint arXiv:2601.23280,

    Thomas YL Lin, Jiachen Yao, Lufang Chiang, Julius Berner, and Anima Anandkumar. Decoupled diffusion sampling for inverse problems on function spaces.arXiv preprint arXiv:2601.23280,

  8. [8]

    Pde-refiner: Achieving accurate long rollouts with neural pde solvers, 2023.URL https://arxiv

    Phillip Lippe, S Veeling Bastiaan, Paris Perdikaris, Richard E Turner, and Johannes Brandstetter. Pde-refiner: Achieving accurate long rollouts with neural pde solvers, 2023.URL https://arxiv. org/abs/2308.05732,

  9. [9]

    Generative ai for fast and accurate statistical computation of fluids.arXiv preprint arXiv:2409.18359,

    Roberto Molinaro, Samuel Lanthaler, Bogdan Raoni´c, Tobias Rohner, Victor Armegioiu, Stephan Simonis, Dana Grund, Yannick Ramic, Zhong Yi Wan, Fei Sha, et al. Generative ai for fast and accurate statistical computation of fluids.arXiv preprint arXiv:2409.18359,

  10. [10]

    Integrating neural operators with diffusion models improves spectral representation in turbulence modeling.arXiv preprint arXiv:2409.08477,

    Vivek Oommen, Aniruddha Bora, Zhen Zhang, and George Em Karniadakis. Integrating neural operators with diffusion models improves spectral representation in turbulence modeling.arXiv preprint arXiv:2409.08477,

  11. [11]

    Integrating fourier neural operators with diffusion models to improve spectral representation of synthetic earthquake ground motion response.arXiv preprint arXiv:2504.00757,

    Niccolò Perrone, Fanny Lehmann, Hugo Gabrielidis, Stefania Fresca, and Filippo Gatti. Integrating fourier neural operators with diffusion models to improve spectral representation of synthetic earthquake ground motion response.arXiv preprint arXiv:2504.00757,

  12. [12]

    Toward a better understanding of fourier neural operators from a spectral perspective.arXiv preprint arXiv:2404.07200,

    19 Preprint Shaoxiang Qin, Fuyuan Lyu, Wenhui Peng, Dingyang Geng, Ju Wang, Xing Tang, Sylvie Leroyer, Naiping Gao, Xue Liu, and Liangzhu Leon Wang. Toward a better understanding of fourier neural operators from a spectral perspective.arXiv preprint arXiv:2404.07200,

  13. [13]

    Score-Based Generative Modeling through Stochastic Differential Equations

    Yang Song, Jascha Sohl-Dickstein, Diederik P Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole. Score-based generative modeling through stochastic differential equations.arXiv preprint arXiv:2011.13456,

  14. [14]

    doi: 10.3390/geosciences12030112

    ISSN 2076-3263. doi: 10.3390/geosciences12030112. URL https://www.mdpi.com/2076-3263/12/3/112. Alasdair Tran, Alexander Mathews, Lexing Xie, and Cheng Soon Ong. Factorized fourier neural operators. arXiv preprint arXiv:2111.13802,

  15. [15]

    Frequency principle: Fourier analysis sheds light on deep neural networks.arXiv preprint arXiv:1901.06523,

    Zhi-Qin John Xu, Yaoyu Zhang, Tao Luo, Yanyang Xiao, and Zheng Ma. Frequency principle: Fourier analysis sheds light on deep neural networks.arXiv preprint arXiv:1901.06523,

  16. [16]

    Guided diffusion sampling on function spaces with applications to pdes.arXiv preprint arXiv:2505.17004,

    Jiachen Yao, Abbas Mammadov, Julius Berner, Gavin Kerrigan, Jong Chul Ye, Kamyar Azizzadenesheli, and Anima Anandkumar. Guided diffusion sampling on function spaces with applications to pdes.arXiv preprint arXiv:2505.17004,

  17. [17]

    Mscalefno: Multi-scale fourier neural operator learning for oscillatory function spaces.arXiv preprint arXiv:2412.20183,

    Zhilin You, Zhenli Xu, and Wei Cai. Mscalefno: Multi-scale fourier neural operator learning for oscillatory function spaces.arXiv preprint arXiv:2412.20183,

  18. [18]

    A probabilistic framework for solving high-frequency helmholtz equations via diffusion models.arXiv preprint arXiv:2602.04082,

    Yicheng Zou, Samuel Lanthaler, and Hossein Salahshoor. A probabilistic framework for solving high-frequency helmholtz equations via diffusion models.arXiv preprint arXiv:2602.04082,

  19. [19]

    Computing the required second-order statistics, Var[Y] =P u +σ 2 τ and Cov[X, Y] =P u (by independence ofXandˆη τ ), yields: ˆXL(k) =α(k)Y(k), α(k) := Pu(k) σ2τ +P u(k) .(51) The coefficient α(k) contracts the noisy observation toward zero (the prior mean) with strength determined by the noise-to-signal ratioσ 2 τ /Pu(k). The associated LMMSE error is: σ2...

  20. [20]

    Score magnitude:E[|s approx|2] =hα 2/λτ =O(ν −2)→0

    +O(ν −1). Score magnitude:E[|s approx|2] =hα 2/λτ =O(ν −2)→0. Score error: E[|ϵ|2]≤ 2(∆post + ∆prior) σ4τ ≤ 4P σ4τ = 4 P ν2 →0.(77) D.4.2 Regime II (ν≪1,ζ≪1) Asymptotics:α→1,λ τ →σ 2 NO,E[|s approx|2] =O(1/(P γ)),σ 2 L,Y Z ≈σ 2 L whenζ≪1. Absolute score error: E[|ϵ|2]≤ 2(σ2 L,Y Z +σ 2 L) σ4τ ≈ 4σ2 L σ4τ = 4 P ν .(78) Relative score error:E[|ϵ| 2]/E[|sappr...

  21. [21]

    Training uses Adam with peak learning rate 3×10 −4 and weight decay 0.01, run for 430,000 steps with a total batch size of 32 (4 per GPU ×8 NVIDIA A100 80 GB GPUs)

    The model is trained under the variance- exploding (VE) diffusion scheme with an exponential noise schedule spanning σmin = 0.002 to σmax = 80, using EDM-style weighting [Karras et al., 2022] and log-uniform noise sampling. Training uses Adam with peak learning rate 3×10 −4 and weight decay 0.01, run for 430,000 steps with a total batch size of 32 (4 per ...

  22. [22]

    No spatial regularity or optimization of the sensor layout is imposed. G.6 Sampling configuration Posterior samples are generated by solving the probability-flow ODE using the explicit Euler integrator with 64 time steps following the EDM noise decay schedule [Karras et al., 2022]. A final denoising step is applied at the terminal noise level. We observed...