pith. sign in

arxiv: 2607.02101 · v1 · pith:AR2MN5WMnew · submitted 2026-07-02 · 📊 stat.ME · stat.ML

Sequential Structure-Sensitive Residual Diagnostics for PDE Inverse Problems

Pith reviewed 2026-07-03 07:47 UTC · model grok-4.3

classification 📊 stat.ME stat.ML
keywords e-processesresidual diagnosticsPDE inverse problemssequential testingmodel misspecificationstructure-sensitive diagnosticsanytime-valid inference
0
0 comments X

The pith

A portfolio of spatial residual-pattern experts detects structured model errors in PDE inversions with anytime-valid error control.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Residual-norm checks often accept fitted PDE models because structured errors attenuate in observation space, leaving magnitudes below thresholds while coherent patterns persist and bias predictions. The paper proposes a sequential diagnostic that maintains a portfolio of experts, each tuned to a different residual pattern, and updates their likelihood-ratio wealth as data arrive. Rejection occurs when aggregate wealth exceeds a threshold, delivering type-I error control that remains valid at any stopping time. Demonstrations on elliptic diffusion, Stokes flow, and ice-stream inversion show the method flags misspecifications earlier than Morozov checks or batch tests while also attributing evidence to specific patterns for model correction.

Core claim

The paper introduces a structure-sensitive sequential diagnostic based on e-processes for PDE inverse problems. It uses a portfolio of spatial residual-pattern experts, updates their likelihood-ratio wealth sequentially, and rejects the fitted model when aggregate wealth crosses a threshold, providing anytime-valid type-I error control for a fixed model. In three inverse problems the method detects failures that standard discrepancy checks miss, identifies them from a fraction of the data, and uses expert wealth to point toward corrective residual patterns.

What carries the argument

An e-process built from a portfolio of spatial residual-pattern experts whose likelihood ratios are multiplied into wealth as observations are processed.

If this is right

  • Standard residual-norm diagnostics accept models that produce materially wrong quantities of interest.
  • The sequential test rejects misspecified fits earlier than fixed-sample or batch projection tests.
  • Expert wealth after rejection identifies which residual patterns supply the evidence.
  • Anytime-valid control allows valid inference even if data collection stops adaptively.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same expert-portfolio construction could be applied to inverse problems outside PDEs, such as tomographic reconstruction.
  • Automated selection or expansion of the expert dictionary could reduce reliance on manual pattern choice.
  • The method supplies a natural interface between sequential testing and downstream model refinement loops.

Load-bearing premise

A pre-specified portfolio of spatial residual-pattern experts is sufficient to capture the structured model errors that matter.

What would settle it

A simulation in which a known structured model error biases a quantity of interest yet the e-process never rejects, or rejects a correctly specified model at a rate exceeding the nominal threshold.

Figures

Figures reproduced from arXiv: 2607.02101 by Ieva Kazlauskaite.

Figure 1
Figure 1. Figure 1: Four misspecification types at λ = 0.15. True interpolated diffusivity (orange) vs. best-fit exponential (blue dashed) and null (gray dotted). The linear case (bottom right, RMS = 0.00186) is nearly indistinguishable; the exponential family approximates it well, leaving minimal residual structure. The bump and three-step cases produce larger mismatches. on its first observation by chance, and such boundary… view at source ↗
Figure 2
Figure 2. Figure 2: E-process under H0 (correct model). 200 trajectories (light blue), median (black). The 5th–95th percentile band stays well below log(1/α) = 3.0 (red dashed); 1 of 200 trajectories cross at some point, yielding empirical type-I rate 0.005. The median drifts to ≈ −3: expert bets systematically lose against pure noise. 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Detection rate Piecewise constant E-process Bon… view at source ↗
Figure 3
Figure 3. Figure 3: Detection rate vs. λ. E-process (blue), Bonferroni (light blue), Morozov (gray). Top axes: model error magnitude ∥µ ∗∥. Dashed red: α = 0.05. 12 [PITH_FULL_IMAGE:figures/full_fig_p012_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Diagnostic comparison for the linear case ( [PITH_FULL_IMAGE:figures/full_fig_p015_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Diagnostic failure on the 2D Stokes inverse problem; representative single runs from [PITH_FULL_IMAGE:figures/full_fig_p016_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Diagnostic failure on a glaciological SSA inverse problem ( [PITH_FULL_IMAGE:figures/full_fig_p017_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Detect–Diagnose–Correct on the linear case ( [PITH_FULL_IMAGE:figures/full_fig_p020_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Detection power and solution-space model error across the four misspecification types. [PITH_FULL_IMAGE:figures/full_fig_p028_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: E-process median stopping time vs. λ for the four misspecification types, where detection rate > 10%. Stopping time falls monotonically with λ for the bump, piecewise, and three-step cases; the subnoise linear case is non-monotonic and slower. The full comparison with the magnitude￾based tests is in [PITH_FULL_IMAGE:figures/full_fig_p029_9.png] view at source ↗
read the original abstract

Computational models in science and engineering are often assessed by checking whether the residual norm is consistent with the assumed noise level. This can be misleading in smoothing inverse problems: structured model errors may be attenuated in observation space, leaving residual magnitudes below practitioner discrepancy thresholds while coherent residual patterns remain. As a result, residual-norm diagnostics can accept fitted models that still give biased parameters, predictions, or quantities of interest. We propose a structure-sensitive sequential diagnostic based on e-processes. The method uses a portfolio of spatial residual-pattern experts, updates their likelihood-ratio wealth as observations are processed, and rejects the fitted model when the aggregate wealth crosses a prescribed threshold, giving anytime-valid type-I error control for a fixed fitted model. We compare the method with Morozov discrepancy checks, fixed-sample residual tests, and batch projection tests. Across three inverse problems (elliptic diffusion, two-dimensional Stokes flow, and a glaciological ice-stream inversion implemented in the community finite-element model icepack) we demonstrate how standard discrepancy checks accept misspecified fits that produce materially wrong quantities of interest. Structure-sensitive batch tests detect these failures using the full dataset, while the e-process detects them earlier from a fraction of the observations. After rejection, the expert wealth attributes the evidence to residual patterns in the chosen dictionary and provides a basis for exploratory model correction.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes a sequential structure-sensitive residual diagnostic for PDE inverse problems based on e-processes. A portfolio of spatial residual-pattern experts is used to update likelihood-ratio wealth as data arrive; the fitted model is rejected when aggregate wealth crosses a threshold. This construction is claimed to deliver anytime-valid type-I error control for a fixed fitted model. The approach is compared to Morozov discrepancy checks, fixed-sample residual tests, and batch projection tests. Across three inverse problems (elliptic diffusion, 2D Stokes flow, and a glaciological ice-stream inversion in icepack), the paper shows that norm-based diagnostics accept misspecified fits that bias quantities of interest, while the e-process method detects failures earlier and attributes evidence to specific residual patterns in the dictionary.

Significance. If the claims hold, the work would provide a practically useful sequential diagnostic that addresses a known limitation of residual-norm checks in smoothing inverse problems. The anytime-valid control and the attribution of evidence to expert patterns are attractive features for iterative model building. The empirical demonstrations on community finite-element code strengthen the case for applicability. The significance is reduced, however, by the absence of any argument that the pre-specified expert portfolio spans the relevant structured errors.

major comments (2)
  1. [Abstract (method description)] Abstract (method paragraph): The central practical claim—that the diagnostic detects misspecifications that produce biased quantities of interest while residual norms remain acceptable—rests on the assumption that the chosen portfolio of spatial residual-pattern experts is sufficiently rich. No coverage argument, completeness result, or adaptive enlargement procedure is supplied; if a coherent residual pattern lies outside the linear span of the experts, aggregate wealth need not grow and the method can accept a misspecified model. This directly affects the claim that the procedure improves upon standard discrepancy checks.
  2. [Abstract and §3] Abstract (comparison paragraph) and §3 (empirical examples): The reported earlier detection in the three inverse problems is presented as evidence of superiority, yet the manuscript provides no quantitative assessment of power against alternatives outside the expert dictionary. Without such a check (e.g., a synthetic misspecification deliberately constructed to lie in the orthogonal complement of the portfolio), it remains unclear whether the observed advantage generalizes beyond the chosen dictionary.
minor comments (1)
  1. [Abstract] The abstract refers to “a portfolio of spatial residual-pattern experts” without a concise definition or reference to the precise functional forms used; a short explicit list or equation in the main text would improve readability.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for the careful reading and constructive comments on our manuscript. The major comments correctly identify that the method's performance depends on the expert portfolio and that the empirical comparisons are limited to cases captured by that portfolio. We respond point by point below.

read point-by-point responses
  1. Referee: [Abstract (method description)] Abstract (method paragraph): The central practical claim—that the diagnostic detects misspecifications that produce biased quantities of interest while residual norms remain acceptable—rests on the assumption that the chosen portfolio of spatial residual-pattern experts is sufficiently rich. No coverage argument, completeness result, or adaptive enlargement procedure is supplied; if a coherent residual pattern lies outside the linear span of the experts, aggregate wealth need not grow and the method can accept a misspecified model. This directly affects the claim that the procedure improves upon standard discrepancy checks.

    Authors: We agree that the diagnostic detects misspecifications only when they produce residuals aligned with the pre-specified expert portfolio. The manuscript supplies no coverage argument or completeness result, as establishing such a guarantee for arbitrary PDE misspecifications would require assumptions beyond the scope of the work. The portfolio is instead assembled from standard spatial patterns (gradients, curvatures, localized bumps) that practitioners can tailor to the application. We will revise the abstract to qualify the improvement claim as conditional on expert alignment and add a discussion paragraph explaining how the dictionary can be expanded. This makes the scope explicit without overstating generality. revision: partial

  2. Referee: [Abstract and §3] Abstract (comparison paragraph) and §3 (empirical examples): The reported earlier detection in the three inverse problems is presented as evidence of superiority, yet the manuscript provides no quantitative assessment of power against alternatives outside the expert dictionary. Without such a check (e.g., a synthetic misspecification deliberately constructed to lie in the orthogonal complement of the portfolio), it remains unclear whether the observed advantage generalizes beyond the chosen dictionary.

    Authors: The three inverse-problem examples were selected because the induced misspecifications project onto the expert dictionary and bias quantities of interest while leaving residual norms acceptable. The e-process theory already implies that wealth remains a martingale (and does not cross thresholds) for components orthogonal to the experts. We will add a short synthetic illustration in §3 that constructs an orthogonal perturbation, confirms that aggregate wealth stays bounded, and contrasts this with the in-span cases. This supplies the requested quantitative check on behavior outside the dictionary. revision: yes

standing simulated objections not resolved
  • A general coverage or completeness result establishing that the expert portfolio spans all relevant structured errors for arbitrary PDE inverse problems.

Circularity Check

0 steps flagged

No significant circularity detected in derivation chain

full rationale

The paper proposes a new sequential diagnostic method based on e-processes applied to a pre-specified portfolio of spatial residual-pattern experts. The central construction (wealth updates, aggregate threshold crossing for rejection, anytime-valid type-I control) follows from standard e-process theory applied to the chosen experts; no equations reduce a claimed prediction or result to a fitted parameter or self-defined quantity by construction. The portfolio is explicitly treated as given input rather than derived, and external comparisons (Morozov, batch tests) are presented as benchmarks rather than internal fits. No load-bearing self-citations or ansatz smuggling appear in the provided description. The derivation is therefore self-contained against external statistical machinery.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

Ledger constructed from abstract only; full paper would likely add more domain assumptions about PDE solvers and noise models.

axioms (1)
  • domain assumption Observations admit a noise model under which likelihood ratios for residual patterns can be constructed and updated sequentially
    Required for the wealth-update step of the e-processes described in the abstract.
invented entities (1)
  • portfolio of spatial residual-pattern experts no independent evidence
    purpose: To detect different coherent residual structures that norm-based checks miss
    Introduced as the core mechanism for the structure-sensitive diagnostic.

pith-pipeline@v0.9.1-grok · 5759 in / 1316 out tokens · 31332 ms · 2026-07-03T07:47:17.132002+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

48 extracted references · 4 canonical work pages · 2 internal anchors

  1. [1]

    Mathematical Methods of Statistics , volume=

    Signal detection for inverse problems in a multidimensional framework , author=. Mathematical Methods of Statistics , volume=. 2014 , publisher=

  2. [2]

    Statistics Surveys , volume=

    A unified treatment for non-asymptotic and asymptotic approaches to minimax signal detection , author=. Statistics Surveys , volume=

  3. [3]

    ESAIM: Probability and Statistics , volume=

    Rate optimal estimation of quadratic functionals in inverse problems with partially unknown operator and application to testing problems , author=. ESAIM: Probability and Statistics , volume=. 2019 , publisher=

  4. [4]

    Bayesian model selection and misspecification testing in imaging inverse problems only from noisy and partial measurements

    Bayesian model selection and misspecification testing in imaging inverse problems only from noisy and partial measurements , author=. arXiv preprint arXiv:2510.27663 , year=

  5. [5]

    Journal of the Royal Statistical Society: Series B (Statistical Methodology) , volume=

    Bayesian calibration of computer models , author=. Journal of the Royal Statistical Society: Series B (Statistical Methodology) , volume=. 2001 , publisher=

  6. [6]

    Physica D: nonlinear phenomena , volume=

    Indistinguishable states II: The imperfect model scenario , author=. Physica D: nonlinear phenomena , volume=. 2004 , publisher=

  7. [7]

    Annual review of statistics and its application , volume=

    On the statistical formalism of uncertainty quantification , author=. Annual review of statistics and its application , volume=. 2019 , publisher=

  8. [8]

    Mathematical Control and Related Fields , volume=

    Stability estimates for a Robin coefficient in the two-dimensional Stokes system , author=. Mathematical Control and Related Fields , volume=

  9. [9]

    Journal of Glaciology , volume=

    Initialization of ice-sheet forecasts viewed as an inverse Robin problem , author=. Journal of Glaciology , volume=. 2010 , publisher=

  10. [10]

    The Cryosphere , volume=

    Adjoint accuracy for the full Stokes ice flow model: limits to the transmission of basal friction variability to the surface , author=. The Cryosphere , volume=. 2014 , publisher=

  11. [11]

    and Humbert, A

    Wolovick, M. and Humbert, A. and Kleiner, T. and R\"uckamp, M. , TITLE =. The Cryosphere , VOLUME =. 2023 , NUMBER =

  12. [12]

    2012 , publisher=

    Methods for solving incorrectly posed problems , author=. 2012 , publisher=

  13. [13]

    Foundations and Trends in Statistics , volume=

    Hypothesis testing with e-values , author=. Foundations and Trends in Statistics , volume=. 2025 , publisher=

  14. [14]

    Journal of the Royal Statistical Society Series B: Statistical Methodology , volume =

    Grünwald, Peter and de Heide, Rianne and Koolen, Wouter , title =. Journal of the Royal Statistical Society Series B: Statistical Methodology , volume =. 2024 , month =

  15. [15]

    Biometrics , pages=

    A multiple testing procedure for clinical trials , author=. Biometrics , pages=. 1979 , publisher=

  16. [16]

    Biometrika , volume=

    Group sequential methods in the design and analysis of clinical trials , author=. Biometrika , volume=. 1977 , publisher=

  17. [17]

    1999 , publisher=

    Group sequential methods with applications to clinical trials , author=. 1999 , publisher=

  18. [18]

    Statistica sinica , pages=

    Posterior predictive assessment of model fitness via realized discrepancies , author=. Statistica sinica , pages=. 1996 , publisher=

  19. [19]

    The Annals of Statistics , volume=

    E-values: Calibration, combination and applications , author=. The Annals of Statistics , volume=. 2021 , publisher=

  20. [20]

    1939 , publisher=

    Etude critique de la notion de collectif , author=. 1939 , publisher=

  21. [21]

    Statistical Science , volume=

    Game-theoretic statistics and safe anytime-valid inference , author=. Statistical Science , volume=. 2023 , publisher=

  22. [22]

    Proceedings of the National Academy of Sciences , volume=

    Beyond Neyman--Pearson: E-values enable hypothesis testing with a data-driven alpha , author=. Proceedings of the National Academy of Sciences , volume=. 2024 , publisher=

  23. [23]

    The Annals of Statistics , volume=

    Time-uniform, nonparametric, nonasymptotic confidence sequences , author=. The Annals of Statistics , volume=. 2021 , publisher=

  24. [24]

    Geoscientific Model Development , volume=

    icepack: A new glacier flow modeling package in Python, version 1.0 , author=. Geoscientific Model Development , volume=. 2021 , publisher=

  25. [25]

    Geophysical Research Letters , volume=

    Mapping ice stream sensitivity in the Amundsen Sector to uncertainty in ice velocity observations , author=. Geophysical Research Letters , volume=. 2025 , publisher=

  26. [26]

    Operations Research , volume=

    Always valid inference: Continuous monitoring of a/b tests , author=. Operations Research , volume=. 2022 , publisher=

  27. [27]

    arXiv preprint arXiv:2011.06931 , year=

    The anytime-valid logrank test: Error control under continuous monitoring with unlimited horizon , author=. arXiv preprint arXiv:2011.06931 , year=

  28. [28]

    Biometrika , volume=

    Valid sequential inference on probability forecast performance , author=. Biometrika , volume=. 2022 , publisher=

  29. [29]

    Acta numerica , volume=

    Inverse problems: a Bayesian perspective , author=. Acta numerica , volume=. 2010 , publisher=

  30. [30]

    1996 , publisher=

    Regularization of inverse problems , author=. 1996 , publisher=

  31. [31]

    2012 , publisher=

    Inverse heat transfer problems , author=. 2012 , publisher=

  32. [32]

    Inverse problems , volume=

    Electrical impedance tomography , author=. Inverse problems , volume=

  33. [33]

    2002 , publisher=

    Computational methods for inverse problems , author=. 2002 , publisher=

  34. [34]

    Tenth International Conference on Learning Representations , year=

    Tracking the risk of a deployed model and detecting harmful distribution shifts , author=. Tenth International Conference on Learning Representations , year=

  35. [35]

    Mathematical finance , volume=

    Universal portfolios , author=. Mathematical finance , volume=. 1991 , publisher=

  36. [36]

    Computer Methods in Applied Mechanics and Engineering , volume=

    The statistical finite element method (statFEM) for coherent synthesis of observation data and model predictions , author=. Computer Methods in Applied Mechanics and Engineering , volume=. 2021 , publisher=

  37. [37]

    Archive of numerical software , volume=

    The FEniCS project version 1.5 , author=. Archive of numerical software , volume=

  38. [38]

    Proceedings of the National Academy of Sciences , volume=

    Statistical finite elements for misspecified models , author=. Proceedings of the National Academy of Sciences , volume=. 2021 , publisher=

  39. [39]

    Proceedings of the National Academy of Sciences , volume=

    Universal inference , author=. Proceedings of the National Academy of Sciences , volume=. 2020 , publisher=

  40. [40]

    Advances in neural information processing systems , volume=

    Random features for large-scale kernel machines , author=. Advances in neural information processing systems , volume=

  41. [41]

    IMA Journal of Numerical Analysis , volume=

    Numerical estimation of the Robin coefficient in a stationary diffusion equation , author=. IMA Journal of Numerical Analysis , volume=. 2010 , publisher=

  42. [42]

    Journal of Glaciology , volume=

    An inexact Gauss-Newton method for inversion of basal sliding and rheology parameters in a nonlinear Stokes ice sheet model , author=. Journal of Glaciology , volume=. 2012 , publisher=

  43. [43]

    arXiv preprint arXiv:2504.02818 , year=

    Universal log-optimality for general classes of e-processes and sequential hypothesis tests , author=. arXiv preprint arXiv:2504.02818 , year=

  44. [44]

    SIAM/ASA Journal on Uncertainty Quantification , volume=

    The Bayesian approach to inverse Robin problems , author=. SIAM/ASA Journal on Uncertainty Quantification , volume=. 2024 , publisher=

  45. [45]

    Inverse problems , volume=

    Learning about physical parameters: The importance of model discrepancy , author=. Inverse problems , volume=. 2014 , publisher=

  46. [46]

    ACM Transactions on Mathematical Software (TOMS) , volume=

    Firedrake: automating the finite element method by composing abstractions , author=. ACM Transactions on Mathematical Software (TOMS) , volume=. 2016 , publisher=

  47. [47]

    Detecting Model Misspecification in Bayesian Inverse Problems via Variational Gradient Descent

    Detecting Model Misspecification in Bayesian Inverse Problems via Variational Gradient Descent , author=. arXiv preprint arXiv:2512.01667 , year=

  48. [48]

    Journal of the Royal Statistical Society Series A: Statistics in Society , volume =

    Shafer, Glenn , title =. Journal of the Royal Statistical Society Series A: Statistics in Society , volume =. 2021 , month =