pith. sign in

arxiv: 2405.15643 · v4 · submitted 2024-05-24 · 📊 stat.ML · cs.LG· cs.NA· math.AP· math.NA· math.PR

An Unconditional Representation of the Conditional Score in Infinite-Dimensional Linear Inverse Problems

Pith reviewed 2026-05-24 01:22 UTC · model grok-4.3

classification 📊 stat.ML cs.LGcs.NAmath.APmath.NAmath.PR
keywords score-based diffusion modelslinear inverse problemsconditional scoreunconditional scoreinfinite-dimensional spacescomputed tomographydiscretization invariance
0
0 comments X

The pith

For linear inverse problems the conditional score equals an exact affine transformation of any trained unconditional score.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Score-based diffusion models sample posteriors in Bayesian inverse problems but typically require repeated forward operator evaluations during generation. This paper establishes that when the forward operator is linear the conditional score can be recovered exactly from a trained unconditional score by simple affine transformations that depend only on the operator. The construction is carried out directly in infinite-dimensional function spaces so that the resulting sampler remains consistent under refinement of any discretization. All forward evaluations are moved to an offline training stage that learns a task-specific unconditional score. The method is demonstrated on high-dimensional CT and deblurring tasks.

Core claim

In infinite-dimensional linear inverse problems the conditional score of the posterior distribution equals an affine transformation of the unconditional score; the transformation is determined solely by the linear forward operator and can be applied without any further model evaluations once the unconditional score has been trained.

What carries the argument

Unconditional representation of the conditional score (UCoS): the exact affine map that converts a trained unconditional score into the conditional score for any linear forward operator.

If this is right

  • Posterior sampling requires zero evaluations of the forward operator at inference time.
  • All operator-dependent computation occurs once during offline training of the task-specific unconditional score.
  • The sampler remains consistent under arbitrary refinement of the discretization because the formulation never leaves the infinite-dimensional space.
  • The same trained unconditional score can be reused for any observation by applying the corresponding affine correction.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same affine identity may supply cheap conditional scores for other linear inverse problems outside imaging once an unconditional score network has been trained.
  • When the forward map is only approximately linear the affine correction could still serve as an inexpensive warm-start or regularizer for existing conditional-score methods.
  • Because the construction is operator-specific yet discretization-invariant it suggests a route to pre-compute score networks for entire families of linear operators that share the same singular system.

Load-bearing premise

The forward operator must be exactly linear.

What would settle it

Apply the derived affine map to a nonlinear forward operator (for example a nonlinear tomography problem) and check whether the resulting samples match the true posterior.

Figures

Figures reproduced from arXiv: 2405.15643 by Duc-Lam Duong, Fabian Schneider, Maarten V. de Hoop, Matti Lassas, Tapio Helin.

Figure 1
Figure 1. Figure 1: Diagonal of Σt for inpainting problem. Time-steps from left to right: t = 0.05, t = 0.1, t = 0.5 and t = 0.9 [PITH_FULL_IMAGE:figures/full_fig_p008_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Diagonal of Σt for CT imaging problem. Time-steps from left to right: t = 0.05, t = 0.1, t = 0.5 and t = 0.9. 8 [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Summary of posterior samples for inpainting problem. On the left column, the top image is [PITH_FULL_IMAGE:figures/full_fig_p012_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Summary of posterior samples for CT imaging problem. Top figure uses FNO with 32 nodes per [PITH_FULL_IMAGE:figures/full_fig_p014_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Comparison of the error dependence of the conditional method and UCoS on the parametrization [PITH_FULL_IMAGE:figures/full_fig_p015_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Summary of posterior samples for the Deblurring problem. On the left column, the top image is [PITH_FULL_IMAGE:figures/full_fig_p016_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Our FNO architecture. The scalar multiplication is given by [PITH_FULL_IMAGE:figures/full_fig_p033_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Posterior samples for Inpainting problem. Methods from top to bottom: SDE ALD, DPS, Proj, [PITH_FULL_IMAGE:figures/full_fig_p035_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Posterior samples for CT imaging problem with a FNO architecture that uses 32 nodes per layer. [PITH_FULL_IMAGE:figures/full_fig_p035_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Posterior samples for CT imaging problem with a FNO architecture that uses 64 nodes per layer. [PITH_FULL_IMAGE:figures/full_fig_p036_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Posterior samples for CT imaging problem with a FNO architecture that uses 128 nodes per [PITH_FULL_IMAGE:figures/full_fig_p036_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Posterior samples for Deblurring problem. Methods from top to bottom: SDE ALD, DPS, Proj, [PITH_FULL_IMAGE:figures/full_fig_p037_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Comparison of the error dependence of the conditional method and UCoS on the parametrization [PITH_FULL_IMAGE:figures/full_fig_p037_13.png] view at source ↗
read the original abstract

Score-based diffusion models (SDMs) have emerged as a powerful tool for sampling from the posterior distribution in Bayesian inverse problems. However, existing methods often require multiple evaluations of the forward mapping to generate a single sample, resulting in significant computational costs for large-scale inverse problems. To address this, we propose an unconditional representation of the conditional score function (UCoS) tailored to linear inverse problems, which avoids forward model evaluations during sampling by shifting computational effort to an offline training phase. In this phase, a \emph{task-dependent} score function is learned based on the linear forward operator. Crucially, we show that the conditional score can be derived \emph{exactly} from a trained (unconditional) score using affine transformations, eliminating the need for conditional score approximations. Our approach is formulated in infinite-dimensional function spaces, making it inherently discretization-invariant. We support this formulation with a rigorous convergence analysis that justifies UCoS beyond any specific discretization. Finally we validate UCoS through high-dimensional computed tomography (CT) and image deblurring experiments, demonstrating both scalability and accuracy.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper claims an exact unconditional representation of the conditional score (UCoS) for linear inverse problems in infinite-dimensional Hilbert spaces. It asserts that the conditional score equals an affine transformation of a trained task-dependent unconditional score, eliminating the need for conditional approximations or online forward evaluations. The formulation is discretization-invariant, supported by a rigorous convergence analysis, and validated on high-dimensional CT and deblurring tasks.

Significance. If the exact affine relation holds under the necessary operator conditions, the result would substantially reduce sampling costs in score-based diffusion models for Bayesian inverse problems by moving computation to an offline training phase. The infinite-dimensional formulation and convergence analysis are strengths that aim to deliver discretization invariance, which is practically relevant for applications such as CT where grid choices vary.

major comments (2)
  1. [Abstract] Abstract: the claim that the conditional score 'can be derived exactly' from an unconditional score 'for any linear forward operator' is load-bearing for the central contribution, yet the manuscript supplies no explicit list of the regularity conditions on A (boundedness, closed range) and on the noise covariance (trace-class) that are required for the push-forward measures to admit densities whose scores are well-defined in infinite-dimensional spaces. Without these conditions the exact affine identity does not follow in general.
  2. [Derivation section] Derivation section (likely §3 or §4): the affine transformation identity must be shown to hold only under the operator assumptions that guarantee the relevant Radon-Nikodym derivatives exist; if the proof tacitly invokes stronger properties than those satisfied by typical continuum-limit CT or deblurring operators, the 'exact' claim for general linear operators is not justified.
minor comments (1)
  1. [Abstract] The convergence analysis is described as 'rigorous' in the abstract; a brief statement of the precise mode of convergence (e.g., in total variation or Wasserstein distance to the target posterior) would improve clarity without altering the technical content.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive comments highlighting the need for explicit regularity conditions in the infinite-dimensional setting. We agree that making these assumptions precise strengthens the manuscript and will incorporate the requested clarifications in the revision.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the claim that the conditional score 'can be derived exactly' from an unconditional score 'for any linear forward operator' is load-bearing for the central contribution, yet the manuscript supplies no explicit list of the regularity conditions on A (boundedness, closed range) and on the noise covariance (trace-class) that are required for the push-forward measures to admit densities whose scores are well-defined in infinite-dimensional spaces. Without these conditions the exact affine identity does not follow in general.

    Authors: We agree that an explicit enumeration of assumptions is required for rigor. In the revised manuscript we will add a dedicated paragraph (new §2.3) stating the standing assumptions: A is a bounded linear operator between separable Hilbert spaces with closed range; the noise covariance operator is trace-class, positive semi-definite and such that the push-forward measures are absolutely continuous with respect to a common Gaussian reference measure whose Cameron-Martin space contains the range of A. These are the minimal conditions guaranteeing the existence of the relevant Radon-Nikodym derivatives and hence of the scores. The phrase 'any linear forward operator' will be qualified by 'satisfying the above regularity conditions'. revision: yes

  2. Referee: [Derivation section] Derivation section (likely §3 or §4): the affine transformation identity must be shown to hold only under the operator assumptions that guarantee the relevant Radon-Nikodym derivatives exist; if the proof tacitly invokes stronger properties than those satisfied by typical continuum-limit CT or deblurring operators, the 'exact' claim for general linear operators is not justified.

    Authors: The derivation in §3 proceeds from the explicit form of the conditional density under Gaussian noise and uses only the boundedness of A together with the trace-class property of the noise covariance; no compactness of A beyond closed range is invoked. The convergence analysis in §4 is written precisely to justify passage to the continuum limit for the same class of operators that arise in discretized CT and deblurring (bounded, closed-range, with trace-class noise). We will revise §3 to insert explicit cross-references to the new assumption paragraph at each step where a Radon-Nikodym derivative appears, thereby making the scope of the identity transparent. This does not alter the applicability of the method to the reported experiments. revision: yes

Circularity Check

0 steps flagged

No circularity: exact affine relation derived from linearity of forward operator

full rationale

The paper derives the claimed exact relation between conditional and unconditional scores directly from the linearity of the forward operator A in infinite-dimensional Hilbert spaces, together with standard properties of score functions and Gaussian noise. The derivation is presented as a mathematical identity (conditional score = affine transform of unconditional score) supported by a convergence analysis that does not rely on fitting parameters to data or on self-citations whose content is unverified. No step reduces by construction to its own inputs, renames a known empirical pattern, or imports a uniqueness result from the authors' prior work. The result is therefore self-contained and independent of the training procedure for the unconditional score.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on linearity of the forward operator (domain assumption) and on the existence of a well-defined unconditional score that can be learned to sufficient accuracy (ad_hoc_to_paper modeling choice). No free parameters or invented entities are visible from the abstract.

axioms (1)
  • domain assumption The forward operator is linear.
    Invoked when the abstract states that the conditional score is derived exactly via affine transformations.

pith-pipeline@v0.9.0 · 5751 in / 1219 out tokens · 19776 ms · 2026-05-24T01:22:00.994906+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

19 extracted references · 19 canonical work pages · 1 internal anchor

  1. [1]

    Posteriorsamplesofsourcegalaxiesinstronggravitationallenseswithscore-based priors.CoRR, abs/2211.03812,

    Alexandre Adam, Adam Coogan, Nikolay Malkin, Ronan Legin, Laurence Perreault Levasseur, Yashar Heza- veh, andYoshuaBengio. Posteriorsamplesofsourcegalaxiesinstronggravitationallenseswithscore-based priors.CoRR, abs/2211.03812,

  2. [2]

    Conditional score-based diffusion models for bayesian inference in infinite dimensions.Advances in Neural Information Processing Systems, 36, 2024a

    Lorenzo Baldassari, Ali Siahkoohi, Josselin Garnier, Knut Solna, and Maarten V de Hoop. Conditional score-based diffusion models for bayesian inference in infinite dimensions.Advances in Neural Information Processing Systems, 36, 2024a. Lorenzo Baldassari, Ali Siahkoohi, Josselin Garnier, Knut Solna, and Maarten V de Hoop. Taming score- baseddiffusionprio...

  3. [3]

    Valentin De Bortoli

    URLhttps://arxiv.org/abs/2111.13606. Valentin De Bortoli. Convergence of denoising diffusion models under the manifold hypothesis.Transactions on Machine Learning Research,

  4. [4]

    Hongrui Chen, Holden Lee, and Jianfeng Lu

    URLhttps:// arxiv.org/abs/2506.03979. Hongrui Chen, Holden Lee, and Jianfeng Lu. Improved analysis of score-based generative modeling: User- friendly bounds under minimal smoothness assumptions. InProceedings of the 40th International Con- ference on Machine Learning, volume 202 ofProceedings of Machine Learning Research, pp. 4735–4763. PMLR, 2023a. Sitan...

  5. [5]

    doi: https://doi.org/10.1016/j.media.2022.102479

    ISSN 1361-8415. doi: https://doi.org/10.1016/j.media.2022.102479. Hyungjin Chung, Byeongsu Sim, and Jong Chul Ye. Come-closer-diffuse-faster: Accelerating conditional diffusion models for inverse problems through stochastic contraction. In2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12403–12412,

  6. [6]

    A Survey on Diffusion Models for Inverse Problems

    ISBN 9789401051484. Giannis Daras, Hyungjin Chung, Chieh-Hsin Lai, Yuki Mitsufuji, Jong Chul Ye, Peyman Milanfar, Alexan- dros G Dimakis, and Mauricio Delbracio. A survey on diffusion models for inverse problems.arXiv preprint arXiv:2410.00083,

  7. [7]

    Feng, Manxiu Cui, Laure Delisle, Oscar Leong, Lihong V

    Sreemanti Dey, Snigdha Saha, Berthy T. Feng, Manxiu Cui, Laure Delisle, Oscar Leong, Lihong V. Wang, and Katherine L. Bouman. Score-based diffusion models for photoacoustic tomography im- age reconstruction. InICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, April

  8. [8]

    URL http://dx.doi.org/10.1109/ICASSP48485.2024.10447579

    doi: 10.1109/icassp48485.2024.10447579. URL http://dx.doi.org/10.1109/ICASSP48485.2024.10447579. Zehao Dou and Yang Song. Diffusion posterior sampling for linear inverse problem solving: A filtering perspective. InThe Twelfth International Conference on Learning Representations,

  9. [9]

    Efficient bayesian computational imaging with a surrogate score- based prior

    Berthy Feng and Katherine Bouman. Efficient bayesian computational imaging with a surrogate score- based prior. InNeurIPS 2023 Workshop on Deep Learning and Inverse Problems,

  10. [10]

    doi: 10.3934/ipi.2009.3

    ISSN 1930-8337. doi: 10.3934/ipi.2009.3

  11. [11]

    Diffusion model based posterior sampling for noisy linear inverse problems.arXiv preprint arXiv:2211.12343,

    Xiangming Meng and Yoshiyuki Kabashima. Diffusion model based posterior sampling for noisy linear inverse problems.arXiv preprint arXiv:2211.12343,

  12. [12]

    Provable probabilistic imaging using score-based generative priors.arXiv preprint arXiv:2310.10835,

    Yu Sun, Zihui Wu, Yifan Chen, Berthy T Feng, and Katherine L Bouman. Provable probabilistic imaging using score-based generative priors.arXiv preprint arXiv:2310.10835,

  13. [13]

    URLhttps://arxiv.org/abs/2505.17004. 20 A Probability measures on Hilbert spaces A.1 Gaussian random processes in an infinite-dimensional Hilbert space This section introduces notations and outlines some basic properties of probability measures on Hilbert spaces. For a more comprehensive introduction, we refer to Da Prato & Zabczyk (2014); Hairer (2023). ...

  14. [14]

    Example1.LetH=R d andµ=N(0,C)be a Gaussian measure onHwith a positive definite covariance matrixC∈Rd×d

    Moreover,H µcan be endowed with a Hilbert space structure with an inner product ⟨g,h⟩Hµ=⟨g,C−1h⟩H =⟨C−1/2g,C−1/2h⟩H. Example1.LetH=R d andµ=N(0,C)be a Gaussian measure onHwith a positive definite covariance matrixC∈Rd×d. Then sinceC 1/2H=H, the Cameron–Martin space is the whole spaceRd. Cameron–Martin’s theoremThe Cameron–Martin spaceHµplays a special rol...

  15. [15]

    This proves the claim

    =x 0 +et/2Z1 + (et−1)CA∗C−1 t (Z1−et/2AZ1) =x 0 +et/2 1 et−1ΣtC−1Z1 + ΣtA∗Γ−1Z2 is a Gaussian random variable centered atx0 with a covariance Cov(mt(e−t/2x0 +Z 1,Ax 0 +Z 2)) = Σ tA∗Γ−1AΣt + 1 et−1ΣtC−1Σt = Σ t ( A∗Γ−1A+ 1 et−1C−1 ) Σt = Σt. This proves the claim. Lemma B.5.The following holds (i) For(x,y)∈H×Rm a.e. inL(X µ t,Y), E(Xµ 0|Y=y,Xµ t =x) = ∫ Hx...

  16. [16]

    (2024) adapted to our case

    (ii):We repeat the aforementioned argument from Pidstrigach et al. (2024) adapted to our case. The joint distribution of ˜Xµ t, ˜Xµ 0 is given by˜n(x0,x)(N(0,Σ t))(dx)⊗µ(dx0). Indeed, for anyA∈σ(˜Xµ 0 ),B∈σ(˜Xµ t ), ∫ ∫ A×B ˜n(x0,x)N(0,Σ t)(dx)µ(dx0) = ∫ ∫ A×B dN(x 0,Σ t) dN(0,Σ t) (x)N(0,Σ t)(dx)µ(dx0) = ∫ ∫ A×B N(x 0,Σ t)(dx)µ(dx0) =P( ˜Xµ 0∈A,˜Xµ t ∈B)...

  17. [17]

    4.45 in Hairer, 2023), it holds S1/2 0 (H)⊂˜C1/2(H)

    AsS 1/2 0 (H)is the intersection of all linear subspaces of full measure underµ(see Prop. 4.45 in Hairer, 2023), it holds S1/2 0 (H)⊂˜C1/2(H). We now find another covariance operatorCsuch that the score function˜scorresponding toCis bounded linear. LetCbe such thatC 1/2(X)⊂S1/2 0 (X)for any linear subsetXofH. This implies S0(H)⊂C(H) = Σt(H).(35) 28 We ide...

  18. [18]

    Bound forεINIT .Recall thatv 0 = (1−e−T)Zandw 0 = (1−e−T)Z+e −T/2X0, whereZ∼N(0,C)and X0∼µy

    ( w⌊τ⌋−v⌊τ⌋ ) . Bound forεINIT .Recall thatv 0 = (1−e−T)Zandw 0 = (1−e−T)Z+e −T/2X0, whereZ∼N(0,C)and X0∼µy. It directly follows that εINIT ≤e−TEy∼πy Ex∼µy∥x∥2 H =e−TEx∼µ∥x∥2 H. where we applied marginalization of the joint distribution. Contribution fromI 1: We observe that εNUM :=E ∫ T−δ 0 ∥I1(τ)∥2 Hdτ =E ∫ T−δ 0 −1 2 ( wτ−w⌊τ⌋ ) +s(wτ,T−τ;µy)−s(w⌊τ...

  19. [19]

    (2024); Baldassari et al

    Similar to the works of Pidstrigach et al. (2024); Baldassari et al. (2024a), we run the forward SDE with a non-constant speed function leading to the SDE dXt =−1 2β(t)Xtdt+ √ β(t)CdWt withβ(t) = 0.05 +t(10−0.05). We train the neural network for 10 epochs for the 32-nodes architecture, 30 epochs for the 64-nodes archi- tecture and 65 epoch for the 128-nod...