pith. sign in

arxiv: 2502.20086 · v2 · submitted 2025-02-27 · 🧮 math.OC · cs.NA· math.NA

Subspace accelerated measure transport methods for fast and scalable sequential experimental design, with application to photoacoustic imaging

Pith reviewed 2026-05-23 02:37 UTC · model grok-4.3

classification 🧮 math.OC cs.NAmath.NA
keywords sequential optimal experimental designincremental expected information gainmeasure transport mapslikelihood-informed subspacesBayesian inverse problemsphotoacoustic imagingPDE models
0
0 comments X

The pith

A derivative-based upper bound on incremental expected information gain enables subspace projectors and conditional transport maps for scalable sequential optimal experimental design in high-dimensional Bayesian inverse problems.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a method for sequential optimal experimental design that repeatedly maximizes the expected information gain from prior to posterior in non-Gaussian Bayesian inverse problems. It derives a general derivative-based upper bound on the incremental expected information gain that both selects the next design point and constructs projectors onto likelihood-informed subspaces. These projectors reduce the parameter dimension before conditional measure transport maps are built for each successive posterior. The resulting framework unifies sequential design with amortized inference and is shown to remain practical for high- and infinite-dimensional problems governed by PDEs.

Core claim

A derivative-based upper bound for the incremental expected information gain guides design placement and simultaneously supplies projectors onto likelihood-informed subspaces; when these projectors are paired with conditional measure transport maps for the sequence of posteriors, the combined procedure yields a unified, scalable framework for sequential optimal experimental design and amortized inference that works in high- and infinite-dimensional settings.

What carries the argument

derivative-based upper bound for iEIG together with likelihood-informed subspace projectors and conditional measure transport maps

If this is right

  • Design selection can be performed by maximizing the bound instead of the intractable iEIG at each stage.
  • Subspace projectors built from the bound reduce the effective parameter dimension before each transport map is trained.
  • Conditional transport maps enable amortized sampling from every posterior that arises during the sequential process.
  • The same framework applies without modification to infinite-dimensional parameter spaces arising from PDE discretizations.
  • Numerical tests on two PDE-governed inverse problems confirm that the resulting designs improve information gain.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same bound-driven projectors might be reused across multiple related inverse problems that share the same forward operator.
  • Transport-map amortization could support real-time sequential design when new data arrive continuously.
  • The approach may reduce the cost of related tasks such as Bayesian optimal design under model discrepancy.

Load-bearing premise

The derivative-based upper bound on iEIG is sufficiently tight to both select good designs and produce projectors that meaningfully reduce the dimension for the transport maps.

What would settle it

On a low-dimensional synthetic inverse problem where the exact iEIG can be computed by quadrature, compare the designs chosen by maximizing the bound against those chosen by maximizing the true iEIG and check whether the selected design sequences coincide.

read the original abstract

We propose a novel approach for sequential optimal experimental design (sOED) for Bayesian inverse problems involving expensive models with high-dimensional unknown parameters. This work focuses on designs that maximize the expected information gain (EIG) from prior to posterior, a task that is computationally very challenging in non-Gaussian settings. This challenge is amplified in sOED, as the incremental expected information gain (iEIG) must be repeatedly approximated across distinct stages, with both prior and posterior distributions being intractable. To address this, we derive a general-purpose, derivative-based upper bound for the iEIG, which not only guides design placement but also enables the construction of projectors onto likelihood-informed subspaces, facilitating parameter dimension reduction. By combining this approach with conditional measure transport maps for the sequence of posteriors, we develop a unified sOED and amortized inference framework scalable to high- and infinite-dimensional problems. Numerical experiments for two inverse problems governed by partial differential equations (PDEs) demonstrate the effectiveness of designs by maximizing the proposed bound.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper claims to derive a general-purpose derivative-based upper bound for the incremental expected information gain (iEIG) in sequential optimal experimental design (sOED) for high-dimensional Bayesian inverse problems with expensive PDE models. This bound is asserted to simultaneously guide design selection and enable projectors onto likelihood-informed subspaces for parameter dimension reduction. The approach is combined with conditional measure transport maps to create a unified, scalable framework for sOED and amortized inference. Effectiveness is demonstrated via numerical experiments on two PDE-governed inverse problems.

Significance. If the derivative-based upper bound produces subspaces that reliably preserve the structure of the sequence of posteriors for accurate conditional transport maps, the work would represent a meaningful advance in tractable sOED for non-Gaussian, high- and infinite-dimensional settings. The unification of design optimization with amortized inference via transport maps addresses a genuine computational bottleneck, and the PDE numerical results provide initial evidence of practicality. The contribution would be strengthened by explicit verification that the bound's subspaces remain effective beyond the reported examples.

major comments (2)
  1. [Abstract] Abstract (paragraph on the bound and projectors): the central claim that the derivative-based upper bound 'enables the construction of projectors onto likelihood-informed subspaces' that make transport maps tractable rests on an unproven alignment between the bound and posterior variability. No theorem or analysis is cited showing that directions emphasized by the bound (computed from derivatives) coincide with those carrying the essential non-Gaussian posterior mass in the sequence of posteriors; this is load-bearing for the scalability assertion in high-dimensional non-Gaussian PDE settings.
  2. [Numerical experiments] Numerical experiments section: while the two PDE examples are reported to demonstrate effectiveness of designs obtained by maximizing the bound, the manuscript provides no quantitative assessment (e.g., error in transport map approximation with vs. without the subspace reduction, or comparison of subspace quality against true posterior covariance or samples) that would confirm the subspaces preserve the structure needed for the subsequent transport step. Without such diagnostics, the experiments do not yet substantiate the unified framework's claimed scalability.
minor comments (2)
  1. Notation for the upper bound and the projectors should be introduced with explicit definitions and distinguished from related quantities (e.g., standard EIG bounds) to improve readability.
  2. The manuscript would benefit from a short table summarizing the computational complexity of each component (bound evaluation, projector construction, transport map training) across the reported examples.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address each major comment below and outline the revisions we will make to strengthen the presentation.

read point-by-point responses
  1. Referee: [Abstract] Abstract (paragraph on the bound and projectors): the central claim that the derivative-based upper bound 'enables the construction of projectors onto likelihood-informed subspaces' that make transport maps tractable rests on an unproven alignment between the bound and posterior variability. No theorem or analysis is cited showing that directions emphasized by the bound (computed from derivatives) coincide with those carrying the essential non-Gaussian posterior mass in the sequence of posteriors; this is load-bearing for the scalability assertion in high-dimensional non-Gaussian PDE settings.

    Authors: The upper bound is constructed directly from the same derivative (gradient) information that defines likelihood-informed subspaces in the existing literature on dimension reduction for Bayesian inversion. Its form therefore identifies directions of high sensitivity to the data, which are the directions expected to carry the dominant posterior variability. While the manuscript does not contain a new standalone theorem proving exact equivalence in every non-Gaussian case, the alignment follows from the derivation of the bound itself. In the revision we will add a short subsection (or remark) that explicitly connects the bound to likelihood-informed projectors, cites the relevant prior works on this construction, and states the conditions under which the alignment is expected to hold, thereby making the reasoning transparent. revision: partial

  2. Referee: [Numerical experiments] Numerical experiments section: while the two PDE examples are reported to demonstrate effectiveness of designs obtained by maximizing the bound, the manuscript provides no quantitative assessment (e.g., error in transport map approximation with vs. without the subspace reduction, or comparison of subspace quality against true posterior covariance or samples) that would confirm the subspaces preserve the structure needed for the subsequent transport step. Without such diagnostics, the experiments do not yet substantiate the unified framework's claimed scalability.

    Authors: We agree that additional quantitative diagnostics would strengthen the numerical section. In the revised manuscript we will augment the experiments with (i) tables or plots comparing the transport-map approximation error (e.g., KL divergence or Wasserstein distance to reference samples) obtained with and without the subspace projectors, and (ii) comparisons of the retained subspace directions against the leading eigenvectors of the posterior covariance (where the latter is computable) or against empirical posterior samples. These additions will directly demonstrate that the subspaces preserve the structure required for accurate conditional transport. revision: yes

Circularity Check

0 steps flagged

No circularity detected in derivation chain

full rationale

The paper derives a new derivative-based upper bound for iEIG that simultaneously guides designs and constructs likelihood-informed projectors for dimension reduction, then combines this with conditional measure transport maps to form a unified sOED/amortized inference framework. No step in the provided abstract or description reduces a claimed prediction or result to a fitted parameter, self-citation, or input quantity by construction; the bound and subspace projectors are presented as newly derived, and the scalability claim rests on this independent combination rather than renaming or re-using prior fitted quantities. The derivation chain is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract alone does not enumerate free parameters, axioms, or invented entities; the bound and subspaces are introduced as part of the method without further specification of fitting or additional postulates.

pith-pipeline@v0.9.0 · 5724 in / 1252 out tokens · 57982 ms · 2026-05-23T02:37:31.457868+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Conditional Sampling via Wasserstein Autoencoders and Triangular Transport

    cs.LG 2026-04 unverdicted novelty 7.0

    CWAEs use triangular decoders and latent independence in Wasserstein autoencoders to perform conditional simulation by capturing low-dimensional structure in conditioned and conditioning variables.

Reference graph

Works this paper leans on

56 extracted references · 56 canonical work pages · cited by 1 Pith paper · 1 internal anchor

  1. [1]

    Alnæs, J

    M. Alnæs, J. Blechta, J. Hake, A. Johansson, B. Kehlet, A. Logg, C. Richardson, J. Ring, M. E. Rognes, and G. N. Wells. The FEniCS project version 1.5. Archive of numerical software , 3(100), 2015

  2. [2]

    Bal and K

    G. Bal and K. Ren. Multi-source quantitative photoacoustic tomography in a diffusive regime. Inverse Problems, 27(7):075003, 2011

  3. [3]

    Baptista, Y

    R. Baptista, Y. Marzouk, and O. Zahm. On the representation and learning of monotone triangular transport maps. Foundations of Computational Mathematics , pages 1–46, 2023

  4. [4]

    J. Beck, B. Mansour Dia, L. Espath, and R. Tempone. Multilevel double loop Monte Carlo and stochastic collocation methods with importance sampling for Bayesian optimal experimental design. Interna- tional Journal for Numerical Methods in Engineering , 121(15):3482–3503, 2020

  5. [5]

    Brennan, D

    M. Brennan, D. Bigoni, O. Zahm, A. Spantini, and Y. Marzouk. Greedy inference with structure-exploiting lazy maps. Advances in Neural Information Processing Systems , 33:8330–8342, 2020

  6. [6]

    L. Cao, R. Baptista, J. Chen, F. Li, O. Ghattas, J. T. Oden, and Y. Marzouk. Bayesian model cali- 21 bration for block copolymer self-assembly: Likelihood-free inference and expected information gain computation via measure transport. Bulletin of the American Physical Society

  7. [7]

    L. Cao, J. Chen, M. Brennan, T. O’Leary-Roseberry, Y. Marzouk, and O. Ghattas. Lazydino: Fast, scalable, and efficiently amortized Bayesian inversion via structure-exploiting and surrogate-driven measure transport. arXiv preprint arXiv:2411.12726 , 2024

  8. [8]

    Chaturantabut and D

    S. Chaturantabut and D. C. Sorensen. Nonlinear model reduction via discrete empirical interpolation. SIAM Journal on Scientific Computing , 32(5):2737–2764, 2010

  9. [9]

    R. T. Chen, Y. Rubanova, J. Bettencourt, and D. K. Duvenaud. Neural ordinary differential equations. Advances in Neural Information Processing Systems , 31, 2018

  10. [10]

    P. G. Constantine, C. Kent, and T. Bui-Thanh. Accelerating Markov chain Monte Carlo with active subspaces. SIAM Journal on Scientific Computing , 38(5):A2779–A2805, 2016

  11. [11]

    Cotter, G

    S. Cotter, G. Roberts, A. Stuart, and D. White. MCMC methods for functions: Modifying old algorithms to make them faster. Statistical Science, 28(3):424–446, 2013

  12. [12]

    B. Cox, J. G. Laufer, S. R. Arridge, and P. C. Beard. Quantitative spectroscopic photoacoustic imaging: a review. Journal of biomedical optics , 17(6):061202–061202, 2012

  13. [13]

    T. Cui. Deep Transport. https://github.com/DeepTransport/deep-tensor

  14. [14]

    T. Cui. Fast forward and inverse problems solver (fastfins). https://github.com/fastfins/fastfins.m

  15. [15]

    Cui and S

    T. Cui and S. Dolgov. Deep composition of tensor-trains using squared inverse rosenblatt transports. Foundations of Computational Mathematics , 22(6):1863–1922, 2022

  16. [16]

    T. Cui, S. Dolgov, and O. Zahm. Scalable conditional deep inverse Rosenblatt transports using tensor trains and gradient-based dimension reduction. Journal of Computational Physics , 485:112103, 2023

  17. [17]

    T. Cui, S. Dolgov, and O. Zahm. Self-reinforced polynomial approximation methods for concentrated probability densities. arXiv preprint arXiv:2303.02554 , 2023

  18. [18]

    T. Cui, K. J. Law, and Y. M. Marzouk. Dimension-independent likelihood-informed MCMC. Journal of Computational Physics, 304:109–137, 2016

  19. [19]

    T. Cui, J. Martin, Y. M. Marzouk, A. Solonen, and A. Spantini. Likelihood-informed dimension reduction for nonlinear inverse problems. Inverse Problems, 30(11):114015, 2014

  20. [20]

    Cui and X

    T. Cui and X. T. Tong. A unified performance analysis of likelihood-informed subspace methods. Bernoulli, 28(4):2788 – 2815, 2022

  21. [21]

    T. Cui, X. T. Tong, and O. Zahm. Prior normalization for certified likelihood-informed subspace detection of Bayesian inverse problems. Inverse Problems, 38(12):124002, 2022

  22. [22]

    Cui and O

    T. Cui and O. Zahm. Data-free likelihood-informed dimension reduction of Bayesian inverse problems. Inverse Problems, 37(4):045009, 2021

  23. [23]

    Dolgov, K

    S. Dolgov, K. Anaya-Izquierdo, C. Fox, and R. Scheichl. Approximation and sampling of multivariate probability distributions in the tensor train decomposition. Stat. Comput., 30:603–625, 2020

  24. [24]

    J. Dong, C. Jacobsen, M. Khalloufi, M. Akram, W. Liu, K. Duraisamy, and X. Huan. Variational Bayesian optimal experimental design with normalizing flows. Comput. Methods Appl. Mech. Eng., 433:117457, 2025

  25. [25]

    C. C. Drovandi, J. M. McGree, and A. N. Pettitt. Sequential Monte Carlo for Bayesian sequentially designed experiments for discrete data. Comput. Stat. Data Anal. , 57(1):320–335, 2013

  26. [26]

    T. A. El Moselhy and Y. M. Marzouk. Bayesian inference with optimal maps. Journal of Computational Physics, 231(23):7815–7850, 2012

  27. [27]

    Foster, D

    A. Foster, D. R. Ivanova, I. Malik, and T. Rainforth. Deep adaptive design: Amortizing sequential Bayesian experimental design. In Int. conf. on machine learning , pages 3384–3395. PMLR, 2021

  28. [28]

    Girolami and B

    M. Girolami and B. Calderhead. Riemann manifold Langevin and Hamiltonian Monte Carlo methods. Journal of the Royal Statistical Society Series B: Statistical Methodology , 73(2):123–214, 2011

  29. [29]

    Go and P

    J. Go and P. Chen. Sequential infinite-dimensional Bayesian optimal experimental design with derivative- informed latent attention neural operator. arXiv preprint arXiv:2409.09141 , 2024

  30. [30]

    Gorodetsky, S

    A. Gorodetsky, S. Karaman, and Y. Marzouk. A continuous analogue of the tensor-train decomposition. Computer Methods in Aapplied Mechanics and Engineering , 347:59–84, 2019

  31. [31]

    Gr¨ ohl, M

    J. Gr¨ ohl, M. Schellenberg, K. Dreher, and L. Maier-Hein. Deep learning for biomedical photoacoustic imaging: A review. Photoacoustics, 22:100241, 2021

  32. [32]

    X. Huan, J. Jagalur, and Y. Marzouk. Optimal experimental design: Formulations and computations. Acta Numerica, 33:715–840, 2024

  33. [33]

    Huan and Y

    X. Huan and Y. M. Marzouk. Simulation-based optimal Bayesian experimental design for nonlinear systems. Journal of Computational Physics , 232(1):288–317

  34. [34]

    Sequential Bayesian optimal experimental design via approximate dynamic programming

    X. Huan and Y. M. Marzouk. Sequential Bayesian optimal experimental design via approximate dynamic programming. arXiv preprint arXiv:1604.08320 , 2016

  35. [35]

    Kaipio and V

    J. Kaipio and V. Kolehmainen. Approximate marginalization over modeling errors and uncertainties in inverse problems. Bayesian theory and applications , pages 644–672, 2013

  36. [36]

    Kleinegesse, C

    S. Kleinegesse, C. Drovandi, and M. U. Gutmann. Sequential Bayesian experimental design for implicit models via mutual information, 2020. 22

  37. [37]

    Koval, R

    K. Koval, R. Herzog, and R. Scheichl. Tractable optimal experimental design using transport maps.Inverse Problems, 40(12):125002, 2024

  38. [38]

    Kruse, G

    J. Kruse, G. Detommaso, U. K¨ othe, and R. Scheichl. HINT: Hierarchical invertible neural transport for density estimation and Bayesian inference. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 8191–8199, 2021

  39. [39]

    F. Li, R. Baptista, and Y. Marzouk. Expected information gain estimation via density approximations: Sample allocation and dimension reduction. arXiv preprint arXiv:2411.08390 , 2024

  40. [40]

    M. T. Li, T. Cui, F. Li, Y. Marzouk, and O. Zahm. Sharp detection of low-dimensional structure in prob- ability measures via dimensional logarithmic Sobolev inequalities. arXiv preprint arXiv:2406.13036 , 2024

  41. [41]

    Lutzweiler and D

    C. Lutzweiler and D. Razansky. Optoacoustic imaging and tomography: reconstruction approaches and outstanding challenges in image performance and quantification. Sensors, 13(6):7345–7384, 2013

  42. [42]

    Martin, L

    J. Martin, L. C. Wilcox, C. Burstedde, and O. Ghattas. A stochastic Newton MCMC method for large- scale statistical inverse problems with application to seismic inversion. SIAM Journal on Scientific Computing, 34(3):A1460–A1487, 2012

  43. [43]

    Marzouk, T

    Y. Marzouk, T. Moselhy, M. Parno, and A. Spantini. Sampling via measure transport: An introduction. Handbook of uncertainty quantification, 1:2, 2016

  44. [44]

    J. C. Mattingly, N. S. Pillai, and A. Stuart. Diffusion limits of the random walk Metropolis algorithm in high dimensions. The Annals of Applied Probability , 22(3):881–930, 2012

  45. [45]

    Onken, S

    D. Onken, S. W. Fung, X. Li, and L. Ruthotto. Ot-flow: Fast and accurate continuous normalizing flows via optimal transport. In Proceedings of the AAAI Conference on Artificial Intelligence , volume 35, pages 9223–9232, 2021

  46. [46]

    Oseledets and E

    I. Oseledets and E. Tyrtyshnikov. TT-cross approximation for multidimensional arrays. Linear Algebra and its Applications , 432(1):70–88, 2010

  47. [47]

    Papamakarios, E

    G. Papamakarios, E. Nalisnick, D. J. Rezende, S. Mohamed, and B. Lakshminarayanan. Normalizing flows for probabilistic modeling and inference. J. Mach. Learn. Res. , 22(1):2617–2680, 2021

  48. [48]

    G. O. Roberts and J. S. Rosenthal. Optimal scaling for various Metropolis-Hastings algorithms. Statistical science, 16(4):351–367, 2001

  49. [49]

    E. G. Ryan, C. C. Drovandi, J. M. McGree, and A. N. Pettitt. A review of modern computational algorithms for Bayesian optimal design. International Statistical Review, 84(1):128–154, 2016

  50. [50]

    K. J. Ryan. Estimating expected information gains for experimental designs with application to the random fatigue-limit model. Journal of Computational and Graphical Statistics , 12(3):585–603, 2003

  51. [51]

    Shen and X

    W. Shen and X. Huan. Bayesian sequential optimal experimental design for nonlinear models using policy gradient reinforcement learning. Comput. Methods Appl. Mech. Eng. , 416:116304, 2023

  52. [52]

    Tarvainen and B

    T. Tarvainen and B. Cox. Quantitative photoacoustic tomography: modeling and inverse problems.Journal of Biomedical Optics, 29(S1):S11509–S11509, 2024

  53. [53]

    Westermann and J

    J. Westermann and J. Zech. Measure transport via polynomial density surrogates. arXiv preprint arXiv:2311.04172, 2023

  54. [54]

    K. Wu, P. Chen, and O. Ghattas. A fast and scalable computational framework for large-scale high- dimensional Bayesian optimal experimental design. SIAM/ASA Journal on Uncertainty Quantifica- tion, 11(1):235–261, 2023

  55. [55]

    K. Wu, T. O’Leary-Roseberry, P. Chen, and O. Ghattas. Large-scale Bayesian optimal experimental design with derivative-informed projected neural network. J. Sci. Comput. , 95(1):30, 2023

  56. [56]

    O. Zahm, T. Cui, K. Law, A. Spantini, and Y. Marzouk. Certified dimension reduction in nonlinear Bayesian inverse problems. Mathematics of Computation , 91(336):1789–1835, 2022. 23