pith. sign in

arxiv: 2604.22159 · v1 · submitted 2026-04-24 · 🧮 math.PR · math.OC

Adapted Optimal Transport between Filtered Gaussian Processes

Pith reviewed 2026-05-08 10:24 UTC · model grok-4.3

classification 🧮 math.PR math.OC
keywords adapted optimal transportfiltered Gaussian processesadapted Wasserstein distanceProcrustes problemCholesky factorsbicausal couplingsGaussian martingalespath space
0
0 comments X

The pith

The adapted 2-Wasserstein distance between filtered Gaussian processes reduces to a constrained Procrustes problem on Cholesky factors.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a space of filtered Gaussian processes in which both the underlying randomness and the filtration are generated by the same Gaussian white noise. Within this space it establishes that the adapted 2-Wasserstein distance admits an explicit variational representation as a constrained orthogonal Procrustes problem between the Cholesky factors of the processes. This representation also identifies the resulting quotient space as the completion of the space of Gaussian measures on path space under the adapted distance. The work further characterizes the projections onto Gaussian martingales, computes the cost of the adapted Brenier coupling, and shows that transport costs of all Gaussian bicausal couplings become asymptotically equivalent as the time horizon grows.

Core claim

On the space of filtered Gaussian processes where both the randomness and the flow of information are driven by a single Gaussian white noise, the adapted 2-Wasserstein distance AW_2 admits a variational representation as a constrained orthogonal Procrustes problem between Cholesky factors. The resulting quotient space is the AW_2-completion of the space of Gaussian distributions on the path space. The paper also characterizes explicitly the AW_2-projections onto the subspaces of Gaussian martingales and analyzes the adapted Brenier coupling as a multivariate generalization of the Knothe-Rosenblatt coupling, while establishing the asymptotic equivalence of transport costs for all Gaussian b

What carries the argument

Constrained orthogonal Procrustes problem between Cholesky factors, which variationally represents the adapted 2-Wasserstein distance by optimizing an orthogonal matrix subject to causality constraints induced by the filtration.

Load-bearing premise

Both the randomness and the information flow in the filtered Gaussian processes are generated by the same single Gaussian white noise, which permits the Cholesky factorization to encode the adapted structure.

What would settle it

For a two-step bivariate Gaussian process, compute the adapted 2-Wasserstein distance by direct minimization over bicausal couplings and compare the value to the minimum of the corresponding constrained Procrustes objective; a mismatch would show the variational representation does not hold.

Figures

Figures reproduced from arXiv: 2604.22159 by Madhu Gunasingam, Ting-Kam Leonard Wong.

Figure 1
Figure 1. Figure 1: Comparison of the transport costs T (·) of the synchro￾nous coupling π I , adapted Brenier coupling π AB, AW2-optimal coupling π AW, and the Brenier (W2-optimal) coupling π BW. We plot the ratios along 100 simulated paths of L (N) and M(N) with d = 5; the solid curves show the sample medians and the shaded bands show the inter-quartile ranges. exists ρ < 1 such that (6.4) lim sup N→∞ distBW(A(N) , B(N) ) d… view at source ↗
read the original abstract

We continue the study of adapted optimal transport in the discrete-time Gaussian setting. To this end, we introduce a space of filtered Gaussian processes where both the randomness and the flow of information are driven by a Gaussian white noise. On this space, the adapted $2$-Wasserstein distance (${AW}_2$) admits a variational representation as a constrained orthogonal Procrustes problem between Cholesky factors. Furthermore, the resulting quotient space is the ${AW}_2$-completion of the space of Gaussian distributions on the path space. We also characterize explicitly the ${AW}_2$-projections onto the subspaces of Gaussian martingales. Next, we analyze the adapted Brenier coupling -- a multivariate generalization of the Knothe--Rosenblatt coupling that serves as a myopic solution to the adapted transport problem, and compute its transport cost. Utilizing a Gaussian random matrix framework, we investigate the asymptotic behavior of transport costs as the time horizon grows; notably, we establish that the transport costs of all Gaussian bicausal couplings are asymptotically equivalent, whereas the classical Bures--Wasserstein distance is strictly smaller. Finally, we demonstrate that the adapted analogue of Gelbrich's lower bound fails in general, and we identify a sufficient martingale difference condition under which the bound is recovered.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 3 minor

Summary. The manuscript introduces a space of filtered Gaussian processes in discrete time where both randomness and information flow are generated by a single Gaussian white noise. On this space the adapted 2-Wasserstein distance AW_2 is shown to admit a variational representation as a constrained orthogonal Procrustes problem between Cholesky factors. The resulting quotient space is identified as the AW_2-completion of the space of Gaussian measures on path space. Additional results include explicit AW_2-projections onto Gaussian martingale subspaces, analysis of the adapted Brenier (multivariate Knothe-Rosenblatt) coupling and its cost, asymptotic equivalence of transport costs for all Gaussian bicausal couplings as the horizon grows (with the classical Bures-Wasserstein distance strictly smaller), and a counter-example showing that an adapted analogue of Gelbrich's lower bound fails in general while holding under a martingale-difference condition.

Significance. If the central variational representation is exact, the work supplies a concrete, matrix-based algorithm for computing adapted OT distances between filtered Gaussians, which is otherwise intractable. The asymptotic equivalence result and the explicit counter-example to the lower bound usefully delineate the gap between adapted and classical OT in the Gaussian setting. These findings are likely to be cited in subsequent work on stochastic control, filtering, and path-space learning.

major comments (2)
  1. [§3] §3 (variational representation theorem): The reduction of AW_2 to the constrained Procrustes problem min ||L - M U||_F over filtration-respecting orthogonal U is load-bearing. The argument shows that every such U induces a bicausal coupling, but the converse—that every bicausal coupling is attained or bounded by some admissible U—is not fully detailed; an explicit construction or density argument under the single-white-noise model would close the gap.
  2. [§5] §5 (asymptotics): The claim that transport costs of all Gaussian bicausal couplings become asymptotically equivalent relies on the Gaussian random-matrix framework. While the strict inequality with the Bures-Wasserstein distance is stated, a quantitative rate or a low-dimensional numerical illustration would strengthen the comparison.
minor comments (3)
  1. [Introduction] Notation for the filtration and the driving white noise is introduced late; an early diagram or low-dimensional example would improve readability.
  2. [Throughout] Several displayed equations use inconsistent indexing for the time horizon N; uniform notation throughout would help.
  3. [§6] The statement of the martingale-difference condition for recovery of the lower bound could be accompanied by a brief remark on how often it holds in typical applications.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive comments on our manuscript. We address each major point below and outline the revisions we will make.

read point-by-point responses
  1. Referee: [§3] §3 (variational representation theorem): The reduction of AW_2 to the constrained Procrustes problem min ||L - M U||_F over filtration-respecting orthogonal U is load-bearing. The argument shows that every such U induces a bicausal coupling, but the converse—that every bicausal coupling is attained or bounded by some admissible U—is not fully detailed; an explicit construction or density argument under the single-white-noise model would close the gap.

    Authors: We appreciate the referee highlighting the need for greater detail on the converse direction in the proof of the variational representation (Theorem 3.1). The manuscript already shows that every filtration-respecting orthogonal U produces a bicausal coupling via the shared white-noise representation. For the converse, under the single-white-noise model every filtered Gaussian process is generated by the same driving noise, so any bicausal coupling admits a representation as an orthogonal transformation of the underlying white noise that respects the filtration (i.e., the transformation matrix is block-lower-triangular with respect to the filtration). In the revised version we will insert an explicit construction that maps an arbitrary bicausal coupling to such an admissible U, thereby establishing that the minimum is attained and closing the gap. revision: yes

  2. Referee: [§5] §5 (asymptotics): The claim that transport costs of all Gaussian bicausal couplings become asymptotically equivalent relies on the Gaussian random-matrix framework. While the strict inequality with the Bures-Wasserstein distance is stated, a quantitative rate or a low-dimensional numerical illustration would strengthen the comparison.

    Authors: We thank the referee for this helpful suggestion. The asymptotic equivalence of all Gaussian bicausal transport costs is proved via the Gaussian random-matrix framework in §5, and the strict inequality with the Bures-Wasserstein distance follows from the positive probability that the limiting random matrix is non-scalar. To strengthen the exposition we will add a low-dimensional numerical illustration (e.g., two-dimensional processes over horizons T=5,10,20) that visualizes the convergence of the adapted costs and their separation from the classical Bures-Wasserstein value. A quantitative convergence rate lies outside the present scope but could be pursued separately. revision: yes

Circularity Check

0 steps flagged

No circularity: variational representation derived from newly introduced filtered Gaussian space

full rationale

The paper defines a space of filtered Gaussian processes driven by a single white noise, then states that AW_2 on this space admits a constrained Procrustes representation between Cholesky factors. This is presented as a derived property rather than a definitional tautology. No fitted inputs are relabeled as predictions, no self-citation chains justify the core equivalence, and the Cholesky reduction follows from the white-noise assumption without presupposing the target distance. The quotient-space claim and other results (Brenier coupling, asymptotics) are likewise stated as consequences on the defined space. The derivation chain remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

The central claims rest on the standard properties of Gaussian white noise, Cholesky factorization of covariance operators, and the definition of adapted couplings; the paper introduces one new object (the filtered process space) whose properties are taken as the foundation for all subsequent derivations.

axioms (2)
  • domain assumption Gaussian white noise drives both the process values and the filtration
    Invoked to define the space of filtered Gaussian processes in the abstract.
  • standard math Covariance operators admit Cholesky factors
    Standard fact for Gaussian measures used to reduce the transport problem to matrix factors.
invented entities (1)
  • Space of filtered Gaussian processes no independent evidence
    purpose: To equip adapted optimal transport with a setting where both randomness and information flow are explicitly modeled by white noise
    Newly introduced object whose definition enables the Procrustes representation and quotient-space claim.

pith-pipeline@v0.9.0 · 5521 in / 1520 out tokens · 125265 ms · 2026-05-08T10:24:56.898182+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

23 extracted references · 23 canonical work pages

  1. [1]

    The geometry of the adapted bures–wasserstein space.arXiv preprint arXiv:2602.00623,

    Beatrice Acciaio, Daniel Bartl, Anne Grass, Songyan Hou, and Gudmund Pammer. The geometry of the adapted Bures–Wasserstein space.arXiv preprint arXiv:2602.00623, 2026

  2. [2]

    Entropic adapted Wasserstein dis- tance on gaussians.Electronic Communications in Probability, 30:1–14, 2025

    Beatrice Acciaio, Songyan Hou, and Gudmund Pammer. Entropic adapted Wasserstein dis- tance on gaussians.Electronic Communications in Probability, 30:1–14, 2025

  3. [3]

    Multicausal transport: barycenters and dynamic matching.SIAM Journal on Financial Mathematics, 16(3):1104–1138, 2025

    Beatrice Acciaio, Daniel Krˇ sek, and Gudmund Pammer. Multicausal transport: barycenters and dynamic matching.SIAM Journal on Financial Mathematics, 16(3):1104–1138, 2025

  4. [4]

    Springer, 2007

    Robert J Adler and Jonathan E Taylor.Random Fields and Geometry. Springer, 2007. 40 MADHU GUNASINGAM AND TING-KAM LEONARD WONG

  5. [5]

    Fundamental properties of process distances.Stochastic Processes and their Applications, 130(9):5575– 5591, 2020

    Julio Backhoff-Veraguas, Mathias Beiglb¨ ock, Manu Eder, and Alois Pichler. Fundamental properties of process distances.Stochastic Processes and their Applications, 130(9):5575– 5591, 2020

  6. [6]

    Causal transport in discrete time and applications.SIAM Journal on Optimization, 27(4):2528– 2562, 2017

    Julio Backhoff-Veraguas, Mathias Beiglbock, Yiqing Lin, and Anastasiia Zalashko. Causal transport in discrete time and applications.SIAM Journal on Optimization, 27(4):2528– 2562, 2017

  7. [7]

    The Wasserstein space of stochastic processes.Journal of the European Mathematical Society, 2024

    Daniel Bartl, Mathias Beiglb¨ ock, and Gudmund Pammer. The Wasserstein space of stochastic processes.Journal of the European Mathematical Society, 2024

  8. [8]

    Beiglb¨ ock, G

    Mathias Beiglb¨ ock, Gudmund Pammer, and Alexander Posch. The Knothe-Rosenblatt dis- tance and its induced topology.arXiv preprint arXiv:2312.16515, 2023

  9. [9]

    arXiv preprint arXiv:2509.03506 (2025)

    Mathias Beiglb¨ ock, Gudmund Pammer, and Stefan Schrott. A Brenier theorem on (P 2(. . .(P 2(H). . .),W 2) and applications to adapted transport.arXiv preprint arXiv:2509.03506, 2025

  10. [10]

    On the Bures–Wasserstein distance between positive definite matrices.Expositiones Mathematicae, 37(2):165–191, 2019

    Rajendra Bhatia, Tanvi Jain, and Yongdo Lim. On the Bures–Wasserstein distance between positive definite matrices.Expositiones Mathematicae, 37(2):165–191, 2019

  11. [11]

    Causal transport on path space.arXiv preprint arXiv:2412.02948,

    Rama Cont and Fang Rui Lim. Causal transport on path space.arXiv preprint arXiv:2412.02948, 2024

  12. [12]

    On a formula for theL 2 Wasserstein metric between measures on Eu- clidean and Hilbert spaces.Mathematische Nachrichten, 147(1):185–203, 1990

    Matthias Gelbrich. On a formula for theL 2 Wasserstein metric between measures on Eu- clidean and Hilbert spaces.Mathematische Nachrichten, 147(1):185–203, 1990

  13. [13]

    Adapted optimal transport between gaussian processes in discrete time.Electronic Communications in Probability, 30:1–14, 2025

    Madhu Gunasingam and Ting-Kam Leonard Wong. Adapted optimal transport between gaussian processes in discrete time.Electronic Communications in Probability, 30:1–14, 2025

  14. [14]

    Cambridge University Press, second edition, 2013

    Roger A Horn and Charles R Johnson.Matrix Aalysis. Cambridge University Press, second edition, 2013

  15. [15]

    A transfer principle for computing the adapted wasserstein distance between stochastic processes.arXiv preprint arXiv:2505.21337,

    Yifan Jiang and Fang Rui Lim. A transfer principle for computing the adapted wasserstein distance between stochastic processes.arXiv preprint arXiv:2505.21337, 2025

  16. [16]

    Wasserstein distributionally robust optimization: Theory and applications in ma- chine learning

    Daniel Kuhn, Peyman Mohajerin Esfahani, Viet Anh Nguyen, and Soroosh Shafieezadeh- Abadeh. Wasserstein distributionally robust optimization: Theory and applications in ma- chine learning. InOperations Research & Management Science in the Age of Analytics, pages 130–166. INFORMS, 2019

  17. [17]

    Causal transport plans and their Monge–Kantorovich problems.Stochastic Analysis and Applications, 36(3):452–484, 2018

    R´ emi Lassalle. Causal transport plans and their Monge–Kantorovich problems.Stochastic Analysis and Applications, 36(3):452–484, 2018

  18. [18]

    Some estimates of norms of random matrices.Proceedings of the American Mathematical Society, 133(5):1273–1282, 2005

    Rafa l Lata la. Some estimates of norms of random matrices.Proceedings of the American Mathematical Society, 133(5):1273–1282, 2005

  19. [19]

    Wasserstein Riemannian geometry of Gaussian densities.Information Geometry, 1:137–179, 2018

    Luigi Malag` o, Luigi Montrucchio, and Giovanni Pistone. Wasserstein Riemannian geometry of Gaussian densities.Information Geometry, 1:137–179, 2018

  20. [20]

    Totally convex functions,

    Alessandro Pinzi and Giuseppe Savar´ e. Totally convex functions,L 2-optimal transport for laws of random measures, and solution to the Monge problem.arXiv preprint arXiv:2509.01768, 2025

  21. [21]

    Wasserstein geometry of Gaussian measures.Osaka Journal of Mathematics, 48(4):1005–1026, 2011

    Asuka Takatsu. Wasserstein geometry of Gaussian measures.Osaka Journal of Mathematics, 48(4):1005–1026, 2011

  22. [22]

    American Mathematical Society, 2003

    C´ edric Villani.Topics in Optimal Transportation. American Mathematical Society, 2003

  23. [23]

    Springer, 2008

    C´ edric Villani.Optimal Transport: Old and New. Springer, 2008. Department of Statistical Sciences, University of Toronto Email address:madhu.gunasingam@mail.utoronto.ca Department of Statistical Sciences, University of Toronto Email address:tkl.wong@utoronto.ca