Adapted Optimal Transport between Filtered Gaussian Processes
Pith reviewed 2026-05-08 10:24 UTC · model grok-4.3
The pith
The adapted 2-Wasserstein distance between filtered Gaussian processes reduces to a constrained Procrustes problem on Cholesky factors.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
On the space of filtered Gaussian processes where both the randomness and the flow of information are driven by a single Gaussian white noise, the adapted 2-Wasserstein distance AW_2 admits a variational representation as a constrained orthogonal Procrustes problem between Cholesky factors. The resulting quotient space is the AW_2-completion of the space of Gaussian distributions on the path space. The paper also characterizes explicitly the AW_2-projections onto the subspaces of Gaussian martingales and analyzes the adapted Brenier coupling as a multivariate generalization of the Knothe-Rosenblatt coupling, while establishing the asymptotic equivalence of transport costs for all Gaussian b
What carries the argument
Constrained orthogonal Procrustes problem between Cholesky factors, which variationally represents the adapted 2-Wasserstein distance by optimizing an orthogonal matrix subject to causality constraints induced by the filtration.
Load-bearing premise
Both the randomness and the information flow in the filtered Gaussian processes are generated by the same single Gaussian white noise, which permits the Cholesky factorization to encode the adapted structure.
What would settle it
For a two-step bivariate Gaussian process, compute the adapted 2-Wasserstein distance by direct minimization over bicausal couplings and compare the value to the minimum of the corresponding constrained Procrustes objective; a mismatch would show the variational representation does not hold.
Figures
read the original abstract
We continue the study of adapted optimal transport in the discrete-time Gaussian setting. To this end, we introduce a space of filtered Gaussian processes where both the randomness and the flow of information are driven by a Gaussian white noise. On this space, the adapted $2$-Wasserstein distance (${AW}_2$) admits a variational representation as a constrained orthogonal Procrustes problem between Cholesky factors. Furthermore, the resulting quotient space is the ${AW}_2$-completion of the space of Gaussian distributions on the path space. We also characterize explicitly the ${AW}_2$-projections onto the subspaces of Gaussian martingales. Next, we analyze the adapted Brenier coupling -- a multivariate generalization of the Knothe--Rosenblatt coupling that serves as a myopic solution to the adapted transport problem, and compute its transport cost. Utilizing a Gaussian random matrix framework, we investigate the asymptotic behavior of transport costs as the time horizon grows; notably, we establish that the transport costs of all Gaussian bicausal couplings are asymptotically equivalent, whereas the classical Bures--Wasserstein distance is strictly smaller. Finally, we demonstrate that the adapted analogue of Gelbrich's lower bound fails in general, and we identify a sufficient martingale difference condition under which the bound is recovered.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces a space of filtered Gaussian processes in discrete time where both randomness and information flow are generated by a single Gaussian white noise. On this space the adapted 2-Wasserstein distance AW_2 is shown to admit a variational representation as a constrained orthogonal Procrustes problem between Cholesky factors. The resulting quotient space is identified as the AW_2-completion of the space of Gaussian measures on path space. Additional results include explicit AW_2-projections onto Gaussian martingale subspaces, analysis of the adapted Brenier (multivariate Knothe-Rosenblatt) coupling and its cost, asymptotic equivalence of transport costs for all Gaussian bicausal couplings as the horizon grows (with the classical Bures-Wasserstein distance strictly smaller), and a counter-example showing that an adapted analogue of Gelbrich's lower bound fails in general while holding under a martingale-difference condition.
Significance. If the central variational representation is exact, the work supplies a concrete, matrix-based algorithm for computing adapted OT distances between filtered Gaussians, which is otherwise intractable. The asymptotic equivalence result and the explicit counter-example to the lower bound usefully delineate the gap between adapted and classical OT in the Gaussian setting. These findings are likely to be cited in subsequent work on stochastic control, filtering, and path-space learning.
major comments (2)
- [§3] §3 (variational representation theorem): The reduction of AW_2 to the constrained Procrustes problem min ||L - M U||_F over filtration-respecting orthogonal U is load-bearing. The argument shows that every such U induces a bicausal coupling, but the converse—that every bicausal coupling is attained or bounded by some admissible U—is not fully detailed; an explicit construction or density argument under the single-white-noise model would close the gap.
- [§5] §5 (asymptotics): The claim that transport costs of all Gaussian bicausal couplings become asymptotically equivalent relies on the Gaussian random-matrix framework. While the strict inequality with the Bures-Wasserstein distance is stated, a quantitative rate or a low-dimensional numerical illustration would strengthen the comparison.
minor comments (3)
- [Introduction] Notation for the filtration and the driving white noise is introduced late; an early diagram or low-dimensional example would improve readability.
- [Throughout] Several displayed equations use inconsistent indexing for the time horizon N; uniform notation throughout would help.
- [§6] The statement of the martingale-difference condition for recovery of the lower bound could be accompanied by a brief remark on how often it holds in typical applications.
Simulated Author's Rebuttal
We thank the referee for the careful reading and constructive comments on our manuscript. We address each major point below and outline the revisions we will make.
read point-by-point responses
-
Referee: [§3] §3 (variational representation theorem): The reduction of AW_2 to the constrained Procrustes problem min ||L - M U||_F over filtration-respecting orthogonal U is load-bearing. The argument shows that every such U induces a bicausal coupling, but the converse—that every bicausal coupling is attained or bounded by some admissible U—is not fully detailed; an explicit construction or density argument under the single-white-noise model would close the gap.
Authors: We appreciate the referee highlighting the need for greater detail on the converse direction in the proof of the variational representation (Theorem 3.1). The manuscript already shows that every filtration-respecting orthogonal U produces a bicausal coupling via the shared white-noise representation. For the converse, under the single-white-noise model every filtered Gaussian process is generated by the same driving noise, so any bicausal coupling admits a representation as an orthogonal transformation of the underlying white noise that respects the filtration (i.e., the transformation matrix is block-lower-triangular with respect to the filtration). In the revised version we will insert an explicit construction that maps an arbitrary bicausal coupling to such an admissible U, thereby establishing that the minimum is attained and closing the gap. revision: yes
-
Referee: [§5] §5 (asymptotics): The claim that transport costs of all Gaussian bicausal couplings become asymptotically equivalent relies on the Gaussian random-matrix framework. While the strict inequality with the Bures-Wasserstein distance is stated, a quantitative rate or a low-dimensional numerical illustration would strengthen the comparison.
Authors: We thank the referee for this helpful suggestion. The asymptotic equivalence of all Gaussian bicausal transport costs is proved via the Gaussian random-matrix framework in §5, and the strict inequality with the Bures-Wasserstein distance follows from the positive probability that the limiting random matrix is non-scalar. To strengthen the exposition we will add a low-dimensional numerical illustration (e.g., two-dimensional processes over horizons T=5,10,20) that visualizes the convergence of the adapted costs and their separation from the classical Bures-Wasserstein value. A quantitative convergence rate lies outside the present scope but could be pursued separately. revision: yes
Circularity Check
No circularity: variational representation derived from newly introduced filtered Gaussian space
full rationale
The paper defines a space of filtered Gaussian processes driven by a single white noise, then states that AW_2 on this space admits a constrained Procrustes representation between Cholesky factors. This is presented as a derived property rather than a definitional tautology. No fitted inputs are relabeled as predictions, no self-citation chains justify the core equivalence, and the Cholesky reduction follows from the white-noise assumption without presupposing the target distance. The quotient-space claim and other results (Brenier coupling, asymptotics) are likewise stated as consequences on the defined space. The derivation chain remains self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Gaussian white noise drives both the process values and the filtration
- standard math Covariance operators admit Cholesky factors
invented entities (1)
-
Space of filtered Gaussian processes
no independent evidence
Reference graph
Works this paper leans on
-
[1]
The geometry of the adapted bures–wasserstein space.arXiv preprint arXiv:2602.00623,
Beatrice Acciaio, Daniel Bartl, Anne Grass, Songyan Hou, and Gudmund Pammer. The geometry of the adapted Bures–Wasserstein space.arXiv preprint arXiv:2602.00623, 2026
-
[2]
Beatrice Acciaio, Songyan Hou, and Gudmund Pammer. Entropic adapted Wasserstein dis- tance on gaussians.Electronic Communications in Probability, 30:1–14, 2025
work page 2025
-
[3]
Beatrice Acciaio, Daniel Krˇ sek, and Gudmund Pammer. Multicausal transport: barycenters and dynamic matching.SIAM Journal on Financial Mathematics, 16(3):1104–1138, 2025
work page 2025
-
[4]
Robert J Adler and Jonathan E Taylor.Random Fields and Geometry. Springer, 2007. 40 MADHU GUNASINGAM AND TING-KAM LEONARD WONG
work page 2007
-
[5]
Julio Backhoff-Veraguas, Mathias Beiglb¨ ock, Manu Eder, and Alois Pichler. Fundamental properties of process distances.Stochastic Processes and their Applications, 130(9):5575– 5591, 2020
work page 2020
-
[6]
Julio Backhoff-Veraguas, Mathias Beiglbock, Yiqing Lin, and Anastasiia Zalashko. Causal transport in discrete time and applications.SIAM Journal on Optimization, 27(4):2528– 2562, 2017
work page 2017
-
[7]
The Wasserstein space of stochastic processes.Journal of the European Mathematical Society, 2024
Daniel Bartl, Mathias Beiglb¨ ock, and Gudmund Pammer. The Wasserstein space of stochastic processes.Journal of the European Mathematical Society, 2024
work page 2024
-
[8]
Mathias Beiglb¨ ock, Gudmund Pammer, and Alexander Posch. The Knothe-Rosenblatt dis- tance and its induced topology.arXiv preprint arXiv:2312.16515, 2023
-
[9]
arXiv preprint arXiv:2509.03506 (2025)
Mathias Beiglb¨ ock, Gudmund Pammer, and Stefan Schrott. A Brenier theorem on (P 2(. . .(P 2(H). . .),W 2) and applications to adapted transport.arXiv preprint arXiv:2509.03506, 2025
-
[10]
Rajendra Bhatia, Tanvi Jain, and Yongdo Lim. On the Bures–Wasserstein distance between positive definite matrices.Expositiones Mathematicae, 37(2):165–191, 2019
work page 2019
-
[11]
Causal transport on path space.arXiv preprint arXiv:2412.02948,
Rama Cont and Fang Rui Lim. Causal transport on path space.arXiv preprint arXiv:2412.02948, 2024
-
[12]
Matthias Gelbrich. On a formula for theL 2 Wasserstein metric between measures on Eu- clidean and Hilbert spaces.Mathematische Nachrichten, 147(1):185–203, 1990
work page 1990
-
[13]
Madhu Gunasingam and Ting-Kam Leonard Wong. Adapted optimal transport between gaussian processes in discrete time.Electronic Communications in Probability, 30:1–14, 2025
work page 2025
-
[14]
Cambridge University Press, second edition, 2013
Roger A Horn and Charles R Johnson.Matrix Aalysis. Cambridge University Press, second edition, 2013
work page 2013
-
[15]
Yifan Jiang and Fang Rui Lim. A transfer principle for computing the adapted wasserstein distance between stochastic processes.arXiv preprint arXiv:2505.21337, 2025
-
[16]
Wasserstein distributionally robust optimization: Theory and applications in ma- chine learning
Daniel Kuhn, Peyman Mohajerin Esfahani, Viet Anh Nguyen, and Soroosh Shafieezadeh- Abadeh. Wasserstein distributionally robust optimization: Theory and applications in ma- chine learning. InOperations Research & Management Science in the Age of Analytics, pages 130–166. INFORMS, 2019
work page 2019
-
[17]
R´ emi Lassalle. Causal transport plans and their Monge–Kantorovich problems.Stochastic Analysis and Applications, 36(3):452–484, 2018
work page 2018
-
[18]
Rafa l Lata la. Some estimates of norms of random matrices.Proceedings of the American Mathematical Society, 133(5):1273–1282, 2005
work page 2005
-
[19]
Wasserstein Riemannian geometry of Gaussian densities.Information Geometry, 1:137–179, 2018
Luigi Malag` o, Luigi Montrucchio, and Giovanni Pistone. Wasserstein Riemannian geometry of Gaussian densities.Information Geometry, 1:137–179, 2018
work page 2018
-
[20]
Alessandro Pinzi and Giuseppe Savar´ e. Totally convex functions,L 2-optimal transport for laws of random measures, and solution to the Monge problem.arXiv preprint arXiv:2509.01768, 2025
-
[21]
Wasserstein geometry of Gaussian measures.Osaka Journal of Mathematics, 48(4):1005–1026, 2011
Asuka Takatsu. Wasserstein geometry of Gaussian measures.Osaka Journal of Mathematics, 48(4):1005–1026, 2011
work page 2011
-
[22]
American Mathematical Society, 2003
C´ edric Villani.Topics in Optimal Transportation. American Mathematical Society, 2003
work page 2003
-
[23]
C´ edric Villani.Optimal Transport: Old and New. Springer, 2008. Department of Statistical Sciences, University of Toronto Email address:madhu.gunasingam@mail.utoronto.ca Department of Statistical Sciences, University of Toronto Email address:tkl.wong@utoronto.ca
work page 2008
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.