Wasserstein Spatial Depth
Pith reviewed 2026-05-23 17:31 UTC · model grok-4.3
The pith
Wasserstein spatial depth extends classical statistical depth to probability distributions equipped with the Wasserstein metric.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We extend the concept of statistical depth to distribution-valued data, introducing the notion of Wasserstein spatial depth. This new measure provides a way to rank and order distributions, enabling the development of order-based clustering techniques and inferential tools. We show that Wasserstein spatial depth (WSD) preserves critical properties of conventional statistical depths, notably, ranging within [0,1], transformation and geodesic invariance, vanishing at infinity, reaching a maximum at the geometric median, and continuity. The population WSD admits a plug-in estimator based on empirical distributions that is consistent and asymptotically normal, and the approach yields a twoSample
What carries the argument
Wasserstein spatial depth, obtained by replacing Euclidean distance in the classical spatial depth formula with the geodesic distance induced by the Wasserstein metric on the space of distributions.
If this is right
- The depth supplies a ranking that directly yields order-based clustering algorithms for populations of distributions.
- A two-sample test for equality of distribution populations follows from comparing depth values or depth regions.
- The empirical depth regions have explicitly characterized breakdown points under contamination.
- The plug-in estimator based on sampled empirical distributions is consistent and asymptotically normal.
- The influence function of the depth can be derived to quantify robustness to perturbations of individual distributions.
Where Pith is reading between the lines
- The same construction could be attempted with other optimal-transport metrics provided they induce a geodesic structure compatible with the depth axioms.
- Applications to functional data or to point processes become immediate once those objects are embedded in a Wasserstein-type space.
- One could test whether depth-based ordering improves upon moment-based or kernel-based methods on concrete tasks such as image histogram classification.
Load-bearing premise
The space of distributions under the Wasserstein metric behaves enough like a Riemannian manifold that the Euclidean spatial depth formula extends directly while keeping all listed invariance and extremal properties.
What would settle it
A finite collection of distributions whose empirical Wasserstein geometric median does not maximize the computed Wasserstein spatial depth value, or a sequence of distributions converging in Wasserstein distance whose depth values fail to converge.
Figures
read the original abstract
Modeling observations as random distributions embedded within Wasserstein spaces is becoming increasingly popular across scientific fields, as it captures the variability and geometric structure of the data more effectively. However, the distinct geometry and unique properties of Wasserstein spaces pose challenges to the application of conventional statistical tools, which are primarily designed for Euclidean spaces. Consequently, adapting and developing new methodologies for analysis within Wasserstein spaces has become essential. The space of distributions on $\mathbb{R}^d$ with $d>1$ is not linear, and "mimic" the geometry of a Riemannian manifold. In this paper, we extend the concept of statistical depth to distribution-valued data, introducing the notion of Wasserstein spatial depth. This new measure provides a way to rank and order distributions, enabling the development of order-based clustering techniques and inferential tools. We show that Wasserstein spatial depth (WSD) preserves critical properties of conventional statistical depths, notably, ranging within $[0,1]$, transformation and geodesic invariance, vanishing at infinity, reaching a maximum at the geometric median, and continuity. Regarding robustness, we characterize the breakdown points of the empirical depth regions and the influence function of the WSD. Additionally, the population WSD has a straightforward plug-in estimator based on sampled empirical distributions. We establish the estimator's consistency and asymptotic normality. We also provide a two-sample test for populations of distributions based on the WSD. Finally, extensive simulations and a real-data application showcase the practical efficacy of the WSD.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces Wasserstein spatial depth (WSD) for distribution-valued data in the Wasserstein space of probability measures on R^d (d>1). It defines WSD by direct analogy to Euclidean spatial depth, leveraging the claim that the Wasserstein metric space 'mimics the geometry of a Riemannian manifold.' The manuscript asserts that WSD satisfies the standard depth axioms (range in [0,1], transformation and geodesic invariance, vanishing at infinity, unique maximum at the geometric median, continuity), characterizes breakdown points of empirical depth regions and the influence function, establishes consistency and asymptotic normality of the plug-in estimator, and proposes a two-sample test, supported by simulations and a real-data example.
Significance. If the geometric construction is made explicit and the claimed invariance and extremal properties are shown to hold without additional assumptions that reduce the result to a fitted quantity, the work would supply a concrete tool for ranking and clustering distributions with robustness and inferential guarantees. The plug-in estimator's asymptotic normality and the two-sample test would be the most immediately usable contributions for applications involving distribution-valued observations.
major comments (3)
- [Abstract] Abstract and opening paragraphs: the definition of WSD is justified by the statement that the Wasserstein space 'mimics the geometry of a Riemannian manifold,' yet no explicit construction (tangent-space projection, exponential map, or handling of non-unique geodesics and cut loci) is supplied. Without this, the transfer of the listed properties (geodesic invariance, unique maximum at the geometric median) does not follow automatically for d>1.
- [Abstract] Abstract: the claims of preservation of depth axioms, breakdown-point calculations, influence function, consistency, and asymptotic normality are asserted without any displayed derivations, explicit formulas, or theorem statements. This prevents verification that the properties hold under the actual Wasserstein geometry rather than by construction.
- [Introduction / Definition] The weakest modeling assumption (that the Wasserstein metric space admits a direct, well-behaved extension of Euclidean spatial depth) is load-bearing for every subsequent claim; if the construction fails to be well-defined everywhere, the robustness, consistency, and test results rest on an unverified foundation.
minor comments (1)
- [Abstract] Notation for the Wasserstein metric and the geometric median should be introduced with a displayed equation at first use to avoid ambiguity when d>1.
Simulated Author's Rebuttal
We thank the referee for the constructive report and the opportunity to clarify the geometric foundation and presentation of our results on Wasserstein spatial depth. We address each major comment below.
read point-by-point responses
-
Referee: [Abstract] Abstract and opening paragraphs: the definition of WSD is justified by the statement that the Wasserstein space 'mimics the geometry of a Riemannian manifold,' yet no explicit construction (tangent-space projection, exponential map, or handling of non-unique geodesics and cut loci) is supplied. Without this, the transfer of the listed properties (geodesic invariance, unique maximum at the geometric median) does not follow automatically for d>1.
Authors: We agree that the phrasing 'mimics the geometry of a Riemannian manifold' is informal and that an explicit metric-space definition is preferable. The WSD is defined directly via the Wasserstein distance in the spatial-depth formula (Section 2), replacing the Euclidean norm with W_2 without invoking tangent spaces or exponential maps. Geodesic invariance follows from the isometry property of the Wasserstein metric under push-forwards, and the unique maximum at the geometric median is proved using the strict convexity of the Wasserstein distance for measures with finite second moments. For d>1 we will add a remark noting that non-uniqueness of geodesics does not affect the distance-based definition. These clarifications and the explicit formula will be inserted in the revised introduction. revision: yes
-
Referee: [Abstract] Abstract: the claims of preservation of depth axioms, breakdown-point calculations, influence function, consistency, and asymptotic normality are asserted without any displayed derivations, explicit formulas, or theorem statements. This prevents verification that the properties hold under the actual Wasserstein geometry rather than by construction.
Authors: The abstract is a high-level summary; the full statements appear as Theorems 3.1 (depth axioms), 4.1–4.2 (breakdown points and influence function), 5.1 (consistency), and 5.2 (asymptotic normality) with proofs in the appendix. To improve readability we will move the theorem statements (without proofs) into a new subsection of the introduction so that the abstract claims are immediately traceable to the stated results. revision: yes
-
Referee: [Introduction / Definition] The weakest modeling assumption (that the Wasserstein metric space admits a direct, well-behaved extension of Euclidean spatial depth) is load-bearing for every subsequent claim; if the construction fails to be well-defined everywhere, the robustness, consistency, and test results rest on an unverified foundation.
Authors: The definition in Section 2 is given for all probability measures with finite second moments (the natural domain of W_2), and well-definedness follows immediately from the triangle inequality and continuity of W_2. All subsequent results (robustness, consistency, two-sample test) are proved from this metric definition using only properties of the Wasserstein space as a complete separable metric space; no additional Riemannian structure is assumed. We will add a short paragraph after the definition explicitly stating the domain and confirming that the extension is well-defined everywhere on this domain. revision: partial
Circularity Check
No circularity: definition of WSD followed by independent proofs of listed properties
full rationale
The paper defines Wasserstein spatial depth by direct analogy to the Euclidean spatial depth using the Wasserstein metric on the space of distributions. It then states that it shows the measure preserves the standard properties (range [0,1], invariances, vanishing at infinity, maximum at geometric median, continuity). No quoted step reduces any claimed property to a fitted input, self-definition, or self-citation chain; the listed properties are presented as results to be established rather than presupposed in the definition. The modeling choice that the Wasserstein space 'mimics' Riemannian geometry is an explicit modeling assumption used to motivate the definition, not a hidden reduction. This is the normal case of a self-contained construction with subsequent verification.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The space of probability distributions equipped with the Wasserstein metric admits a depth notion that inherits the standard properties of statistical depth (range [0,1], invariance, maximum at geometric median).
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
SD(Q; P) := 1 − (∫ |E_P∼P [(x − T_{Q,P}(x))/W_2(P,Q)]|^2 dQ(x))^{1/2} (Definition 3.1); geodesic velocity v_{Q→P}^0 = T_{Q,P} − I
-
IndisputableMonolith/Foundation/AlexanderDuality.leanalexander_duality_circle_linking unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
W_2 endows P_2(R^d) with geodesic metric-space structure; tangent bundle TP(P_2) = closure of gradients in L^2(P)
-
IndisputableMonolith/Foundation/ArithmeticFromLogic.leanembed_injective unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Properties: [0,1]-valued, transformation invariance under isometries of (P_2,W_2), vanishing at infinity, maximality at spatial median
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Ambrosio, L., Gigli, N. and Savare, G.(2005). Gradient Flows in Met- ric Spaces and in the Space of Probability Measures . Birkh¨ auser Basel
work page 2005
-
[2]
Bachoc, F. , B´ethune, L. , Gonzalez-Sanz, A. and Loubes, J.-M. (2023a). Gaussian processes on distributions based on regularized optimal transport. In International Conference on Artificial Intelligence and Statis- tics 26 4986–5010
-
[3]
Bachoc, F. , B´ethune, L. , Gonz´alez-Sanz, A. and Loubes, J.-M. (2023b). Improved learning theory for kernel distribution regression with two-stage sampling. arXiv:2308.14335
-
[4]
Berlinet, A. and Thomas-Agnan, C. (2011). Reproducing Kernel Hilbert Spaces in Probability and Statistics . Springer Science & Business Media
work page 2011
-
[5]
Bertrand, J. and Kloeckner, B. (2012). A geometric study of Wasser- stein spaces: Hadamard spaces. Journal of Topology and Analysis 4 515– 542. Bachoc, Gonz´ alez-Sanz, Loubes, and Yao/Wasserstein Spatial Depth 45
work page 2012
-
[6]
Bigot, J. (2020). Statistical data analysis in the Wasserstein space. ESAIM: Proceedings and Surveys 68 1–19
work page 2020
-
[7]
Bigot, J., Gouet, R., Klein, T. and L´opez, A. (2017). Geodesic PCA in the Wasserstein space by convex PCA. Annales de l’Institut Henri Poincar´ e, Probabilit´ es et Statistiques53 1 – 26
work page 2017
-
[8]
Boissard, E., Le Gouic, T. and Loubes, J.-M. (2015). Distribution’s template estimate with Wasserstein metrics. Bernoulli 21 740–759
work page 2015
-
[9]
Bonneel, N., Peyr´e, G. and Cuturi, M. (2016). Wasserstein barycen- tric coordinates: histogram regression using optimal transport.ACM Trans- actions on Graphics 35 71–1
work page 2016
-
[10]
Brezis, H. (2010). Functional Analysis, Sobolev Spaces and Partial Dif- ferential Equations. New York: Springer
work page 2010
-
[11]
Chakraborty, A. and Chaudhuri, P. (2014). The spatial distribution in infinite dimensional spaces and related quantiles and depths.The Annals of Statistics 42 1203 – 1231
work page 2014
-
[12]
Chami, I., Gu, A., Chatziafratis, V. and R´e, C. (2020). From trees to continuous embeddings and back: Hyperbolic hierarchical clustering. Ad- vances in Neural Information Processing Systems 33 15065–15076
work page 2020
-
[13]
Chan, S. , Santoro, A. , Lampinen, A. , Wang, J. , Singh, A. , Richemond, P., McClelland, J. and Hill, F. (2022). Data distribu- tional properties drive emergent in-context learning in transformers. Ad- vances in Neural Information Processing Systems 35 18878–18891
work page 2022
-
[14]
Chaudhuri, P. (1996). On a geometric notion of quantiles for multivariate data. Journal of the American Statistical Association 91 862–872
work page 1996
- [15]
-
[16]
Chernozhukov, V., Galichon, A., Hallin, M. and Henry, M. (2017). Monge-Kantorovich depth, quantiles, ranks and signs.The Annals of Statis- tics 45 223–256
work page 2017
-
[17]
Cuesta-Albertos, J. A. , Matr´an-Bea, C. and Tuero-Di´az, A. (1996). On lower bounds for the L2-Wasserstein metric in a Hilbert space. Journal of Theoretical Probability 9 263-283
work page 1996
-
[18]
Cuesta-Albertos, J. A. and Nieto-Reyes, A. (2008). The random Tukey depth. Computational Statistics and Data Analysis 52 4979–4988
work page 2008
-
[19]
Cuevas, A., Febrero, M. and Fraiman, R. (2007). Robust estimation and classification for functional data via projection-based depth notions. Computational Statistics 22 481–496
work page 2007
-
[20]
Cuevas, A. and Fraiman, R. (2009). On depth measures and dual statis- tics. A methodology for dealing with general data. Journal of Multivariate Analysis 100 753-766
work page 2009
-
[21]
Cuturi, M. (2013). Sinkhorn distances: Lightspeed computation of op- timal transport. Advances in Neural Information Processing Systems 27 2292-2300
work page 2013
-
[22]
Dai, X. and Lopez-Pintado, S. (2023). Tukey’s depth for object data. Journal of the American Statistical Association 118 1760-1772
work page 2023
-
[23]
Deb, N. and Sen, B. (2023). Multivariate rank-based distribution-free Bachoc, Gonz´ alez-Sanz, Loubes, and Yao/Wasserstein Spatial Depth 46 nonparametric testing using measure transportation. Journal of the Amer- ican Statistical Association 118 192–207
work page 2023
-
[24]
Del Barrio, E., Inouzhe, H., Loubes, J.-M., Matr´an, C. and Mayo- ´Iscar, A. (2020). optimalFlow: optimal transport approach to flow cytom- etry gating and population matching. BMC Bioinformatics 21 1–25
work page 2020
-
[25]
Dubey, P., Chen, Y. and M¨uller, H.-G. (2024). Metric statistics: Ex- ploration and inference for random objects with distance profiles. The An- nals of Statistics 52 757–792
work page 2024
-
[26]
Dutta, S., Ghosh, A. K. and Chaudhuri, P. (2011). Some intriguing properties of Tukey’s half-space depth. Bernoulli 17
work page 2011
-
[27]
Fraiman, R. and Muniz, G. (2001). Trimmed means for functional data. Test 10 419–440
work page 2001
-
[28]
Geenens, G. , Nieto-Reyes, A. and Francisci, G. (2023). Statistical depth in abstract metric spaces. Statistics and Computing 33
work page 2023
-
[29]
Ghorbani, A., Kim, M. and Zou, J. (2020). A distributional framework for data valuation. In International Conference on Machine Learning 37 3535–3544
work page 2020
-
[30]
Gonz´alez-Sanz, A. , Hallin, M. and Sen, B. (2023). Monotone measure-preserving maps in Hilbert spaces: existence, uniqueness, and sta- bility. arXiv:2305.11751
-
[31]
Hallin, M., del Barrio, E. , Cuesta-Albertos, J. and Matr´an, C. (2021). Distribution and quantile functions, ranks and signs in dimension d: A measure transportation approach. The Annals of Statistics 49 1139 – 1165
work page 2021
-
[32]
Kloeckner, B. (2010). A geometric study of Wasserstein spaces: Eu- clidean spaces. Annali della Scuola Normale Superiore di Pisa - Classe di Scienze 9 297–323
work page 2010
-
[33]
Ledoux, M. and Talagrand, M. (1991). Probability in Banach Spaces . Springer Berlin Heidelberg
work page 1991
-
[34]
Liu, R. Y. (1990). On a notion of data depth based on random simplices. The Annals of Statistics 405–414
work page 1990
-
[35]
Liu, Z. and Modarres, R. (2011). Lens data depth and median. Journal of Nonparametric Statistics 23 1063–1074
work page 2011
-
[36]
Liu, R. Y. and Singh, K. (1993). A quality index based on data depth and multivariate rank tests. Journal of the American Statistical Association 88 252–260
work page 1993
-
[37]
Long, J. P. and Huang, J. Z. (2015). A study of functional depths. arXiv:1506.01332
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[38]
L´opez-Pintado, S. and Romo, J. (2009). On the concept of depth for functional data. Journal of the American Statistical Association 104 718– 734
work page 2009
-
[39]
L´opez-Pintado, S. and Romo, J. (2011). A half-region depth for func- tional data. Computational Statistics & Data Analysis 55 1679-1695
work page 2011
-
[40]
McCann, R. J. (1995). Existence and uniqueness of monotone measure- preserving maps. Duke Mathematical Journal 80 309 – 323
work page 1995
-
[41]
Meunier, D. , Pontil, M. and Ciliberto, C. (2022). Distribution re- Bachoc, Gonz´ alez-Sanz, Loubes, and Yao/Wasserstein Spatial Depth 47 gression with sliced Wasserstein kernels. In International Conference on Machine Learning 39 15501–15523
work page 2022
-
[42]
Mosler, K. (2013). Depth statistics. Robustness and Complex Data Struc- tures: Festschrift in Honour of Ursula Gather 17–34. Springer Berlin Hei- delberg
work page 2013
-
[43]
Mosler, K. and Mozharovskyi, P. (2022). Choosing among notions of multivariate depth statistics. Statistical Science 37 348–368
work page 2022
-
[44]
Muzellec, B. and Cuturi, M. (2018). Generalizing point embeddings using the Wasserstein space of elliptical distributions. Advances in Neural Information Processing Systems 31 10258 - 10269
work page 2018
-
[45]
Nagy, S. (2017). Monotonicity properties of spatial depth. Statistics and Probability Letters 129 373-378
work page 2017
-
[46]
Nieto-Reyes, A. and Battey, H. (2016). A topologically valid definition of depth for functional data. Statistical Science 31 61 – 79
work page 2016
-
[47]
Oja, H. (1983). Descriptive statistics for multivariate distributions. Statis- tics & Probability Letters 1 327–332
work page 1983
-
[48]
Otto, F. (2001). The geometry of dissipative evolution equations: The porous medium equation. Communications in Partial Differential Equa- tions 26 101–174
work page 2001
-
[49]
Peyr´e, G. and Cuturi, M. (2019). Computational optimal transport: With applications to data science. Foundations and Trends® in Machine Learning 11 355–607
work page 2019
- [50]
-
[51]
Serfling, R. (2002). A depth function and a scale curve based on spatial quantiles. In Statistical Data Analysis Based on the L1-Norm and Related Methods 25–38. Springer
work page 2002
-
[52]
K., Gretton, A., Fukumizu, K., Sch¨olkopf, B
Sriperumbudur, B. K., Gretton, A., Fukumizu, K., Sch¨olkopf, B. and Lanckriet, G. R. (2010). Hilbert space embeddings and metrics on probability measures. The Journal of Machine Learning Research 11 1517– 1561
work page 2010
-
[53]
Szab´o, Z. , Sriperumbudur, B. K. , P´oczos, B. and Gretton, A. (2016). Learning theory for distribution regression. Journal of Machine Learning Research 17 1–40
work page 2016
-
[54]
Tukey, J. W. (1975). Mathematics and the picturing of data. In Proceed- ings of the International Congress of Mathematicians 2 523–531. Vancou- ver
work page 1975
-
[55]
van der Vaart, A. W. and Wellner, J. A. (1996). Weak Convergence and Empirical Processes. Springer New York
work page 1996
-
[56]
Vardi, Y. and Zhang, C.-H. (2000). The multivariate L1-median and associated data depth. Proceedings of the National Academy of Sciences 97 1423–1426
work page 2000
-
[57]
Villani, C. (2003). Topics in Optimal Transportation . Graduate Studies in Mathematics 58. American Mathematical Society, Providence, RI
work page 2003
-
[58]
Villani, C. (2009). Optimal Transport: Old and New . Springer-Verlag, Berlin. Bachoc, Gonz´ alez-Sanz, Loubes, and Yao/Wasserstein Spatial Depth 48
work page 2009
- [59]
-
[60]
Wang, J.-L., Chiou, J.-M. and M¨uller, H.-G. (2016). Functional data analysis. Annual Review of Statistics and its Application 3 257–295
work page 2016
-
[61]
Zhou, Y. and Sharpee, T. O. (2021). Hyperbolic geometry of gene ex- pression. Iscience 24
work page 2021
-
[62]
Zhuang, Y., Chen, X. and Yang, Y. (2022). Wasserstein K-means for clustering probability distributions. Advances in Neural Information Pro- cessing Systems 35 11382–11395
work page 2022
- [63]
-
[64]
Zuo, Y. and Serfling, R. (2000). General notions of statistical depth function. Annals of Statistics 461–482
work page 2000
-
[65]
´Alvarez Esteban, P. C., del Barrio, E. , Cuesta-Albertos, J. A. and Matr´an, C. (2016). A fixed-point approach to barycenters in Wasser- stein space. Journal of Mathematical Analysis and Applications 441 744–762
work page 2016
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.