arxiv: 2605.13753 · v1 · submitted 2026-05-13 · 💻 cs.LG · cs.CV

Recognition: unknown

Min Generalized Sliced Gromov Wasserstein: A Scalable Path to Gromov Wasserstein

Ashkan Shahbazi , Xinran Liu , Ping He , Soheil Kolouri

Authors on Pith no claims yet

Pith reviewed 2026-05-14 19:40 UTC · model grok-4.3

classification 💻 cs.LG cs.CV

keywords Gromov-Wassersteinsliced optimal transportgeneralized slicersgeometric matchingshape analysisamortized inferencetransport plan

0 comments

The pith

Min-GSGW learns coupled nonlinear slicers so that monotone 1D matching induces low-cost Gromov-Wasserstein transport plans in the original spaces.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces min-GSGW to make the Gromov-Wasserstein problem scalable by replacing direct optimization over transport plans with optimization over a small set of learned projection functions. These functions map each input measure into one dimension such that sorting the projected points produces a coupling whose cost, when evaluated back in the original metric spaces, is close to the true GW optimum. Because the projections are learned jointly for each pair and the final cost is computed in the original spaces, the method inherits rigid-motion invariance without extra alignment steps. Experiments on mesh matching and part transfer show that the induced plans remain geometrically meaningful while running at a fraction of the cost of standard GW solvers. An amortized version further replaces per-pair optimization with a single forward pass of a trained network.

Core claim

min-GSGW minimizes the Gromov-Wasserstein objective directly in the original spaces by searching over parameters of coupled generalized slicers; the optimal monotone coupling of the resulting one-dimensional push-forwards is lifted back to a transport plan whose GW cost serves as the training signal, yielding both the plan and its objective value at reduced expense.

What carries the argument

Coupled nonlinear slicers that produce compatible one-dimensional push-forwards whose monotone coupling induces a transport plan whose GW cost is minimized in the original spaces.

If this is right

The induced transport plans supply geometrically meaningful correspondences for tasks such as mesh matching and shape interpolation.
Rigid-motion invariance holds automatically because the final cost is evaluated with the original distances rather than projected distances.
An amortized network version removes the need for per-instance optimization, enabling direct application to new pairs.
Computational cost drops because only the slicer parameters are optimized instead of a full coupling matrix.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same projection-learning idea could be applied to other quadratic optimal-transport objectives that currently lack fast solvers.
If the slicers generalize across datasets, the amortized model might support real-time registration pipelines in robotics or graphics.
Because the method decouples projection learning from the final cost evaluation, hybrid schemes that combine it with a small number of full GW refinement steps become feasible.

Load-bearing premise

The monotone coupling chosen in the projected one-dimensional domain produces a transport plan whose cost under the original Gromov-Wasserstein objective is close to the global minimum.

What would settle it

On a rigid-motion-invariant shape-matching benchmark, the GW costs or correspondence errors produced by min-GSGW plans are substantially worse than those returned by a converged exact or Sinkhorn GW solver.

Figures

Figures reproduced from arXiv: 2605.13753 by Ashkan Shahbazi, Ping He, Soheil Kolouri, Xinran Liu.

**Figure 1.** Figure 1: Overview of min-GSGW. (a) Two metric measure spaces µ on X ⊂ R p and ν on Y ⊂ R q with q ≥ p. (b) A learned lifting h : X →Y maps source points into the target domain; a shared nonlinear slicer f ∗ : Y →R then assigns push-forward values to both lifted source points h(xi) (circles) and target points yj (squares). Level curves of f ∗ induce a monotone ordering on both sets. (c) Sorting the two push-forward … view at source ↗

**Figure 3.** Figure 3: Horse mesh interpolation from GW couplings. OT barycentric interpolation between consecutive horse meshes at t = 0.33 and t = 0.67. POT achieves lower GW values (8.257 × 10−3 for 0→1, 9.714 × 10−3 for 1→2), yet our method still produces smooth, geometrically coherent deformations (2.884 × 10−2 for 0→1, 4.622 × 10−2 for 1→2). Structural constraints. For the amortized plan Gθ(X, Y ) to be a well-formed slice… view at source ↗

**Figure 4.** Figure 4: Qualitative correspondences on ShapeNet. Each column shows one object category, while rows show the source shape, POT GW, GW Sinkhorn, and our method. Source colors are propagated to the target via the estimated coupling. Bottom annotations report part-matching accuracy, and the last row also reports the accuracy difference between our method and POT GW [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗

**Figure 5.** Figure 5: Qualitative comparison on four toy datasets. Top: GW correspondences. Bottom: ours. Source points lie on [PITH_FULL_IMAGE:figures/full_fig_p012_5.png] view at source ↗

**Figure 6.** Figure 6: Shape correspondence on animal meshes. Landmark correspondences (18 total, 6 shown) computed from geodesic distance matrices. Blue denotes source landmarks, red denotes predicted target landmarks, and black indicates correspondences. (a) POT GW. (b) Ours [PITH_FULL_IMAGE:figures/full_fig_p013_6.png] view at source ↗

read the original abstract

We propose min Generalized Sliced Gromov--Wasserstein (min-GSGW), a sliced formulation for the Gromov--Wasserstein (GW) problem using expressive generalized slicers. The key idea is to learn coupled nonlinear slicers that assign compatible push-forward values to both input measures, so that monotone coupling in the projected domain lifts to a transport plan evaluated against the GW objective in the original spaces. The resulting plan induces a GW objective value, and min-GSGW minimizes this cost directly in the original spaces. We further show that min-GSGW is rigid-motion invariant, a crucial property for geometric matching and shape analysis tasks. Our contributions are threefold: 1) we introduce generalized slicers into the sliced GW framework, 2) we construct a slicing-based efficient GW transport plan; and 3) we develop an amortized variant that replaces per-instance optimization with a learned slicer for unseen input pairs. We perform experiments on animal mesh matching, horse mesh interpolation, and ShapeNet part transfer. Results show that min-GSGW produces meaningful geometric correspondences and GW objective values at substantially lower computational cost than existing GW solvers.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

min-GSGW brings nonlinear slicers and direct original-space GW minimization into the sliced setting, but the lift from 1D couplings to near-optimal plans has no bound or tight empirical check.

read the letter

The paper's main contribution is a sliced GW formulation that learns coupled nonlinear slicers so a monotone 1D coupling lifts to a transport plan whose full GW cost is then minimized directly in the original spaces. They also add an amortized version that learns the slicers once for new pairs instead of optimizing per instance. This moves past the linear projections used in earlier sliced GW work and keeps the rigid-motion invariance that matters for shape tasks. The experiments on animal mesh matching, horse interpolation, and ShapeNet part transfer show faster runtimes than standard GW solvers while producing usable correspondences. That combination of generalized slicers, direct objective minimization, and amortization is the genuinely new piece. The invariance property is cleanly established and the practical speed-up on meshes is the clearest win. The softer part is the approximation quality. Because the method only considers plans that factor through the learned slicer family, it optimizes over a restricted set of couplings. No theorem bounds the gap to the true GW optimum, and the reported results do not include side-by-side GW values against exact or high-accuracy solvers on the same data. If that gap is large on some instances, the scalability claim needs qualification. The paper targets researchers in optimal transport and geometric deep learning who need to run GW-style matching on larger point clouds or meshes. A reader already familiar with sliced OT will see the construction clearly and will want the missing controls on approximation error. I would send it to peer review. The idea is fresh enough and the target problem is practical enough that referees can usefully press on the guarantees and the experimental comparisons.

Referee Report

3 major / 2 minor

Summary. The manuscript proposes min Generalized Sliced Gromov-Wasserstein (min-GSGW), which learns coupled nonlinear slicers to project input measures, applies monotone 1D coupling in the projected domain, and lifts the resulting plan to evaluate and minimize the Gromov-Wasserstein objective directly in the original spaces. It further establishes rigid-motion invariance and presents an amortized variant that replaces per-pair optimization with a learned slicer. Experiments on animal mesh matching, horse mesh interpolation, and ShapeNet part transfer are reported to demonstrate meaningful correspondences at substantially lower cost than existing GW solvers.

Significance. If the lifted plans reliably achieve GW costs close to the global optimum, the approach would supply a practical, scalable route to GW distances and transport plans for large geometric datasets, addressing a key computational bottleneck in shape analysis and geometric machine learning.

major comments (3)

[Abstract / lifting construction] Abstract and the section describing the lifting procedure: the statement that the monotone 1D coupling 'lifts to a transport plan evaluated against the GW objective in the original spaces' and that min-GSGW 'minimizes this cost directly' is load-bearing for the central efficiency claim, yet no explicit construction of the lift, no derivation that the induced cost equals the GW cost of the lifted plan, and no approximation bound relative to the true GW optimum are supplied.
[Experiments] Experimental section (animal mesh matching and ShapeNet results): while lower wall-clock time is shown, the reported GW objective values are not compared quantitatively to those obtained from exact or high-accuracy GW solvers on the same instances (e.g., relative gap or rank correlation); without such controls the claim that the plans are 'meaningful' and constitute a 'scalable path' cannot be assessed.
[Invariance proof] Section on rigid-motion invariance: the invariance property is asserted for the learned slicers, but the argument must be stated explicitly (including how the nonlinear slicer family interacts with isometries) because invariance alone does not address the approximation gap to the true GW optimum.

minor comments (2)

[Amortized variant] The amortized variant is introduced in the contributions but its training objective and inference procedure are not contrasted clearly with the per-instance optimization; a short algorithmic box would improve readability.
[Preliminaries] Notation for the generalized slicers and the push-forward measures should be introduced once with a single consistent symbol set rather than re-defined inline in multiple places.

Simulated Author's Rebuttal

3 responses · 1 unresolved

We thank the referee for their detailed and constructive comments on our manuscript. We provide point-by-point responses to the major comments below, indicating where revisions will be made.

read point-by-point responses

Referee: [Abstract / lifting construction] Abstract and the section describing the lifting procedure: the statement that the monotone 1D coupling 'lifts to a transport plan evaluated against the GW objective in the original spaces' and that min-GSGW 'minimizes this cost directly' is load-bearing for the central efficiency claim, yet no explicit construction of the lift, no derivation that the induced cost equals the GW cost of the lifted plan, and no approximation bound relative to the true GW optimum are supplied.

Authors: We appreciate the referee's emphasis on this foundational aspect. In the revised version, we will provide an explicit construction of the lifting procedure from the 1D coupling to the original space transport plan, including the mathematical derivation that the GW cost is correctly computed on the lifted plan. Regarding the approximation bound to the true GW optimum, we note that establishing such a bound is non-trivial and not currently available; we will instead emphasize the empirical evidence and the method's practical utility. revision: partial
Referee: [Experiments] Experimental section (animal mesh matching and ShapeNet results): while lower wall-clock time is shown, the reported GW objective values are not compared quantitatively to those obtained from exact or high-accuracy GW solvers on the same instances (e.g., relative gap or rank correlation); without such controls the claim that the plans are 'meaningful' and constitute a 'scalable path' cannot be assessed.

Authors: We agree that additional quantitative validation against exact or high-accuracy solvers would strengthen the experimental claims. Since exact GW solvers scale poorly to the dataset sizes in our experiments, we will include new experiments on smaller instances where exact computation is feasible, reporting relative gaps and comparisons to other approximate GW methods to better assess the quality of the obtained transport plans. revision: yes
Referee: [Invariance proof] Section on rigid-motion invariance: the invariance property is asserted for the learned slicers, but the argument must be stated explicitly (including how the nonlinear slicer family interacts with isometries) because invariance alone does not address the approximation gap to the true GW optimum.

Authors: We will revise the section to include a detailed and explicit proof of rigid-motion invariance. The proof will demonstrate how the family of nonlinear slicers interacts with isometries to ensure the min-GSGW distance remains unchanged under simultaneous rigid transformations of both measures. revision: yes

standing simulated objections not resolved

Establishing a theoretical approximation bound to the global optimum of the Gromov-Wasserstein distance.

Circularity Check

0 steps flagged

No circularity: minimization operates directly on GW objective of induced plans

full rationale

The paper defines min-GSGW explicitly as learning coupled nonlinear slicers, inducing a transport plan via 1D monotone coupling, then minimizing the GW cost of that plan evaluated in the original spaces. This is a parameterized optimization over a restricted class of plans, with the objective computed directly rather than by algebraic identity or fitted-parameter renaming. No equation reduces the reported GW value to an input by construction, no self-citation is load-bearing for the core claim, and no ansatz or uniqueness theorem is smuggled in. The approximation gap to true GW is a separate empirical question, not a circularity issue.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review supplies no explicit free parameters, axioms, or invented entities; the learned slicer networks implicitly contain trainable weights, but none are named or quantified here.

pith-pipeline@v0.9.0 · 5514 in / 1139 out tokens · 41758 ms · 2026-05-14T19:40:22.115458+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

26 extracted references · 26 canonical work pages · 1 internal anchor

[1]

On assignment problems related to gromov–wasserstein distances on the real line.SIAM Journal on Imaging Sciences, 16(2):1028–1032, 2023

Robert Beinert, Cosmas Heiss, and Gabriele Steidl. On assignment problems related to gromov–wasserstein distances on the real line.SIAM Journal on Imaging Sciences, 16(2):1028–1032, 2023

work page 2023
[2]

ShapeNet: An Information-Rich 3D Model Repository

Angel X. Chang, Thomas Funkhouser, Leonidas Guibas, Pat Hanrahan, Qixing Huang, Zimo Li, Silvio Savarese, Manolis Savva, Shuran Song, Hao Su, Jianxiong Xiao, Li Yi, and Fisher Yu. ShapeNet: An Information-Rich 3D Model Repository. Technical Report arXiv:1512.03012 [cs.GR], Stanford University — Princeton University — Toyota Technological Institute at Chic...

work page internal anchor Pith review Pith/arXiv arXiv 2015
[3]

Differentiable generalized sliced wasserstein plans

Laetitia Chapel, Romain Tavenard, and Samuel Vaiter. Differentiable generalized sliced wasserstein plans. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems, 2026

work page 2026
[4]

Semidefinite relaxations of the gromov- wasserstein distance

Junyu Chen, Binh Nguyen, Shang Hui Koh, and Yong Sheng Soh. Semidefinite relaxations of the gromov- wasserstein distance. InThe Thirty-eighth Annual Conference on Neural Information Processing Systems, 2024

work page 2024
[5]

Quantized gromov-wasserstein, 2021

Samir Chowdhury, David Miller, and Tom Needham. Quantized gromov-wasserstein, 2021

work page 2021
[6]

Sinkhorn distances: Lightspeed computation of optimal transport

Marco Cuturi. Sinkhorn distances: Lightspeed computation of optimal transport. In C.J. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K.Q. Weinberger, editors,Advances in Neural Information Processing Systems, volume 26. Curran Associates, Inc., 2013

work page 2013
[7]

On the existence of monge maps for the gro- mov–wasserstein problem.Foundations of Computational Mathematics, 25(2):463–510, February 2024

Théo Dumont, Théo Lacombe, and François-Xavier Vialard. On the existence of monge maps for the gro- mov–wasserstein problem.Foundations of Computational Mathematics, 25(2):463–510, February 2024

work page 2024
[8]

Alaya, Aurélie Boisbunon, Stanislas Chambon, Laetitia Chapel, Adrien Corenflos, Kilian Fatras, Nemo Fournier, Léo Gautheron, Nathalie T.H

Rémi Flamary, Nicolas Courty, Alexandre Gramfort, Mokhtar Z. Alaya, Aurélie Boisbunon, Stanislas Chambon, Laetitia Chapel, Adrien Corenflos, Kilian Fatras, Nemo Fournier, Léo Gautheron, Nathalie T.H. Gayraud, Hicham Janati, Alain Rakotomamonjy, Ievgen Redko, Antoine Rolet, Antony Schutz, Vivien Seguy, Danica J. Sutherland, Romain Tavenard, Alexander Tong,...

work page 2021
[9]

Revisiting Frank-Wolfe: Projection-free sparse convex optimization

Martin Jaggi. Revisiting Frank-Wolfe: Projection-free sparse convex optimization. In Sanjoy Dasgupta and David McAllester, editors,Proceedings of the 30th International Conference on Machine Learning, volume 28 of Proceedings of Machine Learning Research, pages 427–435, Atlanta, Georgia, USA, 17–19 Jun 2013. PMLR

work page 2013
[10]

Sampled gromov wasserstein.Machine Learning, 110(8):2151–2186, 2021

Tanguy Kerdoncuff, Rémi Emonet, and Marc Sebban. Sampled gromov wasserstein.Machine Learning, 110(8):2151–2186, 2021

work page 2021
[11]

Efficient transferable optimal transport via min-sliced transport plans, 2025

Xinran Liu, Elaheh Akbari, Rocio Diaz Martin, Navid NaderiAlizadeh, and Soheil Kolouri. Efficient transferable optimal transport via min-sliced transport plans, 2025

work page 2025
[12]

Expected sliced transport plans

Xinran Liu, Rocio Diaz Martin, Yikun Bai, Ashkan Shahbazi, Matthew Thorpe, Akram Aldroubi, and Soheil Kolouri. Expected sliced transport plans. In Y . Yue, A. Garg, N. Peng, F. Sha, and R. Yu, editors,International Conference on Learning Representations, volume 2025, pages 60948–60971, 2025

work page 2025
[13]

Fast optimal transport through sliced generalized wasserstein geodesics

Guillaume Mahey, Laetitia Chapel, Gilles Gasso, Clément Bonet, and Nicolas Courty. Fast optimal transport through sliced generalized wasserstein geodesics. In A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine, editors,Advances in Neural Information Processing Systems, volume 36, pages 35350–35385. Curran Associates, Inc., 2023

work page 2023
[14]

Gromov-monge quasi-metrics and distance distributions.arXiv, 2018, 2018

Facundo Mémoli and Tom Needham. Gromov-monge quasi-metrics and distance distributions.arXiv, 2018, 2018

work page 2018
[15]

Gromov-wasserstein distances and the metric approach to object matching.Foundations of Computational Mathematics, 11(4):417–487, 2011

Facundo Mémoli. Gromov-wasserstein distances and the metric approach to object matching.Foundations of Computational Mathematics, 11(4):417–487, 2011

work page 2011
[16]

Max-min sliced gromov-wasserstein, 2026

Wen-Xin Pan, Isabel Haasler, and Hei Victor Cheng. Max-min sliced gromov-wasserstein, 2026

work page 2026
[17]

Gromov-wasserstein averaging of kernel and distance matrices

Gabriel Peyré, Marco Cuturi, and Justin Solomon. Gromov-wasserstein averaging of kernel and distance matrices. In Maria Florina Balcan and Kilian Q. Weinberger, editors,Proceedings of The 33rd International Conference on Machine Learning, volume 48 ofProceedings of Machine Learning Research, pages 2664–2672, New York, New York, USA, 20–22 Jun 2016. PMLR

work page 2016
[18]

Mark Rowland, Jiri Hron, Alexander G. de G. Matthews, and Zoubin Ghahramani. Orthogonal estimation of wasserstein distances. InInternational Conference on Artificial Intelligence and Statistics, volume 89 of Proceedings of Machine Learning Research, pages 186–195, 2019. 10 min-GSGW

work page 2019
[19]

Linear-time gromov Wasserstein distances using low rank couplings and costs

Meyer Scetbon, Gabriel Peyré, and Marco Cuturi. Linear-time gromov Wasserstein distances using low rank couplings and costs. In Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvari, Gang Niu, and Sivan Sabato, editors,Proceedings of the 39th International Conference on Machine Learning, volume 162 of Proceedings of Machine Learning Research, pa...

work page 2022
[20]

Kim, and Suvrit Sra

Justin Solomon, Gabriel Peyré, Vladimir G. Kim, and Suvrit Sra. Entropic metric alignment for correspondence problems.ACM Trans. Graph., 35(4), July 2016

work page 2016
[21]

Bednarczyk, Igor T

Łukasz Struski, Michał B. Bednarczyk, Igor T. Podolak, and Jacek Tabor. Lapsum - one method to differentiate them all: Ranking, sorting and top-k selection. InThe International Conference on Machine Learning (ICML) 2025, 2025

work page 2025
[22]

Sumner and Jovan Popovi ´c

Robert W. Sumner and Jovan Popovi ´c. Deformation transfer for triangle meshes.ACM Trans. Graph., 23(3):399–405, August 2004

work page 2004
[23]

Optimal transport for structured data with application on graphs

Vayer Titouan, Nicolas Courty, Romain Tavenard, Chapel Laetitia, and Rémi Flamary. Optimal transport for structured data with application on graphs. In Kamalika Chaudhuri and Ruslan Salakhutdinov, editors,Proceedings of the 36th International Conference on Machine Learning, volume 97 ofProceedings of Machine Learning Research, pages 6275–6284, Long Beach,...

work page 2019
[24]

Sliced gromov-wasserstein

Vayer Titouan, Rémi Flamary, Nicolas Courty, Romain Tavenard, and Laetitia Chapel. Sliced gromov-wasserstein. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, editors,Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019

work page 2019
[25]

Scalable gromov-wasserstein learning for graph partitioning and matching

Hongteng Xu, Dixin Luo, and Lawrence Carin. Scalable gromov-wasserstein learning for graph partitioning and matching. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019

work page 2019
[26]

Duality and sample complexity for the gromov-wasserstein distance

Zhengxin Zhang, Ziv Goldfeld, Youssef Mroueh, and Bharath Sriperumbudur. Duality and sample complexity for the gromov-wasserstein distance. InNeurIPS 2023 Workshop Optimal Transport and Machine Learning, 2023. 11 min-GSGW A Ablation Study We ablate two orthogonal design choices ofmin-GSGW: whether the slicer family islinearornonlinear, and whether slicers...

work page 2023