min-GSGW: A Scalable Path to Gromov

Abstract

We propose min Generalized Sliced Gromov–Wasserstein (\(\mathrm{min\text{-}GSGW}\)), a sliced formulation for the Gromov–Wasserstein (GW) problem using expressive generalized slicers. The key idea is to learn coupled nonlinear slicers that assign compatible push-forward values to both input measures, so that monotone coupling in the projected domain lifts to a transport plan evaluated against the GW objective in the original spaces.

The resulting plan induces a GW objective value, and \(\mathrm{min\text{-}GSGW}\) minimizes this cost directly in the original spaces. We further show that \(\mathrm{min\text{-}GSGW}\) is rigid-motion invariant, a crucial property for geometric matching and shape analysis tasks.

We validate on animal mesh matching, horse shape interpolation, and ShapeNet part transfer. Results show that \(\mathrm{min\text{-}GSGW}\) produces meaningful geometric correspondences and GW objective values at substantially lower computational cost than existing GW solvers.

Rigid-motion invariant Explicit transport plans \(\mathcal{O}(n \log n)\) plan extraction Amortized inference GW upper bound guarantee

Contributions

01

Generalized slicer framework. We introduce \(\mathrm{min\text{-}GSGW}\) — a coupled nonlinear slicer construction where both spaces share a common scalarization via a lifting map \(h: \mathcal{X} \to \mathcal{Y}\), inducing transport plans by sorting push-forward values. Nonlinearity reaches regions of coupling space inaccessible to linear projections.
02

Structural theory. We establish rigid-motion invariance and show \(\mathrm{min\text{-}GSGW}\) is a valid upper bound on the GW distance. We optimize via a differentiable sorting relaxation (LapSum), annealed toward hard permutations at inference.
03

Amortized matching. An amortized variant replaces per-instance optimization with a shared-weight transformer architecture that predicts push-forward scores in a single forward pass, generalizing to unseen shape pairs without reoptimization.

Method

Figure 1: Overview of \(\mathrm{min\text{-}GSGW}\). (a) Two metric measure spaces \(\mu\) on \(\mathcal{X} \subset \mathbb{R}^p\) and \(\nu\) on \(\mathcal{Y} \subset \mathbb{R}^q\). (b) A learned lifting \(h: \mathcal{X} \to \mathcal{Y}\) maps source points into the target domain; a shared nonlinear slicer \(f^*: \mathcal{Y} \to \mathbb{R}\) assigns push-forward values to both sets. (c) Sorting and monotone matching yields the transport coupling \(\pi^{f,h}\), whose GW objective is minimized.

Transport plan construction proceeds in three stages. By composing a nonlinear rearrangement \(\xi\) into the slicer, the full 1D GW plan is recoverable from monotone matching under mild conditions (Proposition 1).

Step 01

Lift

A learned measurable map \(h: X \to Y\) embeds the lower-dimensional source into the target domain, replacing the rigid zero-padding of prior sliced GW methods.

Step 02

Shared Nonlinear Slicer

A single scalar function \(f: Y \to \mathbb{R}\) assigns push-forward values to both lifted source points and target points, coupling the two orderings by construction.

Step 03

Sort & Evaluate

Sorting the push-forward sequences and applying monotone matching yields a feasible coupling \(\pi^{f,h}\), whose GW loss is evaluated in the original metric spaces and minimized.

\[\mathrm{min\text{-}GSGW}(\mu,\nu) \;:=\; \inf_{f \in \mathcal{F},\; h \in \mathcal{H}} \;\mathcal{L}_{\mathrm{GW}}\!\bigl(\mu,\,\nu;\;\pi^{\mathrm{mon}}_{f,h}\bigr)\]

Core objective — minimizes GW loss over slicer-induced monotone couplings

Upper bound relation to GW

\[\mathrm{GW}^2(\mu,\nu) \;=\; \mathcal{L}_{\mathrm{GW}}(\mu,\nu;\pi^*) \;\leq\; \inf_{f \in \mathcal{F},\; h \in \mathcal{H}} \mathcal{L}_{\mathrm{GW}}\!\bigl(\mu,\nu;\pi^{f,h}\bigr) \;=\; \mathrm{min\text{-}GSGW}(\mu,\nu)\]

Valid upper bound on \(\mathrm{GW}^2\); tight when the restricted family contains an optimal GW plan.

Rigid-motion invariance (Proposition 2)

Under natural stability assumptions on \(\mathcal{F}\) and \(\mathcal{H}\), for all \(g_X \in \mathcal{E}(p)\), \(g_Y \in \mathcal{E}(q)\):

\[\mathrm{min\text{-}GSGW}(g_{X\#}\mu,\; g_{Y\#}\nu) \;=\; \mathrm{min\text{-}GSGW}(\mu,\;\nu)\]

The objective is invariant to rigid motions applied independently to each space.

Computational Complexity

At inference, plan extraction costs \(\mathcal{O}(n \log n)\) via sorting. Dense GW solvers incur cubic per-iteration cost from quadratic GW tensor contractions.

Method	Complexity	Explicit plan?
POT-GW	NP-hard; \(\mathcal{O}(n^4)\) naive · \(\mathcal{O}(n^3)\) iter.	✓
Sinkhorn GW	\(\mathcal{O}(n^4)\) naive · \(\mathcal{O}(n^3)\) iter.	✓
SGW	\(\mathcal{O}(L \cdot n \log n)\)	—
MSGW	\(\mathcal{O}(L^2 \cdot n \log n)\), OOM at \(N{>}1000\)	—
RI-SGW	\(\mathcal{O}(n_{\mathrm{iter}}(Ln(p{+}\log n){+}p^3))\)	—
min-GSGW (Ours)	\(\mathcal{O}(n \log n)\)	✓

Wall-clock runtime · RTX A6000 · \(N=1024\) · avg. 10 runs

POT-GW

886 ms

Sinkhorn (ε=0.05)

843 ms

Sinkhorn (ε=0.5)

822 ms

Sinkhorn (ε=1)

790 ms

min-GSGW (Ours)

9.4 ms

Figure 2: Runtime scaling. Wall-clock time vs. \(N \in \{100,\ldots,5000\}\) on RTX A6000, averaged over 10 runs. MSGW runs out of memory beyond \(N=1000\); \(\mathrm{min\text{-}GSGW}\) scales most favorably.

Results

Realistic Shape Matching — Geodesic Error ↓

Princeton protocol: mean normalized geodesic distance. H = Horse, E = Elephant, C = Cat.

Method	H–H	E–E	C–C	H–E	C–H	C–E	Time (s)
POT	0.112	0.103	0.079	0.181	0.208	0.189	1.82
Sinkhorn	0.178	0.194	0.138	0.247	0.281	0.236	0.41
LR-GW	0.143	0.157	0.103	0.213	0.241	0.207	0.87
SDP-GW	0.128	0.134	0.091	0.196	0.224	0.193	15.7
SaGroW	0.167	0.182	0.124	0.231	0.264	0.221	0.31
min-GSGW (Ours)	0.079	0.091	0.058	0.138	0.162	0.124	0.08

Figure 3: Shape correspondence on animal meshes. 18 landmark correspondences (6 shown) from geodesic distance matrices. Blue = source, red = predicted target, black = correspondence lines. (a) POT GW. (b) Ours.

Horse Mesh Interpolation

OT barycentric interpolation at \(t=0.33\) and \(t=0.67\). Although POT achieves lower GW values, our plans yield plausible dense correspondences and smooth deformations.

Figure 4: Horse mesh interpolation from GW couplings. POT: \(8.26{\times}10^{-3}\) (0→1), \(9.71{\times}10^{-3}\) (1→2). Ours: \(2.88{\times}10^{-2}\) (0→1), \(4.62{\times}10^{-2}\) (1→2). Global GW optimality does not imply geometric usefulness.

Amortized ShapeNet Part Segmentation — Accuracy ↑

Mean part label transfer accuracy. No part labels used during training. Forward time per pair in ms.

Method	Acc. \(N{=}256\)	ms	Acc. \(N{=}512\)	ms	Acc. \(N{=}1024\)	ms
POT GW	68.9%	75.3	69.8%	375.7	73.5%	886.2
Sinkhorn (\(\varepsilon{=}0.05\))	66.1%	164.7	66.6%	361.1	71.3%	843.1
Sinkhorn (\(\varepsilon{=}0.5\))	64.3%	91.7	65.5%	370.2	68.7%	821.9
Sinkhorn (\(\varepsilon{=}1\))	62.8%	72.3	64.7%	303.3	67.4%	790.2
min-GSGW (Ours)	78.2%	2.66	77.5%	4.3	74.9%	9.4

Figure 5: Qualitative correspondences on ShapeNet. Each column = one object category. Rows: source, POT GW, GW Sinkhorn, ours. Source colors propagated via the estimated coupling. Bottom: part-matching accuracy.

Ablation — Slicer Design

Geodesic error \((\downarrow)\). Nonlinearity and coupling each provide an independent gain.

Slicer	Relation	H–H	E–E	C–C	H–E
Linear	Independent	0.162	0.178	0.121	0.238
Linear	Dependent	0.134	0.147	0.098	0.203
Nonlinear	Independent	0.108	0.119	0.081	0.172
Nonlinear	Dependent	0.079	0.091	0.058	0.138

Figure 6: Toy correspondences. Qualitative comparison on four synthetic 2D-to-3D pairs. Top: GW. Bottom: ours. Source points on \(z=0\); targets in \(\mathbb{R}^3\). Our method produces cleaner, more structurally aligned matches.

Min Generalized Sliced Gromov–Wasserstein:
A Scalable Path to Gromov–Wasserstein