arxiv: 2604.07383 · v2 · submitted 2026-04-08 · 💻 cs.LG

Recognition: no theorem link

SCOT: Multi-Source Cross-City Transfer with Optimal-Transport Soft-Correspondence Objective

Yuyao Wang , Min Yang , Meng Chen , Weiming Huang , Yongshun Gong

Authors on Pith no claims yet

Pith reviewed 2026-05-12 01:23 UTC · model grok-4.3

classification 💻 cs.LG

keywords cross-city transferoptimal transportsoft correspondencemulti-source learningentropic OTurban representation learningSinkhorn algorithm

0 comments

The pith

SCOT recovers explicit soft correspondences between mismatched city regions via entropic optimal transport to improve multi-source transfer without ground-truth alignments.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that cross-city transfer becomes feasible even when cities use incompatible region partitions and supply no known matches by framing alignment as a Sinkhorn-based entropic optimal transport problem that produces explicit soft correspondences. SCOT sharpens the resulting representations with an OT-weighted contrastive objective, stabilizes training through cycle reconstruction, and handles multiple sources by routing each city to a shared prototype hub under a target-induced prior. These steps yield higher prediction accuracy and robustness on real urban tasks while exposing the quality of each alignment through the learned transport plans and hub assignments. Readers would care because many practical settings involve moving models between cities whose data grids or zoning schemes simply do not line up.

Core claim

SCOT claims that entropic optimal transport can be used to learn explicit soft correspondences between unequal region sets of different cities, that an OT-weighted contrastive loss plus cycle reconstruction makes the transferred structure both sharper and more stable, and that balanced transport to a shared prototype hub extends the approach to the multi-source case, producing measurable gains in accuracy together with interpretable diagnostics of alignment quality.

What carries the argument

Sinkhorn-based entropic optimal transport that computes soft region correspondences, combined with an OT-weighted contrastive objective, cycle-style reconstruction regularizer, and balanced transport to a shared prototype hub guided by a target-induced prior.

If this is right

Transfer accuracy and robustness improve consistently across real-world cities and prediction tasks.
The learned transport couplings and hub assignments supply explicit diagnostics of alignment quality.
Multi-source transfer is handled by aligning every source and the target to one shared prototype hub via balanced entropic transport.
The OT-weighted contrastive objective sharpens transferable structure beyond what distribution-level alignment achieves.
Cycle-style reconstruction stabilizes optimization under heterogeneous city data.
The approach avoids reliance on heuristic anchor choices or implicit distribution matching.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same soft-correspondence mechanism could be tested on other spatial transfer settings such as climate or traffic sensor networks whose grids do not match.
If the recovered couplings prove reliable, they could serve as an unsupervised way to discover functional similarities between urban zones for planning applications.
The framework suggests that optimal transport may be a general primitive for aligning heterogeneous spatial datasets when supervised matches are unavailable.

Load-bearing premise

Meaningful soft correspondences between incompatible region partitions of different cities exist and can be recovered via entropic optimal transport without any ground-truth region matches.

What would settle it

Running SCOT on city pairs whose region partitions are known to be randomly mismatched and observing that accuracy gains disappear while the learned couplings become uniform or uninterpretable would show the soft-correspondence premise does not hold in practice.

Figures

Figures reproduced from arXiv: 2604.07383 by Meng Chen, Min Yang, Weiming Huang, Yongshun Gong, Yuyao Wang.

**Figure 1.** Figure 1: Illustration of motivation. this makes alignment the central technical bottleneck: modern GNN encoders already produce expressive region embeddings, but without a principled correspondence mechanism, those embeddings cannot be reliably transferred across incompatible partitions. To cope with these challenges, prior work typically aligns cities by matching embedding distributions or by constructing heurist… view at source ↗

**Figure 2.** Figure 2: t-SNE visualization for XA→BJ transfer. over-mix embedding clouds under heterogeneity (Fig. 1b). Conversely, anchor/nearest-neighbor matches can be brittle and prone to hubness, producing many-to-one correspondences (Lei et al., 2022; Tang et al., 2022; Zhao et al., 2023; Bao et al., 2022; Wang et al., 2021; Chen et al.). This is visible in [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗

**Figure 3.** Figure 3: The pipeline of SCOT. city semantic alignment between unequal region sets, without requiring node correspondence. Backbone and intra-city objective. Following Chen et al., for each c ∈ {s, t} we initialize learnable embeddings H (0) c ∈ R nc×d and apply L Graph Attention Network (GAT) Veličković et al. (2017) layers over the spatial adjacency graph Ac — which defines the neighborhood structure for attent… view at source ↗

**Figure 4.** Figure 4: Illustration of multi-source hub alignment. prototype marginal controls prototype capacity and emphasizes target-relevant semantics, improving stability and preventing source domination [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

**Figure 5.** Figure 5: Radar-chart matrix for cross-city transfer performance. MAE, MAPE, and Avg (min–max normalized within panel; center = lower error). regressor on the source city labels using (Z X, y X) and directly applying it to target embeddings Z Y to predict y Y . We report MAE and MAPE for GDP, population, and CO2 emission prediction; lower values indicate better transfer. 5.2 Experiment Results 5.2.1 Single-source R… view at source ↗

**Figure 7.** Figure 7: Best single-source SCOT (orange) vs. multi [PITH_FULL_IMAGE:figures/full_fig_p008_7.png] view at source ↗

**Figure 6.** Figure 6: Ablation t-SNE of SCOT alignment (XA→BJ). 5.2.2 Multi-source Results [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗

**Figure 11.** Figure 11: Sensitivity to temperature τ (XA→BJ). 2 4 8 16 32 64 128 256 1000 hub size K 160 180 200 220 MAE GDP MAE MAPE 2 4 8 16 32 64 128 256 1000 hub size K 500 600 700 800 Population 2 4 8 16 32 64 128 256 1000 hub size K 140 160 180 200 CO2 2 4 6 8 2 3 4 5 1.5 2.0 2.5 3.0 3.5 MAPE Sensitivity to hub size K (CD+BJ→XA, multi-source) [PITH_FULL_IMAGE:figures/full_fig_p009_11.png] view at source ↗

**Figure 10.** Figure 10: Sensitivity to Sinkhorn regularization ε (XA→BJ). (Appendix H.3), the target-induced prototype prior (Appendix H.2), and balanced vs. unbalanced OT in hub alignment (Appendix H.4). These results support the stability and selectivity benefits of coordinated many-to-hub matching. 5.4 Diagnostics Alignment diagnostics [PITH_FULL_IMAGE:figures/full_fig_p009_10.png] view at source ↗

**Figure 13.** Figure 13: OT coupling diagnostics. Histograms of column entropy H(P:,j ), row entropy H(Pi,:), and column marginals cj = P i Pij (here shown for a representative epoch). H Ablation Study H.1 Ablation Study for Single Source SCOT We conduct an ablation study on the three components of SCOT: the OT alignment loss LOT, the OT-weighted contrastive loss Lcon, and the reconstruction regularizer Lrec [PITH_FULL_IMAGE:fig… view at source ↗

**Figure 14.** Figure 14: Ablation study on cross-city transfer performance. We report MAE (top row) and MAPE (bottom row) for GDP, population, and CO2 prediction across six transfer directions (BJ→XA, XA→BJ, XA→CD, CD→XA, BJ→CD, CD→BJ). H.2 Ablation: Effect of Target-Induced Prototype Prior Equation (19) constructs a target-induced hub marginal b ∈ ∆K−1 by aggregating target–prototype cosine similarity: s¯k = 1 nt Xnt j=1 z˜ t⊤ j… view at source ↗

**Figure 15.** Figure 15: Entropy of prototype marginal bt over training. The uniform prior stays at log K; the adaptive prior decreases steadily, indicating progressive prototype specialization guided by target semantics. H.3 Ablation: Hub vs. Pairwise OT with Global Gating (Multi-source) To isolate the contribution of the shared-prototype hub in our multi-source setting, we compare (i) Ours (Hub): aligning both sources and the t… view at source ↗

**Figure 16.** Figure 16: Comparison of balanced and unbalanced OT in the hub-based alignment (CD,BJ [PITH_FULL_IMAGE:figures/full_fig_p025_16.png] view at source ↗

**Figure 17.** Figure 17: Sensitivity of SCOT to the contrastive weight [PITH_FULL_IMAGE:figures/full_fig_p026_17.png] view at source ↗

**Figure 18.** Figure 18: Effect of contrastive weight η on embedding alignment (XA→BJ). t-SNE visualizations of region embeddings under three values: η = 0 (left), η = 0.5 (middle), and η = 5 (right). Small η leads to weak alignment, moderate η yields well-interleaved yet structured embeddings, while overly large η results in excessive mixing and degraded structure. 26 [PITH_FULL_IMAGE:figures/full_fig_p026_18.png] view at source ↗

**Figure 19.** Figure 19: t-SNE visualization of region embeddings under different alignment weights [PITH_FULL_IMAGE:figures/full_fig_p027_19.png] view at source ↗

**Figure 20.** Figure 20: visualizes how the contrastive temperature τ shapes alignment. With small τ (0.03), the two cities are less interleaved, suggesting weaker correspondence propagation. A moderate τ (0.1) produces clear interleaving while maintaining cluster structure. When τ is large (1), embeddings become overly smoothed and heavily overlapped, consistent with degraded transfer [PITH_FULL_IMAGE:figures/full_fig_p027_20.png] view at source ↗

**Figure 21.** Figure 21: Sensitivity to target-prior temperature τb for multi-source transfer CD,BJ→XA. Solid: MAE. Dashed: MAPE. Lower is better. Hub-usage diagnostics. We quantify how the target city uses the hub prototypes by the OT column marginal pk = P i Πik and report its normalized entropy H(p)/ log K and effective prototype count exp(H(p)). As shown in [PITH_FULL_IMAGE:figures/full_fig_p028_21.png] view at source ↗

**Figure 22.** Figure 22: Target hub usage diagnostics (XA) under varying [PITH_FULL_IMAGE:figures/full_fig_p028_22.png] view at source ↗

read the original abstract

Cross-city transfer improves prediction in label-scarce cities by leveraging labeled data from other cities, but it becomes challenging when cities adopt incompatible partitions and no ground-truth region correspondences exist. Existing approaches either rely on heuristic region matching, which is often sensitive to anchor choices, or perform distribution-level alignment that leaves correspondences implicit and can be unstable under strong heterogeneity. We propose SCOT, a cross-city representation learning framework that learns explicit soft correspondences between unequal region sets via Sinkhorn-based entropic optimal transport. SCOT further sharpens transferable structure with an OT-weighted contrastive objective and stabilizes optimization through a cycle-style reconstruction regularizer. For multi-source transfer, SCOT aligns each source and the target to a shared prototype hub using balanced entropic transport guided by a target-induced prototype prior. Across real-world cities and tasks, SCOT consistently improves transfer accuracy and robustness, while the learned transport couplings and hub assignments provide interpretable diagnostics of alignment quality.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

SCOT gives a concrete OT pipeline for mismatched region partitions in cross-city transfer, but the soft correspondences are only indirectly supported by accuracy gains.

read the letter

SCOT's main move is to replace heuristic region matching or implicit distribution alignment with explicit soft correspondences learned via Sinkhorn entropic OT, then sharpen them with an OT-weighted contrastive loss and stabilize via cycle reconstruction. For multiple sources it adds a target-induced prototype hub that each city aligns to through balanced transport. That combination is new enough to be worth looking at if you work on spatial transfer where zoning schemes differ across cities. It directly tackles a recurring deployment headache in traffic, pollution, or infrastructure prediction. The framework is cleanly described and the interpretability angle via the learned couplings is a reasonable selling point if the transport plans turn out to be faithful. The soft spot is that the central premise—recovering useful correspondences without any ground-truth matches—rests on downstream accuracy improvements rather than any direct test. No synthetic ground-truth check or comparison against known functional similarities appears in the abstract, so it remains possible that the contrastive or cycle terms are doing most of the work. Experiments are asserted to show consistent gains, but without the actual baselines, ablations, or statistical details visible here it is difficult to judge how robust or general the result is. This paper is aimed at people doing urban computing or domain adaptation on spatial data who need to move models to new cities quickly. A reader who already uses OT or contrastive methods in transfer settings will get the most out of it. It is coherent enough and targets a real barrier, so it deserves a serious referee even if the correspondence validation will probably need tightening.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes SCOT, a cross-city transfer learning framework that uses Sinkhorn-based entropic optimal transport to learn explicit soft correspondences between incompatible region partitions across cities without ground-truth matches. It augments the transport plan with an OT-weighted contrastive objective and a cycle-style reconstruction regularizer, and for multi-source settings aligns sources and target to a shared prototype hub via balanced entropic transport with a target-induced prior. The central claim is that this yields consistent accuracy and robustness gains on real-world urban tasks while the learned couplings and hub assignments supply interpretable diagnostics of alignment quality.

Significance. If the soft correspondences are verifiably meaningful rather than optimization artifacts and the reported gains prove robust, SCOT would offer a principled advance in handling heterogeneous partitions for urban transfer learning, moving beyond heuristic matching or implicit distribution alignment. The explicit interpretability of couplings is a potential strength for diagnostics in applied settings.

major comments (2)

[Abstract and §3] Abstract and §3: The load-bearing premise that entropic OT recovers useful soft region correspondences between incompatible partitions (without any ground-truth matches) is asserted but supported only indirectly through downstream transfer accuracy. No direct validation—such as quantitative metrics on coupling fidelity, synthetic ground-truth benchmarks, or ablations isolating the OT component from the contrastive loss and cycle regularizer—is provided, so gains could arise from the auxiliary objectives alone.
[§4] §4 (Experiments): The evaluation reports accuracy improvements but supplies no details on baselines, number of runs, statistical tests, ablation results on the cycle regularizer or hub prior, or robustness checks under varying heterogeneity. This prevents verification of the 'consistent' and 'robust' claims and leaves the interpretability of the learned couplings untested.

minor comments (2)

The notation for the shared prototype hub, transport couplings, and target-induced prior would benefit from a consolidated symbol table or explicit pseudocode to improve readability.
Figure captions and axis labels in the experimental plots could more explicitly indicate which components (e.g., OT vs. contrastive) are ablated or compared.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful and constructive comments on our manuscript. We address each of the major comments in detail below, and we will incorporate revisions to address the concerns raised.

read point-by-point responses

Referee: [Abstract and §3] Abstract and §3: The load-bearing premise that entropic OT recovers useful soft region correspondences between incompatible partitions (without any ground-truth matches) is asserted but supported only indirectly through downstream transfer accuracy. No direct validation—such as quantitative metrics on coupling fidelity, synthetic ground-truth benchmarks, or ablations isolating the OT component from the contrastive loss and cycle regularizer—is provided, so gains could arise from the auxiliary objectives alone.

Authors: We acknowledge the referee's point that the support for the utility of the entropic OT soft correspondences is primarily through downstream performance. To provide more direct evidence, we will add synthetic experiments with ground-truth correspondences to measure the quality of the recovered soft matches, as well as ablations isolating the OT objective from the contrastive and cycle components. These will be added to the revised manuscript in an expanded §3 and §4. revision: yes
Referee: [§4] §4 (Experiments): The evaluation reports accuracy improvements but supplies no details on baselines, number of runs, statistical tests, ablation results on the cycle regularizer or hub prior, or robustness checks under varying heterogeneity. This prevents verification of the 'consistent' and 'robust' claims and leaves the interpretability of the learned couplings untested.

Authors: We appreciate this feedback on the experimental rigor. We agree that additional details are required to allow verification of the consistency and robustness claims. In the revision, we will: provide a complete description of all baselines; report results averaged over multiple independent runs (with standard deviations); include statistical significance tests; add ablations isolating the cycle regularizer and hub prior; and perform robustness checks under varying degrees of partition heterogeneity. We will also add quantitative metrics to support the interpretability of the learned couplings. These enhancements will be included in the revised §4. revision: yes

Circularity Check

0 steps flagged

No significant circularity: SCOT combines standard OT primitives with new objectives without self-referential reduction

full rationale

The derivation applies Sinkhorn entropic OT to recover soft correspondences between region partitions, then augments it with an OT-weighted contrastive loss and cycle reconstruction regularizer. These steps use established optimal-transport machinery and introduce explicit new loss terms rather than fitting a parameter to data and relabeling the fit as a prediction. No equation reduces a claimed output to an input by construction, no uniqueness theorem is imported from self-citation, and no ansatz is smuggled via prior work. The central claim therefore remains an independent modeling choice whose performance can be evaluated against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

Abstract-only review; framework rests on standard entropic OT assumptions plus paper-specific regularizers and a new prototype-hub construct whose independent evidence is not supplied.

axioms (2)

domain assumption Entropic optimal transport produces useful soft correspondences between heterogeneous region partitions
Core premise enabling explicit alignment without ground-truth matches.
ad hoc to paper Cycle-style reconstruction regularizer stabilizes optimization of the transport plan
Introduced specifically to stabilize training in this setting.

invented entities (1)

shared prototype hub no independent evidence
purpose: Serves as common alignment target for multiple source cities and the target city
New construct for multi-source transfer guided by a target-induced prior.

pith-pipeline@v0.9.0 · 5471 in / 1367 out tokens · 47738 ms · 2026-05-12T01:23:48.775171+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Reference graph

Works this paper leans on

26 extracted references · 26 canonical work pages · 1 internal anchor

[1]

Using optimal transport as alignment objective for fine-tuning multilingual contextualized embeddings.arXiv preprint arXiv:2110.02887,

Alqahtani, S., Lalwani, G., Zhang, Y., Romeo, S., and Mansour, S. Using optimal transport as alignment objective for fine-tuning multilingual contextualized embeddings.arXiv preprint arXiv:2110.02887,

work page arXiv
[2]

Storm-gan: spatio-temporal meta-gan for cross-city estimation of human mobility responses to covid-19

Bao, H., Zhou, X., Xie, Y., Li, Y., and Jia, X. Storm-gan: spatio-temporal meta-gan for cross-city estimation of human mobility responses to covid-19. InProceedings of the 2022 IEEE International Conference on Data Mining, ICDM 2022, Orlando, FL, USA, November 28 - December 1, pp. 1–10,

work page 2022
[3]

& Yahav, E

Brody, S., Alon, U., and Yahav, E. How atten- tive are graph attention networks?arXiv preprint arXiv:2105.14491,

work page arXiv
[4]

When transfer learning meets cross-city urban flow prediction: Spatio-temporal adaptation matters

Fang, Z., Wu, D., Pan, L., Chen, L., and Gao, Y. When transfer learning meets cross-city urban flow prediction: Spatio-temporal adaptation matters. InProceedings of the 2022 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2022, Sands Expo & Convention Centre, Singapore, May 22-27, volume 22, pp. 2030–2036,

work page 2022
[5]

Transferable graph structure learning for graph-based traffic forecasting across cities

Jin, Y., Chen, K., and Yang, Q. Transferable graph structure learning for graph-based traffic forecasting across cities. InProceedings of the Twenty-Ninth ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2023, Long Beach, CA, USA, Au- gust 6-10, pp. 1032–1043,

work page 2023
[6]

and Oh, A

Kim, D. and Oh, A. How to find your friendly neigh- borhood: Graph attention design with self-supervision. arXiv preprint arXiv:2204.04879,

work page arXiv
[7]

Modeling network- level traffic flow transitions on sparse data

Lei, X., Mei, H., Shi, B., and Wei, H. Modeling network- level traffic flow transitions on sparse data. InProceed- ings of the Twenty-Eighth ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2022, Washington DC Convention Center, USA, August 14- 18, pp. 835–845,

work page 2022
[8]

Few-sample traffic pre- diction with graph networks using locale as relational inductive biases.IEEE Transactions on Intelligent Transportation Systems, 24(2):1894–1908,

Li, M., Tang, Y., and Ma, W. Few-sample traffic pre- diction with graph networks using locale as relational inductive biases.IEEE Transactions on Intelligent Transportation Systems, 24(2):1894–1908,

work page 1908
[9]

Diffusion Convolutional Recurrent Neural Network: Data-Driven Traffic Forecasting

Li, Y., Yu, R., Shahabi, C., and Liu, Y. Diffusion convo- lutional recurrent neural network: Data-driven traffic forecasting.arXiv preprint arXiv:1707.01926,

work page Pith review arXiv
[10]

Frequency enhanced pre-training for cross-city few-shot traffic forecasting

Liu, Z., Ding, J., and Zheng, G. Frequency enhanced pre-training for cross-city few-shot traffic forecasting. arXiv preprint arXiv:2406.02614,

work page arXiv
[11]

Spatio-temporal graph few-shot learning with cross- city knowledge transfer

11 SCOT: Multi-Source Cross-City Transfer with Optimal-Transport Soft-Correspondence Objectives Lu, B., Gan, X., Zhang, W., Yao, H., Fu, L., and Wang, X. Spatio-temporal graph few-shot learning with cross- city knowledge transfer. InProceedings of the Twenty- Eighth ACM SIGKDD Conference on Knowledge Dis- covery and Data Mining, KDD 2022, Washington DC Co...

work page 2022
[12]

H., Lam, W

Tang, Y., Qu, A., Chow, A. H., Lam, W. H., Wong, S. C., and Ma, W. Domain adversarial spatial-temporal net- work: A transferable framework for short-term traffic forecasting across cities. InProceedings of the Thirty- First ACM International Conference on Information and Knowledge Management, CIKM 2022, Atlanta, GA, USA, October 17-21, pp. 1905–1915,

work page 2022
[13]

Graph Attention Networks

Veličković, P., Cucurull, G., Casanova, A., Romero, A., Lio, P., and Bengio, Y. Graph attention networks. arXiv preprint arXiv:1710.10903,

work page internal anchor Pith review Pith/arXiv arXiv
[14]

Wang, L., Geng, X., Ma, X., Liu, F., and Yang, Q. Cross- city transfer learning for deep spatio-temporal predic- tion.Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI 2019, Macao, China, August 10-16, pp. 1893–1899,

work page 2019
[15]

Areatransfer: A cross-city crowd flow prediction frame- work based on transfer learning

Wei, X., Guo, T., Yu, H., Li, Z., Guo, H., and Li, X. Areatransfer: A cross-city crowd flow prediction frame- work based on transfer learning. InProceedings of the International Conference on Smart Computing and Communications, ICSCC 2021, New York, USA, De- cember 29, pp. 238–253. Springer,

work page 2021
[16]

Transfer knowledge between cities

Wei, Y., Zheng, Y., and Yang, Q. Transfer knowledge between cities. InProceedings of the Twenty-Second ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2016, San Francisco, California, USA, August 13-17, pp. 1905–1914,

work page 2016
[17]

Graph wavenet for deep spatial-temporal graph model- ing.arXiv preprint arXiv:1906.00121,

Wu, Z., Pan, S., Long, G., Jiang, J., and Zhang, C. Graph wavenet for deep spatial-temporal graph model- ing.arXiv preprint arXiv:1906.00121,

work page arXiv 1906
[18]

Can-st: Clustering adaptive normalization for spatio-temporal ood learning

Yang, M., An, Y., Deng, J., Li, X., Xu, B., Zhong, J., Lu, X., and Gong, Y. Can-st: Clustering adaptive normalization for spatio-temporal ood learning. In Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence, pp. 3543–3551, 2025a. Yang, M., Li, X., Xu, B., Nie, X., Zhao, M., Zhang, C., Zheng, Y., and Gong, Y. Stda: Sp...

work page 2019
[19]

Spatio-temporal graph convolutional networks: A deep learning framework for traffic forecasting.arXiv preprint arXiv:1709.04875,

Yu, B., Yin, H., and Zhu, Z. Spatio-temporal graph convolutional networks: A deep learning framework for traffic forecasting.arXiv preprint arXiv:1709.04875,

work page arXiv
[20]

Cross-city transfer learning tackles data scarcity and high labeling costs in urban computing by transferring knowledge from well-instrumented source cities to label-scarce targets

13 SCOT: Multi-Source Cross-City Transfer with Optimal-Transport Soft-Correspondence Objectives A Related Work A.1 Cross-city transfer. Cross-city transfer learning tackles data scarcity and high labeling costs in urban computing by transferring knowledge from well-instrumented source cities to label-scarce targets. FLORAL demonstrates early cross-city mu...

work page 2016
[21]

and transferable graph structure learning (TransGTR) (Jin et al., 2023), alongside region-level transfer with connectivity/parameter generation (CARPG) (Yang et al.,

work page 2023
[22]

and one-stage embedding-plus-alignment frameworks (CoRE) (Chen et al.). Overall, the key challenge is local and selective alignment across unequal, non-corresponding region sets while preserving city-internal structure and task-relevant semantics, motivating our approach. A.2 Spatio-temporal representation learning. Spatio-temporal representation learning...

work page 2025
[23]

and task-specific self-supervised objectives for traffic forecasting (Ji et al., 2023). However, these methods are mainly developed for single-city or homogeneous node sets and often assume aligned node identities or comparable graph structures; in cross-city settings with heterogeneous partitions, unequal region counts, and no natural correspondence, str...

work page 2023
[24]

and Sinkhorn-type objectives/divergences that are GPU-friendly and balance geometric sensitivity with statistical stability (Genevay et al., 2018; Feydy et al., 2019). OT is also a core tool for distribution alignment in domain adaptation and transfer: OT-based DA aligns source and target by optimizing a coupling with optional structure-preserving regular...

work page 2018
[25]

and SuperGAT (Kim & Oh, 2022), keeping all alignment objectives and training hyperparameters unchanged. Table 10 shows only marginal variation across encoders, indicating that the representational capacity of the backbone is not the performance bottleneck and that the alignment module is the primary driver of cross-city transfer quality. Downstream readou...

work page 2022
[26]

collapse

Figure 19: t-SNE visualization of region embeddings under different alignment weightsλalign (XA→BJ). Blue circles denote source regions and orange triangles denote target regions. I.3 Sensitivity toτ. Figure 20 visualizes how the contrastive temperatureτshapes alignment. With smallτ(0.03), the two cities are less interleaved, suggesting weaker corresponde...

work page 2008