pith. sign in

arxiv: 2605.15888 · v1 · pith:WS5AECQHnew · submitted 2026-05-15 · 💻 cs.LG · cs.AI

CHoE: Cross-Domain Heterogeneous Graph Prompt Learning via Structure-Conditioned Experts

Pith reviewed 2026-05-20 20:10 UTC · model grok-4.3

classification 💻 cs.LG cs.AI
keywords heterogeneous graphprompt learningcross-domainexpert networkmeta-pathfew-shotrouting
0
0 comments X

The pith

CHoE adapts heterogeneous graph prompt learning to cross-domain settings using structure-conditioned experts and routing.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces CHoE to extend heterogeneous graph prompt learning to cross-domain scenarios where pre-training and downstream data distributions differ. It trains structure-conditioned experts on meta-path views from the source domain during pre-training. During prompt tuning it uses structure-aware expert routing and load balancing to select compatible experts for the target domain, along with a prompt-based semantic fusion module to combine multi-view representations. This setup aims to prevent the performance degradation seen in prior in-domain only methods. If true, it would enable more practical use of pre-trained graph models in real-world settings with domain shifts and limited labels.

Core claim

The authors claim that CHoE, built on an expert network, trains structure-conditioned experts during pre-training and applies structure-aware expert routing with load balancing during prompt tuning, plus semantic fusion, resulting in consistent improvements for few-shot cross-domain heterogeneous graph prompt learning over baselines.

What carries the argument

Structure-conditioned experts that are trained on specific meta-path views and routed based on structural compatibility to adapt to new domains.

If this is right

  • CHoE improves performance in few-shot cross-domain applications.
  • It outperforms all baseline approaches in such settings.
  • The routing mechanism helps handle distribution shifts without major degradation.
  • Semantic fusion integrates representations from multiple views effectively.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The expert routing idea might apply to homogeneous graphs or other data types with structural shifts.
  • It could reduce reliance on collecting target domain data for pre-training.
  • Testing the load balancing in scenarios with many experts would check for scalability.

Load-bearing premise

Structure-conditioned experts from the source domain can be routed to handle shifts in target domains without significant performance loss or expert collapse.

What would settle it

If on a cross-domain test set the performance of CHoE with structure-aware routing equals that of a version with random expert selection, the value of the conditioning and routing would be called into question.

Figures

Figures reproduced from arXiv: 2605.15888 by Di Jin, Dongxiao He, Jitao Zhao, Peiyuan Li, Weixiong Zhang, Yongqi Huang.

Figure 1
Figure 1. Figure 1: Motivation Experiments. We present the performance [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Overall framework of CHoE. In the pre-training stage, we adopt a generative framework to train an encoder. In the fine-tuning stage, [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Cross-domain node classification over four datasets under different shot settings. [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Hyperparameter analysis. Sensitivity of the balancing co [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗
read the original abstract

Heterogeneous Graph Prompt Learning (HGPL)has emerged as a promising paradigm for bridging the gap between the objectives of pre-training foundation models and their downstream applications in heterogeneous graph settings. However, existing HGPL methods are primarily designed for in-domain scenarios, whereas real-world deployments often span multiple domains, and the data used for pre-training and downstream tasks may originate from different distributions. Consequently, the applicability of current HGPL approaches is limited to in-domain settings, and their performance typically degrades when application domains shift. To address this serious limitation, we develop CHoE, a cross-domain HGPL method built upon an expert network. During pre-training, we introduce and train structure-conditioned experts, and during prompt tuning, we adopt a structure-aware expert routing and load balancing mechanism to select structurally compatible experts for each meta-path view. In addition, we design a prompt-based semantic fusion module to integrate representations across multiple views for downstream prediction. Extensive experiments show that CHoE consistently improves performance in few-shot cross-domain applications, outperforming all baseline approaches.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper proposes CHoE, a cross-domain heterogeneous graph prompt learning method. It trains structure-conditioned experts during pre-training on source-domain meta-path views and, during prompt tuning, applies structure-aware expert routing plus load balancing to select compatible experts for target domains, followed by prompt-based semantic fusion across views. The central claim is that this yields consistent performance gains over baselines in few-shot cross-domain heterogeneous graph tasks.

Significance. If the routing mechanism reliably transfers without expert collapse or degradation under domain shift, the work would meaningfully extend heterogeneous graph prompt learning beyond in-domain settings to practical multi-domain deployments, where pre-training and downstream data distributions commonly differ.

major comments (2)
  1. [Abstract] The abstract states that CHoE 'consistently improves performance in few-shot cross-domain applications, outperforming all baseline approaches,' yet supplies no quantitative metrics, dataset names, ablation results, or statistical significance tests. This directly affects assessment of whether the empirical support for the central claim is robust.
  2. [Method (prompt tuning and routing)] The structure-aware expert routing and load-balancing mechanism (described in the prompt-tuning stage) lacks any reported measure or ablation of structural compatibility between source and target meta-paths, such as meta-path type/length overlap or basic graph statistics. Without this, it is unclear whether the no-degradation assumption holds when domains differ substantially, which is load-bearing for the cross-domain transfer claim.
minor comments (1)
  1. [Abstract] The abstract contains a typographical error: 'HGPL)has' should be 'HGPL has'.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. The comments highlight opportunities to strengthen the presentation of empirical results and the validation of our cross-domain assumptions. We address each major comment below and outline the revisions we will make.

read point-by-point responses
  1. Referee: [Abstract] The abstract states that CHoE 'consistently improves performance in few-shot cross-domain applications, outperforming all baseline approaches,' yet supplies no quantitative metrics, dataset names, ablation results, or statistical significance tests. This directly affects assessment of whether the empirical support for the central claim is robust.

    Authors: We agree that the abstract would be more informative with specific quantitative details. In the revised manuscript we will update the abstract to include representative performance gains (e.g., average accuracy or AUC improvements), the primary datasets used, and reference to statistical significance testing across the few-shot cross-domain settings. revision: yes

  2. Referee: [Method (prompt tuning and routing)] The structure-aware expert routing and load-balancing mechanism (described in the prompt-tuning stage) lacks any reported measure or ablation of structural compatibility between source and target meta-paths, such as meta-path type/length overlap or basic graph statistics. Without this, it is unclear whether the no-degradation assumption holds when domains differ substantially, which is load-bearing for the cross-domain transfer claim.

    Authors: We acknowledge that direct measures of structural compatibility are not reported in the current version. While our overall experimental results show consistent gains under domain shift, we will add a dedicated analysis subsection that quantifies meta-path type/length overlap and basic graph statistics (e.g., node/edge degree distributions) between the source and target domains used in our experiments, together with an ablation on how these factors correlate with routing behavior and performance. revision: yes

Circularity Check

0 steps flagged

No circularity: CHoE is presented as an independent architectural construction validated by experiments.

full rationale

The abstract and description frame CHoE as a new method with structure-conditioned experts trained during pre-training and structure-aware routing plus semantic fusion applied during prompt tuning. No equations, derivations, or predictions are described that reduce by construction to fitted inputs or self-citations. The central claims rest on empirical improvements in few-shot cross-domain settings rather than tautological reductions. This matches the default case of a self-contained construction with no load-bearing circular steps.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only abstract available; no free parameters, axioms, or invented entities are specified in the provided text.

pith-pipeline@v0.9.0 · 5730 in / 983 out tokens · 39109 ms · 2026-05-20T20:10:39.617939+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

31 extracted references · 31 canonical work pages

  1. [1]

    A survey on mix- ture of experts in large language models.IEEE Transac- tions on Knowledge and Data Engineering,

    [Caiet al., 2025 ] Weilin Cai, Juyong Jiang, Fan Wang, Jing Tang, Sunghun Kim, and Jiayi Huang. A survey on mix- ture of experts in large language models.IEEE Transac- tions on Knowledge and Data Engineering,

  2. [2]

    The comparative toxicoge- nomics database: update 2017.Nucleic acids research, 45(D1):D972–D978,

    [Daviset al., 2017 ] Allan Peter Davis, Cynthia J Grondin, Robin J Johnson, Daniela Sciaky, Benjamin L King, Roy McMorran, Jolene Wiegers, Thomas C Wiegers, and Carolyn J Mattingly. The comparative toxicoge- nomics database: update 2017.Nucleic acids research, 45(D1):D972–D978,

  3. [3]

    metapath2vec: Scalable representa- tion learning for heterogeneous networks

    [Donget al., 2017 ] Yuxiao Dong, Nitesh V Chawla, and Ananthram Swami. metapath2vec: Scalable representa- tion learning for heterogeneous networks. InProceed- ings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, pages 135–144,

  4. [4]

    Universal prompt tuning for graph neural networks.Advances in Neural In- formation Processing Systems, 36:52464–52489,

    [Fanget al., 2023 ] Taoran Fang, Yunchao Zhang, Yang Yang, Chunping Wang, and Lei Chen. Universal prompt tuning for graph neural networks.Advances in Neural In- formation Processing Systems, 36:52464–52489,

  5. [5]

    Edge prompt tuning for graph neural networks

    [Fuet al., 2025 ] Xingbo Fu, Yinhan He, and Jundong Li. Edge prompt tuning for graph neural networks. InThe Thirteenth International Conference on Learning Repre- sentations,

  6. [6]

    Graphmae: Self-supervised masked graph autoencoders

    [Houet al., 2022 ] Zhenyu Hou, Xiao Liu, Yukuo Cen, Yux- iao Dong, Hongxia Yang, Chunjie Wang, and Jie Tang. Graphmae: Self-supervised masked graph autoencoders. InProceedings of the 28th ACM SIGKDD conference on knowledge discovery and data mining, pages 594–604,

  7. [7]

    Strategies for pre-training graph neural net- works

    [Huet al., 2020 ] Weihua Hu, Bowen Liu, Joseph Gomes, Marinka Zitnik, Percy Liang, Vijay Pande, and Jure Leskovec. Strategies for pre-training graph neural net- works. InInternational Conference on Learning Repre- sentations,

  8. [8]

    One Prompt Fits All: Universal Graph Adaptation for Pretrained models.arXiv preprint arXiv:2509.22416,

    [Huanget al., 2025 ] Yongqi Huang, Jitao Zhao, Dongxiao He, Xiaobao Wang, Yawen Li, Yuxiao Huang, Di Jin, and Zhiyong Feng. One prompt fits all: Universal graph adaptation for pretrained models.arXiv preprint arXiv:2509.22416,

  9. [9]

    Hdmi: High-order deep multiplex infomax

    [Jinget al., 2021 ] Baoyu Jing, Chanyoung Park, and Hang- hang Tong. Hdmi: High-order deep multiplex infomax. InProceedings of the web conference 2021, pages 2414– 2424,

  10. [10]

    Unified multi-domain graph pre- training for homogeneous and heterogeneous graphs via domain-specific expert encoding.arXiv preprint arXiv:2602.13075,

    [Lianget al., 2026 ] Chundong Liang, Yongqi Huang, Dongxiao He, Peiyuan Li, Yawen Li, Di Jin, and Weixiong Zhang. Unified multi-domain graph pre- training for homogeneous and heterogeneous graphs via domain-specific expert encoding.arXiv preprint arXiv:2602.13075,

  11. [11]

    Graphprompt: Unifying pre-training and downstream tasks for graph neural networks

    [Liuet al., 2023 ] Zemin Liu, Xingtong Yu, Yuan Fang, and Xinming Zhang. Graphprompt: Unifying pre-training and downstream tasks for graph neural networks. InProceed- ings of the ACM web conference 2023, pages 417–428,

  12. [12]

    Hetgpt: Harnessing the power of prompt tuning in pre-trained heterogeneous graph neural networks

    [Maet al., 2024 ] Yihong Ma, Ning Yan, Jiayu Li, Masood Mortazavi, and Nitesh V Chawla. Hetgpt: Harnessing the power of prompt tuning in pre-trained heterogeneous graph neural networks. InProceedings of the ACM Web Conference 2024, pages 1015–1023,

  13. [13]

    Self-supervised heterogeneous graph learning: a ho- mophily and heterogeneity view

    [Moet al., 2024 ] Yujie Mo, Feiping Nie, Ping Hu, Heng Tao Shen, Zheng Zhang, Xinchao Wang, and Xiaofeng Zhu. Self-supervised heterogeneous graph learning: a ho- mophily and heterogeneity view. InThe Twelfth Interna- tional Conference on Learning Representations,

  14. [14]

    Pro-tuning: Unified prompt tuning for vision tasks.IEEE Transactions on Circuits and Systems for Video Technology, 34(6):4653–4667,

    [Nieet al., 2023 ] Xing Nie, Bolin Ni, Jianlong Chang, Gaofeng Meng, Chunlei Huo, Shiming Xiang, and Qi Tian. Pro-tuning: Unified prompt tuning for vision tasks.IEEE Transactions on Circuits and Systems for Video Technology, 34(6):4653–4667,

  15. [15]

    Unsupervised attributed multiplex network embedding

    [Parket al., 2020 ] Chanyoung Park, Donghyun Kim, Jiawei Han, and Hwanjo Yu. Unsupervised attributed multiplex network embedding. InProceedings of the AAAI confer- ence on artificial intelligence, volume 34, pages 5371– 5378,

  16. [16]

    Scaling vision with sparse mixture of experts.Advances in Neural Information Processing Systems, 34:8583–8595,

    [Riquelmeet al., 2021 ] Carlos Riquelme, Joan Puigcerver, Basil Mustafa, Maxim Neumann, Rodolphe Jenatton, Andr´e Susano Pinto, Daniel Keysers, and Neil Houlsby. Scaling vision with sparse mixture of experts.Advances in Neural Information Processing Systems, 34:8583–8595,

  17. [17]

    Mug: Meta-path-aware universal heterogeneous graph pre-training

    [Shanet al., 2026 ] Lianze Shan, Jitao Zhao, Dongxiao He, Yongqi Huang, Zhiyong Feng, and Weixiong Zhang. Mug: Meta-path-aware universal heterogeneous graph pre-training. InProceedings of the AAAI Conference on Artificial Intelligence, volume 40, pages 25260–25268,

  18. [18]

    Graph prompt learn- ing: A comprehensive survey and beyond.arXiv preprint arXiv:2311.16534,

    [Sunet al., 2023 ] Xiangguo Sun, Jiawen Zhang, Xixi Wu, Hong Cheng, Yun Xiong, and Jia Li. Graph prompt learn- ing: A comprehensive survey and beyond.arXiv preprint arXiv:2311.16534,

  19. [19]

    Higpt: Het- erogeneous graph language model

    [Tanget al., 2024 ] Jiabin Tang, Yuhao Yang, Wei Wei, Lei Shi, Long Xia, Dawei Yin, and Chao Huang. Higpt: Het- erogeneous graph language model. InProceedings of the 30th ACM SIGKDD conference on knowledge discovery and data mining, pages 2842–2853,

  20. [20]

    Hetero- geneous graph masked autoencoders

    [Tianet al., 2023 ] Yijun Tian, Kaiwen Dong, Chunhui Zhang, Chuxu Zhang, and Nitesh V Chawla. Hetero- geneous graph masked autoencoders. InProceedings of the AAAI conference on artificial intelligence, volume 37, pages 9997–10005,

  21. [21]

    Graph attention networks

    [Veliˇckovi´cet al., 2018 ] Petar Veliˇckovi´c, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Li `o, and Yoshua Bengio. Graph attention networks. InInterna- tional Conference on Learning Representations,

  22. [22]

    Deep graph infomax

    [Veliˇckovi´cet al., 2019 ] Petar Veli ˇckovi´c, William Fedus, William L Hamilton, Pietro Li `o, Yoshua Bengio, and R Devon Hjelm. Deep graph infomax. InInternational Conference on Learning Representations,

  23. [23]

    Hetero- geneous graph attention network

    [Wanget al., 2019 ] Xiao Wang, Houye Ji, Chuan Shi, Bai Wang, Yanfang Ye, Peng Cui, and Philip S Yu. Hetero- geneous graph attention network. InThe world wide web conference, pages 2022–2032,

  24. [24]

    Self-supervised heterogeneous graph neural network with co-contrastive learning

    [Wanget al., 2021 ] Xiao Wang, Nian Liu, Hui Han, and Chuan Shi. Self-supervised heterogeneous graph neural network with co-contrastive learning. InProceedings of the 27th ACM SIGKDD conference on knowledge discov- ery & data mining, pages 1726–1736,

  25. [25]

    Heteroge- neous graph contrastive multi-view learning

    [Wanget al., 2023 ] Zehong Wang, Qi Li, Donghua Yu, Xi- aolong Han, Xiao-Zhi Gao, and Shigen Shen. Heteroge- neous graph contrastive multi-view learning. InProceed- ings of the 2023 SIAM international conference on data mining (SDM), pages 136–144. SIAM,

  26. [26]

    Topology-aware feature sorting enables universal modeling on homophilic and heterophilic graphs

    [Wanget al., 2026 ] Yi Wang, Jitao Zhao, Dongxiao He, Jia Li, Yuxiao Huang, and Zhiyong Feng. Topology-aware feature sorting enables universal modeling on homophilic and heterophilic graphs. InProceedings of the ACM Web Conference 2026, pages 475–486,

  27. [27]

    Heterogeneous graph prompt learning via adap- tive weight pruning.arXiv preprint arXiv:2507.09132,

    [Weiet al., 2025 ] Chu-Yuan Wei, Shun-Yao Liu, Sheng-Da Zhuo, Chang-Dong Wang, Shu-Qiang Huang, and Mohsen Guizani. Heterogeneous graph prompt learning via adap- tive weight pruning.arXiv preprint arXiv:2507.09132,

  28. [28]

    Graphlora: Structure-aware contrastive low-rank adaptation for cross-graph transfer learning

    [Yanget al., 2025 ] Zhe-Rui Yang, Jindong Han, Chang- Dong Wang, and Hao Liu. Graphlora: Structure-aware contrastive low-rank adaptation for cross-graph transfer learning. InProceedings of the 31st ACM SIGKDD Con- ference on Knowledge Discovery and Data Mining V . 1, pages 1785–1796,

  29. [29]

    Hgprompt: Bridging homogeneous and heterogeneous graphs for few-shot prompt learning

    [Yuet al., 2024 ] Xingtong Yu, Yuan Fang, Zemin Liu, and Xinming Zhang. Hgprompt: Bridging homogeneous and heterogeneous graphs for few-shot prompt learning. In Proceedings of the AAAI conference on artificial intelli- gence, volume 38, pages 16578–16586,

  30. [30]

    Learning to prompt for vision-language models.International Journal of Computer Vision, 130(9):2337–2348,

    [Zhouet al., 2022 ] Kaiyang Zhou, Jingkang Yang, Chen Change Loy, and Ziwei Liu. Learning to prompt for vision-language models.International Journal of Computer Vision, 130(9):2337–2348,

  31. [31]

    Deep graph contrastive representation learning.arXiv preprint arXiv:2006.04131,

    [Zhuet al., 2020 ] Yanqiao Zhu, Yichen Xu, Feng Yu, Qiang Liu, Shu Wu, and Liang Wang. Deep graph contrastive representation learning.arXiv preprint arXiv:2006.04131, 2020