CHoE: Cross-Domain Heterogeneous Graph Prompt Learning via Structure-Conditioned Experts
Pith reviewed 2026-05-20 20:10 UTC · model grok-4.3
The pith
CHoE adapts heterogeneous graph prompt learning to cross-domain settings using structure-conditioned experts and routing.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors claim that CHoE, built on an expert network, trains structure-conditioned experts during pre-training and applies structure-aware expert routing with load balancing during prompt tuning, plus semantic fusion, resulting in consistent improvements for few-shot cross-domain heterogeneous graph prompt learning over baselines.
What carries the argument
Structure-conditioned experts that are trained on specific meta-path views and routed based on structural compatibility to adapt to new domains.
If this is right
- CHoE improves performance in few-shot cross-domain applications.
- It outperforms all baseline approaches in such settings.
- The routing mechanism helps handle distribution shifts without major degradation.
- Semantic fusion integrates representations from multiple views effectively.
Where Pith is reading between the lines
- The expert routing idea might apply to homogeneous graphs or other data types with structural shifts.
- It could reduce reliance on collecting target domain data for pre-training.
- Testing the load balancing in scenarios with many experts would check for scalability.
Load-bearing premise
Structure-conditioned experts from the source domain can be routed to handle shifts in target domains without significant performance loss or expert collapse.
What would settle it
If on a cross-domain test set the performance of CHoE with structure-aware routing equals that of a version with random expert selection, the value of the conditioning and routing would be called into question.
Figures
read the original abstract
Heterogeneous Graph Prompt Learning (HGPL)has emerged as a promising paradigm for bridging the gap between the objectives of pre-training foundation models and their downstream applications in heterogeneous graph settings. However, existing HGPL methods are primarily designed for in-domain scenarios, whereas real-world deployments often span multiple domains, and the data used for pre-training and downstream tasks may originate from different distributions. Consequently, the applicability of current HGPL approaches is limited to in-domain settings, and their performance typically degrades when application domains shift. To address this serious limitation, we develop CHoE, a cross-domain HGPL method built upon an expert network. During pre-training, we introduce and train structure-conditioned experts, and during prompt tuning, we adopt a structure-aware expert routing and load balancing mechanism to select structurally compatible experts for each meta-path view. In addition, we design a prompt-based semantic fusion module to integrate representations across multiple views for downstream prediction. Extensive experiments show that CHoE consistently improves performance in few-shot cross-domain applications, outperforming all baseline approaches.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes CHoE, a cross-domain heterogeneous graph prompt learning method. It trains structure-conditioned experts during pre-training on source-domain meta-path views and, during prompt tuning, applies structure-aware expert routing plus load balancing to select compatible experts for target domains, followed by prompt-based semantic fusion across views. The central claim is that this yields consistent performance gains over baselines in few-shot cross-domain heterogeneous graph tasks.
Significance. If the routing mechanism reliably transfers without expert collapse or degradation under domain shift, the work would meaningfully extend heterogeneous graph prompt learning beyond in-domain settings to practical multi-domain deployments, where pre-training and downstream data distributions commonly differ.
major comments (2)
- [Abstract] The abstract states that CHoE 'consistently improves performance in few-shot cross-domain applications, outperforming all baseline approaches,' yet supplies no quantitative metrics, dataset names, ablation results, or statistical significance tests. This directly affects assessment of whether the empirical support for the central claim is robust.
- [Method (prompt tuning and routing)] The structure-aware expert routing and load-balancing mechanism (described in the prompt-tuning stage) lacks any reported measure or ablation of structural compatibility between source and target meta-paths, such as meta-path type/length overlap or basic graph statistics. Without this, it is unclear whether the no-degradation assumption holds when domains differ substantially, which is load-bearing for the cross-domain transfer claim.
minor comments (1)
- [Abstract] The abstract contains a typographical error: 'HGPL)has' should be 'HGPL has'.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our manuscript. The comments highlight opportunities to strengthen the presentation of empirical results and the validation of our cross-domain assumptions. We address each major comment below and outline the revisions we will make.
read point-by-point responses
-
Referee: [Abstract] The abstract states that CHoE 'consistently improves performance in few-shot cross-domain applications, outperforming all baseline approaches,' yet supplies no quantitative metrics, dataset names, ablation results, or statistical significance tests. This directly affects assessment of whether the empirical support for the central claim is robust.
Authors: We agree that the abstract would be more informative with specific quantitative details. In the revised manuscript we will update the abstract to include representative performance gains (e.g., average accuracy or AUC improvements), the primary datasets used, and reference to statistical significance testing across the few-shot cross-domain settings. revision: yes
-
Referee: [Method (prompt tuning and routing)] The structure-aware expert routing and load-balancing mechanism (described in the prompt-tuning stage) lacks any reported measure or ablation of structural compatibility between source and target meta-paths, such as meta-path type/length overlap or basic graph statistics. Without this, it is unclear whether the no-degradation assumption holds when domains differ substantially, which is load-bearing for the cross-domain transfer claim.
Authors: We acknowledge that direct measures of structural compatibility are not reported in the current version. While our overall experimental results show consistent gains under domain shift, we will add a dedicated analysis subsection that quantifies meta-path type/length overlap and basic graph statistics (e.g., node/edge degree distributions) between the source and target domains used in our experiments, together with an ablation on how these factors correlate with routing behavior and performance. revision: yes
Circularity Check
No circularity: CHoE is presented as an independent architectural construction validated by experiments.
full rationale
The abstract and description frame CHoE as a new method with structure-conditioned experts trained during pre-training and structure-aware routing plus semantic fusion applied during prompt tuning. No equations, derivations, or predictions are described that reduce by construction to fitted inputs or self-citations. The central claims rest on empirical improvements in few-shot cross-domain settings rather than tautological reductions. This matches the default case of a self-contained construction with no load-bearing circular steps.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
structure-aware expert routing ... r_i^l = exp(s_i^{+l}/τ) / (exp(s_i^{+l}/τ) + exp(s_i^{-l}/τ)) ... load counter m_i ... prompt-based semantic fusion
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
structure-conditioned experts ... pre-training on source-domain meta-path views
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
[Caiet al., 2025 ] Weilin Cai, Juyong Jiang, Fan Wang, Jing Tang, Sunghun Kim, and Jiayi Huang. A survey on mix- ture of experts in large language models.IEEE Transac- tions on Knowledge and Data Engineering,
work page 2025
-
[2]
The comparative toxicoge- nomics database: update 2017.Nucleic acids research, 45(D1):D972–D978,
[Daviset al., 2017 ] Allan Peter Davis, Cynthia J Grondin, Robin J Johnson, Daniela Sciaky, Benjamin L King, Roy McMorran, Jolene Wiegers, Thomas C Wiegers, and Carolyn J Mattingly. The comparative toxicoge- nomics database: update 2017.Nucleic acids research, 45(D1):D972–D978,
work page 2017
-
[3]
metapath2vec: Scalable representa- tion learning for heterogeneous networks
[Donget al., 2017 ] Yuxiao Dong, Nitesh V Chawla, and Ananthram Swami. metapath2vec: Scalable representa- tion learning for heterogeneous networks. InProceed- ings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining, pages 135–144,
work page 2017
-
[4]
[Fanget al., 2023 ] Taoran Fang, Yunchao Zhang, Yang Yang, Chunping Wang, and Lei Chen. Universal prompt tuning for graph neural networks.Advances in Neural In- formation Processing Systems, 36:52464–52489,
work page 2023
-
[5]
Edge prompt tuning for graph neural networks
[Fuet al., 2025 ] Xingbo Fu, Yinhan He, and Jundong Li. Edge prompt tuning for graph neural networks. InThe Thirteenth International Conference on Learning Repre- sentations,
work page 2025
-
[6]
Graphmae: Self-supervised masked graph autoencoders
[Houet al., 2022 ] Zhenyu Hou, Xiao Liu, Yukuo Cen, Yux- iao Dong, Hongxia Yang, Chunjie Wang, and Jie Tang. Graphmae: Self-supervised masked graph autoencoders. InProceedings of the 28th ACM SIGKDD conference on knowledge discovery and data mining, pages 594–604,
work page 2022
-
[7]
Strategies for pre-training graph neural net- works
[Huet al., 2020 ] Weihua Hu, Bowen Liu, Joseph Gomes, Marinka Zitnik, Percy Liang, Vijay Pande, and Jure Leskovec. Strategies for pre-training graph neural net- works. InInternational Conference on Learning Repre- sentations,
work page 2020
-
[8]
[Huanget al., 2025 ] Yongqi Huang, Jitao Zhao, Dongxiao He, Xiaobao Wang, Yawen Li, Yuxiao Huang, Di Jin, and Zhiyong Feng. One prompt fits all: Universal graph adaptation for pretrained models.arXiv preprint arXiv:2509.22416,
-
[9]
Hdmi: High-order deep multiplex infomax
[Jinget al., 2021 ] Baoyu Jing, Chanyoung Park, and Hang- hang Tong. Hdmi: High-order deep multiplex infomax. InProceedings of the web conference 2021, pages 2414– 2424,
work page 2021
-
[10]
[Lianget al., 2026 ] Chundong Liang, Yongqi Huang, Dongxiao He, Peiyuan Li, Yawen Li, Di Jin, and Weixiong Zhang. Unified multi-domain graph pre- training for homogeneous and heterogeneous graphs via domain-specific expert encoding.arXiv preprint arXiv:2602.13075,
-
[11]
Graphprompt: Unifying pre-training and downstream tasks for graph neural networks
[Liuet al., 2023 ] Zemin Liu, Xingtong Yu, Yuan Fang, and Xinming Zhang. Graphprompt: Unifying pre-training and downstream tasks for graph neural networks. InProceed- ings of the ACM web conference 2023, pages 417–428,
work page 2023
-
[12]
Hetgpt: Harnessing the power of prompt tuning in pre-trained heterogeneous graph neural networks
[Maet al., 2024 ] Yihong Ma, Ning Yan, Jiayu Li, Masood Mortazavi, and Nitesh V Chawla. Hetgpt: Harnessing the power of prompt tuning in pre-trained heterogeneous graph neural networks. InProceedings of the ACM Web Conference 2024, pages 1015–1023,
work page 2024
-
[13]
Self-supervised heterogeneous graph learning: a ho- mophily and heterogeneity view
[Moet al., 2024 ] Yujie Mo, Feiping Nie, Ping Hu, Heng Tao Shen, Zheng Zhang, Xinchao Wang, and Xiaofeng Zhu. Self-supervised heterogeneous graph learning: a ho- mophily and heterogeneity view. InThe Twelfth Interna- tional Conference on Learning Representations,
work page 2024
-
[14]
[Nieet al., 2023 ] Xing Nie, Bolin Ni, Jianlong Chang, Gaofeng Meng, Chunlei Huo, Shiming Xiang, and Qi Tian. Pro-tuning: Unified prompt tuning for vision tasks.IEEE Transactions on Circuits and Systems for Video Technology, 34(6):4653–4667,
work page 2023
-
[15]
Unsupervised attributed multiplex network embedding
[Parket al., 2020 ] Chanyoung Park, Donghyun Kim, Jiawei Han, and Hwanjo Yu. Unsupervised attributed multiplex network embedding. InProceedings of the AAAI confer- ence on artificial intelligence, volume 34, pages 5371– 5378,
work page 2020
-
[16]
[Riquelmeet al., 2021 ] Carlos Riquelme, Joan Puigcerver, Basil Mustafa, Maxim Neumann, Rodolphe Jenatton, Andr´e Susano Pinto, Daniel Keysers, and Neil Houlsby. Scaling vision with sparse mixture of experts.Advances in Neural Information Processing Systems, 34:8583–8595,
work page 2021
-
[17]
Mug: Meta-path-aware universal heterogeneous graph pre-training
[Shanet al., 2026 ] Lianze Shan, Jitao Zhao, Dongxiao He, Yongqi Huang, Zhiyong Feng, and Weixiong Zhang. Mug: Meta-path-aware universal heterogeneous graph pre-training. InProceedings of the AAAI Conference on Artificial Intelligence, volume 40, pages 25260–25268,
work page 2026
-
[18]
Graph prompt learn- ing: A comprehensive survey and beyond.arXiv preprint arXiv:2311.16534,
[Sunet al., 2023 ] Xiangguo Sun, Jiawen Zhang, Xixi Wu, Hong Cheng, Yun Xiong, and Jia Li. Graph prompt learn- ing: A comprehensive survey and beyond.arXiv preprint arXiv:2311.16534,
-
[19]
Higpt: Het- erogeneous graph language model
[Tanget al., 2024 ] Jiabin Tang, Yuhao Yang, Wei Wei, Lei Shi, Long Xia, Dawei Yin, and Chao Huang. Higpt: Het- erogeneous graph language model. InProceedings of the 30th ACM SIGKDD conference on knowledge discovery and data mining, pages 2842–2853,
work page 2024
-
[20]
Hetero- geneous graph masked autoencoders
[Tianet al., 2023 ] Yijun Tian, Kaiwen Dong, Chunhui Zhang, Chuxu Zhang, and Nitesh V Chawla. Hetero- geneous graph masked autoencoders. InProceedings of the AAAI conference on artificial intelligence, volume 37, pages 9997–10005,
work page 2023
-
[21]
[Veliˇckovi´cet al., 2018 ] Petar Veliˇckovi´c, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Li `o, and Yoshua Bengio. Graph attention networks. InInterna- tional Conference on Learning Representations,
work page 2018
-
[22]
[Veliˇckovi´cet al., 2019 ] Petar Veli ˇckovi´c, William Fedus, William L Hamilton, Pietro Li `o, Yoshua Bengio, and R Devon Hjelm. Deep graph infomax. InInternational Conference on Learning Representations,
work page 2019
-
[23]
Hetero- geneous graph attention network
[Wanget al., 2019 ] Xiao Wang, Houye Ji, Chuan Shi, Bai Wang, Yanfang Ye, Peng Cui, and Philip S Yu. Hetero- geneous graph attention network. InThe world wide web conference, pages 2022–2032,
work page 2019
-
[24]
Self-supervised heterogeneous graph neural network with co-contrastive learning
[Wanget al., 2021 ] Xiao Wang, Nian Liu, Hui Han, and Chuan Shi. Self-supervised heterogeneous graph neural network with co-contrastive learning. InProceedings of the 27th ACM SIGKDD conference on knowledge discov- ery & data mining, pages 1726–1736,
work page 2021
-
[25]
Heteroge- neous graph contrastive multi-view learning
[Wanget al., 2023 ] Zehong Wang, Qi Li, Donghua Yu, Xi- aolong Han, Xiao-Zhi Gao, and Shigen Shen. Heteroge- neous graph contrastive multi-view learning. InProceed- ings of the 2023 SIAM international conference on data mining (SDM), pages 136–144. SIAM,
work page 2023
-
[26]
Topology-aware feature sorting enables universal modeling on homophilic and heterophilic graphs
[Wanget al., 2026 ] Yi Wang, Jitao Zhao, Dongxiao He, Jia Li, Yuxiao Huang, and Zhiyong Feng. Topology-aware feature sorting enables universal modeling on homophilic and heterophilic graphs. InProceedings of the ACM Web Conference 2026, pages 475–486,
work page 2026
-
[27]
Heterogeneous graph prompt learning via adap- tive weight pruning.arXiv preprint arXiv:2507.09132,
[Weiet al., 2025 ] Chu-Yuan Wei, Shun-Yao Liu, Sheng-Da Zhuo, Chang-Dong Wang, Shu-Qiang Huang, and Mohsen Guizani. Heterogeneous graph prompt learning via adap- tive weight pruning.arXiv preprint arXiv:2507.09132,
-
[28]
Graphlora: Structure-aware contrastive low-rank adaptation for cross-graph transfer learning
[Yanget al., 2025 ] Zhe-Rui Yang, Jindong Han, Chang- Dong Wang, and Hao Liu. Graphlora: Structure-aware contrastive low-rank adaptation for cross-graph transfer learning. InProceedings of the 31st ACM SIGKDD Con- ference on Knowledge Discovery and Data Mining V . 1, pages 1785–1796,
work page 2025
-
[29]
Hgprompt: Bridging homogeneous and heterogeneous graphs for few-shot prompt learning
[Yuet al., 2024 ] Xingtong Yu, Yuan Fang, Zemin Liu, and Xinming Zhang. Hgprompt: Bridging homogeneous and heterogeneous graphs for few-shot prompt learning. In Proceedings of the AAAI conference on artificial intelli- gence, volume 38, pages 16578–16586,
work page 2024
-
[30]
[Zhouet al., 2022 ] Kaiyang Zhou, Jingkang Yang, Chen Change Loy, and Ziwei Liu. Learning to prompt for vision-language models.International Journal of Computer Vision, 130(9):2337–2348,
work page 2022
-
[31]
Deep graph contrastive representation learning.arXiv preprint arXiv:2006.04131,
[Zhuet al., 2020 ] Yanqiao Zhu, Yichen Xu, Feng Yu, Qiang Liu, Shu Wu, and Liang Wang. Deep graph contrastive representation learning.arXiv preprint arXiv:2006.04131, 2020
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.