pith. sign in

arxiv: 2512.06356 · v3 · submitted 2025-12-06 · 💻 cs.LG · cs.SI

Mitigating Structural Overfitting: A Distribution-Aware Rectification Framework for Missing Feature Imputation

Pith reviewed 2026-05-17 01:22 UTC · model grok-4.3

classification 💻 cs.LG cs.SI
keywords missing feature imputationgraph neural networksstructural overfittingmasked autoencodingdistribution rectificationinductive learningnode feature completion
0
0 comments X

The pith

The DART framework mitigates structural overfitting in missing feature imputation for graphs through global augmentation, masked autoencoding rectification, and test-time distribution correction.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper claims that diffusion-based methods for imputing missing node features in graphs produce structural overfitting, which shows up as degraded results on separate graph components, loss of semantic variety through over-smoothing, and distribution mismatches when the method must generalize to new graph structures. DART counters this starting with global structural augmentation to link distant parts of the graph, then trains a semantic rectifier on masked autoencoding to capture the underlying feature manifold, and finally applies distribution rectification at inference time to pull biased imputations back onto the learned manifold. A sympathetic reader would care because incomplete features appear in many deployed graph systems such as user profiles and cold-start recommendations, and a method that avoids retraining or performance collapse on new graphs could make these systems more usable in practice. The authors also release the Sailing dataset of voyage records that contains genuinely missing attributes rather than synthetic masks.

Core claim

DART first applies Global Structural Augmentation to create global correlations that bridge disjoint components and widen diffusion coverage. It then trains a semantic rectifier via masked autoencoding that learns the latent feature manifold and restores natural semantic details. At inference, a test-time distribution rectification step projects structurally biased imputed features back onto the learned manifold, closing the gap between training and unseen graph structures. Experiments show this combination yields higher accuracy than prior methods in both transductive and inductive regimes across six public datasets plus the new Sailing dataset with naturally missing attributes.

What carries the argument

The DART framework, built from Global Structural Augmentation to connect graph components, a masked-autoencoding semantic rectifier that learns the latent feature manifold, and a test-time distribution rectification module that maps biased features back onto that manifold.

If this is right

  • Performance remains stable rather than degrading when the graph contains disjoint components.
  • Semantic diversity in the imputed features is retained instead of being lost to over-smoothing.
  • Feature distributions stay aligned when the model is applied to new graph structures in inductive settings.
  • Imputation quality holds on data with real sparsity patterns rather than only synthetic masks, as measured on the Sailing dataset.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same test-time rectification idea could be tested on link-prediction or node-classification tasks where structural changes also induce distribution shifts.
  • The framework's separation of manifold learning from inference-time correction might transfer to non-graph settings such as tabular data with missing entries.
  • Benchmarks that rely only on random masking may systematically underestimate the difficulty of real-world missingness, suggesting a need for more datasets like Sailing.

Load-bearing premise

The masked autoencoding step learns a latent feature manifold that truly captures natural semantic details and the test-time rectification step can reliably map structurally biased imputations back onto this manifold even for graphs never seen during training.

What would settle it

On a held-out collection of graphs with unseen structures and naturally missing features, if the imputed values produced by DART still show larger distribution shift or lower downstream task accuracy than strong diffusion baselines, the rectification mechanism has not succeeded in bridging the inductive gap.

Figures

Figures reproduced from arXiv: 2512.06356 by Fenglin Yu, Jing Tang, Kai Han, Siya Qiu, Xingjian Tao, Yifan Song, Yihong Luo.

Figure 1
Figure 1. Figure 1: The analysis on the challenges of FP-based methods. [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 3
Figure 3. Figure 3: Algorithm 2: Co-label Linking Input: Graph G = (V, E), Features X, Labels y, Parameters 𝑘, 𝜏, Candidate size 𝑀. Output: Graph G˜ = (V, E)˜ . E ← E ˜ ; foreach node 𝑣𝑖 ∈ V do // Sample candidate set Sample a candidate set C𝑖 ⊂ V \ {𝑣𝑖 } of size 𝑀; // Compute distribution foreach 𝑣𝑗 ∈ C𝑖 do 𝑤𝑗 ← sim(x𝑖 , x𝑗); // Normalization 𝑍 ← Í 𝑣𝑙 ∈ C𝑖 exp(𝑤𝑙 /𝜏); foreach 𝑣𝑗 ∈ C𝑖 do 𝑝𝑗 ← exp(𝑤𝑗 /𝜏)/𝑍; // Sample from cand… view at source ↗
Figure 3
Figure 3. Figure 3: Training and inference process of DDFI. different distribution of imputation features while the GNN model for downstream tasks is trained based on X˜ 𝑡𝑟. The distribution shift of the imputed features on test set will seriously degrade the performance of the FP-based framework on inductive tasks. In fact, designing a model that can adapt to both graph structure changes and different feature distributions t… view at source ↗
Figure 4
Figure 4. Figure 4: Performance under different missing rates for transductive learning (best viewed in color). [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Visualization for transductive (Cora) and inductive (Flickr): raw features and features after reconstruction. [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: The construction of Sailing dataset. Our original Sailing dataset contains information about multiple ships, with each ship represented as a node in a graph. The navi￾gation status of each ship serves as a label, with a total of n labels. Edges between nodes are constructed based on the ships’ longitude and latitude coordinates. We divide the entire ocean into several small areas using these coordinates. I… view at source ↗
read the original abstract

Incomplete node features are ubiquitous in real-world scenarios such as user profiling and cold-start recommendation, which severely hinders the practical deployment of graph learning systems (e.g., GNNs). Existing solutions typically rely on diffusion-based structural smoothing (e.g., feature propagation) to impute missing values. However, we find that these approaches suffer from structural overfitting, leading to three progressive challenges: 1) performance degradation on disjoint graphs, 2) loss of semantic diversity due to over-smoothing, and 3) feature distribution shift when generalizing to unseen graph structures (inductive tasks). To address these challenges, we introduce the \textbf{\DART} framework. It begins by employing {\em Global Structural Augmentation (GSA)}, which establishes global correlations to bridge disjoint components and extend diffusion coverage. Building upon this, we design a semantic rectifier based on masked autoencoding. This module learns the latent feature manifold to recover natural semantic details. Crucially, we introduce a test-time distribution rectification mechanism that projects structurally biased features back onto the learned manifold during inference, effectively bridging the inductive distribution gap. Furthermore, considering that synthetic masking fails to reflect real-world sparsity, we present a new dataset \textbf{Sailing} collected from voyage records with naturally missing attributes. Extensive experiments on six public datasets and Sailing demonstrate that \DART significantly outperforms state-of-the-art methods in both transductive and inductive settings. Our code and dataset are available at https://github.com/yfsong00/DART.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 3 minor

Summary. The manuscript introduces the DART framework to mitigate structural overfitting in missing node feature imputation for graph learning systems. It proposes three modules: Global Structural Augmentation (GSA) to bridge disjoint graph components via global correlations, a semantic rectifier based on masked autoencoding to recover natural semantic details from a learned latent feature manifold, and a test-time distribution rectification mechanism to project structurally biased features back onto the manifold during inference for inductive generalization. The authors also contribute the Sailing dataset with naturally missing attributes from voyage records and report that DART outperforms state-of-the-art methods on six public datasets plus Sailing in both transductive and inductive settings.

Significance. If the central claims hold after addressing the noted issues, this work would offer a practical advance for real-world graph applications such as user profiling and cold-start recommendation by explicitly targeting distribution shift in inductive scenarios. The introduction of the Sailing dataset with genuine missingness patterns provides a useful benchmark beyond synthetic masking, and the public release of code and dataset supports reproducibility and follow-on research.

major comments (3)
  1. [Section 3.3, Eq. (8)] Section 3.3 and Eq. (8): the semantic rectifier is positioned as learning a latent manifold that captures natural semantic details independent of training-graph artifacts, yet the masked autoencoding objective is defined only on the GSA-augmented training distribution; without a derivation or bound showing that reconstruction error on held-out disjoint structures remains low, the inductive rectification step risks simply replaying training statistics rather than correcting genuine shift.
  2. [Section 4.3, Table 4] Section 4.3, Table 4 (inductive ablation rows): the reported gains for DART on disjoint test graphs are not accompanied by an ablation that removes only the test-time rectification module while keeping GSA and the rectifier fixed; this omission makes it impossible to isolate whether the inductive improvement is attributable to the rectification mechanism or to the earlier augmentation steps.
  3. [Section 5.1] Section 5.1: the Sailing dataset is introduced as containing naturally missing attributes, but the paper provides no quantitative comparison of its missingness pattern (e.g., fraction missing per node, correlation with graph structure) against the synthetic masking ratios used on the public datasets; this weakens the claim that results on Sailing specifically validate robustness to real-world sparsity.
minor comments (3)
  1. [Abstract] The abstract states performance gains on 'six public datasets' without naming them; explicitly listing the datasets (e.g., Cora, CiteSeer, etc.) would improve readability.
  2. [Figure 3] Figure 3 caption and surrounding text use 'distribution rectification' and 'test-time rectification' interchangeably; consistent terminology would reduce ambiguity.
  3. [Section 4.5] The hyperparameter sensitivity analysis in Section 4.5 reports masking ratios but does not include error bars or statistical significance tests across the five random seeds mentioned in the experimental setup.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and insightful comments on our manuscript. We address each major comment point by point below, providing clarifications and committing to revisions where the feedback identifies gaps in the current presentation or analysis.

read point-by-point responses
  1. Referee: [Section 3.3, Eq. (8)] Section 3.3 and Eq. (8): the semantic rectifier is positioned as learning a latent manifold that captures natural semantic details independent of training-graph artifacts, yet the masked autoencoding objective is defined only on the GSA-augmented training distribution; without a derivation or bound showing that reconstruction error on held-out disjoint structures remains low, the inductive rectification step risks simply replaying training statistics rather than correcting genuine shift.

    Authors: We appreciate this observation on the theoretical grounding. The masked autoencoding objective is applied to the GSA-augmented training distribution precisely because GSA is designed to introduce global correlations that mitigate local structural artifacts, allowing the rectifier to focus on recovering semantic details from a broader manifold. While the current manuscript does not include a formal derivation or generalization bound, the inductive improvements reported in Tables 3 and 4 provide empirical support that the rectification corrects for distribution shift rather than merely replaying training statistics. In the revision we will expand Section 3.3 with a qualitative discussion of the manifold's generalization properties and include an additional plot of reconstruction error on held-out disjoint components to better substantiate the claim. revision: partial

  2. Referee: [Section 4.3, Table 4] Section 4.3, Table 4 (inductive ablation rows): the reported gains for DART on disjoint test graphs are not accompanied by an ablation that removes only the test-time rectification module while keeping GSA and the rectifier fixed; this omission makes it impossible to isolate whether the inductive improvement is attributable to the rectification mechanism or to the earlier augmentation steps.

    Authors: We agree that an ablation isolating the test-time rectification is required to clearly attribute the inductive gains. The current Table 4 reports the full DART model and a version without GSA, but does not include the requested configuration. We will add this ablation (GSA + rectifier only, without test-time rectification) to the inductive rows of Table 4 in the revised manuscript and discuss the resulting performance drop to quantify the module's specific contribution. revision: yes

  3. Referee: [Section 5.1] Section 5.1: the Sailing dataset is introduced as containing naturally missing attributes, but the paper provides no quantitative comparison of its missingness pattern (e.g., fraction missing per node, correlation with graph structure) against the synthetic masking ratios used on the public datasets; this weakens the claim that results on Sailing specifically validate robustness to real-world sparsity.

    Authors: We concur that a direct quantitative comparison would strengthen the motivation for the Sailing dataset. In the revised Section 5.1 we will include a new table reporting key missingness statistics for Sailing (average fraction of missing features per node, variance across nodes, and observed correlation with node degree or community structure) alongside the corresponding values for the synthetic masking ratios (e.g., 0.3, 0.5, 0.7) used on the public datasets, thereby clarifying how Sailing reflects real-world sparsity patterns. revision: yes

Circularity Check

0 steps flagged

New components (GSA, semantic rectifier, test-time rectification) validated via external experiments on public and Sailing datasets; no derivation reduces to fitted inputs or self-citation by construction

full rationale

The paper introduces GSA for global correlations, a masked-autoencoder semantic rectifier to learn latent manifolds, and test-time distribution rectification for inductive gaps. These are presented as architectural choices whose effectiveness is shown through comparative experiments on six public datasets plus the new Sailing dataset with natural missingness. No equations are given that define a target quantity in terms of itself or rename a fitted parameter as a prediction. Self-citations, if present, are not load-bearing for the central claims; the inductive performance argument rests on empirical results rather than reducing to prior author work by definition. This is the common honest case of a self-contained empirical framework.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 2 invented entities

The central claim rests on empirical effectiveness of newly introduced modules rather than first-principles derivation; several design choices are validated only through experiments on the reported datasets.

free parameters (1)
  • masking ratio and autoencoder hyperparameters
    Hyperparameters of the masked autoencoder rectifier are chosen to learn the latent manifold but their specific values are not derivable from the abstract and must be tuned.
axioms (2)
  • domain assumption Masked autoencoding on graph node features can recover a latent manifold that represents natural semantic structure.
    Invoked when the semantic rectifier is introduced to restore details lost to over-smoothing.
  • domain assumption Test-time projection onto the learned manifold corrects distribution shift induced by structural bias in unseen graphs.
    Central to the inductive rectification mechanism.
invented entities (2)
  • Global Structural Augmentation (GSA) no independent evidence
    purpose: Establish global correlations to bridge disjoint graph components and extend diffusion coverage.
    New augmentation technique introduced to address performance degradation on disjoint graphs.
  • Test-time distribution rectification mechanism no independent evidence
    purpose: Project structurally biased imputed features back onto the learned manifold at inference time.
    New mechanism introduced to close the inductive distribution gap.

pith-pipeline@v0.9.0 · 5590 in / 1636 out tokens · 62052 ms · 2026-05-17T01:22:33.003446+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

51 extracted references · 51 canonical work pages · 4 internal anchors

  1. [1]

    Danish Maritime Authority. 2023. Historical AIS Data. http://web.ais.dk/aisdata/

  2. [2]

    Hongyun Cai, Vincent W Zheng, and Kevin Chen-Chuan Chang. 2018. A com- prehensive survey of graph embedding: Problems, techniques, and applications. IEEE transactions on knowledge and data engineering30, 9 (2018), 1616–1637

  3. [3]

    Deli Chen, Yankai Lin, Wei Li, Peng Li, Jie Zhou, and Xu Sun. 2020. Measuring and relieving the over-smoothing problem for graph neural networks from the topological view. InProceedings of the AAAI conference on artificial intelligence, Vol. 34. 3438–3445

  4. [4]

    Xu Chen, Siheng Chen, Jiangchao Yao, Huangjie Zheng, Ya Zhang, and Ivor W Tsang. 2020. Learning on attribute-missing graphs.IEEE transactions on pattern analysis and machine intelligence44, 2 (2020), 740–757

  5. [5]

    Yuhan Chen, Yihong Luo, Yifan Song, Pengwen Dai, Jing Tang, and Xiaochun Cao. 2025. Decoupled graph energy-based model for node out-of-distribution detection on heterophilic graphs.arXiv preprint arXiv:2502.17912(2025)

  6. [6]

    Yuhan Chen, Yihong Luo, Jing Tang, Liang Yang, Siya Qiu, Chuan Wang, and Xiaochun Cao. 2023. LSGNN: towards general graph neural network in node classification by local similarity.arXiv preprint arXiv:2305.04225(2023)

  7. [7]

    Yuchen Fang, Yanjun Qin, Haiyong Luo, Fang Zhao, Bingbing Xu, Liang Zeng, and Chenxing Wang. 2023. When spatio-temporal meet wavelets: Disentangled traffic forecasting via efficient spectral graph attention networks. In2023 IEEE 39th International Conference on Data Engineering (ICDE). IEEE, 517–529

  8. [8]

    Tom Fawcett. 2006. An introduction to ROC analysis. InPattern recognition letters, Vol. 27. Elsevier, 861–874

  9. [9]

    Matthias Fey and Jan Eric Lenssen. 2019. Fast graph representation learning with PyTorch Geometric. InarXiv e-prints. https://arxiv.org/abs/1903.02428

  10. [10]

    Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs.Advances in neural information processing systems30 (2017)

  11. [11]

    Kaiming He, Xinlei Chen, Saining Xie, Yanghao Li, Piotr Dollár, and Ross Girshick

  12. [12]

    InProceedings of the IEEE/CVF conference on computer vision and pattern recognition

    Masked autoencoders are scalable vision learners. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 16000–16009

  13. [13]

    Mikael Henaff, Joan Bruna, and Yann LeCun. 2015. Deep convolutional networks on graph-structured data.arXiv preprint arXiv:1506.05163(2015)

  14. [14]

    Geoffrey E Hinton and Richard Zemel. 1993. Autoencoders, minimum description length and Helmholtz free energy.Advances in neural information processing systems6 (1993)

  15. [15]

    Zhenyu Hou, Xiao Liu, Yukuo Cen, Yuxiao Dong, Hongxia Yang, Chunjie Wang, and Jie Tang. 2022. Graphmae: Self-supervised masked graph autoencoders. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 594–604

  16. [16]

    Mengda Huang, Yang Liu, Xiang Ao, Kuan Li, Jianfeng Chi, Jinghua Feng, Hao Yang, and Qing He. 2022. Auc-oriented graph neural network for fraud detection. InProceedings of the ACM web conference 2022. 1311–1321

  17. [17]

    Cuiying Huo, Di Jin, Yawen Li, Dongxiao He, Yu-Bin Yang, and Lingfei Wu

  18. [18]

    InProceedings of the AAAI Conference on Artificial Intelligence, Vol

    T2-gnn: Graph neural networks for graphs with incomplete features and structure via teacher-student distillation. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 37. 4339–4346

  19. [19]

    Bo Jiang and Ziyan Zhang. 2020. Incomplete graph representation and learning via partial graph neural networks.arXiv preprint arXiv:2003.10130(2020)

  20. [20]

    Bo Jiang, Ziyan Zhang, Doudou Lin, Jin Tang, and Bin Luo. 2019. Semi-supervised learning with graph learning-convolutional networks. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 11313–11320

  21. [21]

    Xinke Jiang, Zidi Qin, Jiarong Xu, and Xiang Ao. 2024. Incomplete graph learning via attribute-structure decoupled variational auto-encoder. InProceedings of the 17th ACM International Conference on Web Search and Data Mining. 304–312

  22. [22]

    Thomas N Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks.arXiv preprint arXiv:1609.02907(2016)

  23. [23]

    Thomas N Kipf and Max Welling. 2016. Variational Graph Auto-Encoders. In arXiv e-prints. https://arxiv.org/abs/1611.07308

  24. [24]

    Kipf and Max Welling

    Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. InInternational Conference on Learning Repre- sentations. https://openreview.net/forum?id=SJU4ayYgl

  25. [25]

    Xin Li, Xiaowen Ying, and Mooi Choo Chuah. 2019. Grip: Graph-based interaction-aware trajectory prediction. In2019 IEEE Intelligent Transportation Systems Conference (ITSC). IEEE, 3960–3966

  26. [26]

    Dongsheng Luo, Tianxiang Zhao, Wei Cheng, Dongkuan Xu, Feng Han, Wenchao Yu, Xiao Liu, Haifeng Chen, and Xiang Zhang. 2024. Towards inductive and efficient explanations for graph neural networks.IEEE Transactions on Pattern Analysis and Machine Intelligence46, 8 (2024), 5245–5259

  27. [27]

    Yihong Luo, Yuhan Chen, Siya Qiu, Yiwei Wang, Chen Zhang, Yan Zhou, Xi- aochun Cao, and Jing Tang. 2024. Fast graph sharpness-aware minimization for enhancing and accelerating few-shot node classification.Advances in Neural Information Processing Systems37 (2024), 132364–132387

  28. [28]

    Abduallah Mohamed, Kun Qian, Mohamed Elhoseiny, and Christian Claudel

  29. [29]

    InProceedings of the IEEE/CVF conference on computer vision and pattern recognition

    Social-stgcnn: A social spatio-temporal graph convolutional neural network for human trajectory prediction. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition. 14424–14432

  30. [30]

    Jiwoong Park, Minsik Lee, Hyung Jin Chang, Kyuewang Lee, and Jin Young Choi. 2019. Symmetric graph convolutional autoencoder for unsupervised graph representation learning. InProceedings of the IEEE/CVF international conference on computer vision. 6519–6528

  31. [31]

    Emanuele Rossi, Henry Kenlay, Maria I Gorinova, Benjamin Paul Chamberlain, Xiaowen Dong, and Michael M Bronstein. 2022. On the unreasonable effective- ness of feature propagation in learning on graphs with missing node features. In Learning on Graphs Conference. PMLR, 11–1

  32. [32]

    Yifan Song, Xiaolong Chen, Wenqing Lin, Jia Li, Chen Zhang, Yan Zhou, Lei Chen, and Jing Tang. 2024. Efficient Graph Embedding Generation and Update for Large-Scale Temporal Graph.Proceedings of the VLDB Endowment18, 4 (2024), 929–942

  33. [33]

    Yifan Song, Darong Lai, Zhihong Chong, and Zeyuan Pan. 2021. Dynamic Network Embedding by Time-Relaxed Temporal Random Walk. InNeural In- formation Processing: 28th International Conference, ICONIP 2021, Sanur, Bali, Indonesia, December 8–12, 2021, Proceedings, Part I 28. Springer, 426–437

  34. [34]

    Hibiki Taguchi, Xin Liu, and Tsuyoshi Murata. 2021. Graph convolutional net- works for graphs containing missing features.Future Generation Computer Systems117 (2021), 155–168

  35. [35]

    Daeho Um, Jiwoong Park, Seulki Park, and Jin young Choi. 2023. Confidence- Based Feature Imputation for Graphs with Partially Known Features. In The Eleventh International Conference on Learning Representations. https: //openreview.net/forum?id=YPKBIILy-Kt

  36. [36]

    Laurens Van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE.. InJournal of machine learning research, Vol. 9

  37. [37]

    Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. 2018. Graph Attention Networks. InInternational Conference on Learning Representations

  38. [38]

    Petar Velickovic, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, Yoshua Bengio, et al. 2017. Graph attention networks.stat1050, 20 (2017), 10–48550

  39. [39]

    Chun Wang, Shirui Pan, Guodong Long, Xingquan Zhu, and Jing Jiang. 2017. Mgae: Marginalized graph autoencoder for graph clustering. InProceedings of the 2017 ACM on Conference on Information and Knowledge Management. 889–898

  40. [40]

    Yiwei Wang, Wei Wang, Yuxuan Liang, Yujun Cai, and Bryan Hooi. 2021. Mixup for node and graph classification. InProceedings of the Web Conference 2021. 3663–3674

  41. [41]

    Yiwei Wang, Wei Wang, Yuxuan Liang, Yujun Cai, Juncheng Liu, and Bryan Hooi

  42. [42]

    InProceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

    Nodeaug: Semi-supervised node classification with data augmentation. InProceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 207–217

  43. [43]

    Shunxin Xiao, Shiping Wang, Yuanfei Dai, and Wenzhong Guo. 2022. Graph neural networks in node classification: survey and evaluation.Machine Vision and Applications33, 1 (2022), 4

  44. [44]

    Renchi Yang, Jieming Shi, Keke Huang, and Xiaokui Xiao. 2022. Scalable and effective bipartite network embedding. InProceedings of the 2022 International Conference on Management of Data. 1977–1991

  45. [45]

    Renchi Yang, Jieming Shi, Xiaokui Xiao, Yin Yang, and Sourav S Bhowmick

  46. [46]

    Homogeneous network embedding for massive graphs via reweighted personalized pagerank.Proceedings of the VLDB Endowment13, 5 (2020), 670–683

  47. [47]

    Jiaxuan You, Xiaobai Ma, Yi Ding, Mykel J Kochenderfer, and Jure Leskovec

  48. [48]

    Handling missing data with graph representation learning.Advances in Neural Information Processing Systems33 (2020), 19075–19087

  49. [49]

    Hanqing Zeng, Hongkuan Zhou, Ajitesh Srivastava, Rajgopal Kannan, and Viktor Prasanna. 2020. GraphSAINT: Graph Sampling Based Inductive Learning Method. InInternational Conference on Learning Representations. https://openreview.net/ forum?id=BJe8pkHFwS

  50. [50]

    Jiani Zhang, Xingjian Shi, Shenglin Zhao, and Irwin King. 2019. STAR-GCN: stacked and reconstructed graph convolutional networks for recommender sys- tems. InProceedings of the 28th International Joint Conference on Artificial Intelli- gence. 4264–4270

  51. [51]

    Muhan Zhang and Yixin Chen. 2020. Inductive Matrix Completion Based on Graph Neural Networks. InInternational Conference on Learning Representations. https://openreview.net/forum?id=ByxxgCEYDS Yifan Song et al. A Appendix A.1 Construction of Sailing 𝑥1 𝑥2 𝑦1 𝑦2 𝑦4 𝑧1 𝑧2 𝑧3 Connect with probability 𝑝. 𝑦3 Figure 6: The construction of Sailing dataset. Our o...