RelPrism: A Multi-Faceted Pre-training Framework with Self-Generated Tasks for Relational Databases

Cheng Yang; Chuan Shi; Hanyang Peng; Jinyu Yang; Junze Chen; Muhan Zhang; Zedi Liu

arxiv: 2605.23241 · v1 · pith:TBVLVXJ6new · submitted 2026-05-22 · 💻 cs.LG

RelPrism: A Multi-Faceted Pre-training Framework with Self-Generated Tasks for Relational Databases

Jinyu Yang , Cheng Yang , Junze Chen , Zedi Liu , Muhan Zhang , Hanyang Peng , Chuan Shi This is my paper

Pith reviewed 2026-05-25 05:02 UTC · model grok-4.3

classification 💻 cs.LG

keywords relational databasesself-supervised learningpre-trainingpseudo-tasksmulti-granularity clusteringgraph representationsrelational deep learningpredictive tasks

0 comments

The pith

RelPrism pre-trains relational database models on pseudo-tasks drawn from intrinsic, relational, and hybrid attribute perspectives at multiple granularities to support better downstream adaptation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper argues that relational database tasks often need information spanning different perspectives and levels of detail, yet most self-supervised methods supply signals from only one facet. RelPrism therefore builds three distinct attribute views, clusters each at several granularities to create pools of pseudo-tasks, and pre-trains graph-based representations on those pools. The resulting representations are intended to carry a broader base of knowledge that transfers more effectively when the model is later adapted to specific classification or regression targets. Experiments across fourteen tasks on five real databases are presented as evidence that this multi-faceted pre-training produces measurable gains over prior single-facet approaches.

Core claim

RelPrism constructs intrinsic, relational, and hybrid attributes from distinct perspectives, applies multi-granularity clustering to each perspective to form corresponding pseudo-task pools, and pre-trains over these pools to expose representations to broader perspectives and granularity levels, yielding a stronger basis for downstream adaptation.

What carries the argument

Multi-granularity clustering on intrinsic, relational, and hybrid attribute perspectives to generate pseudo-task pools for self-supervised pre-training.

If this is right

Representations receive supervision signals from multiple attribute perspectives instead of one facet.
The same pre-trained model can adapt to downstream tasks that emphasize interaction patterns, intrinsic attributes, or their combination.
Performance improves on both classification measured by ROC-AUC and regression measured by MAE across real relational databases.
Self-supervised pre-training becomes feasible without manual labels by converting clustering outputs into pseudo-tasks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same perspective-construction and clustering procedure could be tested on other structured data formats that admit multiple attribute views.
Task-specific weighting of the three perspectives during pre-training might further reduce the gap to fully supervised performance.
Scaling the number of granularity levels or the size of the pseudo-task pools could be examined for additional gains on larger databases.

Load-bearing premise

The pseudo-tasks generated by multi-granularity clustering on the three attribute perspectives supply transferable supervision signals that genuinely improve downstream performance rather than reflecting artifacts of the clustering process or data characteristics.

What would settle it

An ablation that removes either the three-perspective construction or the multi-granularity step and observes no drop in the reported performance margins on the fourteen tasks would falsify the claim that those design choices are responsible for the gains.

Figures

Figures reproduced from arXiv: 2605.23241 by Cheng Yang, Chuan Shi, Hanyang Peng, Jinyu Yang, Junze Chen, Muhan Zhang, Zedi Liu.

**Figure 2.** Figure 2: By generating pseudo-tasks that span complementary at [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 2.** Figure 2: The Overall Framework of RelPrism. (a) We first convert the RDB into a temporal heterogeneous graph. (b) Next, we [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: 1-Shot and 5-Shot Classification and Regression Performance on 14 Tasks across 5 Datasets. For regression tasks, [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: Representation Quality Analysis via Alignment and [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

**Figure 5.** Figure 5: Pseudo-Task Quality Analysis. Our clustering [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗

**Figure 6.** Figure 6: Visualization of Item Examples from rel-amazon. For task item-churn, Item A shows strong interactions and does not churn (label=0), while Item B has weak historical engagement and churns (label=1). For task item-ltv, Item A combines high value with active interactions, leading to high LTV, whereas Item B has no future LTV after churn. fact-table rows as edges, our construction avoids introducing intermedi… view at source ↗

**Figure 7.** Figure 7: Hyper-Parameter Sensitivity Analysis. We inves [PITH_FULL_IMAGE:figures/full_fig_p010_7.png] view at source ↗

read the original abstract

Relational databases (RDBs) remain the cornerstone of modern data systems and support diverse predictive tasks. Recent relational deep learning (RDL) methods enable end-to-end prediction by converting RDBs into graphs, where rows are represented as nodes and inter-table interactions are represented as edges, and then applying graph-based models for representation learning. Despite the strong capability of RDL, effective self-supervised pre-training for RDBs remains non-trivial. RDB tasks often require multi-faceted information across different perspectives and granularities. For example, user churn classification may rely more on interaction patterns, whereas consumption value prediction requires both user-item behaviors and intrinsic user attributes for fine-grained regression. Such heterogeneous needs challenge RDB representation learning, as pre-training objectives should cover comprehensive information for downstream adaptation. However, existing SSL methods typically derive supervision from a single facet, such as node-level intrinsic attributes or subgraph-level relational structures, providing limited adaptability. To this end, we propose RelPrism, a multi-faceted self-supervised learning framework for RDBs. RelPrism constructs intrinsic, relational, and hybrid attributes from distinct perspectives, and applies multi-granularity clustering to each perspective to form corresponding pseudo-task pools. Pre-training over these pools exposes representations to broader perspectives and granularity levels, yielding a stronger basis for downstream adaptation. Experiments on 14 tasks across 5 real-world datasets show that RelPrism improves ROC-AUC by 4.15% for classification and reduces MAE by 10.75% for regression over state-of-the-art baselines. Our code is available at https://anonymous.4open.science/r/RelPrism.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

RelPrism adds multi-perspective attribute construction and multi-granularity clustering for RDB pseudo-tasks, with reported gains that need experimental details to evaluate.

read the letter

RelPrism builds self-supervised pre-training for relational databases by constructing intrinsic, relational, and hybrid attributes from the data, then running multi-granularity clustering on each to create pools of pseudo-tasks. The idea is that exposing the model to these different views and levels gives representations that adapt better to varied downstream needs like churn classification or value regression. That specific combination of three facets plus the clustering step is the concrete addition over single-facet SSL baselines mentioned in the abstract. The reported numbers—4.15% ROC-AUC improvement on classification and 10.75% MAE reduction on regression across 14 tasks and 5 real datasets—show the kind of practical lift that would matter if the controls hold. Code release is also a plus for checking the implementation. The soft spot is that the abstract supplies almost no information on the exact baselines, ablation results, statistical tests, or steps taken to avoid leakage when the pseudo-tasks are derived from the same tables used in downstream evaluation. The key assumption—that the clustering outputs supply genuine transferable supervision rather than reflecting data artifacts—therefore sits on unexamined ground until the full methods and results are reviewed. This work targets people doing representation learning on relational or tabular data who already use graph-based RDL models. A reader looking for new pre-training options in that setting would get direct value from the experiments once the details are filled in. It deserves peer review because the motivation is clear, the method is specified, and the claims are quantitative on real data; the experiments simply require scrutiny on the points above.

Referee Report

2 major / 1 minor

Summary. The paper proposes RelPrism, a multi-faceted self-supervised pre-training framework for relational databases. It constructs intrinsic, relational, and hybrid attributes from distinct perspectives, applies multi-granularity clustering to each to form pseudo-task pools, and pre-trains representations over these pools to improve adaptability for downstream tasks. Experiments on 14 tasks across 5 real-world datasets are reported to yield 4.15% ROC-AUC gains for classification and 10.75% MAE reduction for regression over state-of-the-art baselines.

Significance. If the empirical claims hold under proper controls, the framework could advance self-supervised learning for relational data by addressing multi-perspective and multi-granularity requirements that single-facet SSL methods overlook. The code release is a positive factor for reproducibility.

major comments (2)

[Abstract] Abstract: The reported performance gains (4.15% ROC-AUC, 10.75% MAE) are stated without any identification of the specific baselines, dataset details, statistical significance tests, variance across runs, or ablation studies. These omissions make it impossible to evaluate whether the central claim—that multi-granularity clustering on the three attribute perspectives supplies transferable supervision—is supported by the data.
[Abstract] Abstract (method description): The construction of 'hybrid attributes' and 'pseudo-task pools' via clustering is presented at a high level with no information on how leakage between pseudo-label generation and downstream evaluation is prevented or how the clustering process is validated to produce signals independent of data artifacts. This is load-bearing for the claim of improved downstream adaptation.

minor comments (1)

[Abstract] The anonymous code link is standard for review but should be replaced with a permanent repository upon acceptance.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed comments on the abstract. We address each point below and note that while the abstract is intentionally concise, we agree some additional specificity can be incorporated without exceeding length limits.

read point-by-point responses

Referee: [Abstract] Abstract: The reported performance gains (4.15% ROC-AUC, 10.75% MAE) are stated without any identification of the specific baselines, dataset details, statistical significance tests, variance across runs, or ablation studies. These omissions make it impossible to evaluate whether the central claim—that multi-granularity clustering on the three attribute perspectives supplies transferable supervision—is supported by the data.

Authors: The abstract summarizes results at a high level, with full details provided in the Experiments section (including the five datasets, fourteen tasks, specific SOTA baselines, mean/std over five runs, significance tests, and ablations). We agree the abstract could better orient readers and will revise it to name the primary baselines and datasets while retaining conciseness. revision: yes
Referee: [Abstract] Abstract (method description): The construction of 'hybrid attributes' and 'pseudo-task pools' via clustering is presented at a high level with no information on how leakage between pseudo-label generation and downstream evaluation is prevented or how the clustering process is validated to produce signals independent of data artifacts. This is load-bearing for the claim of improved downstream adaptation.

Authors: The abstract is a high-level summary; the full manuscript details hybrid attribute construction (Section 3.2) and multi-granularity clustering (Section 3.3), with explicit statements that pseudo-labels are derived only from pre-training splits and that downstream data is held out. We will add one sentence to the abstract clarifying the separation of pre-training and evaluation data to address this concern directly. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained with external validation

full rationale

The paper describes a standard self-supervised construction: attribute perspectives are extracted from the input RDB, multi-granularity clustering produces pseudo-task pools, and representations are pre-trained on those pools before downstream adaptation. No equations, fitted parameters, or self-citations are shown that would make any claimed improvement equivalent to the inputs by construction. Performance gains are reported on 14 external downstream tasks across 5 real-world datasets against independent baselines, satisfying the criterion for non-circular empirical support. The framework does not rename known results or import uniqueness via author self-citation in the supplied text.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 2 invented entities

The framework rests on the assumption that clustering-derived pseudo-tasks from multiple attribute perspectives supply useful and non-redundant supervision. No numerical free parameters are named in the abstract. The hybrid attribute construction and pseudo-task pools are new entities introduced by the paper.

axioms (1)

domain assumption Multi-granularity clustering on constructed attributes produces pseudo-labels that provide transferable supervision for downstream RDB tasks
This premise is required for the pre-training pools to improve adaptation as claimed.

invented entities (2)

hybrid attributes no independent evidence
purpose: Capture combined intrinsic and relational information from distinct perspectives
Newly defined attribute type in the framework.
pseudo-task pools no independent evidence
purpose: Provide diverse self-supervised training signals at multiple granularities
Generated via clustering on the three attribute perspectives.

pith-pipeline@v0.9.0 · 5847 in / 1412 out tokens · 29375 ms · 2026-05-25T05:02:13.306958+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

52 extracted references · 52 canonical work pages · 3 internal anchors

[1]

Dara Bahri, Heinrich Jiang, Yi Tay, and Donald Metzler. 2021. Scarf: Self- supervised contrastive learning using random feature corruption.arXiv preprint arXiv:2106.15147(2021)

work page arXiv 2021
[2]

2022.The Kaggle Book: Data analysis and machine learning for competitive data science

Konrad Banachewicz and Luca Massaron. 2022.The Kaggle Book: Data analysis and machine learning for competitive data science. Packt Publishing Ltd

work page 2022
[3]

Yoshua Bengio, Aaron Courville, and Pascal Vincent. 2013. Representation learning: A review and new perspectives.IEEE transactions on pattern analysis and machine intelligence35, 8 (2013), 1798–1828

work page 2013
[4]

Vadim Borisov, Tobias Leemann, Kathrin Seßler, Johannes Haug, Martin Pawel- czyk, and Gjergji Kasneci. 2022. Deep neural networks and tabular data: A survey. IEEE transactions on neural networks and learning systems35, 6 (2022), 7499–7519

work page 2022
[5]

Tianqi Chen. 2016. XGBoost: A Scalable Tree Boosting System.Cornell University (2016)

work page 2016
[6]

Tianlang Chen, Charilaos Kanatsoulis, and Jure Leskovec. 2025. Relgnn: Compos- ite message passing for relational deep learning.arXiv preprint arXiv:2502.06784 (2025)

work page arXiv 2025
[7]

Wei-Yu Chen, Yen-Cheng Liu, Zsolt Kira, Yu-Chiang Frank Wang, and Jia- Bin Huang. 2019. A closer look at few-shot classification.arXiv preprint arXiv:1904.04232(2019)

work page arXiv 2019
[8]

Jillian M Clements, Di Xu, Nooshin Yousefi, and Dmitry Efimov. 2020. Sequential deep learning for credit risk monitoring with tabular financial data.arXiv preprint arXiv:2012.15330(2020)

work page arXiv 2020
[9]

Gabriele Corso, Luca Cavalleri, Dominique Beaini, Pietro Liò, and Petar Veličković

work page
[10]

Principal neighbourhood aggregation for graph nets.Advances in neural information processing systems33 (2020), 13260–13271

work page 2020
[11]

Alexis Cvetkov-Iliev, Alexandre Allauzen, and Gaël Varoquaux. 2023. Relational data embeddings for feature enrichment with background information.Machine Learning112, 2 (2023), 687–720

work page 2023
[12]

Milan Cvitkovic. 2020. Supervised learning on relational databases with graph neural networks.arXiv preprint arXiv:2002.02046(2020)

work page arXiv 2020
[13]

Kaiwen Dong, Padmaja Jonnalagedda, Xiang Gao, Ayan Acharya, Maria Kissa, Mauricio Flores, Nitesh V Chawla, and Kamalika Das. 2025. Transaction Cat- egorization with Relational Deep Learning in QuickBooks. InJoint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 143–160

work page 2025
[14]

Vijay Prakash Dwivedi, Sri Jaladi, Yangyi Shen, Federico López, Charilaos I Kanatsoulis, Rishi Puri, Matthias Fey, and Jure Leskovec. 2025. Relational Graph Transformer.arXiv preprint arXiv:2505.10960(2025)

work page arXiv 2025
[15]

Vijay Prakash Dwivedi, Charilaos Kanatsoulis, Shenyang Huang, and Jure Leskovec. 2025. Relational deep learning: Challenges, foundations and next- generation architectures. InProceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V. 2. 5999–6009

work page 2025
[16]

Michalis Faloutsos, Petros Faloutsos, and Christos Faloutsos. 1999. On power-law relationships of the internet topology.ACM SIGCOMM computer communication review29, 4 (1999), 251–262

work page 1999
[17]

Matthias Fey, Weihua Hu, Kexin Huang, Jan Eric Lenssen, Rishabh Ranjan, Joshua Robinson, Rex Ying, Jiaxuan You, and Jure Leskovec. 2023. Relational deep learning: Graph representation learning on relational databases.arXiv preprint arXiv:2312.04615(2023)

work page arXiv 2023
[18]

Matthias Fey, Vid Kocijan, Federico Lopez, J Lenssen, and Jure Leskovec. 2025. Kumorfm: A foundation model for in-context learning on relational data

work page 2025
[19]

Chelsea Finn, Pieter Abbeel, and Sergey Levine. 2017. Model-agnostic meta- learning for fast adaptation of deep networks. InInternational conference on machine learning. PMLR, 1126–1135

work page 2017
[20]

Jerome H Friedman. 2001. Greedy function approximation: a gradient boosting machine.Annals of statistics(2001), 1189–1232

work page 2001
[21]

Léo Grinsztajn, Klemens Flöge, Oscar Key, Felix Birkel, Philipp Jund, Brendan Roof, Mihir Manium, Shi Bin, Magnus Bühler, Anurag Garg, et al. 2026. TabPFN-3: Technical Report.arXiv preprint arXiv:2605.13986(2026)

work page internal anchor Pith review Pith/arXiv arXiv 2026
[22]

Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs.Advances in neural information processing systems30 (2017)

work page 2017
[23]

Stefan Hegselmann, Alejandro Buendia, Hunter Lang, Monica Agrawal, Xiaoyi Jiang, and David Sontag. 2023. Tabllm: Few-shot classification of tabular data with large language models. InInternational conference on artificial intelligence and statistics. PMLR, 5549–5581

work page 2023
[24]

Zhenyu Hou, Xiao Liu, Yukuo Cen, Yuxiao Dong, Hongxia Yang, Chunjie Wang, and Jie Tang. 2022. Graphmae: Self-supervised masked graph autoencoders. In Proceedings of the 28th ACM SIGKDD conference on knowledge discovery and data mining. 594–604

work page 2022
[25]

Kyle Hsu, Sergey Levine, and Chelsea Finn. 2018. Unsupervised learning via meta-learning.arXiv preprint arXiv:1810.02334(2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018
[26]

James Max Kanter and Kalyan Veeramachaneni. 2015. Deep feature synthesis: Towards automating data science endeavors. In2015 IEEE international conference on data science and advanced analytics (DSAA). IEEE, 1–10

work page 2015
[27]

Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, and Tie-Yan Liu. 2017. Lightgbm: A highly efficient gradient boosting decision tree.Advances in neural information processing systems30 (2017)

work page 2017
[28]

Pengfei Liu, Weizhe Yuan, Jinlan Fu, Zhengbao Jiang, Hiroaki Hayashi, and Graham Neubig. 2023. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing.ACM computing surveys55, 9 (2023), 1–35

work page 2023
[29]

Shengchao Liu, David Vazquez, Jian Tang, and Pierre-André Noël. 2023. Flaky performances when pretraining on relational databases (student abstract). In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 37. 16266–16267

work page 2023
[30]

Stuart Lloyd. 1982. Least squares quantization in PCM.IEEE transactions on information theory28, 2 (1982), 129–137

work page 1982
[31]

Miller McPherson, Lynn Smith-Lovin, and James M Cook. 2001. Birds of a feather: Homophily in social networks.Annual review of sociology27, 1 (2001), 415–444

work page 2001
[32]

Jaehyun Nam, Jihoon Tack, Kyungmin Lee, Hankook Lee, and Jinwoo Shin. 2023. Stunt: Few-shot tabular learning with self-generated tasks from unlabeled tables. arXiv preprint arXiv:2303.00918(2023)

work page arXiv 2023
[33]

Jennifer Neville and David Jensen. 2000. Iterative classification in relational data. InProc. AAAI-2000 workshop on learning statistical models from relational data. Austin Texas, TX, 13–20

work page 2000
[34]

Karl Pearson. 1901. LIII. On lines and planes of closest fit to systems of points in space.The London, Edinburgh, and Dublin philosophical magazine and journal of science2, 11 (1901), 559–572

work page 1901
[35]

Rishabh Ranjan, Valter Hudovernik, Mark Znidar, Charilaos Kanatsoulis, Roshan Upendra, Mahmoud Mohammadi, Joe Meyer, Tom Palczewski, Carlos Guestrin, and Jure Leskovec. 2025. Relational Transformer: Toward Zero-Shot Foundation Models for Relational Data.arXiv preprint arXiv:2510.06377(2025)

work page arXiv 2025
[36]

Joshua Robinson, Rishabh Ranjan, Weihua Hu, Kexin Huang, Jiaqi Han, Alejandro Dobles, Matthias Fey, Jan Eric Lenssen, Yiwen Yuan, Zecheng Zhang, et al. 2024. Relbench: A benchmark for deep learning on relational databases.Advances in Neural Information Processing Systems37 (2024), 21330–21341

work page 2024
[37]

Michael Schlichtkrull, Thomas N Kipf, Peter Bloem, Rianne Van Den Berg, Ivan Titov, and Max Welling. 2018. Modeling relational data with graph convolutional networks. InEuropean semantic web conference. Springer, 593–607

work page 2018
[38]

Yunsheng Shi, Zhengjie Huang, Shikun Feng, Hui Zhong, Wenjin Wang, and Yu Sun. 2020. Masked label prediction: Unified message passing model for semi- supervised classification.arXiv preprint arXiv:2009.03509(2020)

work page arXiv 2020
[39]

Jake Snell, Kevin Swersky, and Richard Zemel. 2017. Prototypical networks for few-shot learning.Advances in neural information processing systems30 (2017)

work page 2017
[40]

Gowthami Somepalli, Micah Goldblum, Avi Schwarzschild, C Bayan Bruss, and Tom Goldstein. 2021. Saint: Improved neural networks for tabular data via row attention and contrastive pre-training.arXiv preprint arXiv:2106.01342(2021)

work page arXiv 2021
[41]

Luis Torgo and Joao Gama. 1997. Regression using classification algorithms. Intelligent Data Analysis1, 4 (1997), 275–292

work page 1997
[42]

Quang Truong, Zhikai Chen, Mingxuan Ju, Tong Zhao, Neil Shah, and Jiliang Tang. 2025. A Pre-training Framework for Relational Data with Information- theoretic Principles.arXiv preprint arXiv:2507.09837(2025)

work page arXiv 2025
[43]

Talip Ucar, Ehsan Hajiramezanali, and Lindsay Edwards. 2021. Subtab: Subsetting features of tabular data for self-supervised representation learning.Advances in Neural Information Processing Systems34 (2021), 18853–18865

work page 2021
[44]

Dennis Ulmer, Lotta Meijerink, and Giovanni Cinà. 2020. Trust issues: Uncertainty estimation does not enable reliable ood detection on medical tabular data. In Machine Learning for Health. PMLR, 341–354

work page 2020
[45]

Petar Veličković, William Fedus, William L Hamilton, Pietro Liò, Yoshua Bengio, and R Devon Hjelm. 2018. Deep graph infomax.arXiv preprint arXiv:1809.10341 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018
[46]

Minjie Wang, Quan Gan, David Wipf, Zheng Zhang, Christos Faloutsos, Weinan Zhang, Muhan Zhang, Zhenkun Cai, Jiahang Li, Zunyao Mao, et al. 2024. 4DBIn- fer: A 4d benchmarking toolbox for graph-centric predictive modeling on RDBs. Advances in Neural Information Processing Systems37 (2024), 27236–27273

work page 2024
[47]

Tongzhou Wang and Phillip Isola. 2020. Understanding contrastive representation learning through alignment and uniformity on the hypersphere. InInternational conference on machine learning. PMLR, 9929–9939

work page 2020
[48]

Xiao Wang, Houye Ji, Chuan Shi, Bai Wang, Yanfang Ye, Peng Cui, and Philip S Yu

work page
[49]

InThe world wide web conference

Heterogeneous graph attention network. InThe world wide web conference. 2022–2032

work page 2022
[50]

Yanbo Wang, Xiyuan Wang, Quan Gan, Minjie Wang, Qibin Yang, David Wipf, and Muhan Zhang. 2025. Griffin: Towards a Graph-Centric Relational Database Foundation Model.arXiv preprint arXiv:2505.05568(2025)

work page arXiv 2025
[51]

Jinsung Yoon, Yao Zhang, James Jordon, and Mihaela Van der Schaar. 2020. Vime: Extending the success of self-and semi-supervised learning to tabular domain. Advances in neural information processing systems33 (2020), 11033–11043

work page 2020
[52]

Yuning You, Tianlong Chen, Yongduo Sui, Ting Chen, Zhangyang Wang, and Yang Shen. 2020. Graph contrastive learning with augmentations.Advances in neural information processing systems33 (2020), 5812–5823. Conference acronym ’XX, June 03–05, 2018, Woodstock, NY Trovato et al. A Dataset and Task Statistics Specific statistics regarding the datasets and task...

work page 2020

[1] [1]

Dara Bahri, Heinrich Jiang, Yi Tay, and Donald Metzler. 2021. Scarf: Self- supervised contrastive learning using random feature corruption.arXiv preprint arXiv:2106.15147(2021)

work page arXiv 2021

[2] [2]

2022.The Kaggle Book: Data analysis and machine learning for competitive data science

Konrad Banachewicz and Luca Massaron. 2022.The Kaggle Book: Data analysis and machine learning for competitive data science. Packt Publishing Ltd

work page 2022

[3] [3]

Yoshua Bengio, Aaron Courville, and Pascal Vincent. 2013. Representation learning: A review and new perspectives.IEEE transactions on pattern analysis and machine intelligence35, 8 (2013), 1798–1828

work page 2013

[4] [4]

Vadim Borisov, Tobias Leemann, Kathrin Seßler, Johannes Haug, Martin Pawel- czyk, and Gjergji Kasneci. 2022. Deep neural networks and tabular data: A survey. IEEE transactions on neural networks and learning systems35, 6 (2022), 7499–7519

work page 2022

[5] [5]

Tianqi Chen. 2016. XGBoost: A Scalable Tree Boosting System.Cornell University (2016)

work page 2016

[6] [6]

Tianlang Chen, Charilaos Kanatsoulis, and Jure Leskovec. 2025. Relgnn: Compos- ite message passing for relational deep learning.arXiv preprint arXiv:2502.06784 (2025)

work page arXiv 2025

[7] [7]

Wei-Yu Chen, Yen-Cheng Liu, Zsolt Kira, Yu-Chiang Frank Wang, and Jia- Bin Huang. 2019. A closer look at few-shot classification.arXiv preprint arXiv:1904.04232(2019)

work page arXiv 2019

[8] [8]

Jillian M Clements, Di Xu, Nooshin Yousefi, and Dmitry Efimov. 2020. Sequential deep learning for credit risk monitoring with tabular financial data.arXiv preprint arXiv:2012.15330(2020)

work page arXiv 2020

[9] [9]

Gabriele Corso, Luca Cavalleri, Dominique Beaini, Pietro Liò, and Petar Veličković

work page

[10] [10]

Principal neighbourhood aggregation for graph nets.Advances in neural information processing systems33 (2020), 13260–13271

work page 2020

[11] [11]

Alexis Cvetkov-Iliev, Alexandre Allauzen, and Gaël Varoquaux. 2023. Relational data embeddings for feature enrichment with background information.Machine Learning112, 2 (2023), 687–720

work page 2023

[12] [12]

Milan Cvitkovic. 2020. Supervised learning on relational databases with graph neural networks.arXiv preprint arXiv:2002.02046(2020)

work page arXiv 2020

[13] [13]

Kaiwen Dong, Padmaja Jonnalagedda, Xiang Gao, Ayan Acharya, Maria Kissa, Mauricio Flores, Nitesh V Chawla, and Kamalika Das. 2025. Transaction Cat- egorization with Relational Deep Learning in QuickBooks. InJoint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 143–160

work page 2025

[14] [14]

Vijay Prakash Dwivedi, Sri Jaladi, Yangyi Shen, Federico López, Charilaos I Kanatsoulis, Rishi Puri, Matthias Fey, and Jure Leskovec. 2025. Relational Graph Transformer.arXiv preprint arXiv:2505.10960(2025)

work page arXiv 2025

[15] [15]

Vijay Prakash Dwivedi, Charilaos Kanatsoulis, Shenyang Huang, and Jure Leskovec. 2025. Relational deep learning: Challenges, foundations and next- generation architectures. InProceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V. 2. 5999–6009

work page 2025

[16] [16]

Michalis Faloutsos, Petros Faloutsos, and Christos Faloutsos. 1999. On power-law relationships of the internet topology.ACM SIGCOMM computer communication review29, 4 (1999), 251–262

work page 1999

[17] [17]

Matthias Fey, Weihua Hu, Kexin Huang, Jan Eric Lenssen, Rishabh Ranjan, Joshua Robinson, Rex Ying, Jiaxuan You, and Jure Leskovec. 2023. Relational deep learning: Graph representation learning on relational databases.arXiv preprint arXiv:2312.04615(2023)

work page arXiv 2023

[18] [18]

Matthias Fey, Vid Kocijan, Federico Lopez, J Lenssen, and Jure Leskovec. 2025. Kumorfm: A foundation model for in-context learning on relational data

work page 2025

[19] [19]

Chelsea Finn, Pieter Abbeel, and Sergey Levine. 2017. Model-agnostic meta- learning for fast adaptation of deep networks. InInternational conference on machine learning. PMLR, 1126–1135

work page 2017

[20] [20]

Jerome H Friedman. 2001. Greedy function approximation: a gradient boosting machine.Annals of statistics(2001), 1189–1232

work page 2001

[21] [21]

Léo Grinsztajn, Klemens Flöge, Oscar Key, Felix Birkel, Philipp Jund, Brendan Roof, Mihir Manium, Shi Bin, Magnus Bühler, Anurag Garg, et al. 2026. TabPFN-3: Technical Report.arXiv preprint arXiv:2605.13986(2026)

work page internal anchor Pith review Pith/arXiv arXiv 2026

[22] [22]

Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs.Advances in neural information processing systems30 (2017)

work page 2017

[23] [23]

Stefan Hegselmann, Alejandro Buendia, Hunter Lang, Monica Agrawal, Xiaoyi Jiang, and David Sontag. 2023. Tabllm: Few-shot classification of tabular data with large language models. InInternational conference on artificial intelligence and statistics. PMLR, 5549–5581

work page 2023

[24] [24]

Zhenyu Hou, Xiao Liu, Yukuo Cen, Yuxiao Dong, Hongxia Yang, Chunjie Wang, and Jie Tang. 2022. Graphmae: Self-supervised masked graph autoencoders. In Proceedings of the 28th ACM SIGKDD conference on knowledge discovery and data mining. 594–604

work page 2022

[25] [25]

Kyle Hsu, Sergey Levine, and Chelsea Finn. 2018. Unsupervised learning via meta-learning.arXiv preprint arXiv:1810.02334(2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018

[26] [26]

James Max Kanter and Kalyan Veeramachaneni. 2015. Deep feature synthesis: Towards automating data science endeavors. In2015 IEEE international conference on data science and advanced analytics (DSAA). IEEE, 1–10

work page 2015

[27] [27]

Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, and Tie-Yan Liu. 2017. Lightgbm: A highly efficient gradient boosting decision tree.Advances in neural information processing systems30 (2017)

work page 2017

[28] [28]

Pengfei Liu, Weizhe Yuan, Jinlan Fu, Zhengbao Jiang, Hiroaki Hayashi, and Graham Neubig. 2023. Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing.ACM computing surveys55, 9 (2023), 1–35

work page 2023

[29] [29]

Shengchao Liu, David Vazquez, Jian Tang, and Pierre-André Noël. 2023. Flaky performances when pretraining on relational databases (student abstract). In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 37. 16266–16267

work page 2023

[30] [30]

Stuart Lloyd. 1982. Least squares quantization in PCM.IEEE transactions on information theory28, 2 (1982), 129–137

work page 1982

[31] [31]

Miller McPherson, Lynn Smith-Lovin, and James M Cook. 2001. Birds of a feather: Homophily in social networks.Annual review of sociology27, 1 (2001), 415–444

work page 2001

[32] [32]

Jaehyun Nam, Jihoon Tack, Kyungmin Lee, Hankook Lee, and Jinwoo Shin. 2023. Stunt: Few-shot tabular learning with self-generated tasks from unlabeled tables. arXiv preprint arXiv:2303.00918(2023)

work page arXiv 2023

[33] [33]

Jennifer Neville and David Jensen. 2000. Iterative classification in relational data. InProc. AAAI-2000 workshop on learning statistical models from relational data. Austin Texas, TX, 13–20

work page 2000

[34] [34]

Karl Pearson. 1901. LIII. On lines and planes of closest fit to systems of points in space.The London, Edinburgh, and Dublin philosophical magazine and journal of science2, 11 (1901), 559–572

work page 1901

[35] [35]

Rishabh Ranjan, Valter Hudovernik, Mark Znidar, Charilaos Kanatsoulis, Roshan Upendra, Mahmoud Mohammadi, Joe Meyer, Tom Palczewski, Carlos Guestrin, and Jure Leskovec. 2025. Relational Transformer: Toward Zero-Shot Foundation Models for Relational Data.arXiv preprint arXiv:2510.06377(2025)

work page arXiv 2025

[36] [36]

Joshua Robinson, Rishabh Ranjan, Weihua Hu, Kexin Huang, Jiaqi Han, Alejandro Dobles, Matthias Fey, Jan Eric Lenssen, Yiwen Yuan, Zecheng Zhang, et al. 2024. Relbench: A benchmark for deep learning on relational databases.Advances in Neural Information Processing Systems37 (2024), 21330–21341

work page 2024

[37] [37]

Michael Schlichtkrull, Thomas N Kipf, Peter Bloem, Rianne Van Den Berg, Ivan Titov, and Max Welling. 2018. Modeling relational data with graph convolutional networks. InEuropean semantic web conference. Springer, 593–607

work page 2018

[38] [38]

Yunsheng Shi, Zhengjie Huang, Shikun Feng, Hui Zhong, Wenjin Wang, and Yu Sun. 2020. Masked label prediction: Unified message passing model for semi- supervised classification.arXiv preprint arXiv:2009.03509(2020)

work page arXiv 2020

[39] [39]

Jake Snell, Kevin Swersky, and Richard Zemel. 2017. Prototypical networks for few-shot learning.Advances in neural information processing systems30 (2017)

work page 2017

[40] [40]

Gowthami Somepalli, Micah Goldblum, Avi Schwarzschild, C Bayan Bruss, and Tom Goldstein. 2021. Saint: Improved neural networks for tabular data via row attention and contrastive pre-training.arXiv preprint arXiv:2106.01342(2021)

work page arXiv 2021

[41] [41]

Luis Torgo and Joao Gama. 1997. Regression using classification algorithms. Intelligent Data Analysis1, 4 (1997), 275–292

work page 1997

[42] [42]

Quang Truong, Zhikai Chen, Mingxuan Ju, Tong Zhao, Neil Shah, and Jiliang Tang. 2025. A Pre-training Framework for Relational Data with Information- theoretic Principles.arXiv preprint arXiv:2507.09837(2025)

work page arXiv 2025

[43] [43]

Talip Ucar, Ehsan Hajiramezanali, and Lindsay Edwards. 2021. Subtab: Subsetting features of tabular data for self-supervised representation learning.Advances in Neural Information Processing Systems34 (2021), 18853–18865

work page 2021

[44] [44]

Dennis Ulmer, Lotta Meijerink, and Giovanni Cinà. 2020. Trust issues: Uncertainty estimation does not enable reliable ood detection on medical tabular data. In Machine Learning for Health. PMLR, 341–354

work page 2020

[45] [45]

Petar Veličković, William Fedus, William L Hamilton, Pietro Liò, Yoshua Bengio, and R Devon Hjelm. 2018. Deep graph infomax.arXiv preprint arXiv:1809.10341 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018

[46] [46]

Minjie Wang, Quan Gan, David Wipf, Zheng Zhang, Christos Faloutsos, Weinan Zhang, Muhan Zhang, Zhenkun Cai, Jiahang Li, Zunyao Mao, et al. 2024. 4DBIn- fer: A 4d benchmarking toolbox for graph-centric predictive modeling on RDBs. Advances in Neural Information Processing Systems37 (2024), 27236–27273

work page 2024

[47] [47]

Tongzhou Wang and Phillip Isola. 2020. Understanding contrastive representation learning through alignment and uniformity on the hypersphere. InInternational conference on machine learning. PMLR, 9929–9939

work page 2020

[48] [48]

Xiao Wang, Houye Ji, Chuan Shi, Bai Wang, Yanfang Ye, Peng Cui, and Philip S Yu

work page

[49] [49]

InThe world wide web conference

Heterogeneous graph attention network. InThe world wide web conference. 2022–2032

work page 2022

[50] [50]

Yanbo Wang, Xiyuan Wang, Quan Gan, Minjie Wang, Qibin Yang, David Wipf, and Muhan Zhang. 2025. Griffin: Towards a Graph-Centric Relational Database Foundation Model.arXiv preprint arXiv:2505.05568(2025)

work page arXiv 2025

[51] [51]

Jinsung Yoon, Yao Zhang, James Jordon, and Mihaela Van der Schaar. 2020. Vime: Extending the success of self-and semi-supervised learning to tabular domain. Advances in neural information processing systems33 (2020), 11033–11043

work page 2020

[52] [52]

Yuning You, Tianlong Chen, Yongduo Sui, Ting Chen, Zhangyang Wang, and Yang Shen. 2020. Graph contrastive learning with augmentations.Advances in neural information processing systems33 (2020), 5812–5823. Conference acronym ’XX, June 03–05, 2018, Woodstock, NY Trovato et al. A Dataset and Task Statistics Specific statistics regarding the datasets and task...

work page 2020