RelGT-AC: A Relational Graph Transformer for Autocomplete Tasks in Relational Databases
Pith reviewed 2026-06-28 10:42 UTC · model grok-4.3
The pith
RelGT-AC improves autocomplete predictions on relational databases by masking target columns and adding TF-IDF text encoding to a graph transformer.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
RelGT-AC extends RelGT with column masking to prevent trivial solutions by hiding the target column, a unified task head for binary classification, multiclass classification, and regression, and a TF-IDF text encoder that recovers lexical signal from free-text columns, leading to outperformance of the GraphSAGE baseline on all regression autocomplete tasks and gains of up to 10 AUROC points on text-heavy tasks.
What carries the argument
Column masking strategy that hides the target column value from the subgraph encoder to prevent the model from using it directly.
If this is right
- Outperforms GraphSAGE baseline on all three regression autocomplete tasks
- Gains up to 10 AUROC points on text-heavy eligibility tasks
- Handles binary classification, multiclass classification, and regression with one unified task head
- Recovers lexical signal from free-text columns that categorical encoders discard
Where Pith is reading between the lines
- The column masking approach could apply to other graph-based prediction settings to reduce feature leakage.
- Testing the model on relational datasets outside the three RelBench v2 sets would check whether the gains hold more broadly.
- Pairing the TF-IDF component with modern language model embeddings might increase gains on text-rich tables.
Load-bearing premise
Column masking stops trivial solutions and the TF-IDF encoder adds useful lexical signal without introducing biases or hurting performance on non-text tasks.
What would settle it
An ablation experiment showing that removing the TF-IDF encoder produces no drop in performance on the text-heavy eligibility tasks would falsify the claim that the encoder recovers strong lexical signal.
Figures
read the original abstract
Relational databases underpin modern enterprise, scientific, and healthcare systems, yet predictive machine learning on such data remains challenging due to their multi-table, heterogeneous, and temporal structure. Relational Deep Learning (RDL) addresses this by representing databases as heterogeneous graphs and applying graph neural networks (GNNs) directly. RelBench v2 recently introduced autocomplete tasks -- a practically motivated task type where the goal is to predict an existing column value from relational context, analogous to an intelligent form-filling assistant. We propose RelGT-AC (Relational Graph Transformer for Autocomplete), extending the RelGT architecture with three targeted contributions: (1) a column masking strategy that prevents trivial solutions by masking the target column during subgraph encoding; (2) a unified task head supporting binary classification, multiclass classification, and regression autocomplete tasks within a single model; and (3) a TF-IDF text encoder that automatically detects and encodes free-text columns, recovering strong lexical signal that categorical encoders discard. Across 7 tasks spanning 3 RelBench v2 datasets (rel-trial, rel-f1, rel-stack), RelGT-AC outperforms the GraphSAGE baseline on all 3 regression autocomplete tasks and achieves up to +10 AUROC points on text-heavy eligibility tasks via the TF-IDF encoder.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes RelGT-AC as an extension of RelGT for autocomplete tasks on relational databases represented as heterogeneous graphs. It introduces (1) a column masking strategy during subgraph encoding, (2) a unified task head for binary/multiclass classification and regression, and (3) a TF-IDF encoder for free-text columns. On 7 tasks from 3 RelBench v2 datasets (rel-trial, rel-f1, rel-stack), it claims outperformance versus GraphSAGE on all regression tasks and gains of up to +10 AUROC on text-heavy eligibility tasks.
Significance. If the empirical results prove robust, the masking strategy and TF-IDF component could offer targeted improvements for handling heterogeneous columns and preventing trivial predictions in relational deep learning, with direct relevance to form-filling applications in enterprise and scientific databases.
major comments (2)
- [Abstract] Abstract: performance numbers are stated without any information on data splits, statistical testing, hyperparameter search, or leakage checks. This information is load-bearing for the central claim of consistent outperformance and must appear in the methods/results sections with explicit tables or text.
- [Contributions (1) and (3)] The description of the TF-IDF encoder and column masking (listed as contributions 1 and 3): the claim that TF-IDF recovers lexical signal without degrading non-text tasks or introducing bias requires supporting ablation results or controls; without them the weakest assumption in the evaluation cannot be verified.
minor comments (1)
- [Abstract] Ensure all acronyms (RDL, GNN, AUROC) are defined on first use and used consistently.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback. We address each major comment below, providing clarifications and committing to revisions where the manuscript can be strengthened.
read point-by-point responses
-
Referee: [Abstract] Abstract: performance numbers are stated without any information on data splits, statistical testing, hyperparameter search, or leakage checks. This information is load-bearing for the central claim of consistent outperformance and must appear in the methods/results sections with explicit tables or text.
Authors: The experimental protocol, including RelBench v2 temporal data splits, hyperparameter search ranges, 5-run averaging with standard deviations, and leakage prevention via benchmark-defined splits, is detailed in Section 4 (Experimental Setup) and the results tables of Section 5. We agree the abstract would benefit from an explicit pointer to these sections and will add one sentence referencing the full evaluation details. revision: partial
-
Referee: [Contributions (1) and (3)] The description of the TF-IDF encoder and column masking (listed as contributions 1 and 3): the claim that TF-IDF recovers lexical signal without degrading non-text tasks or introducing bias requires supporting ablation results or controls; without them the weakest assumption in the evaluation cannot be verified.
Authors: The main results already demonstrate that TF-IDF yields up to +10 AUROC gains on text-heavy tasks while the model remains competitive on the three regression tasks (which use non-text targets). Column masking is validated through the overall outperformance versus the unmasked RelGT baseline. We nevertheless agree that targeted ablations would make the claims more robust and will add them in revision: (a) TF-IDF versus categorical encoding on non-text tasks to quantify any degradation or bias, and (b) masked versus unmasked subgraph encoding to isolate the effect on trivial-prediction prevention. revision: yes
Circularity Check
No significant circularity
full rationale
The manuscript is an empirical model proposal and benchmark evaluation. It defines three architectural contributions (column masking, unified task head, TF-IDF encoder) and reports performance on RelBench v2 tasks against the external GraphSAGE baseline. No equations, fitted parameters, or self-citations are shown to reduce any reported result to quantities defined by the same paper's inputs. The central claims rest on external datasets and baselines, satisfying the self-contained criterion.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
BERT: Pre-training of deep bidirectional transformers for language understanding
Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. BERT: Pre-training of deep bidirectional transformers for language understanding. In NAACL, 2019
2019
-
[2]
P., Kanatsoulis, C., Huang, S., and Leskovec, J
Dwivedi, V. P., Kanatsoulis, C., Huang, S., and Leskovec, J. Relational deep learning: Challenges, foundations and next-generation architectures. arXiv:2506.16654, 2025
- [3]
-
[4]
RelBench v2: A Large-Scale Benchmark and Repository for Relational Data
Gu, J., Ranjan, R., Kanatsoulis, C., et al. RelBench v2: A large-scale benchmark and repository for relational data. arXiv:2602.12606, 2026
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[5]
L., Ying, R., and Leskovec, J
Hamilton, W. L., Ying, R., and Leskovec, J. Inductive representation learning on large graphs. In NeurIPS, 2017
2017
-
[6]
Accurate predictions on small data with a tabular foundation model
Hollmann, N., Müller, S., Eggensperger, K., and Hutter, F. Accurate predictions on small data with a tabular foundation model. Nature, 2025
2025
-
[7]
GraphMAE: Self-supervised masked graph autoencoders
Hou, Z., Liu, X., Cen, Y., Dong, Y., Yang, H., Wang, C., and Tang, J. GraphMAE: Self-supervised masked graph autoencoders. In KDD, 2022
2022
-
[8]
GraphMAE2: A decoding-enhanced masked self-supervised graph learner
Hou, Z., He, Y., Cen, Y., Liu, X., Dong, Y., Kharlamov, E., and Tang, J. GraphMAE2: A decoding-enhanced masked self-supervised graph learner. In WWW, 2023
2023
-
[9]
Heterogeneous graph transformer
Hu, Z., Dong, Y., Wang, K., and Sun, Y. Heterogeneous graph transformer. In WWW, 2020
2020
-
[10]
KumoRFM-2: Scaling Foundation Models for Relational Learning
Hudovernik, V., López, F., Kocijan, V., et al. KumoRFM-2: Scaling foundation models for relational learning. arXiv:2604.12596, 2026
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[11]
PluRel: Synthetic data unlocks scaling laws for relational foundation models
Kothapalli, V., Ranjan, R., Hudovernik, V., et al. PluRel: Synthetic data unlocks scaling laws for relational foundation models. arXiv:2602.04029, 2026
-
[12]
Peleška, J. and Šír, G. Task-agnostic contrastive pretraining for relational deep learning. arXiv:2506.22530, 2025
-
[13]
P., Luu, A
Rampášek, L., Galkin, M., Dwivedi, V. P., Luu, A. T., Wolf, G., and Beaini, D. Recipe for a general, powerful, scalable graph transformer. In NeurIPS, 2022
2022
-
[14]
Relational transformer: Toward zero-shot foundation models for relational data
Ranjan, R., Hudovernik, V., Znidar, M., et al. Relational transformer: Toward zero-shot foundation models for relational data. In ICLR, 2026
2026
-
[15]
RelBench: A benchmark for deep learning on relational databases
Robinson, J., et al. RelBench: A benchmark for deep learning on relational databases. In NeurIPS Datasets & Benchmarks, 2023
2023
-
[16]
Griffin: Towards a graph-centric relational database foundation model
Wang, Y., Wang, X., Gan, Q., Wang, M., Yang, Q., Wipf, D., and Zhang, M. Griffin: Towards a graph-centric relational database foundation model. In ICML, 2025
2025
-
[17]
Relational In-Context Learning via Synthetic Pre-training with Structural Prior
Wang, Y., You, J., Shi, C., and Zhang, M. Relational in-context learning via synthetic pre-training with structural prior. arXiv:2603.03805, 2026
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[18]
4DBInfer: A 4D benchmarking toolbox for graph-centric predictive modeling on relational DBs
Wang, M., Gan, Q., Wipf, D., Cai, Z., Li, N., Tang, J., Zhang, Y., Zhang, Z., Mao, Z., Song, Y., Wang, Y., Li, J., Zhang, H., Yang, G., Qin, X., Lei, C., Zhang, M., Zhang, W., Faloutsos, C., and Zhang, Z. 4DBInfer: A 4D benchmarking toolbox for graph-centric predictive modeling on relational DBs. In NeurIPS Datasets & Benchmarks, 2024
2024
-
[19]
Sparck Jones
K. Sparck Jones. A statistical interpretation of term specificity and its application in retrieval. Journal of Documentation, 28 0 (1): 0 11--21, 1972
1972
-
[20]
and Guestrin, C
Chen, T. and Guestrin, C. XGBoost: A scalable tree boosting system. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp.\ 785--794, 2016
2016
-
[21]
RelGNN: Composite message passing for relational deep learning
Chen, T., Kanatsoulis, C., and Leskovec, J. RelGNN: Composite message passing for relational deep learning. In Forty-second International Conference on Machine Learning (ICML), 2025
2025
-
[22]
and Lenssen, J
Fey, M. and Lenssen, J. E. Fast graph representation learning with PyTorch Geometric. In ICLR Workshop on Representation Learning on Graphs and Manifolds, 2019
2019
-
[23]
E., Ranjan, R., Robinson, J., Ying, R., You, J., and Leskovec, J
Fey, M., Hu, W., Huang, K., Lenssen, J. E., Ranjan, R., Robinson, J., Ying, R., You, J., and Leskovec, J. Position: Relational deep learning --- graph representation learning on relational databases. In Forty-first International Conference on Machine Learning (ICML), 2024
2024
-
[24]
S., Riley, P
Gilmer, J., Schoenholz, S. S., Riley, P. F., Vinyals, O., and Dahl, G. E. Neural message passing for quantum chemistry. In International Conference on Machine Learning (ICML), pp.\ 1263--1272, 2017
2017
-
[25]
Revisiting deep learning models for tabular data
Gorishniy, Y., Rubachev, I., Khrulkov, V., and Babenko, A. Revisiting deep learning models for tabular data. In Advances in Neural Information Processing Systems (NeurIPS), volume 34, pp.\ 18932--18943, 2021
2021
-
[26]
LightGBM: A highly efficient gradient boosting decision tree
Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., and Liu, T.-Y. LightGBM: A highly efficient gradient boosting decision tree. In Advances in Neural Information Processing Systems (NeurIPS), volume 30, 2017
2017
-
[27]
Kipf, T. N. and Welling, M. Semi-supervised classification with graph convolutional networks. In International Conference on Learning Representations (ICLR), 2017
2017
-
[28]
SALT: Sales autocompletion linked business tables dataset
Klein, T., Biehl, C., Costa, M., Sres, A., Kolk, J., and Hoffart, J. SALT: Sales autocompletion linked business tables dataset. In NeurIPS 2024 Third Table Representation Learning Workshop, 2024
2024
-
[29]
Pele s ka, J. and S \'ir, G. Transformers meet relational databases. arXiv:2412.05218, 2024
-
[30]
TabICL: A tabular foundation model for in-context learning on large data
Qu, J., Holzm\" u ller, D., Varoquaux, G., and Le Morvan, M. TabICL: A tabular foundation model for in-context learning on large data. In Forty-second International Conference on Machine Learning (ICML), 2025
2025
-
[31]
N., Kaiser, ., and Polosukhin, I
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, ., and Polosukhin, I. Attention is all you need. In Advances in Neural Information Processing Systems (NeurIPS), volume 30, 2017
2017
-
[32]
Graph attention networks
Veli c kovi\' c , P., Cucurull, G., Casanova, A., Romero, A., Li\` o , P., and Bengio, Y. Graph attention networks. In International Conference on Learning Representations (ICLR), 2018
2018
-
[33]
Do transformers really perform badly for graph representation? In Advances in Neural Information Processing Systems (NeurIPS), volume 34, pp.\ 28877--28888, 2021
Ying, C., Cai, T., Luo, S., Zheng, S., Ke, G., He, D., Shen, Y., and Liu, T.-Y. Do transformers really perform badly for graph representation? In Advances in Neural Information Processing Systems (NeurIPS), volume 34, pp.\ 28877--28888, 2021
2021
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.