pith. sign in
Pith Number

pith:DKDPDEJZ

pith:2026:DKDPDEJZCJSZ3EZZNK3SS6KSX4
not attested not anchored not stored refs resolved

Towards Foundation Models for Relational Databases with Language Models and Graph Neural Networks

Fabian Leeske, Jingcheng Wu, Lucas Etteldorf, Max Finkenbeiner, Mojtaba Nayyeri, Ratan Bahadur Thapa, Steffen Staab

A hybrid of fine-tuned BART and GraphSAGE on relational entity graphs enriches embeddings and competes with supervised baselines for relational database tasks.

arxiv:2605.16085 v1 · 2026-05-15 · cs.DB · cs.AI

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{DKDPDEJZCJSZ3EZZNK3SS6KSX4}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

Experiments on RelBench show that the GNN substantially enriches BART's row embeddings, achieving a ROC-AUC of 67.40 on the driver-dnf task from the rel-f1 dataset. This performance is competitive with supervised baselines such as LightGBM (68.86) and narrows the gap to RDL (72.62) to within 5.22 points, though a substantial gap remains to state-of-the-art foundation models such as KumoRFM (82.63).

C2weakest assumption

That the specific hybrid of fine-tuned BART plus GraphSAGE on relational entity graphs will generalize to arbitrary unseen databases and tasks sufficiently to serve as a foundation model, rather than remaining competitive only on the tested RelBench subset.

C3one line summary

A BART-GraphSAGE hybrid achieves ROC-AUC 67.40 on one RelBench task, competitive with LightGBM but still behind specialized relational deep learning and foundation models.

References

34 extracted · 34 resolved · 3 Pith anchors

[1] M. Fey, W. Hu, K. Huang, J. E. Lenssen, R. Ranjan, J. Robinson, R. Ying, J. You, J. Leskovec, Position: Relational deep learning-graph representation learning on relational databases, in: Forty-first 2024
[2] V. P. Dwivedi, C. Kanatsoulis, S. Huang, J. Leskovec, Relational deep learning: Challenges, founda- tions and next-generation architectures, in: Proceedings of the 31st ACM SIGKDD Conference on Knowle 2025 · doi:10.1145/3711896.3736
[3] Y. Wang, X. Wang, Q. Gan, M. Wang, Q. Yang, D. Wipf, M. Zhang, Griffin: Towards a graph-centric relational database foundation model, in: ICML, volume 267 ofProceedings of Machine Learning Research, P 2025
[4] M. Fey, V. Kocijan, F. Lopez, J. E. Lenssen, J. Leskovec, KumoRFM: A Foundation Model for In- Context Learning on Relational Data, White Paper, Kumo AI, 2025. URL: https://kumo.ai/research /kumo_relat 2025
[5] L. Vogel, B. Hilprecht, C. Binnig, Towards foundation models for relational databases [vision paper], arXiv preprint arXiv:2305.15321 (2023). doi:10.48550/ARXIV.2305.15321 2023 · doi:10.48550/arxiv.2305.15321

Formal links

2 machine-checked theorem links

Receipt and verification
First computed 2026-05-20T00:01:52.027024Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

1a86f1913912659d93396ab7297952bf2fdfd956303e0c2561b32fe77f3fc658

Aliases

arxiv: 2605.16085 · arxiv_version: 2605.16085v1 · doi: 10.48550/arxiv.2605.16085 · pith_short_12: DKDPDEJZCJSZ · pith_short_16: DKDPDEJZCJSZ3EZZ · pith_short_8: DKDPDEJZ
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/DKDPDEJZCJSZ3EZZNK3SS6KSX4 \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 1a86f1913912659d93396ab7297952bf2fdfd956303e0c2561b32fe77f3fc658
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "463dda1f1838b2c33c7252b5c50d3bd5fe685730734d9704d311019b476439c5",
    "cross_cats_sorted": [
      "cs.AI"
    ],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.DB",
    "submitted_at": "2026-05-15T15:46:38Z",
    "title_canon_sha256": "398f2b7a9a0b3ab1e31e50c67f8c3826007d592eb003fff7ab817fefde305847"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.16085",
    "kind": "arxiv",
    "version": 1
  }
}