Leveraging Graph Structure in Seq2Seq Models for Knowledge Graph Link Prediction

Evgeny Kharlamov; Jingcheng Wu; Luu Huu Phuc; Mojtaba Nayyeri; Ratan Bahadur Thapa; Steffen Staab

arxiv: 2605.18211 · v1 · pith:L4LM5T2Znew · submitted 2026-05-18 · 💻 cs.CL · cs.AI

Leveraging Graph Structure in Seq2Seq Models for Knowledge Graph Link Prediction

Luu Huu Phuc , Ratan Bahadur Thapa , Mojtaba Nayyeri , Jingcheng Wu , Evgeny Kharlamov , Steffen Staab This is my paper

Pith reviewed 2026-05-20 10:40 UTC · model grok-4.3

classification 💻 cs.CL cs.AI

keywords knowledge graphslink predictionsequence-to-sequence modelsgraph attention networksT5multi-hop patternsrelational embeddingsCoDEx dataset

0 comments

The pith

GA-S2S improves knowledge graph link prediction by jointly encoding text and full k-hop subgraph topology with RGAT.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Existing sequence-to-sequence models for knowledge graph link prediction rely on textual descriptions or flatten neighborhoods into linear sequences, which discards graph structure. The paper proposes GA-S2S to address this by integrating a T5-small encoder-decoder with a Relational Graph Attention Network. This setup processes both textual features and the complete k-hop subgraph surrounding query entities through relation-aware embeddings. The result is better capture of multi-hop relational patterns. On the CoDEx dataset, the approach yields up to 19 percent relative gains over competitive baselines.

Core claim

GA-S2S jointly encodes both textual features and the full k-hop subgraph topology surrounding the query entity. By integrating raw encoder outputs with RGAT's relation-aware embeddings, the model captures and leverages richer multi-hop relational patterns and textual information to improve link prediction accuracy.

What carries the argument

The Graph-Augmented Sequence-to-Sequence (GA-S2S) framework that combines T5 encoder-decoder outputs with Relational Graph Attention Network processing of k-hop subgraphs to retain graph topology.

If this is right

Link prediction models benefit from explicit multi-hop graph topology instead of flattening neighborhoods into sequences.
Relation-aware embeddings from RGAT add value when combined with textual encoder outputs for structured prediction.
The method shows that hybrid text-graph encoding can raise accuracy by up to 19 percent relative on datasets like CoDEx.
Seq2seq architectures for knowledge graphs can be extended to handle full subgraph structure without losing relational connections.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same augmentation strategy could be tested on other graph reasoning tasks such as multi-hop question answering over knowledge graphs.
Larger values of k or alternative graph neural networks might yield further gains or reveal limits in subgraph size.
This hybrid approach points toward combining graph structure with larger language models for more general structured reasoning problems.

Load-bearing premise

That RGAT applied to the k-hop subgraph on top of T5 outputs will extract additional useful relational patterns beyond those already present in text or flattened sequences.

What would settle it

An experiment on CoDEx where a T5 seq2seq model without the RGAT subgraph component achieves equivalent accuracy to GA-S2S would falsify the claim that the graph topology integration drives the gains.

Figures

Figures reproduced from arXiv: 2605.18211 by Evgeny Kharlamov, Jingcheng Wu, Luu Huu Phuc, Mojtaba Nayyeri, Ratan Bahadur Thapa, Steffen Staab.

**Figure 1.** Figure 1: The proposed model architecture integrates an RGAT module between the encoder and decoder of the T5-small model, enabling the fusion of textual and structural information from KGs. All components of the T5 model and the RGAT module are fully trainable and initialized from scratch, and no pre-trained weights are frozen during training. Input Representation. Given a query 𝑞 = (𝑒, 𝑟, ?), the input to our mode… view at source ↗

read the original abstract

We introduce Graph-Augmented Sequence-to-Sequence (GA-S2S), a novel framework that integrates a T5-small encoder-decoder with a Relational Graph Attention Network (RGAT) to improve link prediction in knowledge graphs. While existing Seq2Seq models rely solely on surface-level textual descriptions of entities and relations and at best, flatten the neighborhoods of a query entity into a single linear sequence, thereby discarding the inherent graph structure, GA-S2S jointly encodes both textual features and the full $k$-hop subgraph topology surrounding the query entity. By integrating raw encoder outputs with RGAT's relation-aware embeddings, our model captures and leverages richer multi-hop relational patterns and textual information. Our preliminary experiments on the CoDEx dataset demonstrate that GA-S2S outperforms competitive Seq2Seq-based baseline models, achieving up to a 19\% relative gain in link prediction accuracy.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

GA-S2S adds RGAT to a T5 seq2seq setup for KG link prediction and claims a 19% gain on CoDEx, but the fusion details and experimental support remain too thin to judge the source of the improvement.

read the letter

The main point is that this paper takes a T5 encoder-decoder for knowledge graph link prediction and adds an RGAT component to handle the full k-hop subgraph around the query entity instead of flattening it into a sequence. They report up to a 19% relative gain over competitive baselines on the CoDEx dataset in preliminary runs. The motivation is straightforward: existing seq2seq models lose relational structure when they linearize neighborhoods, so feeding graph topology through relation-aware attention is a logical next step. The work does a clean job laying out that limitation and showing how the combined model can in principle capture both text and multi-hop patterns. The citations to T5 and RGAT are appropriate and the overall framing stays grounded in prior seq2seq KG work. The soft spots sit in the architecture description and the results. The paper says it integrates raw encoder outputs with RGAT's relation-aware embeddings, yet it does not spell out the precise fusion point, whether RGAT runs on per-token embeddings or pooled entity representations, how the k-hop subgraph is built and aligned, or the depth of message passing. Without that level of detail or an ablation that isolates the graph contribution from capacity or training differences, the 19% number is hard to attribute to the structural encoding. The experiments are labeled preliminary, so there are no error bars, significance tests, or multiple-run statistics. This is an incremental combination rather than a first-principles advance, which is fine as long as the claims match the evidence. The paper is aimed at researchers already working on language-model approaches to knowledge graph completion who want to add graph structure without abandoning the seq2seq paradigm. A reader looking for practical extensions might find usable ideas once the missing pieces are filled in. I would send it to peer review after they clarify the integration mechanism and add ablations plus basic statistical reporting; the core idea is coherent enough to be worth referee time.

Referee Report

2 major / 1 minor

Summary. The paper introduces Graph-Augmented Sequence-to-Sequence (GA-S2S), which augments a T5-small encoder-decoder with a Relational Graph Attention Network (RGAT). It claims that prior Seq2Seq models discard graph structure by flattening k-hop neighborhoods into linear sequences, whereas GA-S2S jointly encodes textual features and the full k-hop subgraph topology around the query entity by integrating raw T5 encoder outputs with RGAT's relation-aware embeddings, yielding up to a 19% relative gain in link-prediction accuracy on the CoDEx dataset.

Significance. If the integration mechanism is shown to extract multi-hop relational patterns that linear flattening cannot capture, the work would offer a concrete way to inject graph topology into text-based Seq2Seq models for KG completion. The choice of RGAT is appropriate for relation-aware message passing, and the reported gain, if reproducible with proper controls, would be a useful empirical signal for the community.

major comments (2)

[Abstract] Abstract: the central claim that GA-S2S 'jointly encodes both textual features and the full k-hop subgraph topology' and 'captures richer multi-hop relational patterns' rests on an integration step whose mechanism is never specified. The text states only that the model 'integrates raw encoder outputs with RGAT's relation-aware embeddings' without describing (a) whether RGAT receives per-token T5 embeddings, pooled entity representations, or a separately constructed graph view, (b) how the k-hop subgraph is extracted and aligned with the textual input, or (c) the number of RGAT layers or message-passing hops. This omission is load-bearing for the claim that topology is actually leveraged beyond prior flattening approaches.
[Abstract] Abstract / Experiments section: the reported 'up to a 19% relative gain' is presented without any description of the Seq2Seq baselines, training protocol, evaluation metric (Hits@K, MRR, etc.), statistical significance tests, error bars, or ablation isolating the RGAT component. Because the result is the sole empirical support for the architecture, the absence of these details prevents assessment of whether the gain stems from structural encoding or from capacity/training differences.

minor comments (1)

[Abstract] The abstract refers to 'preliminary experiments' yet supplies no dataset statistics, hyper-parameter settings, or hardware details; these should be added for reproducibility even in a short paper.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We address each major point below and will revise the manuscript to add the missing details on architecture and experiments.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that GA-S2S 'jointly encodes both textual features and the full k-hop subgraph topology' and 'captures richer multi-hop relational patterns' rests on an integration step whose mechanism is never specified. The text states only that the model 'integrates raw encoder outputs with RGAT's relation-aware embeddings' without describing (a) whether RGAT receives per-token T5 embeddings, pooled entity representations, or a separately constructed graph view, (b) how the k-hop subgraph is extracted and aligned with the textual input, or (c) the number of RGAT layers or message-passing hops. This omission is load-bearing for the claim that topology is actually leveraged beyond prior flattening approaches.

Authors: We agree that the integration mechanism is not described in sufficient detail in the current manuscript. We will revise the abstract and add a dedicated Model Architecture subsection to specify how the T5 encoder outputs are combined with RGAT, how the k-hop subgraph is extracted and aligned with the textual sequence, and the number of RGAT layers used. This will clarify the distinction from prior flattening approaches. revision: yes
Referee: [Abstract] Abstract / Experiments section: the reported 'up to a 19% relative gain' is presented without any description of the Seq2Seq baselines, training protocol, evaluation metric (Hits@K, MRR, etc.), statistical significance tests, error bars, or ablation isolating the RGAT component. Because the result is the sole empirical support for the architecture, the absence of these details prevents assessment of whether the gain stems from structural encoding or from capacity/training differences.

Authors: We agree that the experimental reporting is incomplete. We will expand the Experiments section to fully describe the Seq2Seq baselines, training protocol, evaluation metrics, statistical significance tests, error bars from multiple runs, and an ablation isolating the RGAT component. These additions will allow proper assessment of the source of the reported gains. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical model comparison on external benchmark

full rationale

The paper introduces GA-S2S as an architectural integration of T5 encoder outputs with RGAT on k-hop subgraphs and reports relative accuracy gains on the external CoDEx dataset. No derivation chain, equations, or fitted parameters are presented that reduce by construction to the model's own inputs or prior self-citations. The central claim rests on empirical outperformance rather than any self-referential prediction or uniqueness theorem, making the work self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper introduces no new free parameters, axioms, or invented entities beyond standard neural network training assumptions and the existing T5 and RGAT architectures; it relies on the CoDEx dataset and conventional link prediction evaluation protocols.

axioms (1)

domain assumption Standard assumptions in neural network training and evaluation for link prediction tasks hold, including that the chosen dataset and metrics reflect real-world utility.
Invoked implicitly when claiming superiority on CoDEx without discussing potential dataset biases or metric limitations.

pith-pipeline@v0.9.0 · 5701 in / 1259 out tokens · 45317 ms · 2026-05-20T10:40:01.903076+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

GA-S2S jointly encodes both textual features and the full k-hop subgraph topology surrounding the query entity... By integrating raw encoder outputs with RGAT's relation-aware embeddings
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We introduce Graph-Augmented Sequence-to-Sequence (GA-S2S)... outperforms competitive Seq2Seq-based baseline models, achieving up to a 19% relative gain

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

31 extracted references · 31 canonical work pages · 1 internal anchor

[1]

H. Zhou, L. Halilaj, S. Monka, S. Schmid, Y. Zhu, J. Wu, N. Nazer, S. Staab, Seeing and knowing in the wild: Open-domain visual entity recognition with large-scale knowledge graphs via contrastive learning, in: AAAI, AAAI Press, 2026, pp. 13638–13646. doi:10.1609/AAAI.V40I16.38370

work page doi:10.1609/aaai.v40i16.38370 2026
[2]

Z. Ding, J. Wu, J. Wu, Y. Xia, B. Xiong, V. Tresp, Temporal fact reasoning over hyper-relational knowledge graphs, in: EMNLP (Findings), Findings of ACL, Association for Computational Linguistics, 2024, pp. 355–373. doi:10.18653/V1/2024.FINDINGS-EMNLP.20

work page doi:10.18653/v1/2024.findings-emnlp.20 2024
[3]

Y. Zhu, J. Wu, Y. Wang, H. Zhou, J. Chen, E. Kharlamov, S. Staab, Certainty in uncertainty: Reasoning over uncertain knowledge graphs with statistical guarantees, in: EMNLP, Association for Computational Linguistics, 2025, pp. 8730–8752. doi:10.18653/V1/2025.EMNLP-MAIN.441

work page doi:10.18653/v1/2025.emnlp-main.441 2025
[4]

X. Dong, E. Gabrilovich, G. Heitz, W. Horn, N. Lao, K. Murphy, T. Strohmann, S. Sun, W. Zhang, Knowledge vault: A web-scale approach to probabilistic knowledge fusion, in: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, 2014, pp. 601–610

work page 2014
[5]

Razniewski, F

S. Razniewski, F. Suchanek, W. Nutt, But what do we actually know?, in: Proceedings of the 5th Workshop on Automated Knowledge Base Construction, 2016, pp. 40–44

work page 2016
[6]

S. M. Kazemi, D. Poole, Simple embedding for link prediction in knowledge graphs, Advances in neural information processing systems 31 (2018)

work page 2018
[7]

Getoor, B

L. Getoor, B. Taskar, Introduction to statistical relational learning, MIT press, 2007

work page 2007
[8]

Rossi, D

A. Rossi, D. Barbosa, D. Firmani, A. Matinata, P. Merialdo, Knowledge graph embedding for link prediction: A comparative analysis, ACM Trans. Knowl. Discov. Data 15 (2021). URL: https: //doi.org/10.1145/3424672. doi:10.1145/3424672

work page doi:10.1145/3424672 2021
[9]

Z. Ye, Y. J. Kumar, G. O. Sing, F. Song, J. Wang, A comprehensive survey of graph neural networks for knowledge graphs, IEEE Access 10 (2022) 75729–75741. doi:10.1109/ACCESS.2022.3191784

work page doi:10.1109/access.2022.3191784 2022
[10]

Bordes, N

A. Bordes, N. Usunier, A. Garcia-Durán, J. Weston, O. Yakhnenko, Translating embeddings for modeling multi-relational data, in: Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2, NIPS’13, Curran Associates Inc., Red Hook, NY, USA, 2013, p. 2787–2795

work page 2013
[11]

Nickel, V

M. Nickel, V. Tresp, H.-P. Kriegel, A three-way model for collective learning on multi-relational data, in: Proceedings of the 28th International Conference on International Conference on Machine Learning, ICML’11, Omnipress, Madison, WI, USA, 2011, p. 809–816

work page 2011
[12]

Trouillon, J

T. Trouillon, J. Welbl, S. Riedel, E. Gaussier, G. Bouchard, Complex embeddings for simple link prediction, in: M. F. Balcan, K. Q. Weinberger (Eds.), Proceedings of The 33rd International Conference on Machine Learning, volume 48 ofProceedings of Machine Learning Research, PMLR, New York, New York, USA, 2016, pp. 2071–2080. URL: https://proceedings.mlr.p...

work page 2016
[13]

Schlichtkrull, T

M. Schlichtkrull, T. N. Kipf, P. Bloem, R. van den Berg, I. Titov, M. Welling, Modeling relational data with graph convolutional networks, in: A. Gangemi, R. Navigli, M.-E. Vidal, P. Hitzler, R. Troncy, L. Hollink, A. Tordai, M. Alam (Eds.), The Semantic Web, Springer International Publishing, Cham, 2018, pp. 593–607

work page 2018
[14]

Relational Graph Attention Networks

D. Busbridge, D. Sherburn, P. Cavallo, N. Y. Hammerla, Relational graph attention networks, CoRR abs/1904.05811 (2019). URL: http://arxiv.org/abs/1904.05811.arXiv:1904.05811

work page internal anchor Pith review Pith/arXiv arXiv 1904
[15]

Vashishth, S

S. Vashishth, S. Sanyal, V. Nitin, P. Talukdar, Composition-based multi-relational graph con- volutional networks, in: International Conference on Learning Representations, 2020. URL: https://openreview.net/forum?id=BylA_C4tPr

work page 2020
[16]

F. Lu, P. Cong, X. Huang, Utilizing textual information in knowledge graph embedding: A survey of methods and applications, IEEE Access 8 (2020) 92072–92088. doi: 10.1109/ACCESS.2020. 2995074

work page doi:10.1109/access.2020 2020
[17]

Saxena, A

A. Saxena, A. Kochsiek, R. Gemulla, Sequence-to-sequence knowledge graph completion and question answering, in: S. Muresan, P. Nakov, A. Villavicencio (Eds.), Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Association for Computational Linguistics, Dublin, Ireland, 2022, pp. 2814–2828. URL:...

work page doi:10.18653/v1/2022.acl-long.201 2022
[18]

C. Chen, Y. Wang, B. Li, K.-Y. Lam, Knowledge is flat: A Seq2Seq generative framework for various knowledge graph completion, in: N. Calzolari, C.-R. Huang, H. Kim, J. Pustejovsky, L. Wanner, K.-S. Choi, P.-M. Ryu, H.-H. Chen, L. Donatelli, H. Ji, S. Kurohashi, P. Paggio, N. Xue, S. Kim, Y. Hahm, Z. He, T. K. Lee, E. Santus, F. Bond, S.-H. Na (Eds.), Proc...

work page 2022
[19]

Kochsiek, A

A. Kochsiek, A. Saxena, I. Nair, R. Gemulla, Friendly neighbors: Contextualized sequence- to-sequence link prediction, in: B. Can, M. Mozes, S. Cahyawijaya, N. Saphra, N. Kass- ner, S. Ravfogel, A. Ravichander, C. Zhao, I. Augenstein, A. Rogers, K. Cho, E. Grefenstette, L. Voita (Eds.), Proceedings of the 8th Workshop on Representation Learning for NLP (R...

work page doi:10.18653/v1/2023.repl4nlp-1.11 2023
[20]

B. Liu, M. Peng, W. Xu, X. Jia, M. Peng, Unilp: Unified topology-aware generative framework for link prediction in knowledge graph, in: Proceedings of the ACM Web Conference 2024, WWW ’24, Association for Computing Machinery, New York, NY, USA, 2024, p. 2170–2180. URL: https://doi.org/10.1145/3589334.3645592. doi:10.1145/3589334.3645592

work page doi:10.1145/3589334.3645592 2024
[21]

X. Wang, T. Gao, Z. Zhu, Z. Zhang, Z. Liu, J. Li, J. Tang, KEPLER: A unified model for knowl- edge embedding and pre-trained language representation, Transactions of the Association for Computational Linguistics 9 (2021) 176–194. URL: https://aclanthology.org/2021.tacl-1.11/. doi:10.1162/tacl_a_00360

work page doi:10.1162/tacl_a_00360 2021
[22]

W. Hu, M. Fey, H. Ren, M. Nakata, Y. Dong, J. Leskovec, Ogb-lsc: A large-scale challenge for machine learning on graphs, arXiv preprint arXiv:2103.09430 (2021)

work page arXiv 2021
[23]

Raffel, N

C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y. Zhou, W. Li, P. J. Liu, Exploring the limits of transfer learning with a unified text-to-text transformer, J. Mach. Learn. Res. 21 (2020)

work page 2020
[24]

Z. Ding, Y. Li, Y. He, A. Norelli, J. Wu, V. Tresp, M. M. Bronstein, Y. Ma, Dygmamba: Efficiently modeling long-term temporal dependency on continuous-time dynamic graphs with state space models, Trans. Mach. Learn. Res. 2025 (2025). URL: https://openreview.net/forum?id=sq5AJvVuha

work page 2025
[25]

In: Zong, C., Xia, F., Li, W., Navigli, R

T. Safavi, D. Koutra, CoDEx: A Comprehensive Knowledge Graph Completion Benchmark, in: B. Webber, T. Cohn, Y. He, Y. Liu (Eds.), Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Association for Computational Linguistics, Online, 2020, pp. 8328–8350. URL: https://aclanthology.org/2020.emnlp-main.669/. doi: 10....

work page doi:10.18653/v1/ 2020
[26]

M. Fey, J. E. Lenssen, Fast graph representation learning with PyTorch Geometric, in: ICLR Workshop on Representation Learning on Graphs and Manifolds, 2019

work page 2019
[27]

T. Wolf, L. Debut, V. Sanh, J. Chaumond, C. Delangue, A. Moi, P. Cistac, T. Rault, R. Louf, M. Funtowicz, J. Davison, S. Shleifer, P. von Platen, C. Ma, Y. Jernite, J. Plu, C. Xu, T. Le Scao, S. Gugger, M. Drame, Q. Lhoest, A. Rush, Transformers: State-of-the-art natural language processing, in: Q. Liu, D. Schlangen (Eds.), Proceedings of the 2020 Confere...

work page doi:10.18653/v1/2020.emnlp-demos.6 2020
[28]

Balazevic, C

I. Balazevic, C. Allen, T. Hospedales, TuckER: Tensor factorization for knowledge graph completion, in: K. Inui, J. Jiang, V. Ng, X. Wan (Eds.), Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Association for Computational Lingu...

work page doi:10.18653/v1/d19-1522 2019
[29]

J. Yang, Z. Liu, S. Xiao, C. Li, D. Lian, S. Agrawal, A. Singh, G. Sun, X. Xie, Graphformers: Gnn- nested transformers for representation learning on textual graph, in: M. Ranzato, A. Beygelzimer, Y. Dauphin, P. Liang, J. W. Vaughan (Eds.), Advances in Neural Information Processing Systems, volume 34, Curran Associates, Inc., 2021, pp. 28798–28810. URL: h...

work page 2021
[30]

J. Liu, Q. Mao, W. Jiang, J. Li, Knowformer: revisiting transformers for knowledge graph reasoning, in: Proceedings of the 41st International Conference on Machine Learning, ICML’24, JMLR.org, 2024

work page 2024
[31]

Nawrot, S

P. Nawrot, S. Tworkowski, M. Tyrolski, L. Kaiser, Y. Wu, C. Szegedy, H. Michalewski, Hierarchical transformers are more efficient language models, in: M. Carpuat, M.-C. de Marneffe, I. V. Meza Ruiz (Eds.), Findings of the Association for Computational Linguistics: NAACL 2022, Association for Computational Linguistics, Seattle, United States, 2022, pp. 155...

work page doi:10.18653/v1/2022.findings-naacl.117 2022

[1] [1]

H. Zhou, L. Halilaj, S. Monka, S. Schmid, Y. Zhu, J. Wu, N. Nazer, S. Staab, Seeing and knowing in the wild: Open-domain visual entity recognition with large-scale knowledge graphs via contrastive learning, in: AAAI, AAAI Press, 2026, pp. 13638–13646. doi:10.1609/AAAI.V40I16.38370

work page doi:10.1609/aaai.v40i16.38370 2026

[2] [2]

Z. Ding, J. Wu, J. Wu, Y. Xia, B. Xiong, V. Tresp, Temporal fact reasoning over hyper-relational knowledge graphs, in: EMNLP (Findings), Findings of ACL, Association for Computational Linguistics, 2024, pp. 355–373. doi:10.18653/V1/2024.FINDINGS-EMNLP.20

work page doi:10.18653/v1/2024.findings-emnlp.20 2024

[3] [3]

Y. Zhu, J. Wu, Y. Wang, H. Zhou, J. Chen, E. Kharlamov, S. Staab, Certainty in uncertainty: Reasoning over uncertain knowledge graphs with statistical guarantees, in: EMNLP, Association for Computational Linguistics, 2025, pp. 8730–8752. doi:10.18653/V1/2025.EMNLP-MAIN.441

work page doi:10.18653/v1/2025.emnlp-main.441 2025

[4] [4]

X. Dong, E. Gabrilovich, G. Heitz, W. Horn, N. Lao, K. Murphy, T. Strohmann, S. Sun, W. Zhang, Knowledge vault: A web-scale approach to probabilistic knowledge fusion, in: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, 2014, pp. 601–610

work page 2014

[5] [5]

Razniewski, F

S. Razniewski, F. Suchanek, W. Nutt, But what do we actually know?, in: Proceedings of the 5th Workshop on Automated Knowledge Base Construction, 2016, pp. 40–44

work page 2016

[6] [6]

S. M. Kazemi, D. Poole, Simple embedding for link prediction in knowledge graphs, Advances in neural information processing systems 31 (2018)

work page 2018

[7] [7]

Getoor, B

L. Getoor, B. Taskar, Introduction to statistical relational learning, MIT press, 2007

work page 2007

[8] [8]

Rossi, D

A. Rossi, D. Barbosa, D. Firmani, A. Matinata, P. Merialdo, Knowledge graph embedding for link prediction: A comparative analysis, ACM Trans. Knowl. Discov. Data 15 (2021). URL: https: //doi.org/10.1145/3424672. doi:10.1145/3424672

work page doi:10.1145/3424672 2021

[9] [9]

Z. Ye, Y. J. Kumar, G. O. Sing, F. Song, J. Wang, A comprehensive survey of graph neural networks for knowledge graphs, IEEE Access 10 (2022) 75729–75741. doi:10.1109/ACCESS.2022.3191784

work page doi:10.1109/access.2022.3191784 2022

[10] [10]

Bordes, N

A. Bordes, N. Usunier, A. Garcia-Durán, J. Weston, O. Yakhnenko, Translating embeddings for modeling multi-relational data, in: Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2, NIPS’13, Curran Associates Inc., Red Hook, NY, USA, 2013, p. 2787–2795

work page 2013

[11] [11]

Nickel, V

M. Nickel, V. Tresp, H.-P. Kriegel, A three-way model for collective learning on multi-relational data, in: Proceedings of the 28th International Conference on International Conference on Machine Learning, ICML’11, Omnipress, Madison, WI, USA, 2011, p. 809–816

work page 2011

[12] [12]

Trouillon, J

T. Trouillon, J. Welbl, S. Riedel, E. Gaussier, G. Bouchard, Complex embeddings for simple link prediction, in: M. F. Balcan, K. Q. Weinberger (Eds.), Proceedings of The 33rd International Conference on Machine Learning, volume 48 ofProceedings of Machine Learning Research, PMLR, New York, New York, USA, 2016, pp. 2071–2080. URL: https://proceedings.mlr.p...

work page 2016

[13] [13]

Schlichtkrull, T

M. Schlichtkrull, T. N. Kipf, P. Bloem, R. van den Berg, I. Titov, M. Welling, Modeling relational data with graph convolutional networks, in: A. Gangemi, R. Navigli, M.-E. Vidal, P. Hitzler, R. Troncy, L. Hollink, A. Tordai, M. Alam (Eds.), The Semantic Web, Springer International Publishing, Cham, 2018, pp. 593–607

work page 2018

[14] [14]

Relational Graph Attention Networks

D. Busbridge, D. Sherburn, P. Cavallo, N. Y. Hammerla, Relational graph attention networks, CoRR abs/1904.05811 (2019). URL: http://arxiv.org/abs/1904.05811.arXiv:1904.05811

work page internal anchor Pith review Pith/arXiv arXiv 1904

[15] [15]

Vashishth, S

S. Vashishth, S. Sanyal, V. Nitin, P. Talukdar, Composition-based multi-relational graph con- volutional networks, in: International Conference on Learning Representations, 2020. URL: https://openreview.net/forum?id=BylA_C4tPr

work page 2020

[16] [16]

F. Lu, P. Cong, X. Huang, Utilizing textual information in knowledge graph embedding: A survey of methods and applications, IEEE Access 8 (2020) 92072–92088. doi: 10.1109/ACCESS.2020. 2995074

work page doi:10.1109/access.2020 2020

[17] [17]

Saxena, A

A. Saxena, A. Kochsiek, R. Gemulla, Sequence-to-sequence knowledge graph completion and question answering, in: S. Muresan, P. Nakov, A. Villavicencio (Eds.), Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Association for Computational Linguistics, Dublin, Ireland, 2022, pp. 2814–2828. URL:...

work page doi:10.18653/v1/2022.acl-long.201 2022

[18] [18]

C. Chen, Y. Wang, B. Li, K.-Y. Lam, Knowledge is flat: A Seq2Seq generative framework for various knowledge graph completion, in: N. Calzolari, C.-R. Huang, H. Kim, J. Pustejovsky, L. Wanner, K.-S. Choi, P.-M. Ryu, H.-H. Chen, L. Donatelli, H. Ji, S. Kurohashi, P. Paggio, N. Xue, S. Kim, Y. Hahm, Z. He, T. K. Lee, E. Santus, F. Bond, S.-H. Na (Eds.), Proc...

work page 2022

[19] [19]

Kochsiek, A

A. Kochsiek, A. Saxena, I. Nair, R. Gemulla, Friendly neighbors: Contextualized sequence- to-sequence link prediction, in: B. Can, M. Mozes, S. Cahyawijaya, N. Saphra, N. Kass- ner, S. Ravfogel, A. Ravichander, C. Zhao, I. Augenstein, A. Rogers, K. Cho, E. Grefenstette, L. Voita (Eds.), Proceedings of the 8th Workshop on Representation Learning for NLP (R...

work page doi:10.18653/v1/2023.repl4nlp-1.11 2023

[20] [20]

B. Liu, M. Peng, W. Xu, X. Jia, M. Peng, Unilp: Unified topology-aware generative framework for link prediction in knowledge graph, in: Proceedings of the ACM Web Conference 2024, WWW ’24, Association for Computing Machinery, New York, NY, USA, 2024, p. 2170–2180. URL: https://doi.org/10.1145/3589334.3645592. doi:10.1145/3589334.3645592

work page doi:10.1145/3589334.3645592 2024

[21] [21]

X. Wang, T. Gao, Z. Zhu, Z. Zhang, Z. Liu, J. Li, J. Tang, KEPLER: A unified model for knowl- edge embedding and pre-trained language representation, Transactions of the Association for Computational Linguistics 9 (2021) 176–194. URL: https://aclanthology.org/2021.tacl-1.11/. doi:10.1162/tacl_a_00360

work page doi:10.1162/tacl_a_00360 2021

[22] [22]

W. Hu, M. Fey, H. Ren, M. Nakata, Y. Dong, J. Leskovec, Ogb-lsc: A large-scale challenge for machine learning on graphs, arXiv preprint arXiv:2103.09430 (2021)

work page arXiv 2021

[23] [23]

Raffel, N

C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y. Zhou, W. Li, P. J. Liu, Exploring the limits of transfer learning with a unified text-to-text transformer, J. Mach. Learn. Res. 21 (2020)

work page 2020

[24] [24]

Z. Ding, Y. Li, Y. He, A. Norelli, J. Wu, V. Tresp, M. M. Bronstein, Y. Ma, Dygmamba: Efficiently modeling long-term temporal dependency on continuous-time dynamic graphs with state space models, Trans. Mach. Learn. Res. 2025 (2025). URL: https://openreview.net/forum?id=sq5AJvVuha

work page 2025

[25] [25]

In: Zong, C., Xia, F., Li, W., Navigli, R

T. Safavi, D. Koutra, CoDEx: A Comprehensive Knowledge Graph Completion Benchmark, in: B. Webber, T. Cohn, Y. He, Y. Liu (Eds.), Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Association for Computational Linguistics, Online, 2020, pp. 8328–8350. URL: https://aclanthology.org/2020.emnlp-main.669/. doi: 10....

work page doi:10.18653/v1/ 2020

[26] [26]

M. Fey, J. E. Lenssen, Fast graph representation learning with PyTorch Geometric, in: ICLR Workshop on Representation Learning on Graphs and Manifolds, 2019

work page 2019

[27] [27]

T. Wolf, L. Debut, V. Sanh, J. Chaumond, C. Delangue, A. Moi, P. Cistac, T. Rault, R. Louf, M. Funtowicz, J. Davison, S. Shleifer, P. von Platen, C. Ma, Y. Jernite, J. Plu, C. Xu, T. Le Scao, S. Gugger, M. Drame, Q. Lhoest, A. Rush, Transformers: State-of-the-art natural language processing, in: Q. Liu, D. Schlangen (Eds.), Proceedings of the 2020 Confere...

work page doi:10.18653/v1/2020.emnlp-demos.6 2020

[28] [28]

Balazevic, C

I. Balazevic, C. Allen, T. Hospedales, TuckER: Tensor factorization for knowledge graph completion, in: K. Inui, J. Jiang, V. Ng, X. Wan (Eds.), Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Association for Computational Lingu...

work page doi:10.18653/v1/d19-1522 2019

[29] [29]

J. Yang, Z. Liu, S. Xiao, C. Li, D. Lian, S. Agrawal, A. Singh, G. Sun, X. Xie, Graphformers: Gnn- nested transformers for representation learning on textual graph, in: M. Ranzato, A. Beygelzimer, Y. Dauphin, P. Liang, J. W. Vaughan (Eds.), Advances in Neural Information Processing Systems, volume 34, Curran Associates, Inc., 2021, pp. 28798–28810. URL: h...

work page 2021

[30] [30]

J. Liu, Q. Mao, W. Jiang, J. Li, Knowformer: revisiting transformers for knowledge graph reasoning, in: Proceedings of the 41st International Conference on Machine Learning, ICML’24, JMLR.org, 2024

work page 2024

[31] [31]

Nawrot, S

P. Nawrot, S. Tworkowski, M. Tyrolski, L. Kaiser, Y. Wu, C. Szegedy, H. Michalewski, Hierarchical transformers are more efficient language models, in: M. Carpuat, M.-C. de Marneffe, I. V. Meza Ruiz (Eds.), Findings of the Association for Computational Linguistics: NAACL 2022, Association for Computational Linguistics, Seattle, United States, 2022, pp. 155...

work page doi:10.18653/v1/2022.findings-naacl.117 2022