EMERGE is a benchmark dataset of 233K Wikipedia passages paired with 1.45 million Wikidata edit operations across seven yearly snapshots from 2019 to 2025 for evaluating knowledge graph updates from emerging text.
Transactions of the Association for Computational Linguistics , volume =
4 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
GS-Quant generates coarse-to-fine discrete codes for KG entities via semantic hierarchy injection and causal sequence reconstruction, enabling LLMs to perform knowledge graph completion by treating the codes as vocabulary tokens.
GA-S2S integrates T5 with RGAT to jointly process text and k-hop subgraph topology for knowledge graph link prediction, reporting up to 19% relative accuracy gain over seq2seq baselines on CoDEx.
Gyan is a novel explainable non-transformer language model that achieves SOTA results on multiple datasets by mimicking human-like compositional context and world models.
citing papers explorer
-
EMERGE: A Benchmark for Updating Knowledge Graphs with Emerging Textual Knowledge
EMERGE is a benchmark dataset of 233K Wikipedia passages paired with 1.45 million Wikidata edit operations across seven yearly snapshots from 2019 to 2025 for evaluating knowledge graph updates from emerging text.
-
GS-Quant: Granular Semantic and Generative Structural Quantization for Knowledge Graph Completion
GS-Quant generates coarse-to-fine discrete codes for KG entities via semantic hierarchy injection and causal sequence reconstruction, enabling LLMs to perform knowledge graph completion by treating the codes as vocabulary tokens.
-
Leveraging Graph Structure in Seq2Seq Models for Knowledge Graph Link Prediction
GA-S2S integrates T5 with RGAT to jointly process text and k-hop subgraph topology for knowledge graph link prediction, reporting up to 19% relative accuracy gain over seq2seq baselines on CoDEx.
-
Gyan: An Explainable Neuro-Symbolic Language Model
Gyan is a novel explainable non-transformer language model that achieves SOTA results on multiple datasets by mimicking human-like compositional context and world models.