EMERGE: A Benchmark for Updating Knowledge Graphs with Emerging Textual Knowledge

Daniel Daza; Edoardo Barba; Ira Assent; Klim Zaporojets; Paul Groth; Roberto Navigli

arxiv: 2507.03617 · v2 · submitted 2025-07-04 · 💻 cs.CL

EMERGE: A Benchmark for Updating Knowledge Graphs with Emerging Textual Knowledge

Klim Zaporojets , Daniel Daza , Edoardo Barba , Ira Assent , Roberto Navigli , Paul Groth This is my paper

Pith reviewed 2026-05-19 05:59 UTC · model grok-4.3

classification 💻 cs.CL

keywords knowledge graphsWikidataWikipediadatasetbenchmarkknowledge updateemerging knowledgetext to graph

0 comments

The pith

A benchmark dataset pairs 233K Wikipedia passages with 1.45 million Wikidata edits across seven yearly snapshots to test knowledge-graph updates from new text.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper constructs a dataset that aligns evolving Wikipedia text with the precise add, delete, and update operations each passage would trigger on a particular snapshot of Wikidata. Traditional extraction methods pull facts from text without regard to what the graph already contains, so the new resource forces models to decide updates while taking the current graph state into account. If the alignment method works, researchers gain a large-scale testbed for training systems that keep structured knowledge current as facts emerge in unstructured sources. The resulting collection covers 233K passages and 1.45 million edits spanning 2019 to 2025, exposing concrete integration difficulties that static extraction pipelines do not address.

Core claim

The paper introduces a construction method that produces Wikidata snapshots at yearly intervals together with Wikipedia passages paired to the exact edit operations those passages induce on each snapshot. The resulting resource contains 233K aligned passages and 1.45 million edits over seven snapshots from 2019 to 2025 and is released as a public benchmark for the task of state-aware knowledge-graph updating.

What carries the argument

The alignment of each Wikipedia passage to the specific add, delete, or update operations it induces on a fixed Wikidata snapshot at a given year.

If this is right

Models can now be trained and evaluated on the joint problem of extracting knowledge and deciding how it should modify an existing graph.
The benchmark reveals specific failure modes when new text contradicts or extends the current graph structure.
Yearly snapshots allow temporal studies of how update difficulty changes as the underlying graph grows.
Public release enables direct comparison of update strategies across research groups.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same snapshot-and-alignment technique could be applied to other large KGs to create comparable benchmarks without manual annotation.
Finer time granularity than yearly snapshots might expose short-term update patterns that the current data set cannot capture.
The resource could support research on detecting when text implies a relation should be removed rather than added or updated.

Load-bearing premise

The edit operations that a Wikipedia passage would induce on a particular KG snapshot can be identified and labeled reliably enough to create aligned training pairs.

What would settle it

A controlled experiment in which models trained on the new dataset produce no higher accuracy or consistency when predicting required edits on held-out text-KG pairs than models trained only on standard information-extraction objectives.

Figures

Figures reproduced from arXiv: 2507.03617 by Daniel Daza, Edoardo Barba, Ira Assent, Klim Zaporojets, Paul Groth, Roberto Navigli.

**Figure 2.** Figure 2: Illustration of EMERGE creation pipeline. First, the [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: The distribution of the TKGU operations defined in Section 3 in EMERGE. We also include detection of existing triples (X-Triples), which does not entail any operation on KG. 4.4 Dataset extension EMERGE is an automatically constructed dataset, which we plan to extend using quarterly snapshots of Wikipedia and Wikidata, following the pipeline described in Section 4 and illustrated in [PITH_FULL_IMAGE:figu… view at source ↗

read the original abstract

Knowledge Graphs (KGs) are structured knowledge repositories containing entities and relations between them. In this paper, we study the problem of automatically updating KGs over time in response to evolving knowledge in unstructured textual sources. Addressing this problem requires identifying a wide range of update operations based on the state of an existing KG at a given time and the information extracted from text. This contrasts with traditional information extraction pipelines, which extract knowledge from text independently of the current state of a KG. To address this challenge, we propose a method for construction of a dataset consisting of Wikidata KG snapshots over time and Wikipedia passages paired with the corresponding edit operations that they induce in a particular KG snapshot. The resulting dataset comprises 233K Wikipedia passages aligned with a total of 1.45 million KG edits over 7 different yearly snapshots of Wikidata from 2019 to 2025. Our experimental results highlight key challenges in updating KG snapshots based on emerging textual knowledge, particularly in integrating knowledge expressed in text with the existing KG structure. These findings position the dataset as a valuable benchmark for future research. Our dataset and model implementations are publicly available.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

EMERGE gives a sizable new benchmark pairing Wikipedia passages with induced edits on real Wikidata snapshots, but the labeling of those edits lacks reported validation.

read the letter

The main thing here is a benchmark dataset that aligns 233K Wikipedia passages with 1.45 million specific edit operations across seven yearly Wikidata snapshots from 2019 to 2025. The construction pulls from public sources and focuses on updates that depend on the current state of the KG rather than extracting facts independently of it. That framing is a clear step beyond standard population tasks, and releasing the data plus code supports anyone who wants to test maintenance methods on actual temporal changes. The experiments, even at a high level, usefully flag integration issues when new text meets existing structure. This is the kind of resource that could help groups working on dynamic knowledge bases for search or AI systems. The softer spot is the core labeling step. The paper needs to show how it extracts candidate facts from text, aligns them to entities and relations, and decides add/delete/update relative to each snapshot. Without precision or recall figures on that process, or some human spot-checks, errors in entity linking or missed implicit updates could run through the whole 1.45 million edits. The abstract and high-level description leave that part thin. This is worth a serious referee. The scale and public release make it relevant for the subfield even if the construction details require more scrutiny in review. A reader building or evaluating KG update systems would get concrete value from the data once those steps are clearer.

Referee Report

2 major / 2 minor

Summary. The paper introduces EMERGE, a benchmark for updating knowledge graphs with emerging textual knowledge. It proposes a construction method that aligns 233K Wikipedia passages with 1.45 million induced edit operations (add/delete/update) across 7 yearly Wikidata snapshots (2019–2025), and presents experiments that highlight challenges in state-aware integration of textual knowledge with existing KG structure. The dataset and implementations are released publicly.

Significance. If the induced-edit labels prove reliable, the benchmark would be a valuable contribution for research on dynamic KG updating, as it supplies large-scale, temporally aligned text–edit pairs that explicitly condition on KG snapshot state. This goes beyond standard IE and supports evaluation of methods that must decide add/delete/update relative to current KG content. Public release and use of real Wikidata/Wikipedia sources are clear strengths.

major comments (2)

[Dataset construction] Dataset construction (abstract and §3): the procedure that extracts candidate facts from each Wikipedia passage, aligns them to Wikidata entities/relations, and labels the precise update type (add, delete, update) relative to a given yearly snapshot is presented without any precision/recall figures, human validation, or error analysis. Because the 1.45 M labeled edits are the core of the benchmark, lack of validation on this step is load-bearing for the claim that EMERGE is a usable resource.
[Experiments] Experiments (§4): results are described only at a high level as “highlighting key challenges.” Concrete details on the models or baselines tested, the exact metrics (e.g., edit-type accuracy, entity-linking F1), and quantitative evidence for the claimed difficulties would be needed to substantiate that the dataset exposes non-trivial problems.

minor comments (2)

[Abstract] Abstract: reports dataset size and high-level construction but omits any mention of how edit operations are identified or validated; a single sentence on this point would improve clarity.
[Notation] Notation: ensure consistent terminology for “induced edit,” “update operation,” and “KG snapshot” across sections and figures.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address each major comment point by point below and indicate the changes we will make in the revision.

read point-by-point responses

Referee: [Dataset construction] Dataset construction (abstract and §3): the procedure that extracts candidate facts from each Wikipedia passage, aligns them to Wikidata entities/relations, and labels the precise update type (add, delete, update) relative to a given yearly snapshot is presented without any precision/recall figures, human validation, or error analysis. Because the 1.45 M labeled edits are the core of the benchmark, lack of validation on this step is load-bearing for the claim that EMERGE is a usable resource.

Authors: We agree that explicit validation of the induced-edit labeling process is necessary to support the benchmark's usability. The construction in §3 relies on automated alignment between Wikipedia passages and Wikidata snapshots, but we did not report precision/recall or human validation in the submitted version. In the revision we will add a new subsection with human evaluation on a sampled subset of the 1.45 M edits, together with precision/recall figures for the fact extraction, alignment, and update-type labeling steps, plus a brief error analysis. revision: yes
Referee: [Experiments] Experiments (§4): results are described only at a high level as “highlighting key challenges.” Concrete details on the models or baselines tested, the exact metrics (e.g., edit-type accuracy, entity-linking F1), and quantitative evidence for the claimed difficulties would be needed to substantiate that the dataset exposes non-trivial problems.

Authors: We accept that the experimental results in §4 are summarized at too high a level. The current text focuses on qualitative observations of integration challenges. In the revised version we will expand this section to specify the models and baselines evaluated, report exact metrics including edit-type accuracy and entity-linking F1, and present quantitative tables and analysis that demonstrate the non-trivial difficulties the dataset reveals. revision: yes

Circularity Check

0 steps flagged

No circularity: dataset built from external public sources

full rationale

The paper constructs its benchmark by aligning publicly available Wikidata snapshots (2019-2025) with Wikipedia passages and the edit operations those passages induce on each snapshot. This process draws on external, independently verifiable data rather than any fitted parameters, self-definitional loops, or load-bearing self-citations. No derivation step reduces to its own inputs by construction; the resulting 233K passages and 1.45M edits are outputs of an alignment procedure applied to outside sources, making the work self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The work rests on standard assumptions about the reliability of Wikidata as ground truth and the feasibility of mapping text to discrete edit operations; no free parameters or invented entities are introduced in the abstract.

axioms (2)

domain assumption Wikidata snapshots accurately capture the state of the knowledge graph at each yearly point.
The dataset construction uses these snapshots as the baseline against which text-induced edits are defined.
domain assumption Wikipedia passages can be aligned to produce identifiable and labelable edit operations on a given snapshot.
This alignment is the core step that creates the 233K text-edit pairs.

pith-pipeline@v0.9.0 · 5743 in / 1304 out tokens · 37342 ms · 2026-05-19T05:59:33.895071+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We propose a method for construction of a dataset consisting of Wikidata KG snapshots over time and Wikipedia passages paired with the corresponding edit operations that they induce in a particular KG snapshot.
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

five text-driven knowledge graph updating (TKGU) operations ... Emerging triples (E-Triples), Emerging entities and triples (EE-Triples), ... Deprecated triples (D-Triples)

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

81 extracted references · 81 canonical work pages · 1 internal anchor

[1]

URL: " 'urlintro :=

ENTRY address author booktitle chapter edition editor howpublished institution journal key month note number organization pages publisher school series title type volume year eprint doi pubmed url lastchecked label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block STRINGS urlintro eprinturl eprintpr...

work page
[2]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in bbl.in capitalize " " * FUNCT...

work page
[3]

Oshin Agarwal, Heming Ge, Siamak Shakeri, and Rami Al-Rfou. 2021. https://doi.org/10.18653/v1/2021.naacl-main.278 Knowledge graph based synthetic corpus generation for knowledge-enhanced language model pre-training . In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technol...

work page doi:10.18653/v1/2021.naacl-main.278 2021
[4]

Jacqueline Aguilar, Charley Beller, Paul McNamee, Benjamin Van Durme, Stephanie Strassel, Zhiyi Song, and Joe Ellis. 2014. https://aclanthology.org/W14-2907.pdf A comparison of the events and relations across ace, ere, tac-kbp, and framenet annotation standards . In Proceedings of the 2nd Workshop on EVENTS: Definition, Detection, Coreference, and Represe...

work page 2014
[5]

Isabelle Augenstein, Mrinal Das, Sebastian Riedel, Lakshmi Vikraman, and Andrew McCallum. 2017. https://doi.org/10.18653/v1/S17-2091 Semeval 2017 task 10: Scienceie-extracting keyphrases and relations from scientific publications . In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), pages 546--555

work page doi:10.18653/v1/s17-2091 2017
[6]

Zhen Bi, Jing Chen, Yinuo Jiang, Feiyu Xiong, Wei Guo, Huajun Chen, and Ningyu Zhang. 2024. https://dl.acm.org/doi/full/10.1145/3641850 Codekgc: Code language model for generative knowledge graph construction . ACM Transactions on Asian and Low-Resource Language Information Processing, 23(3):1--16

work page doi:10.1145/3641850 2024
[7]

Kurt Bollacker, Colin Evans, Praveen Paritosh, Tim Sturge, and Jamie Taylor. 2008. https://doi.org/10.1145/1376616.1376746 Freebase: a collaboratively created graph database for structuring human knowledge . In Proceedings of the 2008 ACM SIGMOD international conference on Management of data, pages 1247--1250

work page doi:10.1145/1376616.1376746 2008
[8]

Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Oksana Yakhnenko. 2013. https://proceedings.neurips.cc/paper/2013/hash/1cecc7a77928ca8133fa24680a88d2f9-Abstract.html Translating embeddings for modeling multi-relational data . In Advances in neural information processing systems, pages 2787--2795

work page 2013
[9]

Elizabeth Boschee, Jennifer Lautenschlager, Sean O’Brien, Steve Shellman, James Starz, and Michael Ward. 2015. https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/28075 ICEWS coded event data . Harvard Dataverse, 12

work page doi:10.7910/dvn/28075 2015
[10]

Pere-Llu \' s Huguet Cabot and Roberto Navigli. 2021. https://doi.org/10.18653/v1/2021.findings-emnlp.204 REBEL : Relation extraction by end-to-end language generation . In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 2370--2381

work page doi:10.18653/v1/2021.findings-emnlp.204 2021
[11]

Arun Chaganty, Ashwin Paranjape, Percy Liang, and Christopher D Manning. 2017. https://doi.org/10.18653/v1/D17-1109 Importance sampling for unbiased on-demand evaluation of knowledge base population . In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 1038--1048

work page doi:10.18653/v1/d17-1109 2017
[12]

Nancy Chinchor and Elaine Marsh. 1998. https://catalog.ldc.upenn.edu/docs/LDC2001T02/guidelines.IEtask42.ps Muc-7 information extraction task definition . In Proceeding of the 1998 Message Understanding Conference (MUC-7), pages 359--367

work page 1998
[13]

John Dagdelen, Alexander Dunn, Sanghoon Lee, Nicholas Walker, Andrew S Rosen, Gerbrand Ceder, Kristin A Persson, and Anubhav Jain. 2024. https://doi.org/https://doi.org/10.1038/s41467-024-45563-x Structured information extraction from scientific text with large language models . Nature Communications, 15(1):1418

work page doi:10.1038/s41467-024-45563-x 2024
[14]

Shib Sankar Dasgupta, Swayambhu Nath Ray, and Partha Talukdar. 2018. https://doi.org/10.18653/v1/D18-1225 Hyte: Hyperplane-based temporally aware knowledge graph embedding . In Proceedings of the 2018 conference on empirical methods in natural language processing, pages 2001--2011

work page doi:10.18653/v1/d18-1225 2018
[15]

Daniel Daza, Michael Cochez, and Paul Groth. 2021. https://doi.org/10.1145/3442381.3450141 Inductive entity representations from text via link prediction . In Proceedings of the Web Conference 2021, pages 798--808

work page doi:10.1145/3442381.3450141 2021
[16]

Tim Dettmers, Pasquale Minervini, Pontus Stenetorp, and Sebastian Riedel. 2018. https://doi.org/https://doi.org/10.1609/aaai.v32i1.11573 Convolutional 2d knowledge graph embeddings . In Proceedings of the AAAI conference on artificial intelligence, volume 32

work page doi:10.1609/aaai.v32i1.11573 2018
[17]

Bhuwan Dhingra, Jeremy R Cole, Julian Martin Eisenschlos, Daniel Gillick, Jacob Eisenstein, and William W Cohen. 2022. https://doi.org/10.1162/tacl_a_00459 Time-aware language models as temporal knowledge bases . Transactions of the Association for Computational Linguistics, 10:257--273

work page doi:10.1162/tacl_a_00459 2022
[18]

Bayu Distiawan, Gerhard Weikum, Jianzhong Qi, and Rui Zhang. 2019. https://doi.org/10.18653/v1/P19-1023 Neural relation extraction for knowledge base enrichment . In Proceedings of the 2019 Conference of the Association for Computational Linguistics, pages 229--240

work page doi:10.18653/v1/p19-1023 2019
[19]

Hady Elsahar, Pavlos Vougiouklis, Arslen Remaci, Christophe Gravier, Jonathon Hare, Elena Simperl, and Frederique Laforest. 2019. https://aclanthology.org/L18-1544.pdf T-rex: A large scale alignment of natural language with knowledge base triples

work page 2019
[20]

Evgeniy Gabrilovich, Michael Ringgaard, and Amarnag Subramanya. 2013. https://lemurproject.org/clueweb12/FACC1/ Facc1: Freebase annotation of clueweb corpora

work page 2013
[21]

Luis Gal \'a rraga, Geremy Heitz, Kevin Murphy, and Fabian M Suchanek. 2014. https://doi.org/10.1145/2661829.2662073 Canonicalizing open knowledge bases . In Proceedings of the 23rd acm international conference on conference on information and knowledge management, pages 1679--1688

work page doi:10.1145/2661829.2662073 2014
[22]

Tianyu Gao, Xu Han, Hao Zhu, Zhiyuan Liu, Peng Li, Maosong Sun, and Jie Zhou. 2019. https://doi.org/10.18653/v1/D19-1649 Fewrel 2.0: Towards more challenging few-shot relation classification . In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing ...

work page doi:10.18653/v1/d19-1649 2019
[23]

Claire Gardent, Anastasia Shimorina, Shashi Narayan, and Laura Perez-Beltrachini. 2017. https://doi.org/10.18653/v1/W17-3518 The webnlg challenge: Generating text from rdf data . In Proceedings of the 10th International Conference on Natural Language Generation, pages 124--133

work page doi:10.18653/v1/w17-3518 2017
[24]

Saibo Geng, Martin Josifoski, Maxime Peyrard, and Robert West. 2023. https://doi.org/10.18653/v1/2023.emnlp-main.674 Grammar-constrained decoding for structured nlp tasks without finetuning . In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 10932--10952

work page doi:10.18653/v1/2023.emnlp-main.674 2023
[25]

Xu Han, Hao Zhu, Pengfei Yu, Ziyun Wang, Yuan Yao, Zhiyuan Liu, and Maosong Sun. 2018. https://doi.org/10.18653/v1/D18-1514 FewRel : A large-scale supervised few-shot relation classification dataset with state-of-the-art evaluation . In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP 2018), pages 4803--4809

work page doi:10.18653/v1/d18-1514 2018
[26]

Iris Hendrickx, Su Nam Kim, Zornitsa Kozareva, Preslav Nakov, Diarmuid \'O S \'e aghdha, Sebastian Pad \'o , Marco Pennacchiotti, Lorenza Romano, and Stan Szpakowicz. 2010. https://aclanthology.org/S10-1006.pdf Semeval-2010 task 8: Multi-way classification of semantic relations between pairs of nominals . In Proceedings of the 5th International Workshop o...

work page 2010
[27]

Joel Jang, Seonghyeon Ye, Changho Lee, Sohee Yang, Joongbo Shin, Janghoon Han, Gyeonghun Kim, and Minjoon Seo. 2022 a . https://doi.org/10.18653/v1/2022.emnlp-main.418 T emporal W iki: A lifelong benchmark for training and evaluating ever-evolving language models . In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, ...

work page doi:10.18653/v1/2022.emnlp-main.418 2022
[28]

Joel Jang, Seonghyeon Ye, Sohee Yang, Joongbo Shin, Janghoon Han, Gyeonghun Kim, Stanley Jungkyu Choi, and Minjoon Seo. 2022 b . https://openreview.net/forum?id=vfsRB5MImo9 Towards continual knowledge learning of language models . In ICLR

work page 2022
[29]

Heng Ji, Ralph Grishman, Hoa Trang Dang, Kira Griffitt, and Joe Ellis. 2010. https://blender.cs.illinois.edu/paper/kbp2010overview.pdf Overview of the TAC 2010 knowledge base population track . In Proceedings of the 2010 Text Analysis Conference (TAC 2010), pages 1--25

work page 2010
[30]

Pengcheng Jiang, Jiacheng Lin, Zifeng Wang, Jimeng Sun, and Jiawei Han. 2024. https://doi.org/10.18653/v1/2024.naacl-long.155 Genres: Rethinking evaluation for generative relation extraction in the era of large language models . In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Lang...

work page doi:10.18653/v1/2024.naacl-long.155 2024
[31]

Martin Josifoski, Nicola De Cao, Maxime Peyrard, Fabio Petroni, and Robert West. 2022. https://doi.org/10.18653/v1/2022.naacl-main.342 G en IE : Generative information extraction . In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 4626--4643, Seattle, Un...

work page doi:10.18653/v1/2022.naacl-main.342 2022
[32]

Martin Josifoski, Marija Sakota, Maxime Peyrard, and Robert West. 2023. https://doi.org/10.18653/v1/2023.emnlp-main.96 Exploiting asymmetry for synthetic training data generation: S ynth IE and the case of information extraction . In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 1555--1574, Singapore. Associ...

work page doi:10.18653/v1/2023.emnlp-main.96 2023
[33]

Jungo Kasai, Keisuke Sakaguchi, Ronan Le Bras, Akari Asai, Xinyan Yu, Dragomir Radev, Noah A Smith, Yejin Choi, Kentaro Inui, et al. 2024. https://proceedings.neurips.cc/paper_files/paper/2023/file/9941624ef7f867a502732b5154d30cb7-Paper-Datasets_and_Benchmarks.pdf Realtime qa: What's the answer right now? Advances in Neural Information Processing Systems, 36

work page arXiv 2024
[34]

Jinyoung Kim, Dayoon Ko, and Gunhee Kim. 2024 a . https://doi.org/10.18653/v1/2024.emnlp-main.762 D ynamic ER : Resolving emerging mentions to dynamic entities for RAG . In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 13752--13770, Miami, Florida, USA. Association for Computational Linguistics

work page doi:10.18653/v1/2024.emnlp-main.762 2024
[35]

Yujin Kim, Jaehong Yoon, Seonghyeon Ye, Sangmin Bae, Namgyu Ho, Sung Ju Hwang, and Se-Young Yun. 2024 b . https://doi.org/10.18653/v1/2024.naacl-long.302 Carpe diem: On the evaluation of world knowledge in lifelong language models . In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human ...

work page doi:10.18653/v1/2024.naacl-long.302 2024
[36]

Dayoon Ko, Jinyoung Kim, Hahyeon Choi, and Gunhee Kim. 2024. https://doi.org/10.18653/v1/2024.acl-long.181 G row OVER : How can LLM s adapt to growing real-world knowledge? In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 3282--3308, Bangkok, Thailand. Association for Computational L...

work page doi:10.18653/v1/2024.acl-long.181 2024
[37]

Timoth \'e e Lacroix, Guillaume Obozinski, and Nicolas Usunier. 2020. https://openreview.net/forum?id=rke2P1BFwS Tensor decompositions for temporal knowledge base completion . arXiv preprint arXiv:2004.04926

work page arXiv 2020
[38]

Kenton Lee, Luheng He, Mike Lewis, and Luke Zettlemoyer. 2017. https://doi.org/10.18653/v1/D17-1018 End-to-end neural coreference resolution . In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 188--197, Copenhagen, Denmark. Association for Computational Linguistics

work page doi:10.18653/v1/d17-1018 2017
[39]

u ttler, Mike Lewis, Wen-tau Yih, Tim Rockt \

Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich K \"u ttler, Mike Lewis, Wen-tau Yih, Tim Rockt \"a schel, et al. 2020. https://proceedings.neurips.cc/paper/2020/hash/6b493230205f780e1bc26945df7481e5-Abstract.html Retrieval-augmented generation for knowledge-intensive nlp tasks . In Proceedings of th...

work page 2020
[40]

Belinda Z Li, Emmy Liu, Alexis Ross, Abbas Zeitoun, Graham Neubig, and Jacob Andreas. 2024. https://arxiv.org/pdf/2406.11830 Language modeling with editable external knowledge . arXiv preprint arXiv:2406.11830

work page arXiv 2024
[41]

Jiao Li, Yueping Sun, Robin J Johnson, Daniela Sciaky, Chih-Hsuan Wei, Robert Leaman, Allan Peter Davis, Carolyn J Mattingly, Thomas C Wiegers, and Zhiyong Lu. 2016. https://doi.org/10.1093/database/baw068 Biocreative v cdr task corpus: a resource for chemical disease relation extraction . Database, 2016

work page doi:10.1093/database/baw068 2016
[42]

Adam Liska, Tomas Kocisky, Elena Gribovskaya, Tayfun Terzi, Eren Sezener, Devang Agrawal, D’Autume Cyprien De Masson, Tim Scholtes, Manzil Zaheer, Susannah Young, et al. 2022. https://proceedings.mlr.press/v162/liska22a.html Streaming QA : A benchmark for adaptation to new knowledge over time in question answering models . In International Conference on M...

work page 2022
[43]

Yi Luan, Luheng He, Mari Ostendorf, and Hannaneh Hajishirzi. 2018. https://doi.org/10.18653/v1/D18-1360 Multi-task identification of entities, relations, and coreference for scientific knowledge graph construction . In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 3219--3232

work page doi:10.18653/v1/d18-1360 2018
[44]

Ling Luo, Po-Ting Lai, Chih-Hsuan Wei, Cecilia N Arighi, and Zhiyong Lu. 2022. https://doi.org/10.1093/bib/bbac282 Bio RED : a rich biomedical relation extraction dataset . Briefings in Bioinformatics, 23(5):bbac282

work page doi:10.1093/bib/bbac282 2022
[45]

Xin Lv, Yankai Lin, Yixin Cao, Lei Hou, Juanzi Li, Zhiyuan Liu, Peng Li, and Jie Zhou. 2022. https://doi.org/10.18653/v1/2022.findings-acl.282 Do pre-trained models benefit knowledge graph completion? a reliable evaluation and a reasonable approach . In Findings of the Association for Computational Linguistics: ACL 2022, pages 3570--3581, Dublin, Ireland....

work page doi:10.18653/v1/2022.findings-acl.282 2022
[46]

Farzaneh Mahdisoltani, Joanna Biega, and Fabian Suchanek. 2014. https://imt.hal.science/hal-01699874/ Yago3: A knowledge base from multilingual wikipedias . In 7th biennial conference on innovative data systems research. CIDR Conference

work page 2014
[47]

Katerina Margatina, Shuai Wang, Yogarshi Vyas, Neha Anna John, Yassine Benajiba, and Miguel Ballesteros. 2023. https://doi.org/10.18653/v1/2023.eacl-main.211 Dynamic benchmarking of masked language models on temporal concept drift with multiple views . In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Lingu...

work page doi:10.18653/v1/2023.eacl-main.211 2023
[48]

Filipe Mesquita, Matteo Cannaviccio, Jordan Schmidek, Paramita Mirza, and Denilson Barbosa. 2019. https://doi.org/10.18653/v1/D19-1069 K nowledge N et: A benchmark dataset for knowledge base population . In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language ...

work page doi:10.18653/v1/d19-1069 2019
[49]

George A Miller. 1995. https://doi.org/10.1145/219717.219748 Wordnet: a lexical database for english . Communications of the ACM, 38(11):39--41

work page doi:10.1145/219717.219748 1995
[50]

Yasumasa Onoe, Michael Zhang, Eunsol Choi, and Greg Durrett. 2022. https://doi.org/10.18653/v1/2022.findings-naacl.52 Entity cloze by date: What LM s know about unseen entities . In Findings of the Association for Computational Linguistics: NAACL 2022, pages 693--702, Seattle, United States. Association for Computational Linguistics

work page doi:10.18653/v1/2022.findings-naacl.52 2022
[51]

Yasumasa Onoe, Michael Zhang, Shankar Padmanabhan, Greg Durrett, and Eunsol Choi. 2023. https://doi.org/10.18653/v1/2023.acl-long.300 Can LM s learn new entities from descriptions? challenges in propagating injected knowledge . In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 5469--5...

work page doi:10.18653/v1/2023.acl-long.300 2023
[52]

Riccardo Orlando, Pere-Llu \'i s Huguet Cabot, Edoardo Barba, and Roberto Navigli. 2024. https://doi.org/10.18653/v1/2024.findings-acl.839 R e L i K : Retrieve and L in K , fast and accurate entity linking and relation extraction on an academic budget . In Findings of the Association for Computational Linguistics: ACL 2024, pages 14114--14132, Bangkok, Th...

work page doi:10.18653/v1/2024.findings-acl.839 2024
[53]

Heiko Paulheim. 2016. https://doi.org/10.3233/SW-160218 Knowledge graph refinement: A survey of approaches and evaluation methods . Semantic web, 8(3):489--508

work page doi:10.3233/sw-160218 2016
[54]

Simon Razniewski, Hiba Arnaout, Shrestha Ghosh, and Fabian Suchanek. 2024. https://doi.org/10.1145/3639563 Completeness, recall, and negation in open-world knowledge bases: A survey . ACM Computing Surveys, 56(6):1--42

work page doi:10.1145/3639563 2024
[55]

Sebastian Riedel, Limin Yao, and Andrew McCallum. 2010. https://doi.org/https://doi.org/10.1007/978-3-642-15939-8_10 Modeling relations and their mentions without labeled text . In Proceedings of the 2010 European Conference on Machine Learning and Knowledge Discovery in Databases, pages 148--163

work page doi:10.1007/978-3-642-15939-8_10 2010
[56]

Gaetano Rossiello, Md Faisal Mahbub Chowdhury, Nandana Mihindukulasooriya, Owen Cornec, and Alfio Massimiliano Gliozzo. 2023. https://doi.org/https://doi.org/10.1609/aaai.v37i13.27084 Knowgl: Knowledge generation and linking from text . In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 16476--16478

work page doi:10.1609/aaai.v37i13.27084 2023
[57]

Dan Roth and Wen-tau Yih. 2004. https://aclanthology.org/W04-2401.pdf A linear programming formulation for global inference in natural language tasks . Technical report, Illinois Univ at Urbana-Champaign Dept of Computer Science

work page 2004
[58]

Tara Safavi and Danai Koutra. 2020. https://doi.org/10.18653/v1/2020.emnlp-main.669 C o DE x: A C omprehensive K nowledge G raph C ompletion B enchmark . In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 8328--8350, Online. Association for Computational Linguistics

work page doi:10.18653/v1/2020.emnlp-main.669 2020
[59]

Tong Shen, Fu Zhang, and Jingwei Cheng. 2022. https://doi.org/https://doi.org/10.1016/j.knosys.2022.109597 A comprehensive overview of knowledge graph completion . Knowledge-Based Systems, 255:109597

work page doi:10.1016/j.knosys.2022.109597 2022
[60]

Zhiyi Song, Ann Bies, Stephanie Strassel, Tom Riese, Justin Mott, Joe Ellis, Jonathan Wright, Seth Kulick, Neville Ryant, and Xiaoyi Ma. 2015. https://aclanthology.org/W15-0812.pdf From light to rich ere: annotation of entities, relations, and events . In Proceedings of the the 3rd Workshop on EVENTS: Definition, Detection, Coreference, and Representation...

work page 2015
[61]

Budhitama Subagdja, D Shanthoshigaa, Zhaoxia Wang, and Ah-Hwee Tan. 2024. https://doi.org/10.1145/3640313 Machine learning for refining knowledge graphs: A survey . ACM Computing Surveys, 56(6):1--38

work page doi:10.1145/3640313 2024
[62]

TAC-KBP. 2022. https://tac.nist.gov/tracks/index.html TAC-KBP home page

work page 2022
[63]

Kristina Toutanova and Danqi Chen. 2015. https://aclanthology.org/W15-4007.pdf Observed versus latent features for knowledge base and text inference . In Proceedings of the 3rd workshop on continuous vector space models and their compositionality, pages 57--66

work page 2015
[64]

Denny Vrande c i \'c and Markus Kr \"o tzsch. 2014. https://doi.org/10.1145/2629489 Wikidata: a free collaborative knowledgebase . Communications of the ACM, 57(10):78--85

work page doi:10.1145/2629489 2014
[65]

Tu Vu, Mohit Iyyer, Xuezhi Wang, Noah Constant, Jerry Wei, Jason Wei, Chris Tar, Yun-Hsuan Sung, Denny Zhou, Quoc Le, and Thang Luong. 2024. https://doi.org/10.18653/v1/2024.findings-acl.813 F resh LLM s: Refreshing large language models with search engine augmentation . In Findings of the Association for Computational Linguistics: ACL 2024, pages 13697--...

work page doi:10.18653/v1/2024.findings-acl.813 2024
[66]

Christopher Walker, Stephanie Strassel, Julie Medero, and Kazuaki Maeda. 2006. Ace 2005 multilingual training corpus. Linguistic Data Consortium, Philadelphia, 57

work page 2006
[67]

Liang Wang, Wei Zhao, Zhuoyu Wei, and Jingming Liu. 2022. https://doi.org/10.18653/v1/2022.acl-long.295 S im KGC : Simple contrastive knowledge graph completion with pre-trained language models . In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 4281--4294, Dublin, Ireland. Associatio...

work page doi:10.18653/v1/2022.acl-long.295 2022
[68]

Xiaozhi Wang, Tianyu Gao, Zhaocheng Zhu, Zhiyuan Liu, Juanzi Li, and Jian Tang. 2021. https://doi.org/10.1162/tacl_a_00360 KEPLER : A unified model for knowledge embedding and pre-trained language representation . Transactions of the Association for Computational Linguistics, 9:176--194

work page doi:10.1162/tacl_a_00360 2021
[69]

Colin White, Samuel Dooley, Manley Roberts, Arka Pal, Ben Feuer, Siddhartha Jain, Ravid Shwartz-Ziv, Neel Jain, Khalid Saifullah, Siddartha Naidu, et al. 2024. https://openreview.net/forum?id=sKYHBTAxVa Livebench: A challenging, contamination-free llm benchmark . arXiv preprint arXiv:2406.19314

work page internal anchor Pith review Pith/arXiv arXiv 2024
[70]

Tongtong Wu, Linhao Luo, Yuan-Fang Li, Shirui Pan, Thuy-Trang Vu, and Gholamreza Haffari. 2024 a . https://arxiv.org/pdf/2402.01364 Continual learning for large language models: A survey . arXiv preprint arXiv:2402.01364

work page arXiv 2024
[71]

Xiaobao Wu, Liangming Pan, William Yang Wang, and Anh Tuan Luu. 2024 b . https://doi.org/10.18653/v1/2024.emnlp-main.843 AKEW : Assessing knowledge editing in the wild . In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 15118--15133, Miami, Florida, USA. Association for Computational Linguistics

work page doi:10.18653/v1/2024.emnlp-main.843 2024
[72]

Xiaobao Wu, Liangming Pan, Yuxi Xie, Ruiwen Zhou, Shuai Zhao, Yubo Ma, Mingzhe Du, Rui Mao, Anh Tuan Luu, and William Yang Wang. 2024 c . https://arxiv.org/pdf/2412.13670 Antileak-bench: Preventing data contamination by automatically constructing benchmarks with updated real-world knowledge . arXiv preprint arXiv:2412.13670

work page arXiv 2024
[73]

Rui Xing, Jie Luo, and Tengwei Song. 2020. https://doi.org/https://doi.org/10.1186/s12859-020-03889-5 Biorel: towards large-scale biomedical relation extraction . BMC bioinformatics, 21:1--13

work page doi:10.1186/s12859-020-03889-5 2020
[74]

Wenhan Xiong, Mo Yu, Shiyu Chang, Xiaoxiao Guo, and William Yang Wang. 2018. https://doi.org/10.18653/v1/D18-1223 One-shot relational learning for knowledge graphs . In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 1980--1990, Brussels, Belgium. Association for Computational Linguistics

work page doi:10.18653/v1/d18-1223 2018
[75]

Derong Xu, Wei Chen, Wenjun Peng, Chao Zhang, Tong Xu, Xiangyu Zhao, Xian Wu, Yefeng Zheng, Yang Wang, and Enhong Chen. 2024. https://doi.org/https://doi.org/10.1007/s11704-024-40555-y Large language models for generative information extraction: A survey . Frontiers of Computer Science, 18(6):186357

work page doi:10.1007/s11704-024-40555-y 2024
[76]

Yuan Yao, Deming Ye, Peng Li, Xu Han, Yankai Lin, Zhenghao Liu, Zhiyuan Liu, Lixin Huang, Jie Zhou, and Maosong Sun. 2019. https://aclanthology.org/P19-1074 DocRED : A large-scale document-level relation extraction dataset . In Proceedings of the 2019 Annual Meeting of the Association for Computational Linguistics (ACL 2019), pages 764--777

work page 2019
[77]

Klim Zaporojets, Johannes Deleu, Chris Develder, and Thomas Demeester. 2021. https://doi.org/10.1016/j.ipm.2021.102563 DWIE : An entity-centric dataset for multi-task document-level information extraction . Information Processing & Management, 58(4):102563

work page doi:10.1016/j.ipm.2021.102563 2021
[78]

Urchade Zaratiana, Nadi Tomeh, Pierre Holat, and Thierry Charnois. 2024. https://doi.org/https://doi.org/10.1609/aaai.v38i17.29919 An autoregressive text-to-graph framework for joint entity and relation extraction . In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 19477--19487

work page doi:10.1609/aaai.v38i17.29919 2024
[79]

Bowen Zhang and Harold Soh. 2024. https://doi.org/10.18653/v1/2024.emnlp-main.548 Extract, define, canonicalize: An LLM -based framework for knowledge graph construction . In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 9820--9836, Miami, Florida, USA. Association for Computational Linguistics

work page doi:10.18653/v1/2024.emnlp-main.548 2024
[80]

Yuhao Zhang, Victor Zhong, Danqi Chen, Gabor Angeli, and Christopher D. Manning. 2017. https://doi.org/10.18653/v1/D17-1004 Position-aware attention and supervised data improve slot filling . In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 35--45, Copenhagen, Denmark. Association for Computational Linguistics

work page doi:10.18653/v1/d17-1004 2017

Showing first 80 references.

[1] [1]

URL: " 'urlintro :=

ENTRY address author booktitle chapter edition editor howpublished institution journal key month note number organization pages publisher school series title type volume year eprint doi pubmed url lastchecked label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block STRINGS urlintro eprinturl eprintpr...

work page

[2] [2]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in bbl.in capitalize " " * FUNCT...

work page

[3] [3]

Oshin Agarwal, Heming Ge, Siamak Shakeri, and Rami Al-Rfou. 2021. https://doi.org/10.18653/v1/2021.naacl-main.278 Knowledge graph based synthetic corpus generation for knowledge-enhanced language model pre-training . In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technol...

work page doi:10.18653/v1/2021.naacl-main.278 2021

[4] [4]

Jacqueline Aguilar, Charley Beller, Paul McNamee, Benjamin Van Durme, Stephanie Strassel, Zhiyi Song, and Joe Ellis. 2014. https://aclanthology.org/W14-2907.pdf A comparison of the events and relations across ace, ere, tac-kbp, and framenet annotation standards . In Proceedings of the 2nd Workshop on EVENTS: Definition, Detection, Coreference, and Represe...

work page 2014

[5] [5]

Isabelle Augenstein, Mrinal Das, Sebastian Riedel, Lakshmi Vikraman, and Andrew McCallum. 2017. https://doi.org/10.18653/v1/S17-2091 Semeval 2017 task 10: Scienceie-extracting keyphrases and relations from scientific publications . In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), pages 546--555

work page doi:10.18653/v1/s17-2091 2017

[6] [6]

Zhen Bi, Jing Chen, Yinuo Jiang, Feiyu Xiong, Wei Guo, Huajun Chen, and Ningyu Zhang. 2024. https://dl.acm.org/doi/full/10.1145/3641850 Codekgc: Code language model for generative knowledge graph construction . ACM Transactions on Asian and Low-Resource Language Information Processing, 23(3):1--16

work page doi:10.1145/3641850 2024

[7] [7]

Kurt Bollacker, Colin Evans, Praveen Paritosh, Tim Sturge, and Jamie Taylor. 2008. https://doi.org/10.1145/1376616.1376746 Freebase: a collaboratively created graph database for structuring human knowledge . In Proceedings of the 2008 ACM SIGMOD international conference on Management of data, pages 1247--1250

work page doi:10.1145/1376616.1376746 2008

[8] [8]

Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Oksana Yakhnenko. 2013. https://proceedings.neurips.cc/paper/2013/hash/1cecc7a77928ca8133fa24680a88d2f9-Abstract.html Translating embeddings for modeling multi-relational data . In Advances in neural information processing systems, pages 2787--2795

work page 2013

[9] [9]

Elizabeth Boschee, Jennifer Lautenschlager, Sean O’Brien, Steve Shellman, James Starz, and Michael Ward. 2015. https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/28075 ICEWS coded event data . Harvard Dataverse, 12

work page doi:10.7910/dvn/28075 2015

[10] [10]

Pere-Llu \' s Huguet Cabot and Roberto Navigli. 2021. https://doi.org/10.18653/v1/2021.findings-emnlp.204 REBEL : Relation extraction by end-to-end language generation . In Findings of the Association for Computational Linguistics: EMNLP 2021, pages 2370--2381

work page doi:10.18653/v1/2021.findings-emnlp.204 2021

[11] [11]

Arun Chaganty, Ashwin Paranjape, Percy Liang, and Christopher D Manning. 2017. https://doi.org/10.18653/v1/D17-1109 Importance sampling for unbiased on-demand evaluation of knowledge base population . In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 1038--1048

work page doi:10.18653/v1/d17-1109 2017

[12] [12]

Nancy Chinchor and Elaine Marsh. 1998. https://catalog.ldc.upenn.edu/docs/LDC2001T02/guidelines.IEtask42.ps Muc-7 information extraction task definition . In Proceeding of the 1998 Message Understanding Conference (MUC-7), pages 359--367

work page 1998

[13] [13]

John Dagdelen, Alexander Dunn, Sanghoon Lee, Nicholas Walker, Andrew S Rosen, Gerbrand Ceder, Kristin A Persson, and Anubhav Jain. 2024. https://doi.org/https://doi.org/10.1038/s41467-024-45563-x Structured information extraction from scientific text with large language models . Nature Communications, 15(1):1418

work page doi:10.1038/s41467-024-45563-x 2024

[14] [14]

Shib Sankar Dasgupta, Swayambhu Nath Ray, and Partha Talukdar. 2018. https://doi.org/10.18653/v1/D18-1225 Hyte: Hyperplane-based temporally aware knowledge graph embedding . In Proceedings of the 2018 conference on empirical methods in natural language processing, pages 2001--2011

work page doi:10.18653/v1/d18-1225 2018

[15] [15]

Daniel Daza, Michael Cochez, and Paul Groth. 2021. https://doi.org/10.1145/3442381.3450141 Inductive entity representations from text via link prediction . In Proceedings of the Web Conference 2021, pages 798--808

work page doi:10.1145/3442381.3450141 2021

[16] [16]

Tim Dettmers, Pasquale Minervini, Pontus Stenetorp, and Sebastian Riedel. 2018. https://doi.org/https://doi.org/10.1609/aaai.v32i1.11573 Convolutional 2d knowledge graph embeddings . In Proceedings of the AAAI conference on artificial intelligence, volume 32

work page doi:10.1609/aaai.v32i1.11573 2018

[17] [17]

Bhuwan Dhingra, Jeremy R Cole, Julian Martin Eisenschlos, Daniel Gillick, Jacob Eisenstein, and William W Cohen. 2022. https://doi.org/10.1162/tacl_a_00459 Time-aware language models as temporal knowledge bases . Transactions of the Association for Computational Linguistics, 10:257--273

work page doi:10.1162/tacl_a_00459 2022

[18] [18]

Bayu Distiawan, Gerhard Weikum, Jianzhong Qi, and Rui Zhang. 2019. https://doi.org/10.18653/v1/P19-1023 Neural relation extraction for knowledge base enrichment . In Proceedings of the 2019 Conference of the Association for Computational Linguistics, pages 229--240

work page doi:10.18653/v1/p19-1023 2019

[19] [19]

Hady Elsahar, Pavlos Vougiouklis, Arslen Remaci, Christophe Gravier, Jonathon Hare, Elena Simperl, and Frederique Laforest. 2019. https://aclanthology.org/L18-1544.pdf T-rex: A large scale alignment of natural language with knowledge base triples

work page 2019

[20] [20]

Evgeniy Gabrilovich, Michael Ringgaard, and Amarnag Subramanya. 2013. https://lemurproject.org/clueweb12/FACC1/ Facc1: Freebase annotation of clueweb corpora

work page 2013

[21] [21]

Luis Gal \'a rraga, Geremy Heitz, Kevin Murphy, and Fabian M Suchanek. 2014. https://doi.org/10.1145/2661829.2662073 Canonicalizing open knowledge bases . In Proceedings of the 23rd acm international conference on conference on information and knowledge management, pages 1679--1688

work page doi:10.1145/2661829.2662073 2014

[22] [22]

Tianyu Gao, Xu Han, Hao Zhu, Zhiyuan Liu, Peng Li, Maosong Sun, and Jie Zhou. 2019. https://doi.org/10.18653/v1/D19-1649 Fewrel 2.0: Towards more challenging few-shot relation classification . In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing ...

work page doi:10.18653/v1/d19-1649 2019

[23] [23]

Claire Gardent, Anastasia Shimorina, Shashi Narayan, and Laura Perez-Beltrachini. 2017. https://doi.org/10.18653/v1/W17-3518 The webnlg challenge: Generating text from rdf data . In Proceedings of the 10th International Conference on Natural Language Generation, pages 124--133

work page doi:10.18653/v1/w17-3518 2017

[24] [24]

Saibo Geng, Martin Josifoski, Maxime Peyrard, and Robert West. 2023. https://doi.org/10.18653/v1/2023.emnlp-main.674 Grammar-constrained decoding for structured nlp tasks without finetuning . In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 10932--10952

work page doi:10.18653/v1/2023.emnlp-main.674 2023

[25] [25]

Xu Han, Hao Zhu, Pengfei Yu, Ziyun Wang, Yuan Yao, Zhiyuan Liu, and Maosong Sun. 2018. https://doi.org/10.18653/v1/D18-1514 FewRel : A large-scale supervised few-shot relation classification dataset with state-of-the-art evaluation . In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP 2018), pages 4803--4809

work page doi:10.18653/v1/d18-1514 2018

[26] [26]

Iris Hendrickx, Su Nam Kim, Zornitsa Kozareva, Preslav Nakov, Diarmuid \'O S \'e aghdha, Sebastian Pad \'o , Marco Pennacchiotti, Lorenza Romano, and Stan Szpakowicz. 2010. https://aclanthology.org/S10-1006.pdf Semeval-2010 task 8: Multi-way classification of semantic relations between pairs of nominals . In Proceedings of the 5th International Workshop o...

work page 2010

[27] [27]

Joel Jang, Seonghyeon Ye, Changho Lee, Sohee Yang, Joongbo Shin, Janghoon Han, Gyeonghun Kim, and Minjoon Seo. 2022 a . https://doi.org/10.18653/v1/2022.emnlp-main.418 T emporal W iki: A lifelong benchmark for training and evaluating ever-evolving language models . In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, ...

work page doi:10.18653/v1/2022.emnlp-main.418 2022

[28] [28]

Joel Jang, Seonghyeon Ye, Sohee Yang, Joongbo Shin, Janghoon Han, Gyeonghun Kim, Stanley Jungkyu Choi, and Minjoon Seo. 2022 b . https://openreview.net/forum?id=vfsRB5MImo9 Towards continual knowledge learning of language models . In ICLR

work page 2022

[29] [29]

Heng Ji, Ralph Grishman, Hoa Trang Dang, Kira Griffitt, and Joe Ellis. 2010. https://blender.cs.illinois.edu/paper/kbp2010overview.pdf Overview of the TAC 2010 knowledge base population track . In Proceedings of the 2010 Text Analysis Conference (TAC 2010), pages 1--25

work page 2010

[30] [30]

Pengcheng Jiang, Jiacheng Lin, Zifeng Wang, Jimeng Sun, and Jiawei Han. 2024. https://doi.org/10.18653/v1/2024.naacl-long.155 Genres: Rethinking evaluation for generative relation extraction in the era of large language models . In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Lang...

work page doi:10.18653/v1/2024.naacl-long.155 2024

[31] [31]

Martin Josifoski, Nicola De Cao, Maxime Peyrard, Fabio Petroni, and Robert West. 2022. https://doi.org/10.18653/v1/2022.naacl-main.342 G en IE : Generative information extraction . In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 4626--4643, Seattle, Un...

work page doi:10.18653/v1/2022.naacl-main.342 2022

[32] [32]

Martin Josifoski, Marija Sakota, Maxime Peyrard, and Robert West. 2023. https://doi.org/10.18653/v1/2023.emnlp-main.96 Exploiting asymmetry for synthetic training data generation: S ynth IE and the case of information extraction . In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 1555--1574, Singapore. Associ...

work page doi:10.18653/v1/2023.emnlp-main.96 2023

[33] [33]

Jungo Kasai, Keisuke Sakaguchi, Ronan Le Bras, Akari Asai, Xinyan Yu, Dragomir Radev, Noah A Smith, Yejin Choi, Kentaro Inui, et al. 2024. https://proceedings.neurips.cc/paper_files/paper/2023/file/9941624ef7f867a502732b5154d30cb7-Paper-Datasets_and_Benchmarks.pdf Realtime qa: What's the answer right now? Advances in Neural Information Processing Systems, 36

work page arXiv 2024

[34] [34]

Jinyoung Kim, Dayoon Ko, and Gunhee Kim. 2024 a . https://doi.org/10.18653/v1/2024.emnlp-main.762 D ynamic ER : Resolving emerging mentions to dynamic entities for RAG . In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 13752--13770, Miami, Florida, USA. Association for Computational Linguistics

work page doi:10.18653/v1/2024.emnlp-main.762 2024

[35] [35]

Yujin Kim, Jaehong Yoon, Seonghyeon Ye, Sangmin Bae, Namgyu Ho, Sung Ju Hwang, and Se-Young Yun. 2024 b . https://doi.org/10.18653/v1/2024.naacl-long.302 Carpe diem: On the evaluation of world knowledge in lifelong language models . In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human ...

work page doi:10.18653/v1/2024.naacl-long.302 2024

[36] [36]

Dayoon Ko, Jinyoung Kim, Hahyeon Choi, and Gunhee Kim. 2024. https://doi.org/10.18653/v1/2024.acl-long.181 G row OVER : How can LLM s adapt to growing real-world knowledge? In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 3282--3308, Bangkok, Thailand. Association for Computational L...

work page doi:10.18653/v1/2024.acl-long.181 2024

[37] [37]

Timoth \'e e Lacroix, Guillaume Obozinski, and Nicolas Usunier. 2020. https://openreview.net/forum?id=rke2P1BFwS Tensor decompositions for temporal knowledge base completion . arXiv preprint arXiv:2004.04926

work page arXiv 2020

[38] [38]

Kenton Lee, Luheng He, Mike Lewis, and Luke Zettlemoyer. 2017. https://doi.org/10.18653/v1/D17-1018 End-to-end neural coreference resolution . In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 188--197, Copenhagen, Denmark. Association for Computational Linguistics

work page doi:10.18653/v1/d17-1018 2017

[39] [39]

u ttler, Mike Lewis, Wen-tau Yih, Tim Rockt \

Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich K \"u ttler, Mike Lewis, Wen-tau Yih, Tim Rockt \"a schel, et al. 2020. https://proceedings.neurips.cc/paper/2020/hash/6b493230205f780e1bc26945df7481e5-Abstract.html Retrieval-augmented generation for knowledge-intensive nlp tasks . In Proceedings of th...

work page 2020

[40] [40]

Belinda Z Li, Emmy Liu, Alexis Ross, Abbas Zeitoun, Graham Neubig, and Jacob Andreas. 2024. https://arxiv.org/pdf/2406.11830 Language modeling with editable external knowledge . arXiv preprint arXiv:2406.11830

work page arXiv 2024

[41] [41]

Jiao Li, Yueping Sun, Robin J Johnson, Daniela Sciaky, Chih-Hsuan Wei, Robert Leaman, Allan Peter Davis, Carolyn J Mattingly, Thomas C Wiegers, and Zhiyong Lu. 2016. https://doi.org/10.1093/database/baw068 Biocreative v cdr task corpus: a resource for chemical disease relation extraction . Database, 2016

work page doi:10.1093/database/baw068 2016

[42] [42]

Adam Liska, Tomas Kocisky, Elena Gribovskaya, Tayfun Terzi, Eren Sezener, Devang Agrawal, D’Autume Cyprien De Masson, Tim Scholtes, Manzil Zaheer, Susannah Young, et al. 2022. https://proceedings.mlr.press/v162/liska22a.html Streaming QA : A benchmark for adaptation to new knowledge over time in question answering models . In International Conference on M...

work page 2022

[43] [43]

Yi Luan, Luheng He, Mari Ostendorf, and Hannaneh Hajishirzi. 2018. https://doi.org/10.18653/v1/D18-1360 Multi-task identification of entities, relations, and coreference for scientific knowledge graph construction . In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 3219--3232

work page doi:10.18653/v1/d18-1360 2018

[44] [44]

Ling Luo, Po-Ting Lai, Chih-Hsuan Wei, Cecilia N Arighi, and Zhiyong Lu. 2022. https://doi.org/10.1093/bib/bbac282 Bio RED : a rich biomedical relation extraction dataset . Briefings in Bioinformatics, 23(5):bbac282

work page doi:10.1093/bib/bbac282 2022

[45] [45]

Xin Lv, Yankai Lin, Yixin Cao, Lei Hou, Juanzi Li, Zhiyuan Liu, Peng Li, and Jie Zhou. 2022. https://doi.org/10.18653/v1/2022.findings-acl.282 Do pre-trained models benefit knowledge graph completion? a reliable evaluation and a reasonable approach . In Findings of the Association for Computational Linguistics: ACL 2022, pages 3570--3581, Dublin, Ireland....

work page doi:10.18653/v1/2022.findings-acl.282 2022

[46] [46]

Farzaneh Mahdisoltani, Joanna Biega, and Fabian Suchanek. 2014. https://imt.hal.science/hal-01699874/ Yago3: A knowledge base from multilingual wikipedias . In 7th biennial conference on innovative data systems research. CIDR Conference

work page 2014

[47] [47]

Katerina Margatina, Shuai Wang, Yogarshi Vyas, Neha Anna John, Yassine Benajiba, and Miguel Ballesteros. 2023. https://doi.org/10.18653/v1/2023.eacl-main.211 Dynamic benchmarking of masked language models on temporal concept drift with multiple views . In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Lingu...

work page doi:10.18653/v1/2023.eacl-main.211 2023

[48] [48]

Filipe Mesquita, Matteo Cannaviccio, Jordan Schmidek, Paramita Mirza, and Denilson Barbosa. 2019. https://doi.org/10.18653/v1/D19-1069 K nowledge N et: A benchmark dataset for knowledge base population . In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language ...

work page doi:10.18653/v1/d19-1069 2019

[49] [49]

George A Miller. 1995. https://doi.org/10.1145/219717.219748 Wordnet: a lexical database for english . Communications of the ACM, 38(11):39--41

work page doi:10.1145/219717.219748 1995

[50] [50]

Yasumasa Onoe, Michael Zhang, Eunsol Choi, and Greg Durrett. 2022. https://doi.org/10.18653/v1/2022.findings-naacl.52 Entity cloze by date: What LM s know about unseen entities . In Findings of the Association for Computational Linguistics: NAACL 2022, pages 693--702, Seattle, United States. Association for Computational Linguistics

work page doi:10.18653/v1/2022.findings-naacl.52 2022

[51] [51]

Yasumasa Onoe, Michael Zhang, Shankar Padmanabhan, Greg Durrett, and Eunsol Choi. 2023. https://doi.org/10.18653/v1/2023.acl-long.300 Can LM s learn new entities from descriptions? challenges in propagating injected knowledge . In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 5469--5...

work page doi:10.18653/v1/2023.acl-long.300 2023

[52] [52]

Riccardo Orlando, Pere-Llu \'i s Huguet Cabot, Edoardo Barba, and Roberto Navigli. 2024. https://doi.org/10.18653/v1/2024.findings-acl.839 R e L i K : Retrieve and L in K , fast and accurate entity linking and relation extraction on an academic budget . In Findings of the Association for Computational Linguistics: ACL 2024, pages 14114--14132, Bangkok, Th...

work page doi:10.18653/v1/2024.findings-acl.839 2024

[53] [53]

Heiko Paulheim. 2016. https://doi.org/10.3233/SW-160218 Knowledge graph refinement: A survey of approaches and evaluation methods . Semantic web, 8(3):489--508

work page doi:10.3233/sw-160218 2016

[54] [54]

Simon Razniewski, Hiba Arnaout, Shrestha Ghosh, and Fabian Suchanek. 2024. https://doi.org/10.1145/3639563 Completeness, recall, and negation in open-world knowledge bases: A survey . ACM Computing Surveys, 56(6):1--42

work page doi:10.1145/3639563 2024

[55] [55]

Sebastian Riedel, Limin Yao, and Andrew McCallum. 2010. https://doi.org/https://doi.org/10.1007/978-3-642-15939-8_10 Modeling relations and their mentions without labeled text . In Proceedings of the 2010 European Conference on Machine Learning and Knowledge Discovery in Databases, pages 148--163

work page doi:10.1007/978-3-642-15939-8_10 2010

[56] [56]

Gaetano Rossiello, Md Faisal Mahbub Chowdhury, Nandana Mihindukulasooriya, Owen Cornec, and Alfio Massimiliano Gliozzo. 2023. https://doi.org/https://doi.org/10.1609/aaai.v37i13.27084 Knowgl: Knowledge generation and linking from text . In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 16476--16478

work page doi:10.1609/aaai.v37i13.27084 2023

[57] [57]

Dan Roth and Wen-tau Yih. 2004. https://aclanthology.org/W04-2401.pdf A linear programming formulation for global inference in natural language tasks . Technical report, Illinois Univ at Urbana-Champaign Dept of Computer Science

work page 2004

[58] [58]

Tara Safavi and Danai Koutra. 2020. https://doi.org/10.18653/v1/2020.emnlp-main.669 C o DE x: A C omprehensive K nowledge G raph C ompletion B enchmark . In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 8328--8350, Online. Association for Computational Linguistics

work page doi:10.18653/v1/2020.emnlp-main.669 2020

[59] [59]

Tong Shen, Fu Zhang, and Jingwei Cheng. 2022. https://doi.org/https://doi.org/10.1016/j.knosys.2022.109597 A comprehensive overview of knowledge graph completion . Knowledge-Based Systems, 255:109597

work page doi:10.1016/j.knosys.2022.109597 2022

[60] [60]

Zhiyi Song, Ann Bies, Stephanie Strassel, Tom Riese, Justin Mott, Joe Ellis, Jonathan Wright, Seth Kulick, Neville Ryant, and Xiaoyi Ma. 2015. https://aclanthology.org/W15-0812.pdf From light to rich ere: annotation of entities, relations, and events . In Proceedings of the the 3rd Workshop on EVENTS: Definition, Detection, Coreference, and Representation...

work page 2015

[61] [61]

Budhitama Subagdja, D Shanthoshigaa, Zhaoxia Wang, and Ah-Hwee Tan. 2024. https://doi.org/10.1145/3640313 Machine learning for refining knowledge graphs: A survey . ACM Computing Surveys, 56(6):1--38

work page doi:10.1145/3640313 2024

[62] [62]

TAC-KBP. 2022. https://tac.nist.gov/tracks/index.html TAC-KBP home page

work page 2022

[63] [63]

Kristina Toutanova and Danqi Chen. 2015. https://aclanthology.org/W15-4007.pdf Observed versus latent features for knowledge base and text inference . In Proceedings of the 3rd workshop on continuous vector space models and their compositionality, pages 57--66

work page 2015

[64] [64]

Denny Vrande c i \'c and Markus Kr \"o tzsch. 2014. https://doi.org/10.1145/2629489 Wikidata: a free collaborative knowledgebase . Communications of the ACM, 57(10):78--85

work page doi:10.1145/2629489 2014

[65] [65]

Tu Vu, Mohit Iyyer, Xuezhi Wang, Noah Constant, Jerry Wei, Jason Wei, Chris Tar, Yun-Hsuan Sung, Denny Zhou, Quoc Le, and Thang Luong. 2024. https://doi.org/10.18653/v1/2024.findings-acl.813 F resh LLM s: Refreshing large language models with search engine augmentation . In Findings of the Association for Computational Linguistics: ACL 2024, pages 13697--...

work page doi:10.18653/v1/2024.findings-acl.813 2024

[66] [66]

Christopher Walker, Stephanie Strassel, Julie Medero, and Kazuaki Maeda. 2006. Ace 2005 multilingual training corpus. Linguistic Data Consortium, Philadelphia, 57

work page 2006

[67] [67]

Liang Wang, Wei Zhao, Zhuoyu Wei, and Jingming Liu. 2022. https://doi.org/10.18653/v1/2022.acl-long.295 S im KGC : Simple contrastive knowledge graph completion with pre-trained language models . In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 4281--4294, Dublin, Ireland. Associatio...

work page doi:10.18653/v1/2022.acl-long.295 2022

[68] [68]

Xiaozhi Wang, Tianyu Gao, Zhaocheng Zhu, Zhiyuan Liu, Juanzi Li, and Jian Tang. 2021. https://doi.org/10.1162/tacl_a_00360 KEPLER : A unified model for knowledge embedding and pre-trained language representation . Transactions of the Association for Computational Linguistics, 9:176--194

work page doi:10.1162/tacl_a_00360 2021

[69] [69]

Colin White, Samuel Dooley, Manley Roberts, Arka Pal, Ben Feuer, Siddhartha Jain, Ravid Shwartz-Ziv, Neel Jain, Khalid Saifullah, Siddartha Naidu, et al. 2024. https://openreview.net/forum?id=sKYHBTAxVa Livebench: A challenging, contamination-free llm benchmark . arXiv preprint arXiv:2406.19314

work page internal anchor Pith review Pith/arXiv arXiv 2024

[70] [70]

Tongtong Wu, Linhao Luo, Yuan-Fang Li, Shirui Pan, Thuy-Trang Vu, and Gholamreza Haffari. 2024 a . https://arxiv.org/pdf/2402.01364 Continual learning for large language models: A survey . arXiv preprint arXiv:2402.01364

work page arXiv 2024

[71] [71]

Xiaobao Wu, Liangming Pan, William Yang Wang, and Anh Tuan Luu. 2024 b . https://doi.org/10.18653/v1/2024.emnlp-main.843 AKEW : Assessing knowledge editing in the wild . In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 15118--15133, Miami, Florida, USA. Association for Computational Linguistics

work page doi:10.18653/v1/2024.emnlp-main.843 2024

[72] [72]

Xiaobao Wu, Liangming Pan, Yuxi Xie, Ruiwen Zhou, Shuai Zhao, Yubo Ma, Mingzhe Du, Rui Mao, Anh Tuan Luu, and William Yang Wang. 2024 c . https://arxiv.org/pdf/2412.13670 Antileak-bench: Preventing data contamination by automatically constructing benchmarks with updated real-world knowledge . arXiv preprint arXiv:2412.13670

work page arXiv 2024

[73] [73]

Rui Xing, Jie Luo, and Tengwei Song. 2020. https://doi.org/https://doi.org/10.1186/s12859-020-03889-5 Biorel: towards large-scale biomedical relation extraction . BMC bioinformatics, 21:1--13

work page doi:10.1186/s12859-020-03889-5 2020

[74] [74]

Wenhan Xiong, Mo Yu, Shiyu Chang, Xiaoxiao Guo, and William Yang Wang. 2018. https://doi.org/10.18653/v1/D18-1223 One-shot relational learning for knowledge graphs . In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 1980--1990, Brussels, Belgium. Association for Computational Linguistics

work page doi:10.18653/v1/d18-1223 2018

[75] [75]

Derong Xu, Wei Chen, Wenjun Peng, Chao Zhang, Tong Xu, Xiangyu Zhao, Xian Wu, Yefeng Zheng, Yang Wang, and Enhong Chen. 2024. https://doi.org/https://doi.org/10.1007/s11704-024-40555-y Large language models for generative information extraction: A survey . Frontiers of Computer Science, 18(6):186357

work page doi:10.1007/s11704-024-40555-y 2024

[76] [76]

Yuan Yao, Deming Ye, Peng Li, Xu Han, Yankai Lin, Zhenghao Liu, Zhiyuan Liu, Lixin Huang, Jie Zhou, and Maosong Sun. 2019. https://aclanthology.org/P19-1074 DocRED : A large-scale document-level relation extraction dataset . In Proceedings of the 2019 Annual Meeting of the Association for Computational Linguistics (ACL 2019), pages 764--777

work page 2019

[77] [77]

Klim Zaporojets, Johannes Deleu, Chris Develder, and Thomas Demeester. 2021. https://doi.org/10.1016/j.ipm.2021.102563 DWIE : An entity-centric dataset for multi-task document-level information extraction . Information Processing & Management, 58(4):102563

work page doi:10.1016/j.ipm.2021.102563 2021

[78] [78]

Urchade Zaratiana, Nadi Tomeh, Pierre Holat, and Thierry Charnois. 2024. https://doi.org/https://doi.org/10.1609/aaai.v38i17.29919 An autoregressive text-to-graph framework for joint entity and relation extraction . In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 19477--19487

work page doi:10.1609/aaai.v38i17.29919 2024

[79] [79]

Bowen Zhang and Harold Soh. 2024. https://doi.org/10.18653/v1/2024.emnlp-main.548 Extract, define, canonicalize: An LLM -based framework for knowledge graph construction . In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 9820--9836, Miami, Florida, USA. Association for Computational Linguistics

work page doi:10.18653/v1/2024.emnlp-main.548 2024

[80] [80]

Yuhao Zhang, Victor Zhong, Danqi Chen, Gabor Angeli, and Christopher D. Manning. 2017. https://doi.org/10.18653/v1/D17-1004 Position-aware attention and supervised data improve slot filling . In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 35--45, Copenhagen, Denmark. Association for Computational Linguistics

work page doi:10.18653/v1/d17-1004 2017