pith. sign in

arxiv: 2605.17669 · v1 · pith:5YGVHTCGnew · submitted 2026-05-17 · 💻 cs.AI

Multimodal Cultural Heritage Knowledge Graph Extension with Language and Vision Models

Pith reviewed 2026-05-20 12:05 UTC · model grok-4.3

classification 💻 cs.AI
keywords knowledge graph completionmultimodal learningcultural heritagelarge language modelsvision-language modelsWJocondeFrench heritage dataknowledge graph extension
0
0 comments X

The pith

Multimodal text and image data allow reliable extension of cultural heritage knowledge graphs using LLMs and VLMs

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper introduces WJoconde, a multimodal knowledge graph for French cultural heritage that merges textual descriptions with images of entities. It supplies three graph variants plus a benchmark dataset to support evaluation of knowledge graph completion techniques. The authors then describe a framework that applies large language models to text sources and vision-language models to images, followed by an automated extraction step and a validation pipeline that checks new facts against the current graph. If the approach works, cultural heritage collections could grow more quickly while preserving accuracy, aiding digital preservation and public access to historical records.

Core claim

We introduce WJoconde as a multimodal knowledge graph that integrates both textual and image information for French cultural heritage entities. We release three variants of the graph and a benchmark to advance research on knowledge graph completion. We further present a framework that combines LLMs and VLMs for automated extraction from unstructured resources together with a validation pipeline that grounds model outputs against the existing graph, resulting in efficient extensions with high reliability.

What carries the argument

The multimodal extension framework that performs automated data extraction from unstructured text and images via LLMs and VLMs, then applies a special validation pipeline to ground outputs against the existing WJoconde graph.

Load-bearing premise

The validation pipeline can check LLM and VLM outputs against the existing graph without introducing many false triples or missing many true ones.

What would settle it

Measuring the fraction of incorrect triples added and the fraction of known true facts missed when the pipeline processes a new batch of cultural heritage sources would show whether reliability holds.

Figures

Figures reproduced from arXiv: 2605.17669 by Fay\c{c}al Hamdi, Jean-Claude Moissinac, Nada Mimouni, Yang Zhang.

Figure 1
Figure 1. Figure 1: Details about the dataset can be found in Table 4. [PITH_FULL_IMAGE:figures/full_fig_p008_1.png] view at source ↗
Figure 1
Figure 1. Figure 1: Example of the WJocondeMM [PITH_FULL_IMAGE:figures/full_fig_p009_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Pipeline of our approaches. effective for the most popular open-source LLMs and VLMs, without requiring any model training. We attribute this to the fact that these LLMs are trained, among other data sources, on Wikidata. At the end of this stage, each model will give a list of predicted entities that serves as a candidate for our query. We also observe that, due to the strong ability of the latest models,… view at source ↗
Figure 3
Figure 3. Figure 3: Prompt example for LLM Similar to the LLM, given the template and an image as input, as shown in Fig.5, the VLM will give the result in a JSON file [PITH_FULL_IMAGE:figures/full_fig_p012_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Output from LLM [PITH_FULL_IMAGE:figures/full_fig_p013_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Prompt example for vision-language model [PITH_FULL_IMAGE:figures/full_fig_p013_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Output of vision-language model 5.2 Validation and evaluation The raw outputs from the LLM and VLM could fall into 3 types: answers already in the KG; answers that are synonyms to the ones in KG or between the models, and answers that are new to the KG. Thus, we perform further post-validation to select the new answers to the KG. 5.2.1 Entity matching. In order to select new entities, we first filter out p… view at source ↗
Figure 7
Figure 7. Figure 7: We also validate the outputs between the two models, in order to avoid synonyms between the outputs of the two models. In the end, we have two sets of answers with entities that are new to the KG and unique for the combined set [PITH_FULL_IMAGE:figures/full_fig_p014_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Validation from the vision-language model. [PITH_FULL_IMAGE:figures/full_fig_p015_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Example of "Saint George and the Dragon": entities in green blocks are already in the dataset, blue and red blocks are newly [PITH_FULL_IMAGE:figures/full_fig_p017_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Olympia by Manet, grey denotes properties, green denotes entities, and pink denotes literal values. [PITH_FULL_IMAGE:figures/full_fig_p023_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: The models predict "candlestick" for both paintings [PITH_FULL_IMAGE:figures/full_fig_p024_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: The models predict "shield" for both paintings [PITH_FULL_IMAGE:figures/full_fig_p024_12.png] view at source ↗
read the original abstract

The preservation and interpretation of cultural heritage increasingly rely on digital technologies, among which Knowledge Graphs (KGs) stand out for their ability to structure vast amounts of data. However, the construction and expansion of these KGs often face challenges due to the diverse and complex nature of cultural heritage information. In this paper, we propose a novel approach for extending KG resources in the domain of cultural heritage, which we applied to French data. First, we introduce a new knowledge graph in the domain of French cultural heritage, WJoconde, which is distinguished by its multimodality as it integrates both textual and image information of the entities. We further introduce three variants of WJoconde to facilitate downstream research, such as Knowledge Graph Completion (KGC). We also built a comprehensive benchmark for KGC methods on our dataset. Second, we propose a new framework for extending cultural heritage KGs using multi-modal approaches leveraging Large Language Models (LLMs) and Vision-Language Models (VLMs), which includes automated data extraction from unstructured resources combined with a special validation pipeline for grounding the output of both models, to further extend WJoconde. Our results show that by integrating the rich text and image information in cultural heritage data, we can efficiently enhance KGs with high reliability. We open-source all code and benchmark datasets with text and images, as well as the original data with an interactive access point

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces WJoconde, a new multimodal knowledge graph for French cultural heritage that integrates textual and image data, along with three variants to support knowledge graph completion research and a corresponding benchmark. It further proposes a framework that uses LLMs and VLMs to extract new facts from unstructured sources, applies a special validation pipeline to ground the outputs against the existing graph, and claims that this multimodal integration enables efficient KG extension with high reliability. All code, benchmarks, and an interactive access point are open-sourced.

Significance. If the validation pipeline demonstrably controls false-positive rates on novel extractions, the work offers a practical, reproducible template for multimodal KG extension in cultural-heritage domains where seed graphs are incomplete by design. The open-sourcing of code, text-image datasets, and the interactive portal is a clear strength that supports downstream reproducibility and benchmarking.

major comments (2)
  1. [Abstract and §4] Abstract and §4 (Results): the claim of 'efficient enhancement with high reliability' is not supported by any reported precision, recall, F1, or error-rate figures for the extracted triples; without these numbers or an ablation of the validation step it is impossible to evaluate whether the central empirical claim holds.
  2. [§3.2] §3.2 (Framework and Validation Pipeline): the description of the 'special validation pipeline' does not specify whether grounding relies on external ground-truth documents, human adjudication, or only intra-graph consistency and model self-consistency. If the latter, the pipeline cannot reliably distinguish true novel facts from hallucinations on incomplete cultural-heritage sources.
minor comments (2)
  1. [§2.1] §2.1: the three variants of WJoconde are introduced but their exact differences in modality handling and how they affect the KGC benchmark are not tabulated.
  2. [Figure 1] Figure 1: the diagram of the extraction-validation loop would benefit from explicit arrows showing where external grounding data (if any) enters the pipeline.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We address each major point below and describe the planned revisions.

read point-by-point responses
  1. Referee: [Abstract and §4] Abstract and §4 (Results): the claim of 'efficient enhancement with high reliability' is not supported by any reported precision, recall, F1, or error-rate figures for the extracted triples; without these numbers or an ablation of the validation step it is impossible to evaluate whether the central empirical claim holds.

    Authors: We agree that explicit quantitative support is missing from the current version. In the revised manuscript we will add to §4 the precision, recall, and F1 scores for triples produced by the LLM-VLM pipeline, together with an ablation that isolates the contribution of the validation step. These figures will be computed on a held-out set of manually verified extractions. revision: yes

  2. Referee: [§3.2] §3.2 (Framework and Validation Pipeline): the description of the 'special validation pipeline' does not specify whether grounding relies on external ground-truth documents, human adjudication, or only intra-graph consistency and model self-consistency. If the latter, the pipeline cannot reliably distinguish true novel facts from hallucinations on incomplete cultural-heritage sources.

    Authors: The referee correctly identifies an ambiguity in the current text. The pipeline performs grounding via intra-graph consistency checks against existing WJoconde entities and relations plus model self-consistency. We will expand §3.2 with a precise step-by-step description of these checks. Because complete external ground truth is rarely available for novel cultural-heritage facts, we will also add a limitations paragraph and report error rates obtained from human review of a random sample of 200 extractions. revision: partial

Circularity Check

0 steps flagged

No circularity: applied KG construction and benchmarking

full rationale

The paper describes construction of the WJoconde KG, creation of variants and a KGC benchmark, plus an empirical framework that extracts triples via LLMs/VLMs and applies a validation pipeline against the seed graph. No equations, fitted parameters, or predictions are presented that reduce by construction to the inputs. Central claims rest on external data sources and open-sourced artifacts whose correctness is independently checkable, satisfying the self-contained criterion for score 0.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Based on the abstract alone, the work relies on standard assumptions about the accuracy of LLM and VLM outputs and the existence of unstructured cultural heritage resources, but does not introduce or fit any explicit free parameters, new axioms, or invented entities.

pith-pipeline@v0.9.0 · 5787 in / 1196 out tokens · 30476 ms · 2026-05-20T12:05:28.283901+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

58 extracted references · 58 canonical work pages · 11 internal anchors

  1. [1]

    Eneko Agirre, Ander Barrena, Oier Lopez de Lacalle, Aitor Soroa, Samuel Fernando, and Mark Stevenson. 2012. Matching Cultural Heritage items to Wikipedia. InProceedings of the Eighth International Conference on Language Resources and Evaluation (LREC’12). European Language Resources Association (ELRA), Istanbul, Turkey, 1729–1735. https://aclanthology.org...

  2. [2]

    Marie Al-Ghossein, Talel Abdessalem, and Anthony Barré. 2018. Open data in the hotel industry: leveraging forthcoming events for hotel recommendation.J. of IT & Tourism20, 1-4 (2018), 191–216. https://telecom-paris.hal.science/hal-02287955v1

  3. [3]

    Sören Auer, Christian Bizer, Georgi Kobilarov, Jens Lehmann, Richard Cyganiak, and Zachary G. Ives. 2007. DBpedia: A Nucleus for a Web of Open Data. InThe Semantic Web, 6th International Semantic Web Conference, 2nd Asian Semantic Web Conference, ISWC 2007 + ASWC 2007, Busan, Korea, November 11-15, 2007 (Lecture Notes in Computer Science), Karl Aberer, Ke...

  4. [4]

    Ivana Balažević, Carl Allen, and Timothy M Hospedales. 2019. Tucker: Tensor factorization for knowledge graph completion.arXiv preprint arXiv:1901.09590(2019). https://doi.org/10.48550/arXiv.1901.09590

  5. [5]

    Victor Boer, Jan Wielemaker, Judith Gent, Michiel Hildebrand, Antoine Isaac, Jacco Van Ossenbruggen, and Guus Schreiber. 2012. Supporting Linked Data Production for Cultural Heritage Institutes: The Amsterdam Museum Case Study.ACM Journal of Experimental Algorithms - JEA. doi:10.1007/978-3-642-30284-8_56

  6. [6]

    Kurt Bollacker, Colin Evans, Praveen Paritosh, Tim Sturge, and Jamie Taylor. 2008. Freebase: a collaboratively created graph database for structuring human knowledge. InProceedings of the 2008 ACM SIGMOD International Conference on Management of Data(Vancouver, Canada)(SIGMOD ’08). Association for Computing Machinery, New York, NY, USA, 1247–1250. doi:10....

  7. [7]

    Antoine Bordes, Nicolas Usunier, Alberto García-Durán, Jason Weston, and Oksana Yakhnenko. 2013. Translating Embeddings for Modeling Multi-relational Data. InAdvances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems

  8. [8]

    Proceedings of a meeting held December 5-8, 2013, Lake Tahoe, Nevada, United States, Christopher J. C. Burges, Léon Bottou, Zoubin Ghahramani, and Kilian Q. Weinberger (Eds.). 2787–2795. https://proceedings.neurips.cc/paper/2013/hash/1cecc7a77928ca8133fa24680a88d2f9-Abstract.html

  9. [9]

    Armand Boschin. 2019. WikiDataSets : Standardized sub-graphs from WikiData.CoRRabs/1906.04536 (2019). arXiv:1906.04536 http://arxiv.org/abs/ 1906.04536

  10. [10]

    Judith Brouwer and Harm Nijboer. 2017. Golden Agents. A Web of Linked Biographical Data for the Dutch Golden Age. InProceedings of the Second Conference on Biographical Data in a Digital World 2017, Linz, Austria, November 6-7, 2017 (CEUR Workshop Proceedings), Antske Fokkens, Serge ter Braake, Ronald Sluijter, Paul Longley Arthur, and Eveline Wandl-Vogt ...

  11. [11]

    Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin...

  12. [12]

    Gustavo Candela, Milena Dobreva, Henk Alkemade, Olga Holownia, Mahendra Mahey, Sarah Ames, Karen Renaud, Ines Vodopivec, Benjamin Charles Germain Lee, Thomas Padilla, Steven Claeyssens, Isto Huvila, and Beth Knazook. 2025. A Use Case Lens on Digital Cultural Heritage.CoRR abs/2509.08710 (2025). arXiv:2509.08710 doi:10.48550/ARXIV.2509.08710

  13. [13]

    Costanza Caraffa, Emily Pugh, Tracy Stuber, and Louisa Wood Ruby. 2020. PHAROS: A digital research space for photo archives.Art Libraries Journal45, 1 (2020), 2–11. doi:10.1017/alj.2019.34

  14. [14]

    Valentina Anita Carriero, Aldo Gangemi, Maria Letizia Mancinelli, Ludovica Marinucci, Andrea Giovanni Nuzzolese, Valentina Presutti, and Chiara Veninata. 2019. ArCo: The Italian Cultural Heritage Knowledge Graph. InThe Semantic Web – ISWC 2019. Springer International Publishing, 36–52. https://doi.org/10.48550/arXiv.1905.02840

  15. [15]

    Xiang Chen, Ningyu Zhang, Lei Li, Shumin Deng, Chuanqi Tan, Changliang Xu, Fei Huang, Luo Si, and Huajun Chen. 2022. Hybrid Transformer with Multi-level Fusion for Multimodal Knowledge Graph Completion. InProceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval(, Madrid, Spain,)(SIGIR ’22). Associatio...

  16. [16]

    Marilena Daquino, Francesca Mambelli, Silvio Peroni, Francesca Tomasi, and Fabio Vitali. 2017. Enhancing Semantic Expressivity in the Cultural Heritage Domain: Exposing the Zeri Photo Archive as Linked Open Data.J. Comput. Cult. Herit.10, 4, Article 21 (July 2017), 21 pages. doi:10.1145/ 3051487

  17. [17]

    Tim Dettmers, Pasquale Minervini, Pontus Stenetorp, and Sebastian Riedel. 2018. Convolutional 2d knowledge graph embeddings. InProceedings of the AAAI conference on artificial intelligence, Vol. 32. https://doi.org/10.48550/arXiv.1707.01476 Manuscript submitted to ACM 20 Zhang et al

  18. [18]

    Chris Dijkshoorn, Lizzy Jongma, Lora Aroyo, Jacco Van Ossenbruggen, Guus Schreiber, Wesley Weele, and Jan Wielemaker. 2017. The Rijksmuseum collection as Linked Data.Semantic Web9 (01 2017), 1–10. doi:10.3233/SW-170257

  19. [19]

    Takuma Ebisu and Ryutaro Ichise. 2018. Toruse: Knowledge graph embedding on a lie group. InProceedings of the AAAI conference on artificial intelligence, Vol. 32. https://doi.org/10.48550/arXiv.1711.05435

  20. [20]

    Marco Fiorucci, Marina Khoroshiltseva, Massimiliano Pontil, Arianna Traviglia, Alessio Del Bue, and Stuart James. 2020. Machine Learning for Cultural Heritage: A Survey.Pattern Recognition Letters133 (2020), 102–108. doi:10.1016/j.patrec.2020.02.017

  21. [21]

    Luis Galárraga, Christina Teflioudi, Katja Hose, and Fabian Suchanek. 2013. AMIE: Association rule mining under incomplete evidence in ontological knowledge bases.WWW 2013 - Proceedings of the 22nd International Conference on World Wide Web, 413–422. doi:10.1145/2488388.2488425

  22. [22]

    Nicolas Gonthier, Yann Gousseau, Said Ladjal, and Olivier Bonfait. 2019. Weakly Supervised Object Detection in Artworks. InComputer Vision – ECCV 2018 Workshops, Laura Leal-Taixé and Stefan Roth (Eds.). Springer International Publishing, Cham, 692–709. https://doi.org/10.48550/arXiv.1810.02569

  23. [23]

    Aaron Grattafiori, Abhimanyu Dubey, Abhinav Jauhri, and et al. 2024. The Llama 3 Herd of Models. arXiv:2407.21783 [cs.AI] https://doi.org/10. 48550/arXiv.2407.21783

  24. [24]

    Rashid, Lukas Schmelzeisen, Steffen Staab, Eva Blomqvist, Claudia d’Amato, José Emilio Labra Gayo, Sebastian Neumaier, Anisa Rula, Juan Sequeda, and Antoine Zimmermann

    Aidan Hogan, Claudio Gutierrez, Michael Cochez, Gerard de Melo, Sabrina Kirrane, Axel Polleres, Roberto Navigli, Axel-Cyrille Ngonga Ngomo, Sabbir M. Rashid, Lukas Schmelzeisen, Steffen Staab, Eva Blomqvist, Claudia d’Amato, José Emilio Labra Gayo, Sebastian Neumaier, Anisa Rula, Juan Sequeda, and Antoine Zimmermann. 2021.Knowledge Graphs. Morgan & Claypo...

  25. [25]

    Guoliang Ji, Shizhu He, Liheng Xu, Kang Liu, and Jun Zhao. 2015. Knowledge graph embedding via dynamic mapping matrix. InProceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (volume 1: Long papers). 687–696. https://aclanthology.org/P15-1067.pdf

  26. [26]

    Timothée Lacroix, Nicolas Usunier, and Guillaume Obozinski. 2018. Canonical tensor decomposition for knowledge base completion. InInternational Conference on Machine Learning. PMLR, 2863–2872. https://doi.org/10.48550/arXiv.1806.07297

  27. [27]

    Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen tau Yih, Tim Rocktäschel, Sebastian Riedel, and Douwe Kiela. 2021. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. arXiv:2005.11401 [cs.CL] https://doi.org/10.48550/arXiv.2005.11401

  28. [28]

    Junnan Li, Dongxu Li, Caiming Xiong, and Steven Hoi. 2022. Blip: Bootstrapping language-image pre-training for unified vision-language understanding and generation. InInternational Conference on Machine Learning. PMLR, 12888–12900. https://doi.org/10.48550/arXiv.2201.12086

  29. [29]

    Yankai Lin, Zhiyuan Liu, Maosong Sun, Yang Liu, and Xuan Zhu. 2015. Learning Entity and Relation Embeddings for Knowledge Graph Completion. InProceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, January 25-30, 2015, Austin, Texas, USA, Blai Bonet and Sven Koenig (Eds.). AAAI Press, 2181–2187. doi:10.1609/AAAI.V29I1.9491

  30. [30]

    Gengchen Mai, Krzysztof Janowicz, Bo Yan, Rui Zhu, Ling Cai, and Ni Lao. 2019. Contextual Graph Attention for Answering Logical Queries over Incomplete Knowledge Graphs. InProceedings of the 10th International Conference on Knowledge Capture (K-CAP ’19). Association for Computing Machinery, 171–178. doi:10.1145/3360901.3364432

  31. [31]

    Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient Estimation of Word Representations in Vector Space. arXiv:1301.3781 [cs.CL] https://doi.org/10.48550/arXiv.1301.3781

  32. [32]

    George A. Miller. 1995. WordNet: A Lexical Database for English.Commun. ACM38, 11 (nov 1995), 39–41. doi:10.1145/219717.219748

  33. [33]

    Nada Mimouni and Jean-Claude Moissinac. 2024. Towards Efficient Exploitation of Large Knowledge Bases by Context Graphs. InKnowledge Graphs in the Age of Language Models and Neuro-Symbolic AI - Proceedings of the 20th International Conference on Semantic Systems, 17-19 September 2024, Amsterdam, The Netherlands (Studies on the Semantic Web, Vol. 60). IOS ...

  34. [34]

    Nada Mimouni, Jean-Claude Jc Moissinac, and Anh Tuan Vu. 2019. Knowledge base completion with analogical inference on context graphs. In Semapro 2019. https://telecom-paris.hal.science/hal-02281147v1

  35. [35]

    2020.Semantic Representation of the Joconde Database

    Jean-Claude Moissinac. 2020.Semantic Representation of the Joconde Database. doi:10.5281/zenodo.3986498

  36. [36]

    Jean-Claude Moissinac, Francois Rouze, Piyush Wadhera, and Bastien Germain. 2020. Toward a Semantic Representation of the Joconde Database. InAdvances In Semantic Processing. International Conference. 14TH 2020. (SEMAPRO 2020). 62–67. https://telecom-paris.hal.science/hal-04394579v1/ document

  37. [37]

    Dat Quoc Nguyen. 2017. A survey of embedding models of entities and relationships for knowledge graph completion.arXiv preprint arXiv:1703.08098 (2017). https://doi.org/10.48550/arXiv.1703.08098

  38. [38]

    Phuc Nguyen and Hideaki Takeda. 2022. Wikidata-lite for Knowledge Extraction and Exploration. arXiv:2211.05416 [cs.DB] https://doi.org/10. 48550/arXiv.2211.05416

  39. [39]

    Maximilian Nickel, Lorenzo Rosasco, and Tomaso Poggio. 2016. Holographic embeddings of knowledge graphs. InProceedings of the AAAI conference on artificial intelligence, Vol. 30. https://doi.org/10.48550/arXiv.1510.04935

  40. [40]

    Maximilian Nickel, Volker Tresp, Hans-Peter Kriegel, et al. 2011. A three-way model for collective learning on multi-relational data.. InIcml, Vol. 11. 3104482–3104584. http://www.icml-2011.org/papers/438_icmlpaper.pdf

  41. [41]

    Yiwen Peng, Thomas Bonald, and Mehwish Alam. 2024. Refining wikidata taxonomy using large language models. InProceedings of the 33rd ACM International Conference on Information and Knowledge Management. 5395–5399. https://doi.org/10.48550/arXiv.2409.04056

  42. [42]

    Christian Pentzold, Esther Weltevrede, Michele Mauri, David Laniado, Andreas Kaltenbrunner, and Erik Borra. 2017. Digging Wikipedia: The Online Encyclopedia as a Digital Cultural Heritage Gateway and Site.J. Comput. Cult. Herit.10, 1 (March 2017). doi:10.1145/3012285

  43. [43]

    RKD. [n. d.]. RKD Knowledge Graph. https://rkd.triply.cc/rkd/RKD-Knowledge-Graph. Accessed: 2025-12-02. Manuscript submitted to ACM Multimodal Cultural Heritage Knowledge Graph Extension with Language and Vision Models 21

  44. [44]

    Bin Shang, Yinliang Zhao, Jun Liu, and Di Wang. 2024. LAFA: Multimodal Knowledge Graph Completion with Link Aware Fusion and Aggregation. Proceedings of the AAAI Conference on Artificial Intelligence38, 8 (Mar. 2024), 8957–8965. doi:10.1609/aaai.v38i8.28744

  45. [45]

    Chao Shang, Yun Tang, Jing Huang, Jinbo Bi, Xiaodong He, and Bowen Zhou. 2018. End-to-end Structure-Aware Convolutional Networks for Knowledge Base Completion. arXiv:1811.04441 [cs.AI] https://doi.org/10.48550/arXiv.1811.04441

  46. [46]

    Tong Shen, Fu Zhang, and Jingwei Cheng. 2022. A Comprehensive Overview of Knowledge Graph Completion.Know.-Based Syst.255, C (2022), 65 pages. doi:10.1016/j.knosys.2022.109597

  47. [47]

    Sean Szumlanski and Fernando Gomez. 2010. Automatically Acquiring a Semantic Network of Related Concepts. InProceedings of the 19th ACM CIKM. 19–28. https://stars.library.ucf.edu/etd/2585/

  48. [48]

    Mary Ann Tan, Tabea Tietz, Oleksandra Bruns, Jonas Oppenlaender, Danilo Dessì, and Harald Sack. 2021. DDB-KG: The German Bibliographic Heritage in a Knowledge Graph. InProceedings of the 6th International Workshop on Computational History (HistoInformatics 2021) co-located with ACM/IEEE Joint Conference on Digital Libraries 2021 (JCDL 2021), Online event,...

  49. [49]

    Kristina Toutanova and Danqi Chen. 2015. Observed versus latent features for knowledge base and text inference. InProceedings of the 3rd workshop on continuous vector space models and their compositionality. 57–66. https://aclanthology.org/W15-4007.pdf

  50. [50]

    Kristina Toutanova, Danqi Chen, Patrick Pantel, Hoifung Poon, Pallavi Choudhury, and Michael Gamon. 2015. Representing Text for Joint Embedding of Text and Knowledge Bases. InProceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1499–1509. doi:10.18653/v1/D15-1174

  51. [51]

    Théo Trouillon, Johannes Welbl, Sebastian Riedel, Éric Gaussier, and Guillaume Bouchard. 2016. Complex Embeddings for Simple Link Prediction. InProceedings of the 33nd International Conference on Machine Learning, ICML 2016, New York City, NY, USA, June 19-24, 2016 (JMLR Workshop and Conference Proceedings), Maria-Florina Balcan and Kilian Q. Weinberger (...

  52. [52]

    Yale University. 2023. 17 Million Reasons to Love LUX, Yale’s New Collections Search Tool. https://news.yale.edu/2023/06/01/17-million-reasons- love-lux-yales-new-collections-search-tool. Accessed: 2025-02-20

  53. [53]

    Blerta Veseli, Sneha Singhania, Simon Razniewski, and Gerhard Weikum. 2023. Evaluating Language Models for Knowledge Base Completion. In The Semantic Web. Springer Nature Switzerland, Cham, 227–243. https://doi.org/10.48550/arXiv.2303.11082

  54. [54]

    Zhen Wang, Jianwen Zhang, Jianlin Feng, and Zheng Chen. 2014. Knowledge Graph Embedding by Translating on Hyperplanes. InProceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, July 27 -31, 2014, Québec City, Québec, Canada, Carla E. Brodley and Peter Stone (Eds.). AAAI Press, 1112–1119. doi:10.1609/AAAI.V28I1.8870

  55. [55]

    Jason Weston, Antoine Bordes, Oksana Yakhnenko, and Nicolas Usunier. 2013. Connecting Language and Knowledge Bases with Embedding Models for Relation Extraction. InProceedings of the 2013 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 1366–1371. https://aclanthology.org/D13-1136

  56. [56]

    Bishan Yang, Wen-tau Yih, Xiaodong He, Jianfeng Gao, and Li Deng. 2014. Embedding entities and relations for learning and inference in knowledge bases.arXiv preprint arXiv:1412.6575(2014). https://doi.org/10.48550/arXiv.1412.6575

  57. [57]

    Liang Yao, Jiazhen Peng, Chengsheng Mao, and Yuan Luo. 2024. Exploring Large Language Models for Knowledge Graph Completion. arXiv:2308.13916 [cs.CL] https://doi.org/10.48550/arXiv.2308.13916

  58. [58]

    Neural Generative Question Answering

    Jun Yin, Xin Jiang, Zhengdong Lu, Lifeng Shang, Hang Li, and Xiaoming Li. 2016. Neural Generative Question Answering. InProceedings of IJCAI. AAAI Press, 2972–2978. https://doi.org/10.48550/arXiv.1512.01337 A Performance of embedding models A.1 Embedding models on WJoconde Experimental setup: for the ComplEx_n3 model, we set 𝑟𝑎𝑛𝑘= 500, 𝑙𝑒𝑎𝑟𝑛𝑖𝑛𝑔_𝑟𝑎𝑡𝑒= 5𝑒− ...