pith. sign in

arxiv: 2605.22304 · v1 · pith:LU4NRL6Anew · submitted 2026-05-21 · 💻 cs.AI · cs.DB· cs.LG

Evaluation of Pipelines for Data Integration into Knowledge Graphs

Pith reviewed 2026-05-22 05:11 UTC · model grok-4.3

classification 💻 cs.AI cs.DBcs.LG
keywords knowledge graphsdata integrationbenchmarkpipelinesevaluationcoveragecorrectnessconsistency
0
0 comments X

The pith

A new benchmark evaluates data integration pipelines for knowledge graphs using coverage, correctness, and consistency on movie-domain datasets.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces KGI-Bench to assess different workflows that add various input data to an existing knowledge graph. It supplies a seed KG, overlapping input data in three formats, and a reference KG as ground truth, all in the movie domain. Pipelines are judged by how much new information their updates add, how accurate those additions are, and whether the resulting graph remains consistent without contradictions. Testing twelve pipelines demonstrates measurable differences tied to input formats and internal design decisions. The approach supplies a concrete way to identify stronger pipeline options for integration tasks.

Core claim

The central claim is that integration pipelines can be systematically compared by running them on shared benchmark datasets and scoring the resulting updated knowledge graph with the three complementary metrics of coverage, correctness, and consistency. The supplied movie-domain resources include a seed KG, multi-format input data that overlaps with the seed, and a reference KG serving as ground truth, enabling reproducible evaluation of twelve pipelines across formats and design choices.

What carries the argument

KGI-Bench benchmark, which supplies a seed knowledge graph, multi-format overlapping input data, a reference ground-truth KG, and evaluation through the three metrics of coverage, correctness, and consistency on the updated graph.

Load-bearing premise

The movie-domain datasets and the three chosen metrics are representative enough to identify the best pipeline choices for general data integration problems across domains and data types.

What would settle it

Re-running the twelve pipelines on a non-movie dataset such as a biology or finance knowledge graph and obtaining a substantially different ranking of which pipelines perform best would indicate the movie resources do not generalize.

Figures

Figures reproduced from arXiv: 2605.22304 by Erhard Rahm, Marvin Hofer.

Figure 1
Figure 1. Figure 1: Ontology/Schema graph of classes film, person, com [PITH_FULL_IMAGE:figures/full_fig_p008_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: RDF single-source pipeline layouts used in the eval [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: JSON single-source pipeline layouts used in the [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Text single-source pipeline layouts used in the eval [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Number of integrated entities by entity type and expected entities at each source increment (stage) for all pipelines. [PITH_FULL_IMAGE:figures/full_fig_p013_5.png] view at source ↗
read the original abstract

Integrating new data into knowledge graphs (KG) typically involves different tasks that are executed within workflows or pipelines There are many possible pipelines for a specific integration problem but there is not yet a general approach to evaluate the overall quality and performance of such pipelines to be able to determine the best choices. We therefore propose a new benchmark KGI-Bench to evaluate integration pipelines that ingest different kinds of input data into an existing KG. We evaluate pipelines by analyzing their output, i.e., the updated KG, with the three complementary quality metrics coverage, correctness and consistency. We also provide benchmark datasets (seed KG, overlapping input data of three formats, reference KG as a ground truth) for the movie domain. To demonstrate the applicability and usefulness of the proposed benchmark, we comparatively evaluate 12 pipelines and analyze their behavior across different input data formats and design choices.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript proposes KGI-Bench, a benchmark for evaluating data integration pipelines that ingest different input data formats into an existing knowledge graph. Pipelines are assessed by analyzing the updated KG against three complementary quality metrics (coverage, correctness, consistency). The authors supply movie-domain datasets (seed KG, overlapping inputs in three formats, reference KG as ground truth) and demonstrate the benchmark by comparatively evaluating 12 pipelines while analyzing behavior across input formats and design choices.

Significance. If the metrics are rigorously defined and the evaluation results reproducible, the benchmark could address the lack of standardized methods for comparing KG integration pipelines. The provision of concrete datasets and a multi-pipeline demonstration is a positive step toward reproducibility and practical utility in data integration research.

major comments (3)
  1. [Benchmark and metrics description] Section describing the quality metrics: the three metrics (coverage, correctness, consistency) are presented as complementary for evaluating the updated KG, but no explicit definitions, formulas, or computation procedures relative to the reference KG are supplied. This is load-bearing for the central claim that the benchmark enables determination of best pipeline choices.
  2. [Comparative evaluation] Evaluation and demonstration section: the analysis of the 12 pipelines' behavior across input formats and design choices lacks reported quantitative results, tables of metric values, or statistical comparisons, leaving the demonstration without verifiable support for the claimed usefulness.
  3. [Datasets and generalizability] Benchmark datasets section: the movie-domain seed KG and inputs are used to support general conclusions about pipeline selection, yet no cross-domain experiments or sensitivity analysis address whether the metrics and rankings transfer to domains with greater schema heterogeneity or noise (e.g., biomedical data).
minor comments (2)
  1. [Abstract] The abstract mentions 'three formats' for input data but does not name them; adding this detail would improve clarity without altering the contribution.
  2. [Metrics] Notation for the updated KG versus reference KG should be introduced consistently when first describing the metrics.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the detailed and constructive feedback on our manuscript. We address each of the major comments below, indicating the changes we will make to strengthen the paper.

read point-by-point responses
  1. Referee: [Benchmark and metrics description] Section describing the quality metrics: the three metrics (coverage, correctness, consistency) are presented as complementary for evaluating the updated KG, but no explicit definitions, formulas, or computation procedures relative to the reference KG are supplied. This is load-bearing for the central claim that the benchmark enables determination of best pipeline choices.

    Authors: We appreciate this observation. The manuscript introduces the metrics in the context of the benchmark but does not provide the explicit formulas or detailed computation procedures. We will revise the section on quality metrics to include precise definitions and formulas. For instance, coverage will be defined as the fraction of reference KG elements covered by the updated KG, correctness as the accuracy of integrated facts against the reference, and consistency as the degree to which the updated KG satisfies predefined constraints, with step-by-step procedures for calculation relative to the reference KG. This will support the claim more rigorously. revision: yes

  2. Referee: [Comparative evaluation] Evaluation and demonstration section: the analysis of the 12 pipelines' behavior across input formats and design choices lacks reported quantitative results, tables of metric values, or statistical comparisons, leaving the demonstration without verifiable support for the claimed usefulness.

    Authors: We acknowledge that while the manuscript provides an analysis of the pipelines' behaviors, it would be improved by including explicit quantitative results. In the revised version, we will add tables presenting the coverage, correctness, and consistency scores for each of the 12 pipelines under different input formats. Additionally, we will include statistical summaries or comparisons where appropriate to provide verifiable support for our observations on design choices and input formats. revision: yes

  3. Referee: [Datasets and generalizability] Benchmark datasets section: the movie-domain seed KG and inputs are used to support general conclusions about pipeline selection, yet no cross-domain experiments or sensitivity analysis address whether the metrics and rankings transfer to domains with greater schema heterogeneity or noise (e.g., biomedical data).

    Authors: The demonstration uses the movie domain to provide a controlled and reproducible example with the supplied datasets. We recognize that this limits the ability to draw broad conclusions about generalizability. In the revision, we will add a discussion on the potential applicability to other domains, including considerations for schema heterogeneity and noise, and perform a basic sensitivity analysis on the existing movie data by simulating varying levels of input noise if feasible. Full cross-domain validation would require additional datasets and is planned for future work. revision: partial

Circularity Check

0 steps flagged

No circularity: benchmark proposal and empirical evaluation stand independently

full rationale

The paper proposes KGI-Bench as a new evaluation framework for KG integration pipelines, supplies movie-domain seed KG, input data in three formats, and reference KG, then runs an empirical comparison of 12 pipelines using the three metrics coverage, correctness, and consistency. No equations, derivations, fitted parameters, or predictions appear in the provided text. The central demonstration of the benchmark's usefulness is a direct empirical measurement on the supplied datasets rather than a reduction to self-citation, self-definition, or renamed known results. Concerns about domain representativeness affect external validity but do not constitute circularity under the stated criteria.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The work introduces a benchmark without new mathematical axioms, free parameters, or invented entities; it builds on standard knowledge-graph concepts and evaluation practices.

pith-pipeline@v0.9.0 · 5670 in / 990 out tokens · 48517 ms · 2026-05-22T05:11:49.657432+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

50 extracted references · 50 canonical work pages

  1. [1]

    Gabriel Amaral, Odinaldo Rodrigues, and Elena Simperl. 2024. ProVe: A pipeline for automated provenance verification of knowledge graphs against textual sources.Semantic Web15, 6 (2024), 2159–2192. https://doi.org/10.3233/SW- 233467

  2. [2]

    Renzo Angles, Angela Bonifati, Stefania Dumbrava, George Fletcher, Alastair Green, Jan Hidders, Bei Li, Leonid Libkin, Victor Marsault, Wim Martens, Filip Murlak, Stefan Plantikow, Ognjen Savkovic, Michael Schmidt, Juan Sequeda, Slawek Staworko, Dominik Tomaszuk, Hannes Voigt, Domagoj Vrgoc, Mingxi Wu, and Dusan Zivkovic. 2023. PG-Schema: Schemas for Prop...

  3. [3]

    Roos M Bakker and Maaike HT de Boer. 2026. Dynamic knowledge graph evaluation: Semantic and syntactic metrics for evaluating changes.Data & Knowledge Engineering(2026), 102611

  4. [4]

    Meghyn Bienvenu and Camille Bourgaux. 2017. Inconsistency-Tolerant Querying of Description Logic Knowledge Bases. InReasoning Web: Logical Foundation of Knowledge Graph Construction and Query Answering, Jeff Z. Pan, Diego Calvanese, Thomas Eiter, Ian Horrocks, Michael Kifer, Fangzhen Lin, and Yuting Zhao (Eds.). Vol. 9885. Springer International Publishin...

  5. [5]

    Christian Bizer and Andy Seaborne. 2004. D2RQ-treating non-RDF databases as virtual RDF graphs. InProceedings of the 3rd international semantic web conference (ISWC2004), Vol. 2004. Springer Hiroshima

  6. [6]

    Martin Brümmer, Milan Dojchinovski, and Sebastian Hellmann. 2016. DBpedia Abstracts: A Large-Scale, Open, Multilingual NLP Training Corpus. InProceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, H...

  7. [7]

    Ringwald Celian, Gandon, Fabien, Faron Catherine, Michel Franck, and Abi Akl Hanna. 2025. A systematic review of relation extraction task since the emergence of Transformers. https://doi.org/10.48550/arXiv.2511.03610 arXiv:2511.03610 [cs]

  8. [8]

    Vassilis Christophides, Vasilis Efthymiou, Themis Palpanas, George Papadakis, and Kostas Stefanidis. 2021. An Overview of End-to-End Entity Resolution for Big Data.ACM Comput. Surv.53, 6 (2021), 127:1–127:42. https://doi.org/10.1145/ 3418896

  9. [9]

    Carolina Cortes, Lisa Ehrlinger, Lorena Etcheverry, and Felix Naumann. 2025. Is SHACL Suitable for Data Quality Assessment?CoRRabs/2507.22305 (2025). https://doi.org/10.48550/ARXIV.2507.22305 arXiv:2507.22305

  10. [10]

    Meiji Cui, Li Li, Zhihong Wang, and Mingyu You. 2017. A Survey on Relation Extraction. InKnowledge Graph and Semantic Computing. Language, Knowledge, and Intelligence - Second China Conference, CCKS 2017, Chengdu, China, August 26-29, 2017, Revised Selected Papers (Communications in Computer and Information Science), Juanzi Li, Ming Zhou, Guilin Qi, Ni La...

  11. [11]

    Xi Deng, Volker Haarslev, and Nematollaah Shiri. 2007. Measuring Inconsisten- cies in Ontologies. InThe Semantic Web: Research and Applications, Enrico Fran- coni, Michael Kifer, and Wolfgang May (Eds.). Vol. 4519. Springer Berlin Heidel- berg, Berlin, Heidelberg, 326–340. https://doi.org/10.1007/978-3-540-72667-8_24 Series Title: Lecture Notes in Compute...

  12. [12]

    Jérôme Euzenat, Christian Meilicke, Heiner Stuckenschmidt, Pavel Shvaiko, and Cássia Trojahn. 2011. Ontology alignment evaluation initiative: six years of experience. InJournal on data semantics XV. Springer, 158–192

  13. [13]

    Juliana Freire, Grace Fan, Benjamin Feuer, Christos Koutras, Yurong Liu, Eduardo Peña, and Eden Wu. 2025. Large Language Models for Data Discovery and Integration: Challenges and Opportunities.IEEE Data Eng. Bull49, 1 (2025), 3–31

  14. [14]

    Hamed Babaei Giglou, Jennifer D’Souza, Oliver Karras, and Sören Auer. 2025. OntoAligner: A Comprehensive Modular and Robust Python Toolkit for Ontology Alignment. https://doi.org/10.48550/arXiv.2503.21902 arXiv:2503.21902 [cs]

  15. [15]

    Nicolas Heist, Sven Hertling, and Heiko Paulheim. 2023. KGrEaT: A Framework to Evaluate Knowledge Graphs via Downstream Tasks. https://doi.org/10.48550/ arXiv.2308.10537 arXiv:2308.10537 [cs]

  16. [16]

    Sven Hertling and Heiko Paulheim. 2020. The Knowledge Graph Track at OAEI. The Semantic Web12123 (May 2020), 343–359. https://doi.org/10.1007/978-3- 030-49461-2_20

  17. [17]

    Marvin Hofer, Daniel Obraczka, Alieh Saeedi, Hanna Köpcke, and Erhard Rahm

  18. [18]

    https://doi.org/10.3390/INFO15080509

    Construction of Knowledge Graphs: Current State and Challenges.Inf.15, 8 (2024), 509. https://doi.org/10.3390/INFO15080509

  19. [19]

    Marvin Hofer and Erhard Rahm. 2025. KGpipe: Generation and Evaluation of Pipelines for Data Integration into Knowledge Graphs.CoRRabs/2511.18364 (2025). https://doi.org/10.48550/arXiv.2511.18364

  20. [20]

    Aidan Hogan, Eva Blomqvist, Michael Cochez, Claudia d’Amato, Gerard De Melo, Claudio Gutierrez, Sabrina Kirrane, José Emilio Labra Gayo, Roberto Navigli, Sebastian Neumaier, et al. 2021. Knowledge graphs.ACM Computing Surveys (Csur)54, 4 (2021), 1–37

  21. [21]

    Elwin Huaman. 2022. Steps to Knowledge Graphs Quality Assessment. https: //doi.org/10.48550/arXiv.2208.07779 arXiv:2208.07779 [cs]. Evaluation of Pipelines for Data Integration into Knowledge Graphs

  22. [22]

    Elwin Huaman, Amar Tauqeer, and Anna Fensel. 2021. Towards Knowledge Graphs Validation through Weighted Knowledge Sources. InKnowledge Graphs and Semantic Web (KGSWC 2021) (Communications in Computer and Information Science), Vol. 1459. Springer, 45–60. https://doi.org/10.1007/978-3-030-91305-2_4

  23. [23]

    Pere-Lluís Huguet Cabot, Simone Tedeschi, Axel-Cyrille Ngonga Ngomo, and Roberto Navigli. 2023. RED FM: a Filtered and Multilingual Relation Extraction Dataset. InProceedings of the 61st Annual Meeting of the Association for Computa- tional Linguistics (Volume 1: Long Papers). Association for Computational Linguis- tics, Toronto, Canada, 4326–4343. https:...

  24. [24]

    Jan Martin Keil. [n.d.]. ABECTO: Assessing Accuracy and Completeness of RDF Knowledge Graphs. ([n. d.])

  25. [26]

    Dimitris Kontokostas, Patrick Westphal, Sören Auer, Sebastian Hellmann, Jens Lehmann, Roland Cornelissen, and Amrapali Zaveri. 2014. Test-driven evaluation of linked data quality. In23rd International World Wide Web Conference, WWW ’14, Seoul, Republic of Korea, April 7-11, 2014, Chin-Wan Chung, Andrei Z. Broder, Kyuseok Shim, and Torsten Suel (Eds.). ACM...

  26. [27]

    Christos Koutras, George Siachamis, Andra Ionescu, Kyriakos Psarakis, Jerry Brons, Marios Fragkoulis, Christoph Lofi, Angela Bonifati, and Asterios Katsifodi- mos. 2021. Valentine: Evaluating Matching Techniques for Dataset Discovery. In2021 IEEE 37th International Conference on Data Engineering (ICDE). IEEE, 468–479

  27. [28]

    Domenico Lembo, Maurizio Lenzerini, Riccardo Rosati, Marco Ruzzi, and Domenico Fabio Savo. [n.d.]. Inconsistency-tolerant Query Answering in Ontology-based Data Access. ([n. d.])

  28. [29]

    Thomas Lukasiewicz, Enrico Malizia, Maria Vanina Martinez, Cristian Moli- naro, Andreas Pieris, and Gerardo I. Simari. 2022. Inconsistency-tolerant query answering for existential rules.Artificial Intelligence307 (June 2022), 103685. https://doi.org/10.1016/j.artint.2022.103685

  29. [30]

    Stefano Marchesin and Gianmaria Silvello. 2024. Efficient and Reliable Estimation of Knowledge Graph Accuracy.Proceedings of the VLDB Endowment17, 9 (May 2024), 2392–2403. https://doi.org/10.14778/3665844.3665865

  30. [31]

    Martinez-Rodriguez, Aidan Hogan, and Ivan Lopez-Arevalo

    Jose L. Martinez-Rodriguez, Aidan Hogan, and Ivan Lopez-Arevalo. 2020. Infor- mation extraction meets the Semantic Web: A survey.Semantic Web11, 2 (Feb. 2020), 255–335. https://doi.org/10.3233/SW-180333

  31. [32]

    Mendes, Hannes Mühleisen, and Christian Bizer

    Pablo N. Mendes, Hannes Mühleisen, and Christian Bizer. 2012. Sieve: linked data quality assessment and fusion. InProceedings of the 2012 Joint EDBT/ICDT Workshops, Berlin, Germany, March 30, 2012, Divesh Srivastava and Ismail Ari (Eds.). ACM, 116–123. https://doi.org/10.1145/2320765.2320803

  32. [33]

    Lars-Peter Meyer, Claus Stadler, Johannes Frey, Norman Radtke, Kurt Jung- hanns, Roy Meissner, Gordian Dziwis, Kirill Bulert, and Michael Martin. 2023. LLM-assisted Knowledge Graph Engineering: Experiments with ChatGPT. In First Working Conference on Artificial Intelligence Development for a Resilient and Sustainable Tomorrow - AI Tomorrow 2023, Leipzig, ...

  33. [34]

    Nandana Mihindukulasooriya, Sanju Tiwari, Carlos F Enguix, and Kusum Lata

  34. [35]

    InInternational semantic web conference

    Text2kgbench: A benchmark for ontology-driven knowledge graph gener- ation from text. InInternational semantic web conference. Springer, 247–265

  35. [36]

    Sedir Mohammed, Lisa Ehrlinger, Hazar Harmouch, Felix Naumann, and Divesh Srivastava. 2025. The Five Facets of Data Quality Assessment.SIGMOD Rec.54, 2 (July 2025), 18–27. https://doi.org/10.1145/3749116.3749120

  36. [37]

    Talukdar

    Prakhar Ojha and Partha P. Talukdar. 2017. KGEval: Accuracy Estimation of Automatically Constructed Knowledge Graphs. InProceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, EMNLP 2017, Copenhagen, Denmark, September 9-11, 2017, Martha Palmer, Rebecca Hwa, and Sebastian Riedel (Eds.). Association for Computational Linguis...

  37. [38]

    George Papadakis, Leonidas Tsekouras, Emmanouil Thanos, Nikiforos Pittaras, Giovanni Simonini, Dimitrios Skoutas, Paul Isaris, George Giannakopoulos, Themis Palpanas, and Manolis Koubarakis. 2020. JedAI3 : beyond batch, blocking- based Entity Resolution. InProceedings of the 23rd International Conference on Extending Database Technology, EDBT 2020, Copenh...

  38. [39]

    Heiko Paulheim. 2016. Knowledge graph refinement: A survey of approaches and evaluation methods.Semantic Web8, 3 (Dec. 2016), 489–508. https://doi. org/10.3233/SW-160218

  39. [40]

    Umair Qudus, Michael Röder, Muhammad Saleem, and Axel-Cyrille Ngonga Ngomo. [n.d.]. Fact Checking over Knowledge Graphs – A Survey. ([n. d.])

  40. [41]

    Kashif Rabbani, Matteo Lissandrini, and Katja Hose. 2022. SHACL and ShEx in the Wild: A Community Survey on Validating Shapes Generation and Adoption. InCompanion of The Web Conference 2022, Virtual Event / Lyon, France, April 25 - 29, 2022, Frédérique Laforest, Raphaël Troncy, Elena Simperl, Deepak Agarwal, Aristides Gionis, Ivan Herman, and Lionel Médin...

  41. [42]

    André Gomes Regino and Anderson Rossanez. 2026. A Systematic Literature Review on RDF Triple Generation from Natural Language Texts. (2026)

  42. [43]

    Farzad Shami, Stefano Marchesin, and Gianmaria Silvello. 2026. Benchmarking Large Language Models for Knowledge Graph Validation. InProceedings 29th International Conference on Extending Database Technology, EDBT 2026, Tampere, Finland, March 24-27, 2026. OpenProceedings.org, 551–565. https://doi.org/10. 48786/EDBT.2026.45

  43. [44]

    Suchanek, Serge Abiteboul, and Pierre Senellart

    Fabian M. Suchanek, Serge Abiteboul, and Pierre Senellart. 2011. PARIS: Proba- bilistic Alignment of Relations, Instances, and Schema.Proc. VLDB Endow.5, 3 (2011), 157–168. https://doi.org/10.14778/2078331.2078332

  44. [45]

    Zequn Sun, Qingheng Zhang, Wei Hu, Chengming Wang, Muhao Chen, Farahnaz Akrami, and Chengkai Li. 2020. A benchmarking study of embedding-based entity alignment for knowledge graphs.arXiv preprint arXiv:2003.07743(2020)

  45. [46]

    Gyte Tamasauskaite and Paul Groth. 2023. Defining a Knowledge Graph Devel- opment Process Through a Systematic Review.ACM Trans. Softw. Eng. Methodol. 32, 1 (2023), 27:1–27:40. https://doi.org/10.1145/3522586

  46. [47]

    Dylan Van Assche, Thomas Delva, Gerald Haesendonck, Pieter Heyvaert, Ben De Meester, and Anastasia Dimou. 2023. Declarative RDF graph generation from heterogeneous (semi-)structured data: A systematic literature review.Journal of Web Semantics75 (Jan. 2023), 100753. https://doi.org/10.1016/j.websem.2022. 100753

  47. [48]

    Xiangyu Wang, Lyuzhou Chen, Taiyu Ban, Muhammad Usman, Yifeng Guan, Shikang Liu, Tianhao Wu, and Huanhuan Chen. 2021. Knowledge graph quality control: A survey.Fundamental Research1, 5 (Sept. 2021), 607–626. https: //doi.org/10.1016/j.fmre.2021.09.003

  48. [49]

    Gerhard Weikum, Xin Luna Dong, Simon Razniewski, Fabian Suchanek, et al

  49. [50]

    Machine knowledge: Creation and curation of comprehensive knowledge bases.Foundations and Trends®in Databases10, 2-4 (2021), 108–490

  50. [51]

    Amrapali Zaveri, Anisa Rula, Andrea Maurino, Ricardo Pietrobon, Jens Lehmann, and Sören Auer. 2015. Quality assessment for Linked Data: A Survey: A sys- tematic literature review and conceptual framework.Semantic Web7, 1 (March 2015), 63–93. https://doi.org/10.3233/SW-150175 Marvin Hofer and Erhard Rahm APPENDIX Tables 4 and 5 report task-specific evaluat...