Construction of Historical Knowledge Graphs Based on BERT and Graph Neural Networks

Bartlomiej Brzozka; Ping Li

arxiv: 2606.01747 · v1 · pith:IQLAGWXVnew · submitted 2026-06-01 · 💻 cs.CL · cs.AI

Construction of Historical Knowledge Graphs Based on BERT and Graph Neural Networks

Ping Li , Bartlomiej Brzozka This is my paper

Pith reviewed 2026-06-28 15:00 UTC · model grok-4.3

classification 💻 cs.CL cs.AI

keywords knowledge graphsBERTgraph neural networkshistorical textsentity extractionrelation extractiondigital humanitiestext mining

0 comments

The pith

BERT combined with graph neural networks extracts entities and relationships from historical texts more accurately than rule-based methods or other deep learning baselines.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes a joint BERT-GNN architecture to convert unstructured historical documents into structured knowledge graphs by extracting entities and relationships. BERT handles semantic context while the GNN models connections between elements, addressing ambiguities and non-standard grammar common in old texts. Experiments use municipal records, parliamentary documents, and historical correspondence, claiming higher precision, recall, and F1 scores than conventional approaches. A sympathetic reader would care because this could make large-scale digital analysis of historical material feasible by turning it into queryable structured data.

Core claim

What carries the argument

The joint BERT-GNN architecture that uses bidirectional encoding for semantic representations followed by graph-based relational learning to extract entities and relationships.

If this is right

Historical data extraction becomes more accurate for building knowledge graphs than rule-based or standard deep-learning methods.
Complex nested structures and implicit references in historical writing can be handled with greater thoroughness.
The combination of context-sensitive semantic techniques and relational graph learning enables automatic addition of historical knowledge to repositories.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the performance gain holds, the approach could support scaling analysis to much larger historical corpora than manual methods allow.
The method might extend to other irregular text domains such as early legal or scientific documents.
Integration with other modalities like images could link extracted textual relations to visual historical sources.

Load-bearing premise

That BERT and GNN together can systematically resolve linguistic ambiguities, context-limited references, and lack of established grammatical norms in historical texts with sufficient accuracy to produce reliable knowledge graphs.

What would settle it

A direct comparison on the same collection of municipal records, parliamentary documents, and historical correspondence where the BERT-GNN system fails to show higher precision, recall, and F1-score than the rule-based and deep-learning baselines.

Figures

Figures reproduced from arXiv: 2606.01747 by Bartlomiej Brzozka, Ping Li.

**Figure 1.** Figure 1: System Architecture for Historical Knowledge Graph Construction. 4. Experiments and Results 4.1. Dataset and Experimental Configuration Approximately 15,000 phrases with more than 31,245 entities and more than 12,524 relationships between them were ultimately gathered based on a substantial number of municipal records, comments in the legislative record, and edited history. appropriately cross-checked the … view at source ↗

**Figure 2.** Figure 2: F1-score and Accuracy Comparison among BERT-GNN and Baseline Models [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: BERT-GNN Convergence and GNN Layer Depth Sensitivity. 4.3. Case interpretability and error diagnosis According to [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: Confusion Matrix Analysis for Entity and Relation Extraction. BERT-GNN demonstrates its efficacy using an example from the UK parliamentary proceedings in 1867. The model accurately identifies the following entities: "Industrial Committee" (organization), "Charles Robinson" (person), and "Session vote, 1867" (event/Dates). Additionally, it can replicate the information about member relationships and partic… view at source ↗

read the original abstract

Through digital humanities research and scale-up historical data analysis, a significant amount of traditional historical text is converted into structured knowledge graphs. This paper provides a high-level architecture that combines bidirectional encoder representations of transformers (BERT) and graph neural networks (GNN) to extract the entities and relationships from various types of historical texts. The texts of traditional history resolve linguistic ambiguities, references limited by context, and a lack of established grammatical norms in a systematic way. This study develops a new image retrieval system based on FastRQNet and pre-trained vision-language model Vilt-qaformer+RoBInet in accordance with the aforementioned recommendations. The experiments make full use of a comprehensive collection of municipal records, parliamentary documents, and historical correspondence. When compared to conventional rule-based techniques and other popular deep-learning baselines, the joint BERT-GNN system obtains greater Precision, Recall, and F1-score (Table 2). Complex nested structures and implicit reference issues can be handled by this structure with sufficient accuracy and thoroughness when creating knowledge graphs. The aforementioned experiments show that combining relational graph learning algorithms with context-sensitive semantic representation techniques can automatically extract historical data to add accumulated wisdom to the knowledge repository.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The abstract's abrupt switch to an unrelated image retrieval system breaks any link between the BERT-GNN method and the claimed results.

read the letter

The one thing to know is that the abstract inserts a sentence about developing an image retrieval system with FastRQNet and Vilt-qaformer right in the middle of the BERT-GNN description for historical texts. This switch leaves the Table 2 performance numbers without a clear connection to the stated pipeline.

The paper applies standard BERT for semantic context and GNN for relations on actual historical sources such as municipal records, parliamentary documents, and correspondence. Using those collections instead of synthetic data is a small step in the right direction, and the target problem of extracting entities and relations from texts with non-standard grammar is a real one in digital humanities.

The central weakness is the incoherence. The abstract asserts that the joint BERT-GNN system beats rule-based methods and other deep-learning baselines on precision, recall, and F1, yet the inserted image-retrieval claim has no transition or follow-up. Without that fixed, the experimental results cannot be attributed to the described architecture. There are no equations, no method details, and no error analysis supplied even in the abstract, so the gains remain assertions rather than demonstrated outcomes.

This is aimed at digital humanities readers who might want an off-the-shelf pipeline for turning old documents into graphs. Anyone already working with BERT or GNN applications will see nothing new in the approach itself.

I would not bring this to a reading group. It does not deserve peer review in its current form because the basic attribution of results is broken and the thinking is not coherent on its own terms.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes a high-level architecture that combines BERT and graph neural networks (GNN) to extract entities and relationships from historical texts (municipal records, parliamentary documents, and historical correspondence) in order to construct knowledge graphs. It claims that the joint BERT-GNN system systematically resolves linguistic ambiguities, context-limited references, and non-standard grammar, and that it obtains higher Precision, Recall, and F1-score than rule-based methods and other deep-learning baselines (Table 2). The abstract also contains an unrelated sentence describing an image retrieval system based on FastRQNet and Vilt-qaformer+RoBInet.

Significance. If the experimental results could be unambiguously attributed to the described BERT-GNN pipeline, the work would address a genuine need in digital humanities for automated structuring of historical texts. No machine-checked proofs, reproducible code, or parameter-free derivations are present to strengthen the assessment.

major comments (2)

[Abstract] Abstract: the sentence 'This study develops a new image retrieval system based on FastRQNet and pre-trained vision-language model Vilt-qaformer+RoBInet in accordance with the aforementioned recommendations' has no connection to the BERT-GNN architecture or to historical-text entity/relation extraction. This severs attribution of the Table 2 performance numbers to the claimed method, rendering the central empirical claim unevaluable.
[Abstract] Abstract and § (methods description): only a high-level architecture is sketched; no model details, training procedure, dataset statistics, hyper-parameters, or error analysis are supplied, so the reported gains in Precision/Recall/F1 cannot be reproduced or scrutinized.

minor comments (1)

[Abstract] Abstract: the clause 'The texts of traditional history resolve linguistic ambiguities...' is grammatically unclear and should be rephrased.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments. We address the major points below and indicate where revisions will be made.

read point-by-point responses

Referee: [Abstract] Abstract: the sentence 'This study develops a new image retrieval system based on FastRQNet and pre-trained vision-language model Vilt-qaformer+RoBInet in accordance with the aforementioned recommendations' has no connection to the BERT-GNN architecture or to historical-text entity/relation extraction. This severs attribution of the Table 2 performance numbers to the claimed method, rendering the central empirical claim unevaluable.

Authors: We agree this sentence is unrelated and was included in error during drafting. It will be removed from the abstract. The Table 2 results derive from the BERT-GNN experiments on municipal records, parliamentary documents, and historical correspondence as described in the methods and experiments sections. revision: yes
Referee: [Abstract] Abstract and § (methods description): only a high-level architecture is sketched; no model details, training procedure, dataset statistics, hyper-parameters, or error analysis are supplied, so the reported gains in Precision/Recall/F1 cannot be reproduced or scrutinized.

Authors: The manuscript presents a high-level architecture to emphasize the overall BERT-GNN pipeline for historical knowledge graph construction. We acknowledge that greater detail is required for reproducibility. In revision we will expand the methods section with BERT and GNN component specifications, training procedures, dataset statistics, hyper-parameters, and error analysis. revision: yes

Circularity Check

0 steps flagged

No derivation chain or equations present; no circularity detectable

full rationale

The manuscript describes a high-level BERT-GNN architecture for entity/relation extraction from historical texts and reports empirical Precision/Recall/F1 gains versus baselines in Table 2. No equations, derivations, fitted parameters, or self-citations appear in the provided text. The central claim is an empirical performance comparison rather than a mathematical derivation that could reduce to its inputs by construction. None of the enumerated circularity patterns (self-definitional, fitted-input-called-prediction, self-citation load-bearing, etc.) are applicable because no load-bearing derivation steps exist.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract supplies no information on free parameters, axioms, or invented entities.

pith-pipeline@v0.9.1-grok · 5733 in / 1073 out tokens · 21328 ms · 2026-06-28T15:00:00.192402+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

15 extracted references · 1 linked inside Pith

[1]

Fresnel lenses in rear projection displays,

Wang, Q., Mao, Z., Wang, B., & Guo, L. (2017). Knowledge graph embedding: A survey of approaches and applications. IEEE transactions on knowledge and data engineering, 29(12), 2724-2743.Davis, A. R., Bush, C., Harvey, J. C. and Foley, M. F., "Fresnel lenses in rear projection displays," SID Int. Symp. Digest Tech. Papers 32(1), 934-937 (2001)

2017
[2]

Ji, S., Pan, S., Cambria, E., Marttinen, P., & Yu, P. S. (2021). A survey on knowledge graphs: Representation, acquisition, and applications. IEEE transactions on neural networks and learning systems, 33(2), 494-514.-

2021
[3]

Chen, Q., Zhuo, Z., & Wang, W. (2019). Bert for joint intent classification and slot filling. arXiv preprint arXiv:1902.10909

Pith/arXiv arXiv 2019
[4]

(2018, November)

Han, X., Cao, S., Lv, X., Lin, Y., Liu, Z., Sun, M., & Li, J. (2018, November). Openke: An open toolkit for knowledge embedding. In Proceedings of the 2018 conference on empirical methods in natural language processing: system demonstrations (pp. 139-144)

2018
[5]

& Long, B

Wu, L., Chen, Y., Shen, K., Guo, X., Gao, H., Li, S., ... & Long, B. (2023). Graph neural networks for natural language processing: A survey. Foundations and Trends in Machine Learning, 16(2), 119-328

2023
[6]

Liu, B., & Wu, L. (2022). Graph neural networks in natural language processing. In Graph neural networks: foundations, frontiers, and applications (pp. 463-481). Singapore: Springer Nature Singapore

2022
[7]

(2021, December)

Liu, X., Su, Y., & Xu, B. (2021, December). The application of graph neural network in natural language processing and computer vision. In 2021 3rd International Conference on Machine Learning, Big Data and Business Intelligence (MLBDBI) (pp. 708-714). IEEE

2021
[8]

(2023, July)

Yang, S., Choi, M., Cho, Y., & Choo, J. (2023, July). HistRED: A historical document -level relation extraction dataset. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 3207-3224)

2023
[9]

(2020, December)

Zhang, Z., Yu, B., Shu, X., Liu, T., Tang, H., Yubin, W., & Guo, L. (2020, December). Document -level relation extraction with dual- tier heterogeneous graph. In Proceedings of the 28th International Conference on Computational Linguistics (pp. 1630-1641)

2020
[10]

Abu-Salih, B., Al -Qurishi, M., Alweshah, M., Al- Smadi, M., Alfayez, R., & Saadeh, H. (2023). Healthcare knowledge graph construction: A systematic review of the state- of-the-art, open issues, and opportunities. Journal of Big Data, 10(1), 81

2023
[11]

Abraham, S., Mäs, S., & Bernard, L. (2018). Extraction of spatio‐temporal data about historical events from text documents. Transactions in GIS, 22(3), 677-696

2018
[12]

Liu, S., Yang, H., Li, J., & Kolmanič, S. (2020). Preliminary study on the knowledge graph construction of Chinese ancient history and culture. Information, 11(4), 186

2020
[13]

Chen, Y., Tian, M., Wu, Q., Tao, L., Jiang, T., Qiu, Q., & Huang, H. (2024). A deep learning -based method for deep information extraction from multimodal data for geological reports to support geological knowledge graph construction. Earth Science Informatics, 17(3), 1867-1887

2024
[14]

Li, Y., Luo, L., Zeng, X., & Han, Z. (2025). Fine-tuned BERT-BiLSTM-CRF approach for named entity recognition in geological disaster texts. Earth Science Informatics, 18(2), 368

2025
[15]

(2024, December)

Xie, C., Deng, L., Tang, Z., & He, J. (2024, December). Fusion and Construction Strategy of Knowledge Graphs from Multi-Source Data. In 2024 4th International Conference on Mobile Networks and Wireless Communications (ICMNWC) (pp. 1-7). IEEE

2024

[1] [1]

Fresnel lenses in rear projection displays,

Wang, Q., Mao, Z., Wang, B., & Guo, L. (2017). Knowledge graph embedding: A survey of approaches and applications. IEEE transactions on knowledge and data engineering, 29(12), 2724-2743.Davis, A. R., Bush, C., Harvey, J. C. and Foley, M. F., "Fresnel lenses in rear projection displays," SID Int. Symp. Digest Tech. Papers 32(1), 934-937 (2001)

2017

[2] [2]

Ji, S., Pan, S., Cambria, E., Marttinen, P., & Yu, P. S. (2021). A survey on knowledge graphs: Representation, acquisition, and applications. IEEE transactions on neural networks and learning systems, 33(2), 494-514.-

2021

[3] [3]

Chen, Q., Zhuo, Z., & Wang, W. (2019). Bert for joint intent classification and slot filling. arXiv preprint arXiv:1902.10909

Pith/arXiv arXiv 2019

[4] [4]

(2018, November)

Han, X., Cao, S., Lv, X., Lin, Y., Liu, Z., Sun, M., & Li, J. (2018, November). Openke: An open toolkit for knowledge embedding. In Proceedings of the 2018 conference on empirical methods in natural language processing: system demonstrations (pp. 139-144)

2018

[5] [5]

& Long, B

Wu, L., Chen, Y., Shen, K., Guo, X., Gao, H., Li, S., ... & Long, B. (2023). Graph neural networks for natural language processing: A survey. Foundations and Trends in Machine Learning, 16(2), 119-328

2023

[6] [6]

Liu, B., & Wu, L. (2022). Graph neural networks in natural language processing. In Graph neural networks: foundations, frontiers, and applications (pp. 463-481). Singapore: Springer Nature Singapore

2022

[7] [7]

(2021, December)

Liu, X., Su, Y., & Xu, B. (2021, December). The application of graph neural network in natural language processing and computer vision. In 2021 3rd International Conference on Machine Learning, Big Data and Business Intelligence (MLBDBI) (pp. 708-714). IEEE

2021

[8] [8]

(2023, July)

Yang, S., Choi, M., Cho, Y., & Choo, J. (2023, July). HistRED: A historical document -level relation extraction dataset. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 3207-3224)

2023

[9] [9]

(2020, December)

Zhang, Z., Yu, B., Shu, X., Liu, T., Tang, H., Yubin, W., & Guo, L. (2020, December). Document -level relation extraction with dual- tier heterogeneous graph. In Proceedings of the 28th International Conference on Computational Linguistics (pp. 1630-1641)

2020

[10] [10]

Abu-Salih, B., Al -Qurishi, M., Alweshah, M., Al- Smadi, M., Alfayez, R., & Saadeh, H. (2023). Healthcare knowledge graph construction: A systematic review of the state- of-the-art, open issues, and opportunities. Journal of Big Data, 10(1), 81

2023

[11] [11]

Abraham, S., Mäs, S., & Bernard, L. (2018). Extraction of spatio‐temporal data about historical events from text documents. Transactions in GIS, 22(3), 677-696

2018

[12] [12]

Liu, S., Yang, H., Li, J., & Kolmanič, S. (2020). Preliminary study on the knowledge graph construction of Chinese ancient history and culture. Information, 11(4), 186

2020

[13] [13]

Chen, Y., Tian, M., Wu, Q., Tao, L., Jiang, T., Qiu, Q., & Huang, H. (2024). A deep learning -based method for deep information extraction from multimodal data for geological reports to support geological knowledge graph construction. Earth Science Informatics, 17(3), 1867-1887

2024

[14] [14]

Li, Y., Luo, L., Zeng, X., & Han, Z. (2025). Fine-tuned BERT-BiLSTM-CRF approach for named entity recognition in geological disaster texts. Earth Science Informatics, 18(2), 368

2025

[15] [15]

(2024, December)

Xie, C., Deng, L., Tang, Z., & He, J. (2024, December). Fusion and Construction Strategy of Knowledge Graphs from Multi-Source Data. In 2024 4th International Conference on Mobile Networks and Wireless Communications (ICMNWC) (pp. 1-7). IEEE

2024