pith. sign in

arxiv: 2507.13625 · v2 · submitted 2025-07-18 · 💻 cs.AI

Bridging Dual Knowledge Graphs for Multi-Hop Question Answering in Construction Safety

Pith reviewed 2026-05-19 05:02 UTC · model grok-4.3

classification 💻 cs.AI
keywords dual knowledge graphsmulti-hop question answeringretrieval-augmented generationconstruction safetycompliance checkinglarge language modelsinformation retrievalregulatory text
0
0 comments X

The pith

Dual knowledge graphs enable high-accuracy multi-hop reasoning over complex safety regulations for automated compliance checking.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents BifrostRAG, a system that uses two interconnected knowledge graphs to handle questions about construction safety rules that require linking multiple pieces of information. One graph captures how words and phrases relate in the text, while the other tracks the document's organizational structure like sections and clauses. By combining graph-based traversal with semantic vector search, the system retrieves relevant information for large language models to answer accurately. This approach matters because safety regulations are dense and interconnected, making manual or simple search methods error-prone for compliance tasks in construction projects.

Core claim

BifrostRAG models both linguistic relationships and document structure using dual knowledge graphs. It employs a hybrid retrieval mechanism that combines graph traversal with vector-based semantic search. On a multi-hop question dataset, this yields 92.8% precision, 85.5% recall, and an F1 score of 87.3%, significantly outperforming vector-only and graph-only RAG baselines and serving as a robust knowledge engine for LLM-driven compliance checking.

What carries the argument

The dual-graph hybrid retrieval mechanism, which integrates linguistic relationship graphs with structural document graphs to support multi-hop synthesis across regulatory clauses.

If this is right

  • Supports accurate synthesis of information across interlinked clauses in regulatory texts.
  • Outperforms traditional single-graph or vector-based retrieval methods in precision and recall for compliance queries.
  • Offers a blueprint for applying similar dual-graph approaches to complex technical documents in other engineering fields.
  • Enhances the reliability of LLM-based systems for automated construction compliance checking.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Applying this to other regulatory areas like building codes or environmental standards could yield similar gains in query handling.
  • Future work might test integration with real-time project data to flag compliance issues proactively during construction planning.
  • Scalability to larger regulatory corpora without performance loss would need verification in expanded datasets.

Load-bearing premise

The linguistic and structural relationships captured by the dual knowledge graphs are sufficient to support accurate multi-hop reasoning over the full complexity of regulatory text without additional domain-specific tuning or post-processing.

What would settle it

Testing the system on a new multi-hop dataset derived from a different set of construction regulations where the F1 score falls below that of vector-only or graph-only baselines would indicate the dual-graph approach does not generalize as claimed.

Figures

Figures reproduced from arXiv: 2507.13625 by College of Architecture, College Station, Mo Hu (1), Texas A&M University, USA), Xi Wang (1), Yuxin Zhang (1), Zhenyu Zhang (1) ((1) Department of Construction Science.

Figure 1
Figure 1. Figure 1: Structure of OSHA 1926 2.3. Multi-hop Questions Due to the structural complexity of safety regulations, safety professionals frequently encounter questions that require in￾tegrating information from multiple, non-contiguous provi￾sions—a process known as multi-hop QA [23, 11]. This pro￾cess demands navigation through the relationship types and en￾abling mechanisms described in [PITH_FULL_IMAGE:figures/ful… view at source ↗
Figure 2
Figure 2. Figure 2: BiFrostRAG Framework These complementary graphs power BifrostRAG’s hybrid retrieval mechanism (Figure 2b), which employs a two-stage retrieval process for QA. First, user questions are decomposed into entities and triples, then semantically matched against the ENG. This process identifies semantically relevant sections that share terminology and scope of application through entity and triple matching. Seco… view at source ↗
Figure 3
Figure 3. Figure 3: Prompt Outline of Content Pruning Entity Extraction: The entity extraction agent integrates the simplified text produced by the content pruning agent into the user prompt as input. Then it requests LLM to extract entities in accordance with Wikipedia definitions, assigning represen￾tative labels based on prior knowledge. In addition to entity identification and labeling, the agent also extracts all referen… view at source ↗
Figure 4
Figure 4. Figure 4: Brief Prompt of Entity Extraction System prompt: Now you are a professional knowledge graph generator. You will be provided with a paragraph from the OSHA construction safety regulations. Extract all relations between the previously defined entities {} from the entire text. 1 Only use previously defined entities: {}. Do not make any change to the previously defined entities name. 2 When describing the rela… view at source ↗
Figure 5
Figure 5. Figure 5: Brief Prompt of Relationship Extraction Figure 6a, both the newly extracted entity strings Enew and the relation strings Rnew are transformed into vector representations via the embedding engine. The relation vectors are directly collected into the local schema, whereas for each new entity enew, the iEntities Refiner performs an iterative cosine similar￾ity computation between its embedding and the set of … view at source ↗
Figure 6
Figure 6. Figure 6: Embedding-Based Incremental Entity Refinement [PITH_FULL_IMAGE:figures/full_fig_p010_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: The Construction of Document Navigator Graph [PITH_FULL_IMAGE:figures/full_fig_p010_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Designed Interface of the OSHA Regulation QA System [PITH_FULL_IMAGE:figures/full_fig_p011_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Search Result for a Sample User Query expert deliberation and iterative refinement, a set of predefined ground truth answers is established for performance evaluation. 6.2. Evaluation Metrics In the experiment, each model was instructed to generate an￾swers with corresponding reference section IDs. We evaluated model performance in answering multi-hop QA using three ob￾jective metrics: (1) Precision rate: … view at source ↗
Figure 10
Figure 10. Figure 10: Query Processing Pipeline of the QA System [PITH_FULL_IMAGE:figures/full_fig_p013_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Visualization of the Generated Dual Knowledge Graphs in Neo4j [PITH_FULL_IMAGE:figures/full_fig_p015_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Performance Comparison Across OSHA Subparts [PITH_FULL_IMAGE:figures/full_fig_p016_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Hierarchical Clustering of Low-Performing Questions [PITH_FULL_IMAGE:figures/full_fig_p016_13.png] view at source ↗
read the original abstract

Information retrieval and question answering from safety regulations are essential for automated construction compliance checking but are hindered by the linguistic and structural complexity of regulatory text. Many queries are multi-hop, requiring synthesis across interlinked clauses. To address the challenge, this paper introduces BifrostRAG, a dual-graph retrieval-augmented generation (RAG) system that models both linguistic relationships and document structure. The proposed architecture supports a hybrid retrieval mechanism that combines graph traversal with vector-based semantic search, enabling large language models to reason over both the content and the structure of the text. On a multi-hop question dataset, BifrostRAG achieves 92.8% precision, 85.5% recall, and an F1 score of 87.3%. These results significantly outperform vector-only and graph-only RAG baselines, establishing BifrostRAG as a robust knowledge engine for LLM-driven compliance checking. The dual-graph, hybrid retrieval mechanism presented in this paper offers a transferable blueprint for navigating complex technical documents across knowledge-intensive engineering domains.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The paper introduces BifrostRAG, a dual knowledge graph RAG system that models both linguistic relationships and document structure for multi-hop question answering over construction safety regulations. It uses hybrid retrieval combining graph traversal and vector-based search to support LLM reasoning, reporting 92.8% precision, 85.5% recall, and 87.3% F1 on a multi-hop question dataset while outperforming vector-only and graph-only baselines.

Significance. If the empirical results hold under proper validation, the dual-graph hybrid approach offers a practical blueprint for navigating complex regulatory texts in engineering domains. The work directly targets a real-world need in automated compliance checking and could transfer to other knowledge-intensive technical documents.

major comments (2)
  1. Abstract: The central claim of robustness for multi-hop reasoning rests on the reported metrics (92.8% precision, 85.5% recall, 87.3% F1) significantly outperforming baselines, yet no details are supplied on dataset size, construction method (expert annotation vs. synthetic/LLM-generated), clause-type diversity, or whether the test questions were held out from dual-graph construction. This omission is load-bearing because it prevents distinguishing genuine generalization from possible dataset artifacts that exploit the exact linguistic/structural links modeled by the graphs.
  2. Evaluation (implied by abstract claims): Without reported statistical significance tests, error analysis, baseline implementation details, or dataset statistics, it is impossible to assess whether the outperformance demonstrates sufficiency for full regulatory complexity or merely reflects a small or specially crafted test set.

Simulated Author's Rebuttal

2 responses · 0 unresolved

Thank you for reviewing our manuscript and providing valuable feedback. We have carefully considered the major comments regarding the transparency of our dataset and evaluation details. We address each point below and have revised the manuscript to include the requested information.

read point-by-point responses
  1. Referee: Abstract: The central claim of robustness for multi-hop reasoning rests on the reported metrics (92.8% precision, 85.5% recall, 87.3% F1) significantly outperforming baselines, yet no details are supplied on dataset size, construction method (expert annotation vs. synthetic/LLM-generated), clause-type diversity, or whether the test questions were held out from dual-graph construction. This omission is load-bearing because it prevents distinguishing genuine generalization from possible dataset artifacts that exploit the exact linguistic/structural links modeled by the graphs.

    Authors: We agree with the referee that these details are crucial for evaluating the generalizability of our results. Although the full manuscript describes the dataset construction in Section 4.1, we recognize that the abstract lacked this information. We have revised the abstract to briefly note the dataset size and expert annotation process. Furthermore, we have added explicit statements confirming that the test questions were held out from the dual-graph construction and included statistics on clause-type diversity in the revised Experiments section. revision: yes

  2. Referee: Evaluation (implied by abstract claims): Without reported statistical significance tests, error analysis, baseline implementation details, or dataset statistics, it is impossible to assess whether the outperformance demonstrates sufficiency for full regulatory complexity or merely reflects a small or specially crafted test set.

    Authors: We acknowledge that the original submission did not include statistical significance tests or a dedicated error analysis. In the revised version, we have added these elements: we now report p-values from appropriate statistical tests showing the significance of the performance improvements. A new error analysis subsection discusses the remaining failure cases and their implications for regulatory complexity. We have also expanded the baseline descriptions with implementation details and added a table with comprehensive dataset statistics to better characterize the test set. revision: yes

Circularity Check

0 steps flagged

No circularity detected; results rest on external dataset and baselines

full rationale

The paper presents BifrostRAG as an architectural system combining dual knowledge graphs with hybrid retrieval for multi-hop QA over regulatory text. Performance is reported as empirical metrics (92.8% precision, 85.5% recall, 87.3% F1) on a multi-hop question dataset, with direct comparison to vector-only and graph-only RAG baselines. No equations, derivations, fitted parameters, or self-citations appear in the provided text that would reduce these outcomes to inputs by construction. The evaluation chain relies on an external test set and standard retrieval baselines, remaining self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only the abstract is available; no explicit free parameters, mathematical axioms, or newly postulated entities are stated. The system name BifrostRAG and the dual-graph split are presented as engineering choices rather than derived entities.

pith-pipeline@v0.9.0 · 5741 in / 1202 out tokens · 42064 ms · 2026-05-19T05:02:37.526934+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

61 extracted references · 61 canonical work pages · 1 internal anchor

  1. [1]

    Q. Chen, D. Long, C. Yang, H. Xu, Knowledge graph improved dy- namic risk analysis method for behavior-based safety management on a construction site, Journal of Management in Engineering 39 (4) (2023) 04023023. doi:10.1061/JMENEA.MEENG-5306

  2. [2]

    Y . Lu, Q. Li, Z. Zhou, Y . Deng, Ontology-based knowledge modeling for automated construction safety checking, Safety Science 79 (2015) 11–18. doi:10.1016/j.ssci.2015.05.008

  3. [3]

    S. Choe, S. Yun, F. Leite, Analysis of the e ffectiveness of the osha steel erection standard in the construction industry, Safety Science 89 (2016) 190–200. doi:10.1016/j.ssci.2016.06.016

  4. [4]

    P. K. Howard, The Death of Common Sense: How Law Is Su ffocating America, Random House Publishing Group, 2011

  5. [5]

    Schulte, A

    P. Schulte, A. Okun, C. Stephenson, M. Colligan, H. Ahlers, C. Gjessing, G. Loos, R. Niemeier, M. Sweeney, Information dissemination and use: Critical components in occupational safety and health, American Journal of Industrial Medicine 44 (5) (2003) 515–531. doi:10.1002/ajim. 10295

  6. [6]

    Solihin, C

    W. Solihin, C. Eastman, A knowledge representation approach in bim rule requirement analysis using the conceptual graph, Journal of Information Technology in Construction (ITcon) 21 (24) (2016) 370–401

  7. [7]

    Y . Zhou, W. Solihin, J. K. W. Yeoh, Facilitating knowledge transfer dur- ing code compliance checking using conceptual graphs, Journal of Com- puting in Civil Engineering 37 (5) (2023) 05023001. doi:10.1061/ JCCEE5.CPENG-4884

  8. [9]

    H. Wu, B. Zhong, H. Li, P. Love, X. Pan, N. Zhao, Combining com- puter vision with semantic reasoning for on-site safety management in construction, Journal of Building Engineering 42 (2021) 103036. doi: 10.1016/j.jobe.2021.103036

  9. [10]

    A. S. Kulinan, M. Park, P. P. W. Aung, G. Cha, S. Park, Advancing construction site workforce safety monitoring through bim and com- puter vision integration, Automation in Construction 158 (2024) 105227. doi:10.1016/j.autcon.2023.105227

  10. [11]

    D. Cui, S. Xu, S. Wang, K. Zhang, Beyond the images: Comprehensible unsafe behaviour recognition boosted by joint inference graph with multi- hop reasoning, Advanced Engineering Informatics 66 (2025) 103454. doi:10.1016/j.aei.2025.103454

  11. [12]

    Zhang, X

    J. Zhang, X. Ruan, H. Si, X. Wang, Dynamic hazard analysis on construc- tion sites using knowledge graphs integrated with real-time information, Automation in Construction 170 (2025) 105938

  12. [13]

    J. Lee, S. Ahn, D. Kim, D. Kim, Performance comparison of retrieval- augmented generation and fine-tuned large language models for construc- tion safety management knowledge retrieval, Automation in Construction 168 (2024) 105846

  13. [14]

    Y . Chen, G. Lu, K. Wang, S. Chen, C. Duan, Knowledge graph for safety management standards of water conservancy construction engi- neering, Automation in Construction 168 (2024) 105873.doi:10.1016/ j.autcon.2024.105873

  14. [15]

    C. Wu, W. Ding, Q. Jin, J. Jiang, R. Jiang, Q. Xiao, L. Liao, X. Li, Re- trieval augmented generation-driven information retrieval and question answering in construction management, Advanced Engineering Informat- ics 65 (2025) 103158. doi:10.1016/j.aei.2025.103158

  15. [16]

    L. Guo, F. Yan, T. Li, T. Yang, Y . Lu, An automatic method for con- structing machining process knowledge base from knowledge graph, Robotics and Computer-Integrated Manufacturing 73 (2022) 102222. doi:10.1016/j.rcim.2021.102222

  16. [18]

    J. Dong, Q. Zhang, X. Huang, K. Duan, Q. Tan, Z. Jiang, Hierarchy-aware multi-hop question answering over knowledge graphs, in: Proceedings of the ACM Web Conference 2023, 2023, pp. 2519–2527. doi:10.1145/ 3543507.3583376

  17. [19]

    R. Ren, J. Zhang, Semantic rule-based construction procedural informa- tion extraction to guide jobsite sensing and monitoring, Journal of Com- puting in Civil Engineering 35 (6) (2021) 04021026. doi:10.1061/ (ASCE)CP.1943-5487.0000971

  18. [20]

    X. Wang, N. El-Gohary, Deep learning-based named entity recognition and resolution of referential ambiguities for enhanced information ex- traction from construction safety regulations, Journal of Computing in Civil Engineering 37 (5) (2023) 04023023. doi:10.1061/(ASCE)CP. 1943-5487.0001064

  19. [21]

    X. Wang, N. El-Gohary, Deep learning-based relation extraction and knowledge graph-based representation of construction safety require- ments, Automation in Construction 147 (2023) 104696. doi:10.1016/ j.autcon.2022.104696

  20. [22]

    H. Wang, S. Xu, D. Cui, H. Xu, H. Luo, Information integration of regulation texts and tables for automated construction safety knowledge mapping, Journal of Construction Engineering and Management 150 (5) (2024) 04024034. doi:10.1061/JCEMD4.COENG-14436

  21. [23]

    V . Mavi, A. Jangra, A. Jatowt, Multi-hop question answering, Foun- dations and Trends ® in Information Retrieval 17 (5) (2024) 457–586. doi:10.1561/1500000102

  22. [24]

    Ehrlinger, W

    L. Ehrlinger, W. W ¨oß, Towards a definition of knowledge graphs (2016)

  23. [25]

    Paulheim, Knowledge graph refinement: A survey of approaches and evaluation methods, Semantic Web 8 (3) (2017) 489–508

    H. Paulheim, Knowledge graph refinement: A survey of approaches and evaluation methods, Semantic Web 8 (3) (2017) 489–508. doi: 10.3233/SW-160218

  24. [26]

    ACM Comput

    A. Hogan, E. Blomqvist, M. Cochez, C. D’amato, G. De Melo, C. Gutier- rez, S. Kirrane, et al., Knowledge graphs, ACM Computing Surveys 54 (4) (2022) 1–37. doi:10.1145/3447772

  25. [27]

    Malyshev, M

    S. Malyshev, M. Kr ¨otzsch, L. Gonz ´alez, J. Gonsior, A. Bielefeldt, Get- ting the most out of wikidata: Semantic technology usage in wikipedia’s knowledge graph, in: The Semantic Web – ISWC 2018, 2018, pp. 376– 394

  26. [28]

    Yahya, J

    M. Yahya, J. G. Breslin, M. I. Ali, Semantic web and knowledge graphs for industry 4.0, Applied Sciences 11 (11) (2021) 5110. doi:10.3390/ app11115110

  27. [29]

    Zou, A survey on application of knowledge graph, Journal of Physics: Conference Series 1487 (1) (2020) 012016.doi:10.1088/1742-6596/ 1487/1/012016

    X. Zou, A survey on application of knowledge graph, Journal of Physics: Conference Series 1487 (1) (2020) 012016.doi:10.1088/1742-6596/ 1487/1/012016

  28. [30]

    Qian, X.-Y

    J. Qian, X.-Y . Li, C. Zhang, L. Chen, T. Jung, J. Han, Social network de- anonymization and privacy inference with knowledge graph model, IEEE Transactions on Dependable and Secure Computing 16 (4) (2019) 679–

  29. [31]

    doi:10.1109/TDSC.2017.2697854

  30. [32]

    Z. Wang, T. Chen, J. Ren, W. Yu, H. Cheng, L. Lin, Deep reasoning with knowledge graph for social relationship understanding, arXiv (2018). doi:10.48550/arXiv.1807.00504

  31. [33]

    S. Zhu, J. Zhou, L. Cheng, X. Fu, Y . Wang, K. Dai, Research on a bim model quality compliance checking method based on a knowledge graph, Journal of Computing in Civil Engineering 39 (1) (2025) 04024049.doi: 10.1061/JCCEE5.CPENG-5950

  32. [34]

    V . K. Kommineni, B. K ¨onig-Ries, S. Samuel, From human experts to machines: An llm supported approach to ontology and knowledge graph construction, arXiv (2024). doi:10.48550/arXiv.2403.08345

  33. [35]

    Asprino, E

    L. Asprino, E. Daga, A. Gangemi, P. Mulholland, Knowledge graph con- struction with a fac ¸ade: A unified method to access heterogeneous data sources on the web, ACM Transactions on Internet Technology 23 (1) (2023) 6:1–6:31. doi:10.1145/3555312

  34. [36]

    S. Ji, S. Pan, E. Cambria, P. Marttinen, P. S. Yu, A survey on knowledge graphs: Representation, acquisition, and applications, IEEE Transactions on Neural Networks and Learning Systems 33 (2) (2022) 494–514. doi: 10.1109/TNNLS.2021.3070843

  35. [37]

    Zhong, J

    L. Zhong, J. Wu, Q. Li, H. Peng, X. Wu, A comprehensive survey on au- tomatic knowledge graph construction, ACM Computing Surveys 56 (4) (2023) 94:1–94:62. doi:10.1145/3618295

  36. [38]

    Al-Moslmi, M

    T. Al-Moslmi, M. G. Ocana, A. L. Opdahl, C. Veres, Named entity extrac- tion for knowledge graphs: A literature overview, IEEE Access 8 (2020) 32862–32881. doi:10.1109/ACCESS.2020.2973928

  37. [39]

    Nassiri, M

    K. Nassiri, M. Akhloufi, Transformer models used for text-based ques- tion answering systems, Applied Intelligence 53 (9) (2023) 10602–10635. 18 doi:10.1007/s10489-022-04052-8

  38. [40]

    A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions

    L. Huang, W. Yu, W. Ma, W. Zhong, Z. Feng, H. Wang, Q. Chen, et al., A survey on hallucination in large language models: Principles, taxon- omy, challenges, and open questions, ACM Transactions on Information Systems 43 (2) (2025) 1–55. doi:10.1145/3703155

  39. [41]

    Korbak, H

    T. Korbak, H. Elsahar, G. Kruszewski, M. Dymetman, On reinforcement learning and distribution matching for fine-tuning language models with no catastrophic forgetting, Advances in Neural Information Processing Systems 35 (2022) 16203–16220

  40. [42]

    Z. Liu, W. Ping, R. Roy, P. Xu, C. Lee, M. Shoeybi, B. Catanzaro, Chatqa: Surpassing gpt-4 on conversational qa and rag, Advances in Neural Infor- mation Processing Systems 37 (2025) 15416–15459

  41. [43]

    T. R. McIntosh, T. Liu, T. Susnjak, P. Watters, A. Ng, M. N. Halgamuge, A culturally sensitive test to evaluate nuanced gpt hallucination, IEEE Transactions on Artificial Intelligence 5 (6) (2024) 2739–2751. doi: 10.1109/TAI.2023.3332837

  42. [44]

    S. Pan, L. Luo, Y . Wang, C. Chen, J. Wang, X. Wu, Unifying large language models and knowledge graphs: A roadmap, IEEE Transac- tions on Knowledge and Data Engineering 36 (7) (2024) 3580–3599. doi:10.1109/TKDE.2024.3352100

  43. [45]

    Y . Zhu, X. Wang, J. Chen, S. Qiao, Y . Ou, Y . Yao, S. Deng, H. Chen, N. Zhang, Llms for knowledge graph construction and reasoning: Recent capabilities and future opportunities, World Wide Web 27 (5) (2024) 58. doi:10.1007/s11280-024-01297-w

  44. [46]

    Meyer, J

    L.-P. Meyer, J. Frey, F. Brei, N. Arndt, Assessing sparql capabilities of large language models, arXiv (2024). doi:10.48550/arXiv.2409. 05925

  45. [47]

    Taipalus, Vector database management systems: Fundamental con- cepts, use-cases, and current challenges, Cognitive Systems Research 85 (2024) 101216

    T. Taipalus, Vector database management systems: Fundamental con- cepts, use-cases, and current challenges, Cognitive Systems Research 85 (2024) 101216. doi:10.1016/j.cogsys.2024.101216

  46. [48]

    Y . Wan, Z. Chen, Y . Liu, C. Chen, M. Packianather, Empowering llms by hybrid retrieval-augmented generation for domain-centric q&a in smart manufacturing, Advanced Engineering Informatics 65 (2025) 103212. doi:10.1016/j.aei.2025.103212

  47. [49]

    X. Pan, W. Zhuang, S. Wen, W. Yu, J. Bao, X. Li, A context-aware kg- llm collaborated conceptual design approach for personalized products: A case in lower limbs rehabilitation assistive devices, Advanced Engi- neering Informatics 66 (2025) 103422. doi:10.1016/j.aei.2025. 103422

  48. [50]

    Zhang, G

    D. Zhang, G. Ma, T. Qu, X. Wang, W. Zhou, X. Wang, A knowl- edge graph-enhanced large language model for question answering of hy- draulic structure safety management, Advanced Engineering Informatics 66 (2025) 103468. doi:10.1016/j.aei.2025.103468

  49. [51]

    Francis, A

    N. Francis, A. Green, P. Guagliardo, L. Libkin, T. Lindaaker, V . Marsault, S. Plantikow, M. Rydberg, P. Selmer, A. Taylor, Cypher: An evolving query language for property graphs, in: Proceedings of the 2018 In- ternational Conference on Management of Data, 2018, pp. 1433–1445. doi:10.1145/3183713.3190657

  50. [52]

    Sammour, J

    F. Sammour, J. Xu, X. Wang, M. Hu, Z. Zhang, Responsible ai in construction safety: Systematic evaluation of large language models and prompt engineering, arXiv (2024). doi:10.48550/arXiv.2411. 08320

  51. [53]

    Pujara, E

    J. Pujara, E. Augustine, L. Getoor, Sparsity and noise: Where knowl- edge graph embeddings fall short, in: Proceedings of the 2017 Confer- ence on Empirical Methods in Natural Language Processing, 2017, pp. 1751–1756. doi:10.18653/v1/D17-1184

  52. [54]

    R. Omar, O. Mangukiya, P. Kalnis, E. Mansour, Chatgpt versus tradi- tional question answering for knowledge graphs: Current status and fu- ture directions towards knowledge graph chatbots, arXiv (2023). doi: 10.48550/arXiv.2302.06466

  53. [55]

    Y . Tan, D. Min, Y . Li, W. Li, N. Hu, Y . Chen, G. Qi, Can chatgpt replace traditional kbqa models? an in-depth analysis of the question answering performance of the gpt llm family, in: The Semantic Web - ISWC 2023, 2023, pp. 348–367

  54. [57]

    M. Adil, G. Lee, V . A. Gonzalez, Q. Mei, Using vision language models for safety hazard identification in construction, arXiv (2025). doi:10. 48550/arXiv.2504.09083

  55. [58]

    Sabir, R

    A. Sabir, R. Hussain, A. Pedro, C. Park, Personalized construction safety training system using conversational ai in virtual reality, Automation in Construction 175 (2025) 106207

  56. [59]

    Lairgi, L

    Y . Lairgi, L. Monela, R. Cazabet, K. Benabdeslem, P. Cl ´eau, Text2kg: Incremental knowledge graphs construction using large language models, in: Web Information Systems Engineering – WISE 2024, 2025, pp. 214– 229

  57. [60]

    X. Xue, J. Zhang, Y . Chen, Question-answering framework for build- ing codes using fine-tuned and distilled pre-trained transformer mod- els, Automation in Construction 168 (2024) 105730. doi:10.1016/j. autcon.2024.105730

  58. [61]

    K. Jeon, G. Lee, Hybrid large language model approach for prompt and sensitive defect management: A comparative analysis of hybrid, non- hybrid, and graphrag approaches, Advanced Engineering Informatics 64 (2025) 103076. doi:10.1016/j.aei.2024.103076

  59. [62]

    Zheng, S

    C. Zheng, S. Wong, X. Su, Y . Tang, A. Nawaz, M. Kassem, Automat- ing construction contract review using knowledge graph-enhanced large language models, Automation in Construction 175 (2025) 106179

  60. [63]

    Zhang, N

    J. Zhang, N. M. El-Gohary, Semantic nlp-based information extrac- tion from construction regulatory documents for automated compliance checking, Journal of Computing in Civil Engineering 30 (2) (2016) 04015014. doi:10.1061/(ASCE)CP.1943-5487.0000346

  61. [64]

    Zhang, N

    R. Zhang, N. El-Gohary, A deep neural network-based method for deep information extraction using transfer learning strategies to support au- tomated compliance checking, Automation in Construction 132 (2021) 103834. doi:10.1016/j.autcon.2021.103834. 19