Automated Construction of a Knowledge Graph of Nuclear Fusion Energy for Effective Elicitation and Retrieval of Information

Adriano Agnello; Andrea Loreti; Kesi Chen; Robert Firth; Ruby George; Shinnosuke Tanaka

arxiv: 2504.07738 · v3 · pith:CDJXIJKFnew · submitted 2025-04-10 · 💻 cs.CL

Automated Construction of a Knowledge Graph of Nuclear Fusion Energy for Effective Elicitation and Retrieval of Information

Andrea Loreti , Kesi Chen , Ruby George , Robert Firth , Adriano Agnello , Shinnosuke Tanaka This is my paper

Pith reviewed 2026-05-22 20:45 UTC · model grok-4.3

classification 💻 cs.CL

keywords knowledgegraphlanguagelargeautomatedconstructiondocumentenergy

0 comments

The pith

Pre-trained large language models build the first knowledge graph for nuclear fusion energy and enable retrieval-augmented answers to multi-hop queries.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper outlines a multi-step automated pipeline that extracts structured information from large document collections in specialized domains. Applied to nuclear fusion energy, the method uses large language models to identify entities and resolve them into a connected graph, with extraction quality checked against Zipf's law. The resulting graph then supports a retrieval-augmented generation system that answers natural-language questions by traversing entity links, including those that require chaining multiple facts. This matters for a field whose documents are numerous, varied, and difficult to search manually for related concepts.

Core claim

We apply our method to build the first knowledge graph of nuclear fusion energy. We develop a knowledge-graph retrieval-augmented generation system that uses multiple prompts with large language models to provide contextually relevant answers to natural-language queries, including complex multi-hop questions requiring reasoning across interconnected entities.

What carries the argument

The multi-step automated pipeline that applies pre-trained large language models to named entity recognition and entity resolution, then feeds the resulting graph into a retrieval-augmented generation system.

If this is right

The first structured knowledge representation specific to nuclear fusion energy is produced from documents.
Complex queries that link multiple fusion concepts become answerable by traversing graph connections.
Large language models are shown capable of handling entity tasks in a high-heterogeneity scientific domain.
The pipeline offers a repeatable way to turn document corpora into queryable graphs for elicitation and retrieval.
Evaluation against Zipf's law supplies a quantitative check on extraction quality.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same pipeline could be applied to other document-heavy scientific domains that lack existing structured databases.
The graph might reveal previously unnoticed clusters of related fusion technologies or research gaps by inspecting entity connectivity.
Adding numerical data from fusion simulations or experiments could turn the graph into a hybrid text-plus-data resource.
Users outside the field could use the multi-hop capability to trace how one fusion component affects another without reading the source papers.
keywords

Load-bearing premise

Pre-trained large language models can perform named entity recognition and entity resolution accurately enough in the heterogeneous nuclear fusion domain without substantial domain adaptation.

What would settle it

A test set of fusion documents where the models produce entity extractions whose frequency distribution deviates sharply from Zipf's law or where the graph-augmented answers show no gain over plain language-model answers on multi-hop questions.

Figures

Figures reproduced from arXiv: 2504.07738 by Adriano Agnello, Andrea Loreti, Kesi Chen, Robert Firth, Ruby George, Shinnosuke Tanaka.

**Figure 1.** Figure 1: Zipf’s law applied to the top-ranked 500 single word entities in a case study of 349 abstracts: (a) before entity resolution, (b) after entity resolution [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗

**Figure 2.** Figure 2: Workflow for the automated creation of a KG. The first layer [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: figure 3 [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗

**Figure 4.** Figure 4: Workflow of the KG-RAG. The user input is a question processed [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗

**Figure 3.** Figure 3: Example of KG accordingly to the graph architecture used in this [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 5.** Figure 5: Two examples of LLM-generated Cypher queries, single-hop (left) [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗

read the original abstract

In this document, we discuss a multi-step approach to automated construction of a knowledge graph, for structuring and representing domain-specific knowledge from large document corpora. We apply our method to build the first knowledge graph of nuclear fusion energy, a highly specialized field characterized by vast scope and heterogeneity. This is an ideal benchmark to test the key features of our pipeline, including automatic named entity recognition and entity resolution. We show how pre-trained large language models can be used to address these challenges and we evaluate their performance against Zipf's law, which characterizes human natural language. Additionally, we develop a knowledge-graph retrieval-augmented generation system that uses multiple prompts with large language models to provide contextually relevant answers to natural-language queries, including complex multi-hop questions requiring reasoning across interconnected entities.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper applies standard LLM pipelines to create a first knowledge graph for nuclear fusion and a multi-hop RAG system, but skips direct accuracy checks on entity extraction.

read the letter

The main point is that this work takes existing LLM techniques for named entity recognition and entity resolution and applies them to build what the authors call the first knowledge graph in nuclear fusion energy. They then use that graph in a retrieval-augmented generation setup with multiple prompts to handle natural-language and multi-hop queries. The domain choice makes sense as a test case because fusion literature mixes plasma physics, engineering components, and reaction pathways in one heterogeneous corpus. The pipeline description is clear and the RAG layer shows how the graph could support reasoning across entities. That part is straightforward and practical. The evaluation step is the soft spot. They compare the resulting entity distribution to Zipf's law, which checks whether frequencies follow a power-law pattern similar to ordinary text. This does not measure precision, recall, or linking errors on the actual technical terms. In a specialized field, pre-trained models can easily misidentify or fail to resolve rare but critical items like specific tokamak diagnostics or isotope pathways. Without numbers on those errors or a small manual audit, it is hard to know whether the downstream multi-hop answers rest on a reliable graph. The abstract gives no such figures. This paper is mainly for people already working on domain-specific knowledge graphs or for fusion researchers who need better structured search. A reader looking for new methods in LLM-based KG construction will not find them here. The thinking is coherent and the authors engage honestly with the domain challenges. It deserves peer review so the authors can add concrete validation of the extraction quality before the claims about effective elicitation and retrieval can be taken at face value.

Referee Report

1 major / 2 minor

Summary. The manuscript presents a multi-step automated pipeline for constructing a knowledge graph from large document corpora, using pre-trained large language models for named entity recognition and entity resolution. The approach is applied to nuclear fusion energy to produce what is described as the first such KG in the domain; performance is evaluated by comparison to Zipf's law, and a KG-augmented RAG system is developed to answer natural-language queries, including complex multi-hop questions.

Significance. If the pipeline yields a high-quality KG with low error rates in entity linking, the work would offer a novel structured resource for the fusion community and illustrate LLM utility for KG construction in heterogeneous technical domains. The KG-RAG component could improve retrieval for multi-hop reasoning. However, the absence of precision/recall metrics or error analysis on the specialized corpus substantially weakens the ability to judge whether these benefits are realized.

major comments (1)

[§4] §4 (Evaluation against Zipf's law): The reported comparison provides only frequency-rank statistics and does not include precision, recall, or F1 scores for named entity recognition or entity resolution on the nuclear fusion corpus. This omission is load-bearing because the central claim—that the resulting KG supports reliable multi-hop reasoning—requires evidence that extraction and linking errors are low enough to avoid missing or spurious edges on critical domain terms.

minor comments (2)

The abstract and method sections would benefit from explicit statements of the specific LLMs employed, the size and composition of the input document corpus, and any prompt templates used for NER and resolution.
Consider including a summary table of KG statistics (number of entities, relation types, triples) to allow readers to gauge scale and coverage.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their detailed review and constructive feedback. We address the single major comment below and outline planned revisions to strengthen the manuscript.

read point-by-point responses

Referee: §4 (Evaluation against Zipf's law): The reported comparison provides only frequency-rank statistics and does not include precision, recall, or F1 scores for named entity recognition or entity resolution on the nuclear fusion corpus. This omission is load-bearing because the central claim—that the resulting KG supports reliable multi-hop reasoning—requires evidence that extraction and linking errors are low enough to avoid missing or spurious edges on critical domain terms.

Authors: We agree that precision, recall, and F1 scores would constitute stronger direct evidence of extraction quality. However, no gold-standard annotated corpus exists for named entity recognition or entity resolution in the nuclear fusion domain, and creating one would require extensive manual labeling by domain experts, which was beyond the scope and resources of this work. Zipf's law was therefore employed as an indirect, unsupervised validation proxy to confirm that the extracted entity distribution aligns with expected natural-language patterns, indicating broad coverage without gross over- or under-extraction. We will revise §4 to (1) explicitly state this rationale and its limitations, (2) add a qualitative error analysis on a random sample of 200 entities (with examples of correct and incorrect extractions/links), and (3) discuss how the observed Zipf compliance, combined with the RAG results on multi-hop queries, supports the claim of usable KG quality. These changes will make the evaluation section more transparent while remaining honest about the absence of quantitative metrics. revision: partial

Circularity Check

0 steps flagged

No circularity: standard LLM pipeline with external Zipf comparison

full rationale

The paper describes a multi-step pipeline that applies pre-trained LLMs to named-entity recognition and entity resolution on a fusion-energy corpus, constructs a knowledge graph, and then uses the graph for RAG. No equations, fitted parameters, or self-citations are presented as load-bearing derivations. The only quantitative reference is an external comparison to Zipf’s law, which is an independent statistical benchmark and does not reduce any claimed output to the method’s own inputs by construction. The central claims therefore remain methodologically self-contained rather than tautological.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Minimal additional assumptions beyond standard LLM capabilities and the applicability of Zipf's law to entity distributions.

axioms (1)

domain assumption Pre-trained LLMs are capable of accurate named entity recognition and entity resolution in specialized scientific domains
Central to the automated construction pipeline described.

pith-pipeline@v0.9.0 · 7621 in / 886 out tokens · 58670 ms · 2026-05-22T20:45:33.583283+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The workflow consists in the following steps: Data Acquisition (DAQ), NER, entity resolution, KG construction and RE... Neo4j

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Challenges and opportunities for AI to help deliver fusion energy
physics.plasm-ph 2026-03 unverdicted novelty 2.0

AI offers opportunities to advance fusion energy R&D but requires responsible practices and expert collaborations to overcome its inherent challenges.

Reference graph

Works this paper leans on

39 extracted references · 39 canonical work pages · cited by 1 Pith paper

[1]

https://www.iter.org/scientists/iter-technical-reports

work page
[2]

European Research Roadmap to the Realisation of Fusion Energy

EUROfusion, “European Research Roadmap to the Realisation of Fusion Energy.” https://euro-fusion.org/eurofusion/roadmap/

work page
[3]

A FAIR based approach to data sharing in Europe,

P. Strand, D. P. Coster, M. Plociennik, S. de Witt, I. A. Klampanos, J. Decker, F. Imbeaux, J. F. Artaud, B. Bosak, N. Cummings, L. Fleury, A. Ikonomopoulos, S. Konstantopoulos, A. Ludvig-Osipov, P. Maini, J. Morales, and M. Owsiak, “A FAIR based approach to data sharing in Europe,”Plasma Physics and Controlled Fusion, vol. 64, p. 104001, aug 2022. https:...

work page doi:10.1088/1361-6587/ac8618 2022
[4]

The FAIR Guiding Principles for scientific data management and steward- ship

M. Wilkinson, M. Dumontier, I. J. Aalbersberg,et al., “The FAIR guiding principles for scientific data management and stewardship,”Sci. Data, vol. 3, p. 160018, 2016. https://doi.org/10.1038/sdata.2016.18

work page doi:10.1038/sdata.2016.18 2016
[5]

Digital signal processing & data science challenge,

S. McIntosh, “Digital signal processing & data science challenge,” Magnetic and Fusion Diagnostic Data Science, ITER International School Nagoya, Japan, 2024. https://www.iter.org/public/education/ iter-international-school

work page 2024
[6]

The Semantic Web,

T. Berners-Lee, J. Hendler, and O. Lassila, “The Semantic Web,” Scientific American, vol. 284, p. 28, 2001. https://doi.org/10.1038/ scientificamerican0501-34

work page 2001
[7]

Linked data - the story so far,

C. Bizer, T. Heath, and T. Berners-Lee, “Linked data - the story so far,”International Journal on Semantic Web and Information Systems, vol. 5, no. 3, pp. 1–22, 2009. https://www.bibsonomy.org/bibtex/ 25e13b99f0fe4d28c1261158410041c70/mgraube

work page 2009
[8]

RDF V ocabulary Description Language 1.0: RDF Schema W3C Recommendation,

G. R. Brickley D., “RDF V ocabulary Description Language 1.0: RDF Schema W3C Recommendation,”Retrieved June 14 2009, 2004. http: //www.w3.org/TR/rdf-schema/

work page 2009
[9]

OWL Web Ontology Language - W3C Recommendation,

V . H. F. McGuinness D., “OWL Web Ontology Language - W3C Recommendation,”Retrieved June 14 2009, 2004. http://www.w3.org/ TR/owl-features/

work page 2009
[10]

An automatic on- tology generation framework with an organizational perspective,

S. Elnagar, V . Y . Yoon, and M. A. Thomas, “An automatic on- tology generation framework with an organizational perspective,” in Hawaii International Conference on System Sciences, 2020. https: //api.semanticscholar.org/CorpusID:213718548

work page 2020
[11]

Automatically generating extraction patterns from untagged text,

E. Riloff, “Automatically generating extraction patterns from untagged text,” American Association for Artificial Intelligence, Menlo Park, CA (United States), 12 1996. https://www.osti.gov/biblio/430781

work page 1996
[12]

A statistical model for multilingual entity detection and tracking,

R. Florian, H. Hassan, A. Ittycheriah, H. Jing, N. Kambhatla, X. Luo, N. Nicolov, and S. Roukos, “A statistical model for multilingual entity detection and tracking,” inNorth American Chapter of the Association for Computational Linguistics, 2004. https://api.semanticscholar.org/ CorpusID:14831480

work page 2004
[13]

A fully Bayesian approach to unsuper- vised part-of-speech tagging,

S. Goldwater and T. Griffiths, “A fully Bayesian approach to unsuper- vised part-of-speech tagging,” inACL 2007 - Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, ACL 2007 - Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, pp. 744–751, 2007. 45th Annual Meeting of the Associat...

work page 2007
[14]

Open information extraction from the web,

O. Etzioni, M. Banko, S. Soderland, and D. S. Weld, “Open information extraction from the web,”Commun. ACM, vol. 51, p. 68–74, Dec. 2008. https://doi.org/10.1145/1409360.1409378

work page doi:10.1145/1409360.1409378 2008
[15]

Learning to construct knowledge bases from the world wide web,

M. Craven, D. DiPasquo, D. Freitag, A. McCallum, T. Mitchell, K. Nigam, and S. Slattery, “Learning to construct knowledge bases from the world wide web,”Artificial Intelligence, vol. 118, no. 1, pp. 69–113, 2000. https://www.sciencedirect.com/science/article/pii/ S0004370200000047

work page 2000
[16]

Introduction to the CoNLL-2005 Shared Task: Semantic Role Labeling,

X. Carreras and L. M `arquez, “Introduction to the CoNLL-2005 Shared Task: Semantic Role Labeling,” inProceedings of the Ninth Conference on Computational Natural Language Learning (CoNLL-2005)(I. Dagan and D. Gildea, eds.), (Ann Arbor, Michigan), pp. 152–164, Association for Computational Linguistics, June 2005. ”https://aclanthology.org/ W05-0620/

work page 2005
[17]

Automatic labeling of semantic roles,

D. Gildea and D. Jurafsky, “Automatic labeling of semantic roles,” Computational Linguistics, vol. 28, no. 3, pp. 245–288, 2002. https: //aclanthology.org/J02-3001/

work page 2002
[18]

Knowledgehub: An end-to-end tool for assisted scientific discovery,

S. Tanaka, J. Barry, V . Kuruvanthodi, M. Moses, M. J. Giammona, N. Herr, M. Elkaref, and G. de Mel, “Knowledgehub: An end-to-end tool for assisted scientific discovery,” inProceedings of the Thirty- Third International Joint Conference on Artificial Intelligence, IJCAI- 24(K. Larson, ed.), pp. 8815–8819, International Joint Conferences on Artificial Inte...

work page 2024
[19]

Clinical named entity recognition using deep learning models.,

W. Y , J. M, X. J, Z. D, and X. H, “Clinical named entity recognition using deep learning models.,”AMIA Annu Symp Proc., pp. 1812–1819, Apr. 2018. https://pmc.ncbi.nlm.nih.gov/articles/PMC5977567/

work page 2018
[20]

A survey on deep learning for named entity recognition,

J. Li, A. Sun, J. Han, and C. Li, “A survey on deep learning for named entity recognition,”IEEE Transactions on Knowledge and Data Engineering, vol. 34, no. 1, pp. 50–70, 2022. https://ieeexplore.ieee.org/ document/10184827

work page arXiv 2022
[21]

arXiv preprint arXiv:2205.12689 , year=

M. Agrawal, S. Hegselmann, H. Lang, Y . Kim, and D. Sontag, “Large language models are few-shot clinical information extractors,” inPro- ceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022. https://arxiv.org/pdf/2205.12689.pdf

work page arXiv 2022
[22]

Chatie: Zero-shot information extraction via chatting with chatgpt.arXiv preprint arXiv:2302.10205, 2023

X. Wei, X. Cui, N. Cheng, X. Wang, X. Zhang, S. Huang, P. Xie, J. Xu, Y . Chen, M. Zhang, Y . Jiang, and W. Han, “Zero-shot information extraction via chatting with chatgpt,”ArXiv, vol. abs/2302.10205, 2023. https://api.semanticscholar.org/CorpusID:257050669

work page arXiv 2023
[23]

Bertnet: Harvesting knowledge graphs with arbitrary relations from pretrained language models,

S. Hao, B. Tan, K. Tang, B. Ni, X. Shao, H. Zhang, E. P. Xing, and Z. Hu, “Bertnet: Harvesting knowledge graphs with arbitrary relations from pretrained language models,” 2023. https://arxiv.org/abs/2206. 14268

work page 2023
[24]

Eval- uating ChatGPT’s information extraction capabilities: An assessment of performance, explainability, calibration, and faithfulness,

B. Li, G. Fang, Y . Yang, Q. Wang, W. Ye, W. Zhao, and S. Zhang, “Eval- uating ChatGPT’s information extraction capabilities: An assessment of performance, explainability, calibration, and faithfulness,”ArXiv, vol. abs/2304.11633, 2023. https://api.semanticscholar.org/CorpusID: 258297899

work page arXiv 2023
[25]

PromptNER: Prompting for named entity recognition,

D. Ashok and Z. C. Lipton, “PromptNER: Prompting for named entity recognition,”ArXiv, 2023. https://doi.org/10.48550/arXiv.2305.15444

work page doi:10.48550/arxiv.2305.15444 2023
[26]

Iterative Zero-Shot LLM Prompting for Knowledge Graph Construction,

S. Carta, A. Giuliani, L. Piano, A. S. Podda, L. Pompianu, and S. G. Tiddia, “Iterative Zero-Shot LLM Prompting for Knowledge Graph Construction,” 2023. https://arxiv.org/abs/2307.01128

work page arXiv 2023
[27]

DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia,

J. Lehmann, R. Isele, M. Jakob, A. Jentzsch, D. Kontokostas, P. N. Mendes, S. Hellmann, M. Morsey, P. van Kleef, S. Auer, and C. Bizer, “DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia,”Semantic Web, vol. 6, pp. 167–195, 2015. https://api. semanticscholar.org/CorpusID:1181640

work page 2015
[28]

Ares: An automated evaluation framework for retrieval-augmented generation systems,

J. Saad-Falcon, O. Khattab, C. Potts, and M. Zaharia, “ARES: An automated evaluation framework for retrieval-augmented generation systems,” 2024. https://arxiv.org/abs/2311.09476

work page arXiv 2024
[29]

H. Yu, A. Gan, K. Zhang, S. Tong, Q. Liu, and Z. Liu,Evaluation of Retrieval-Augmented Generation: A Survey, p. 102–120. Springer Nature Singapore, 2025. http://dx.doi.org/10.1007/978-981-96-1024-2 8

work page doi:10.1007/978-981-96-1024-2 2025
[30]

How well do llms cite relevant medical references? an evaluation framework and analyses,

K. Wu, E. Wu, A. Cassasola, A. Zhang, K. Wei, T. Nguyen, S. Ri- antawan, P. S. Riantawan, D. E. Ho, and J. Zou, “How well do llms cite relevant medical references? an evaluation framework and analyses,”

work page
[31]

https://arxiv.org/abs/2402.02008

work page arXiv
[32]

Survey of hallucination in natural language generation,

Z. Ji, N. Lee, R. Frieske, T. Yu, D. Su, Y . Xu, E. Ishii, Y . J. Bang, A. Madotto, and P. Fung, “Survey of hallucination in natural language generation,”ACM Comput. Surv., vol. 55, Mar. 2023. https://doi.org/10. 1145/3571730

work page 2023
[33]

The Journal of Chemical Physics 132(21), 214102 (2010)

M. Najjar and B. Khanbabaei, “Effects of carbon impurity on the ignition of deuterium-tritium targets under the relativistic shock waves,”Physics of Plasmas, vol. 26, p. 032709, 03 2019. https://doi.org/10.1063/1. 5087298

work page doi:10.1063/1 2019
[34]

Unified first-principles equations of state of deuterium-tritium mixtures in the global inertial confinement fusion region,

D. Kang, Y . Hou, Q. Zeng, and J. Dai, “Unified first-principles equations of state of deuterium-tritium mixtures in the global inertial confinement fusion region,”Matter and Radiation at Extremes, vol. 5, p. 055401, 09

work page
[35]

https://doi.org/10.1063/5.0008231

work page doi:10.1063/5.0008231
[36]

Advanced impurity measurement for deuterium–tritium-burning plasmas using pulsed CO2 laser collective Thomson scattering,

S. Lee and T. Kondoh, “Advanced impurity measurement for deuterium–tritium-burning plasmas using pulsed CO2 laser collective Thomson scattering,”Review of Scientific Instruments, vol. 71, pp. 3718– 3722, 10 2000. https://doi.org/10.1063/1.1311940

work page doi:10.1063/1.1311940 2000
[37]

Development of the indirect-drive approach to inertial confine- ment fusion and the target physics basis for ignition and gain.,

J. Lindl, “Development of the indirect-drive approach to inertial confine- ment fusion and the target physics basis for ignition and gain.,”Phys. Plasmas, vol. 2, 1995. https://doi.org/10.1063/1.871025

work page doi:10.1063/1.871025 1995
[38]

Neoclassical transport of impurities in tokamak plasmas,

S. Hirshman and D. Sigmar, “Neoclassical transport of impurities in tokamak plasmas,”Nuclear Fusion, vol. 21, p. 1079, sep 1981. https: //dx.doi.org/10.1088/0029-5515/21/9/003

work page doi:10.1088/0029-5515/21/9/003 1981
[39]

Measurements of microturbulence in tokamaks and comparisons with theories of turbulence and anomalous transport,

P. C. Liewer, “Measurements of microturbulence in tokamaks and comparisons with theories of turbulence and anomalous transport,” Nuclear Fusion, vol. 25, p. 543, may 1985. https://dx.doi.org/10.1088/ 0029-5515/25/5/004

work page 1985

[1] [1]

https://www.iter.org/scientists/iter-technical-reports

work page

[2] [2]

European Research Roadmap to the Realisation of Fusion Energy

EUROfusion, “European Research Roadmap to the Realisation of Fusion Energy.” https://euro-fusion.org/eurofusion/roadmap/

work page

[3] [3]

A FAIR based approach to data sharing in Europe,

P. Strand, D. P. Coster, M. Plociennik, S. de Witt, I. A. Klampanos, J. Decker, F. Imbeaux, J. F. Artaud, B. Bosak, N. Cummings, L. Fleury, A. Ikonomopoulos, S. Konstantopoulos, A. Ludvig-Osipov, P. Maini, J. Morales, and M. Owsiak, “A FAIR based approach to data sharing in Europe,”Plasma Physics and Controlled Fusion, vol. 64, p. 104001, aug 2022. https:...

work page doi:10.1088/1361-6587/ac8618 2022

[4] [4]

The FAIR Guiding Principles for scientific data management and steward- ship

M. Wilkinson, M. Dumontier, I. J. Aalbersberg,et al., “The FAIR guiding principles for scientific data management and stewardship,”Sci. Data, vol. 3, p. 160018, 2016. https://doi.org/10.1038/sdata.2016.18

work page doi:10.1038/sdata.2016.18 2016

[5] [5]

Digital signal processing & data science challenge,

S. McIntosh, “Digital signal processing & data science challenge,” Magnetic and Fusion Diagnostic Data Science, ITER International School Nagoya, Japan, 2024. https://www.iter.org/public/education/ iter-international-school

work page 2024

[6] [6]

The Semantic Web,

T. Berners-Lee, J. Hendler, and O. Lassila, “The Semantic Web,” Scientific American, vol. 284, p. 28, 2001. https://doi.org/10.1038/ scientificamerican0501-34

work page 2001

[7] [7]

Linked data - the story so far,

C. Bizer, T. Heath, and T. Berners-Lee, “Linked data - the story so far,”International Journal on Semantic Web and Information Systems, vol. 5, no. 3, pp. 1–22, 2009. https://www.bibsonomy.org/bibtex/ 25e13b99f0fe4d28c1261158410041c70/mgraube

work page 2009

[8] [8]

RDF V ocabulary Description Language 1.0: RDF Schema W3C Recommendation,

G. R. Brickley D., “RDF V ocabulary Description Language 1.0: RDF Schema W3C Recommendation,”Retrieved June 14 2009, 2004. http: //www.w3.org/TR/rdf-schema/

work page 2009

[9] [9]

OWL Web Ontology Language - W3C Recommendation,

V . H. F. McGuinness D., “OWL Web Ontology Language - W3C Recommendation,”Retrieved June 14 2009, 2004. http://www.w3.org/ TR/owl-features/

work page 2009

[10] [10]

An automatic on- tology generation framework with an organizational perspective,

S. Elnagar, V . Y . Yoon, and M. A. Thomas, “An automatic on- tology generation framework with an organizational perspective,” in Hawaii International Conference on System Sciences, 2020. https: //api.semanticscholar.org/CorpusID:213718548

work page 2020

[11] [11]

Automatically generating extraction patterns from untagged text,

E. Riloff, “Automatically generating extraction patterns from untagged text,” American Association for Artificial Intelligence, Menlo Park, CA (United States), 12 1996. https://www.osti.gov/biblio/430781

work page 1996

[12] [12]

A statistical model for multilingual entity detection and tracking,

R. Florian, H. Hassan, A. Ittycheriah, H. Jing, N. Kambhatla, X. Luo, N. Nicolov, and S. Roukos, “A statistical model for multilingual entity detection and tracking,” inNorth American Chapter of the Association for Computational Linguistics, 2004. https://api.semanticscholar.org/ CorpusID:14831480

work page 2004

[13] [13]

A fully Bayesian approach to unsuper- vised part-of-speech tagging,

S. Goldwater and T. Griffiths, “A fully Bayesian approach to unsuper- vised part-of-speech tagging,” inACL 2007 - Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, ACL 2007 - Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, pp. 744–751, 2007. 45th Annual Meeting of the Associat...

work page 2007

[14] [14]

Open information extraction from the web,

O. Etzioni, M. Banko, S. Soderland, and D. S. Weld, “Open information extraction from the web,”Commun. ACM, vol. 51, p. 68–74, Dec. 2008. https://doi.org/10.1145/1409360.1409378

work page doi:10.1145/1409360.1409378 2008

[15] [15]

Learning to construct knowledge bases from the world wide web,

M. Craven, D. DiPasquo, D. Freitag, A. McCallum, T. Mitchell, K. Nigam, and S. Slattery, “Learning to construct knowledge bases from the world wide web,”Artificial Intelligence, vol. 118, no. 1, pp. 69–113, 2000. https://www.sciencedirect.com/science/article/pii/ S0004370200000047

work page 2000

[16] [16]

Introduction to the CoNLL-2005 Shared Task: Semantic Role Labeling,

X. Carreras and L. M `arquez, “Introduction to the CoNLL-2005 Shared Task: Semantic Role Labeling,” inProceedings of the Ninth Conference on Computational Natural Language Learning (CoNLL-2005)(I. Dagan and D. Gildea, eds.), (Ann Arbor, Michigan), pp. 152–164, Association for Computational Linguistics, June 2005. ”https://aclanthology.org/ W05-0620/

work page 2005

[17] [17]

Automatic labeling of semantic roles,

D. Gildea and D. Jurafsky, “Automatic labeling of semantic roles,” Computational Linguistics, vol. 28, no. 3, pp. 245–288, 2002. https: //aclanthology.org/J02-3001/

work page 2002

[18] [18]

Knowledgehub: An end-to-end tool for assisted scientific discovery,

S. Tanaka, J. Barry, V . Kuruvanthodi, M. Moses, M. J. Giammona, N. Herr, M. Elkaref, and G. de Mel, “Knowledgehub: An end-to-end tool for assisted scientific discovery,” inProceedings of the Thirty- Third International Joint Conference on Artificial Intelligence, IJCAI- 24(K. Larson, ed.), pp. 8815–8819, International Joint Conferences on Artificial Inte...

work page 2024

[19] [19]

Clinical named entity recognition using deep learning models.,

W. Y , J. M, X. J, Z. D, and X. H, “Clinical named entity recognition using deep learning models.,”AMIA Annu Symp Proc., pp. 1812–1819, Apr. 2018. https://pmc.ncbi.nlm.nih.gov/articles/PMC5977567/

work page 2018

[20] [20]

A survey on deep learning for named entity recognition,

J. Li, A. Sun, J. Han, and C. Li, “A survey on deep learning for named entity recognition,”IEEE Transactions on Knowledge and Data Engineering, vol. 34, no. 1, pp. 50–70, 2022. https://ieeexplore.ieee.org/ document/10184827

work page arXiv 2022

[21] [21]

arXiv preprint arXiv:2205.12689 , year=

M. Agrawal, S. Hegselmann, H. Lang, Y . Kim, and D. Sontag, “Large language models are few-shot clinical information extractors,” inPro- ceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022. https://arxiv.org/pdf/2205.12689.pdf

work page arXiv 2022

[22] [22]

Chatie: Zero-shot information extraction via chatting with chatgpt.arXiv preprint arXiv:2302.10205, 2023

X. Wei, X. Cui, N. Cheng, X. Wang, X. Zhang, S. Huang, P. Xie, J. Xu, Y . Chen, M. Zhang, Y . Jiang, and W. Han, “Zero-shot information extraction via chatting with chatgpt,”ArXiv, vol. abs/2302.10205, 2023. https://api.semanticscholar.org/CorpusID:257050669

work page arXiv 2023

[23] [23]

Bertnet: Harvesting knowledge graphs with arbitrary relations from pretrained language models,

S. Hao, B. Tan, K. Tang, B. Ni, X. Shao, H. Zhang, E. P. Xing, and Z. Hu, “Bertnet: Harvesting knowledge graphs with arbitrary relations from pretrained language models,” 2023. https://arxiv.org/abs/2206. 14268

work page 2023

[24] [24]

Eval- uating ChatGPT’s information extraction capabilities: An assessment of performance, explainability, calibration, and faithfulness,

B. Li, G. Fang, Y . Yang, Q. Wang, W. Ye, W. Zhao, and S. Zhang, “Eval- uating ChatGPT’s information extraction capabilities: An assessment of performance, explainability, calibration, and faithfulness,”ArXiv, vol. abs/2304.11633, 2023. https://api.semanticscholar.org/CorpusID: 258297899

work page arXiv 2023

[25] [25]

PromptNER: Prompting for named entity recognition,

D. Ashok and Z. C. Lipton, “PromptNER: Prompting for named entity recognition,”ArXiv, 2023. https://doi.org/10.48550/arXiv.2305.15444

work page doi:10.48550/arxiv.2305.15444 2023

[26] [26]

Iterative Zero-Shot LLM Prompting for Knowledge Graph Construction,

S. Carta, A. Giuliani, L. Piano, A. S. Podda, L. Pompianu, and S. G. Tiddia, “Iterative Zero-Shot LLM Prompting for Knowledge Graph Construction,” 2023. https://arxiv.org/abs/2307.01128

work page arXiv 2023

[27] [27]

DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia,

J. Lehmann, R. Isele, M. Jakob, A. Jentzsch, D. Kontokostas, P. N. Mendes, S. Hellmann, M. Morsey, P. van Kleef, S. Auer, and C. Bizer, “DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia,”Semantic Web, vol. 6, pp. 167–195, 2015. https://api. semanticscholar.org/CorpusID:1181640

work page 2015

[28] [28]

Ares: An automated evaluation framework for retrieval-augmented generation systems,

J. Saad-Falcon, O. Khattab, C. Potts, and M. Zaharia, “ARES: An automated evaluation framework for retrieval-augmented generation systems,” 2024. https://arxiv.org/abs/2311.09476

work page arXiv 2024

[29] [29]

H. Yu, A. Gan, K. Zhang, S. Tong, Q. Liu, and Z. Liu,Evaluation of Retrieval-Augmented Generation: A Survey, p. 102–120. Springer Nature Singapore, 2025. http://dx.doi.org/10.1007/978-981-96-1024-2 8

work page doi:10.1007/978-981-96-1024-2 2025

[30] [30]

How well do llms cite relevant medical references? an evaluation framework and analyses,

K. Wu, E. Wu, A. Cassasola, A. Zhang, K. Wei, T. Nguyen, S. Ri- antawan, P. S. Riantawan, D. E. Ho, and J. Zou, “How well do llms cite relevant medical references? an evaluation framework and analyses,”

work page

[31] [31]

https://arxiv.org/abs/2402.02008

work page arXiv

[32] [32]

Survey of hallucination in natural language generation,

Z. Ji, N. Lee, R. Frieske, T. Yu, D. Su, Y . Xu, E. Ishii, Y . J. Bang, A. Madotto, and P. Fung, “Survey of hallucination in natural language generation,”ACM Comput. Surv., vol. 55, Mar. 2023. https://doi.org/10. 1145/3571730

work page 2023

[33] [33]

The Journal of Chemical Physics 132(21), 214102 (2010)

M. Najjar and B. Khanbabaei, “Effects of carbon impurity on the ignition of deuterium-tritium targets under the relativistic shock waves,”Physics of Plasmas, vol. 26, p. 032709, 03 2019. https://doi.org/10.1063/1. 5087298

work page doi:10.1063/1 2019

[34] [34]

Unified first-principles equations of state of deuterium-tritium mixtures in the global inertial confinement fusion region,

D. Kang, Y . Hou, Q. Zeng, and J. Dai, “Unified first-principles equations of state of deuterium-tritium mixtures in the global inertial confinement fusion region,”Matter and Radiation at Extremes, vol. 5, p. 055401, 09

work page

[35] [35]

https://doi.org/10.1063/5.0008231

work page doi:10.1063/5.0008231

[36] [36]

Advanced impurity measurement for deuterium–tritium-burning plasmas using pulsed CO2 laser collective Thomson scattering,

S. Lee and T. Kondoh, “Advanced impurity measurement for deuterium–tritium-burning plasmas using pulsed CO2 laser collective Thomson scattering,”Review of Scientific Instruments, vol. 71, pp. 3718– 3722, 10 2000. https://doi.org/10.1063/1.1311940

work page doi:10.1063/1.1311940 2000

[37] [37]

Development of the indirect-drive approach to inertial confine- ment fusion and the target physics basis for ignition and gain.,

J. Lindl, “Development of the indirect-drive approach to inertial confine- ment fusion and the target physics basis for ignition and gain.,”Phys. Plasmas, vol. 2, 1995. https://doi.org/10.1063/1.871025

work page doi:10.1063/1.871025 1995

[38] [38]

Neoclassical transport of impurities in tokamak plasmas,

S. Hirshman and D. Sigmar, “Neoclassical transport of impurities in tokamak plasmas,”Nuclear Fusion, vol. 21, p. 1079, sep 1981. https: //dx.doi.org/10.1088/0029-5515/21/9/003

work page doi:10.1088/0029-5515/21/9/003 1981

[39] [39]

Measurements of microturbulence in tokamaks and comparisons with theories of turbulence and anomalous transport,

P. C. Liewer, “Measurements of microturbulence in tokamaks and comparisons with theories of turbulence and anomalous transport,” Nuclear Fusion, vol. 25, p. 543, may 1985. https://dx.doi.org/10.1088/ 0029-5515/25/5/004

work page 1985