Who's Who? LLM-assisted Software Traceability with Architecture Entity Recognition

Anne Koziolek; Dominik Fuch{\ss}; Haoyu Liu; Jan Keim; Johannes von Geisau; Sophie Corallo; Tobias Hey

arxiv: 2511.02434 · v2 · submitted 2025-11-04 · 💻 cs.SE

Who's Who? LLM-assisted Software Traceability with Architecture Entity Recognition

Dominik Fuch{\ss} , Haoyu Liu , Sophie Corallo , Tobias Hey , Jan Keim , Johannes von Geisau , Anne Koziolek This is my paper

Pith reviewed 2026-05-18 01:32 UTC · model grok-4.3

classification 💻 cs.SE

keywords software traceabilityLLM entity recognitionarchitecture documentationtraceability link recoverysoftware architecture modelssource code analysisentity matchingautomated SAM generation

0 comments

The pith

Large language models can identify architectural entities in documentation and code to automate traceability without manual models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper examines how large language models can locate architecturally relevant entities inside software architecture documentation and source code. The aim is to remove the bottleneck of manually building software architecture models so that traceability links between high-level descriptions and implementations become easier to establish and maintain. Two methods are developed: ExArch pulls component names directly from the artifacts to form simple models, while ArTEMiS locates entities and matches them semantically to models that may themselves be generated automatically. Experiments show ExArch reaching an F1 score of 0.86 using only documentation and code, close to a strong baseline that still requires hand-crafted models. The results indicate that LLM assistance can make architecture-to-code traceability practical for more teams and projects.

Core claim

The central claim is that LLMs can effectively identify architectural entities in textual artifacts, enabling automated SAM generation and TLR. ExArch extracts component names as simple SAMs from SAD and source code and achieves an F1 of 0.86, comparable to TransArC at 0.87 that needs manual SAMs. ArTEMiS matches entities and performs on par with the heuristic SWATTR at F1 0.81, while the combination of ArTEMiS and ExArch outperforms the best baseline without manual SAMs.

What carries the argument

ExArch and ArTEMiS, LLM-driven methods that extract component names from SAD and code or perform semantic entity matching to produce or link against SAMs for traceability.

Load-bearing premise

The revised benchmark and the manually created SAMs used for evaluation accurately represent real architectural entities across projects and documentation styles.

What would settle it

Applying the same LLM prompts and matching steps to a new collection of projects with independently produced ground-truth SAMs and checking whether F1 scores stay near 0.86 or fall sharply.

Figures

Figures reproduced from arXiv: 2511.02434 by Anne Koziolek, Dominik Fuch{\ss}, Haoyu Liu, Jan Keim, Johannes von Geisau, Sophie Corallo, Tobias Hey.

**Figure 2.** Figure 2: Comparison of extracted SAMs for MediaStore using SADs [PITH_FULL_IMAGE:figures/full_fig_p015_2.png] view at source ↗

**Figure 3.** Figure 3: Comparison of extracted SAMs. For JabRef the Code-extracted components in the picture cover only the [PITH_FULL_IMAGE:figures/full_fig_p016_3.png] view at source ↗

read the original abstract

Identifying architecturally relevant entities in textual artifacts is crucial for Traceability Link Recovery (TLR) between Software Architecture Documentation (SAD) and source code. While Software Architecture Models (SAMs) can bridge the semantic gap between these artifacts, their manual creation is time-consuming. LLMs offer new capabilities for extracting architectural entities from SAD and source code to construct SAMs automatically or establish direct trace links. This paper extends our ICSA 2025 paper [19], which introduced Extracting Architecture (ExArch) for LLM-based architecture component name extraction. The extension contributes the novel Architecture Traceability with Entity Matching via Semantic inference (ArTEMiS) approach, an extended evaluation with additional LLMs, configurations, a revised benchmark, and a combined evaluation of both approaches. Specifically, this paper presents the following approaches: ExArch extracts component names as simple SAMs from SAD and source code to eliminate the need for manual SAM creation, while ArTEMiS identifies architectural entities in documentation and matches them with (manually or automatically generated) SAM entities. Our evaluation compares against state-of-the-art approaches SWATTR, TransArC and ArDoCode. TransArC achieves strong performance (F1: 0.87) but requires manually created SAMs; ExArch achieves comparable results (F1: 0.86) using only SAD and code. ArTEMiS is on par with the traditional heuristic-based SWATTR (F1: 0.81) and can successfully replace it when integrated with TransArC. The combination of ArTEMiS and ExArch outperforms ArDoCode, the best baseline without manual SAMs. Our results demonstrate that LLMs can effectively identify architectural entities in textual artifacts, enabling automated SAM generation and TLR, making architecture-code traceability more practical and accessible.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The extension adds ArTEMiS for semantic matching and shows LLM F1 scores close to baselines, but the revised benchmark's reliability is the main open question.

read the letter

The main point is that this work extends their prior ExArch paper with ArTEMiS, which uses LLMs for semantic inference to match architectural entities, and reports F1 scores that put the new methods on par with or near the baselines while cutting down on manual SAM creation in some setups. ExArch reaches 0.86 without manual models, nearly matching TransArC at 0.87, and the combination beats ArDoCode. ArTEMiS hits 0.81 like the heuristic SWATTR. They added more LLMs, configurations, a revised benchmark, and a joint evaluation of both approaches. This gives concrete numbers on automating parts of architecture traceability from docs and code. The comparisons are direct and the scores are stated plainly against named baselines, which makes the practical angle easy to see. It stays tied to the earlier ICSA result without inflating what the new pieces add. The soft spot is the benchmark and the manual SAMs used for evaluation. The abstract reports the F1 numbers but gives no details on how the benchmark was revised, how entities were defined, who annotated them, or any consistency checks across projects. If those definitions carry project-specific bias or subjectivity, the scores could overstate how well the LLMs would work on other documentation styles. The stress-test concern about ground-truth validity holds up from what's shown, since there's no external validation mentioned. Prompt engineering, data splits, statistical significance, and error analysis are also absent, which limits how far the results can be checked. This is for researchers in software architecture traceability or LLM use in SE tasks. A reader who wants empirical comparisons with specific scores in this niche would find it useful. It has enough new empirical content and a clear extension of prior work to deserve a serious referee, even with the gaps. Send it to peer review and ask for more on benchmark construction and annotation reliability.

Referee Report

3 major / 2 minor

Summary. The paper extends prior ICSA 2025 work on ExArch (LLM-based extraction of architecture component names from SAD and source code to produce simple SAMs) by introducing ArTEMiS (LLM-based identification of architectural entities in documentation followed by semantic matching to SAM entities for TLR). It reports an extended evaluation using additional LLMs and configurations on a revised benchmark, with ExArch reaching F1 0.86 (comparable to TransArC at 0.87, which requires manual SAMs), ArTEMiS matching SWATTR at F1 0.81, and the ExArch+ArTEMiS combination outperforming ArDoCode; the central claim is that these LLM approaches make automated SAM generation and architecture-code traceability practical.

Significance. If the ground-truth annotations prove reliable, the results indicate that LLMs can deliver performance on par with or better than prior methods while reducing or eliminating manual SAM construction, which would lower barriers to traceability link recovery in software architecture practice.

major comments (3)

[Evaluation / Benchmark description (around §4)] The headline performance claims (ExArch F1 0.86 vs. TransArC 0.87; ArTEMiS F1 0.81 matching SWATTR) rest on the revised benchmark and manually created SAMs accurately representing real-world architectural entities. The manuscript must detail the construction process for these artifacts, including inter-annotator agreement, project selection criteria, and any validation steps, because subjectivity or project-specific bias in entity definitions would directly affect the reported F1 scores and the claim of practical automated TLR.
[Experimental setup / Evaluation methodology] The experimental setup lacks sufficient detail on prompt engineering choices, train/test splits, number of runs, and statistical significance testing for the F1 comparisons against SWATTR, TransArC, and ArDoCode. Without these, it is difficult to assess whether the observed differences (e.g., ExArch nearly matching TransArC) are robust or sensitive to implementation decisions.
[Results and discussion] No error analysis or qualitative breakdown of false positives/negatives is provided for either ExArch or ArTEMiS. Such analysis is needed to substantiate the claim that LLMs can reliably identify architectural entities across documentation styles and to identify remaining limitations before asserting that the approaches make TLR “more practical and accessible.”

minor comments (2)

[Abstract] The abstract states that the work uses “a revised benchmark” but does not summarize what was changed from the prior ICSA version or why the revision was necessary.
[Introduction] Notation for the two approaches (ExArch vs. ArTEMiS) and their outputs (simple SAMs vs. entity matching) should be introduced more explicitly in the contributions paragraph to avoid reader confusion.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment below, indicating the revisions we will make to improve the manuscript's transparency and rigor.

read point-by-point responses

Referee: The headline performance claims (ExArch F1 0.86 vs. TransArC 0.87; ArTEMiS F1 0.81 matching SWATTR) rest on the revised benchmark and manually created SAMs accurately representing real-world architectural entities. The manuscript must detail the construction process for these artifacts, including inter-annotator agreement, project selection criteria, and any validation steps, because subjectivity or project-specific bias in entity definitions would directly affect the reported F1 scores and the claim of practical automated TLR.

Authors: We agree that a detailed account of benchmark construction is essential to establish the reliability of the ground-truth annotations. In the revised manuscript, we will expand Section 4 to describe the project selection criteria, the annotation guidelines and process for creating SAMs and entity labels, inter-annotator agreement metrics (including Cohen's kappa), and validation procedures. This addition will directly support the validity of the reported F1 scores. revision: yes
Referee: The experimental setup lacks sufficient detail on prompt engineering choices, train/test splits, number of runs, and statistical significance testing for the F1 comparisons against SWATTR, TransArC, and ArDoCode. Without these, it is difficult to assess whether the observed differences (e.g., ExArch nearly matching TransArC) are robust or sensitive to implementation decisions.

Authors: We acknowledge the importance of methodological transparency for evaluating robustness. We will revise the experimental setup section to elaborate on prompt engineering choices and templates, clarify data splits (noting the predominantly zero-shot/few-shot nature of the LLM evaluations), report the number of runs, and add statistical significance testing (e.g., McNemar's test or bootstrap methods) for the F1 comparisons. revision: yes
Referee: No error analysis or qualitative breakdown of false positives/negatives is provided for either ExArch or ArTEMiS. Such analysis is needed to substantiate the claim that LLMs can reliably identify architectural entities across documentation styles and to identify remaining limitations before asserting that the approaches make TLR “more practical and accessible.”

Authors: We agree that a qualitative error analysis would strengthen the discussion of limitations and reliability across documentation styles. In the revised manuscript, we will add a dedicated subsection in the Results and Discussion that provides examples of false positives and negatives for ExArch and ArTEMiS, along with categorization by entity type or documentation characteristics where relevant. revision: yes

Circularity Check

0 steps flagged

Empirical evaluation against external baselines with no derivations or self-referential reductions

full rationale

The paper performs an empirical comparison of LLM-based methods (ExArch for component extraction and ArTEMiS for entity matching) against independent baselines (SWATTR, TransArC, ArDoCode) using F1 scores on a revised benchmark and manually created SAMs. No equations, first-principles derivations, fitted parameters renamed as predictions, or self-definitional loops are present. The extension of prior ICSA 2025 work [19] introduces ExArch but the current results rely on new evaluations with additional LLMs and configurations, which are measured against external benchmarks rather than reducing to the method's own inputs by construction. Self-citation is not load-bearing for the central claims. This is a standard empirical performance study self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claims rest on standard assumptions about LLM capabilities for named entity recognition in technical text and the existence of consistent architectural entities across documentation and code.

axioms (1)

domain assumption Large language models can be prompted to reliably extract architecturally relevant entity names from software documentation and source code.
This underpins both ExArch and ArTEMiS extraction steps.

pith-pipeline@v0.9.0 · 5883 in / 1172 out tokens · 36057 ms · 2026-05-18T01:32:07.179824+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

69 extracted references · 69 canonical work pages

[1]

Aakash Ahmad, Muhammad Waseem, Peng Liang, Mahdi Fahmideh, Mst Shamima Aktar, and Tommi Mikkonen. 2023. Towards Human-Bot Collaborative Software Architecting with ChatGPT. InProceedings of the 27th International Conference on Evaluation and Assessment in Software Engineering(Oulu, Finland)(EASE ’23). Association for Computing Machinery, New York, NY, USA,...

work page doi:10.1145/3593434.3593468 2023
[2]

Amarjeet and Jitender Kumar Chhabra. 2017. Improving modular structure of software system using structural and lexical dependency.Information and Software Technology82 (2017), 96–120. doi:10.1016/j.infsof.2016.09.011

work page doi:10.1016/j.infsof.2016.09.011 2017
[3]

Antoniol, G

G. Antoniol, G. Canfora, G. Casazza, A. De Lucia, and E. Merlo. 2002. Recovering Traceability Links between Code and Documentation.IEEE Transactions on Software Engineering28, 10 (Oct. 2002), 970–983. doi:10.1109/TSE.2002.1041053

work page doi:10.1109/tse.2002.1041053 2002
[4]

Thazin Win Win Aung, Huan Huo, and Yulei Sui. 2020. A Literature Review of Automatic Traceability Links Recovery for Software Change Impact Analysis. InProceedings of the 28th International Conference on Program Comprehension(Seoul, Republic of Korea)(ICPC ’20). Association for Manuscript submitted to ACM Who’s Who? LLM-assisted Software Traceability with...

work page doi:10.1145/3387904.3389251 2020
[5]

YuXuan Chen, Jianwei Ding, Dashuang Li, and Zhouguo Chen. 2021. Joint BERT Model based Cybersecurity Named Entity Recognition. In Proceedings of the 2021 4th International Conference on Software Engineering and Information Management(Yokohama, Japan)(ICSIM ’21). Association for Computing Machinery, New York, NY, USA, 236–242. doi:10.1145/3451471.3451508

work page doi:10.1145/3451471.3451508 2021
[6]

Choongki Cho, Ki-Seong Lee, Minsoo Lee, and Chan-Gun Lee. 2019. Software Architecture Module-View Recovery Using Cluster Ensembles.IEEE Access7 (2019), 72872–72884. doi:10.1109/ACCESS.2019.2920427

work page doi:10.1109/access.2019.2920427 2019
[7]

2012.Software and systems traceability

Jane Cleland-Huang, Orlena Gotel, Andrea Zisman, et al. 2012.Software and systems traceability. Vol. 2. Springer. doi:10.1007/978-1-4471-2239-5

work page doi:10.1007/978-1-4471-2239-5 2012
[8]

Anna Corazza, Sergio Di Martino, Valerio Maggio, and Giuseppe Scanniello. 2011. Investigating the use of lexical information for software system clustering. In2011 15th European Conference on Software Maintenance and Reengineering. 35–44. doi:10.1109/CSMR.2011.8

work page doi:10.1109/csmr.2011.8 2011
[9]

Javier Cámara, Lola Burgueño, and Javier Troya. 2024. Towards standarized benchmarks of LLMs in software modeling tasks: a conceptual framework.Software and Systems Modeling(Sept. 2024). doi:10.1007/s10270-024-01206-9

work page doi:10.1007/s10270-024-01206-9 2024
[10]

Javier Cámara, Javier Troya, Lola Burgueño, and Antonio Vallecillo. 2023. On the assessment of generative AI in modeling tasks: an experience report with ChatGPT and UML.Software and Systems Modeling22, 3 (May 2023), 781–793. doi:10.1007/s10270-023-01105-5

work page doi:10.1007/s10270-023-01105-5 2023
[11]

Javier Cámara, Javier Troya, Julio Montes-Torres, and Francisco J. Jaime. 2024. Generative AI in the Software Modeling Classroom: An Experience Report With ChatGPT and Unified Modeling Language.IEEE Software41, 6 (2024), 73–81. doi:10.1109/MS.2024.3385309

work page doi:10.1109/ms.2024.3385309 2024
[12]

Souvick Das, Novarun Deb, Agostino Cortesi, and Nabendu Chaki. 2023. Zero-shot Learning for Named Entity Recognition in Software Specification Documents. In2023 IEEE 31st International Requirements Engineering Conference (RE). 100–110. doi:10.1109/RE57278.2023.00019

work page doi:10.1109/re57278.2023.00019 2023
[13]

R. Dhar, K. Vaidhyanathan, and V. Varma. 2024. Can LLMs Generate Architectural Design Decisions? - An Exploratory Empirical Study. In2024 IEEE 21st International Conference on Software Architecture (ICSA). IEEE Computer Society, Los Alamitos, CA, USA, 79–89. doi:10.1109/ICSA59870.2024. 00016

work page doi:10.1109/icsa59870.2024 2024
[14]

Andrés Díaz-Pace, Antonela Tommasel, and Rafael Capilla

J. Andrés Díaz-Pace, Antonela Tommasel, and Rafael Capilla. 2024. Helping Novice Architects to Make Quality Design Decisions Using an LLM-Based Assistant. InSoftware Architecture, Matthias Galster, Patrizia Scandurra, Tommi Mikkonen, Pablo Oliveira Antonino, Elisa Yumi Nakagawa, and Elena Navarro (Eds.). Springer Nature Switzerland, Cham, 324–332

work page 2024
[15]

Tobias Eisenreich, Sandro Speth, and Stefan Wagner. 2024. From Requirements to Architecture: An AI-Based Journey to Semi-Automatically Generate Software Architectures. InProceedings of the 1st International Workshop on Designing Software(Lisbon, Portugal)(Designing ’24). Association for Computing Machinery, New York, NY, USA, 52–55. doi:10.1145/3643660.3643942

work page doi:10.1145/3643660.3643942 2024
[16]

Zhangyin Feng, Daya Guo, Duyu Tang, Nan Duan, Xiaocheng Feng, Ming Gong, Linjun Shou, Bing Qin, Ting Liu, Daxin Jiang, and Ming Zhou. 2020. CodeBERT: A Pre-Trained Model for Programming and Natural Languages. InFindings of the Association for Computational Linguistics: EMNLP 2020. Association for Computational Linguistics, Online, 1536–1547. doi:10.18653/...

work page doi:10.18653/v1/2020.findings-emnlp.139 2020
[17]

Dominik Fuchß, Sophie Corallo, Jan Keim, Janek Speit, and Anne Koziolek. 2023. Establishing a Benchmark Dataset for Traceability Link Recovery Between Software Architecture Documentation and Models. InSoftware Architecture. ECSA 2022 Tracks and Workshops, Thais Batista, Tomáš Bureš, Claudia Raibulet, and Henry Muccini (Eds.). Springer International Publis...

work page doi:10.1007/978-3-031-36889-9_30 2023
[18]

Dominik Fuchß, Tobias Hey, Jan Keim, Haoyu Liu, Niklas Ewald, Tobias Thirolf, and Anne Koziolek. 2025. LiSSA: Toward Generic Traceability Link Recovery through Retrieval-Augmented Generation. InProceedings of the IEEE/ACM 47th International Conference on Software Engineering(Ottawa, Canada)(ICSE ’25). Institute of Electrical and Electronics Engineers (IEE...

work page doi:10.1109/icse55347.2025.00186 2025
[19]

Dominik Fuchß, Haoyu Liu, Tobias Hey, Jan Keim, and Anne Koziolek. 2025. Enabling Architecture Traceability by LLM-based Architecture Component Name Extraction. In2025 IEEE 22nd International Conference on Software Architecture (ICSA). Institute of Electrical and Electronics Engineers (IEEE). doi:10.1109/ICSA65012.2025.00011

work page doi:10.1109/icsa65012.2025.00011 2025
[20]

Dominik Fuchß, Haoyu Liu, Sophie Corallo, Tobias Hey, Jan Keim, Johannes von Geisau, and Anne Koziolek. 2025. Replication Package: Who’s Who? LLM-assisted Software Traceability with Architecture Entity Recognition. https://github.com/ardoco/Replication-Package-TAAS25_LLM- assisted-Software-Traceability-with-Architecture-Entity-Recognition Note: If accepte...

work page 2025
[21]

Dominik Fuchß, Haoyu Liu, Tobias Hey, Jan Keim, and Anne Koziolek. 2024. Replication Package: Enabling Architecture Traceability by LLM-based Architecture Component Name Extraction. doi:10.5281/ZENODO.14506935

work page doi:10.5281/zenodo.14506935 2024
[22]

Hui Gao, Hongyu Kuang, Kexin Sun, Xiaoxing Ma, Alexander Egyed, Patrick Mäder, Guoping Rong, Dong Shao, and He Zhang. 2023. Using Consensual Biterms from Text Structures of Requirements and Code to Improve IR-Based Traceability Recovery. InProceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering (ASE ’22). Association fo...

work page doi:10.1145/3551349.3556948 2023
[23]

Joshua Garcia, Daniel Popescu, Chris Mattmann, Nenad Medvidovic, and Yuanfang Cai. 2011. Enhancing architectural recovery using concerns. In 2011 26th IEEE/ACM International Conference on Automated Software Engineering (ASE 2011). 552–555. doi:10.1109/ASE.2011.6100123

work page doi:10.1109/ase.2011.6100123 2011
[24]

Jameleddine Hassine. 2024. An LLM-based Approach to Recover Traceability Links between Security Requirements and Goal Models. InProceedings of the 28th International Conference on Evaluation and Assessment in Software Engineering(Salerno, Italy)(EASE ’24). Association for Computing Machinery, New York, NY, USA, 643–651. doi:10.1145/3661167.3661261

work page doi:10.1145/3661167.3661261 2024
[25]

Jane Huffman Hayes, Alex Dekhtyar, and Senthil Karthikeyan Sundaram. 2006. Advancing Candidate Link Generation for Requirements Tracing: The Study of Methods.IEEE TSE32, 1 (Jan. 2006), 4–19. doi:10.1109/TSE.2006.3 Manuscript submitted to ACM 24 Dominik Fuchß, Haoyu Liu, Sophie Corallo, Tobias Hey, Jan Keim, Johannes von Geisau, and Anne Koziolek

work page doi:10.1109/tse.2006.3 2006
[26]

Jane Huffman Hayes, Alex Dekhtyar, Senthil Karthikeyan Sundaram, E Ashlee Holbrook, Sravanthi Vadlamudi, and Alain April. 2007. REquirements TRacing On target (RETRO): improving software maintenance through traceability recovery.Innovations in Systems and Software Engineering3 (2007), 193–202

work page 2007
[27]

Min Tjoa

Guntur Budi Herwanto, Gerald Quirchmayr, and A. Min Tjoa. 2024. Leveraging NLP Techniques for Privacy Requirements Engineering in User Stories.IEEE Access12 (2024), 22167–22189. doi:10.1109/ACCESS.2024.3364533

work page doi:10.1109/access.2024.3364533 2024
[28]

Tobias Hey, Fei Chen, Sebastian Weigelt, and Walter F. Tichy. 2021. Improving Traceability Link Recovery Using Fine-grained Requirements-to-Code Relations. In2021 IEEE International Conference on Software Maintenance and Evolution (ICSME)(2021-09). 12–22. doi:10.1109/ICSME52107.2021.00008

work page doi:10.1109/icsme52107.2021.00008 2021
[29]

Tobias Hey, Dominik Fuchß, Jan Keim, and Anne Koziolek. 2025. Requirements Traceability Link Recovery via Retrieval-Augmented Generation. In Requirements Engineering: Foundation for Software Quality. Springer, Cham. doi:10.1007/978-3-031-88531-0_27

work page doi:10.1007/978-3-031-88531-0_27 2025
[30]

Tobias Hey, Jan Keim, and Sophie Corallo. 2024. Requirements Classification for Traceability Link Recovery. In2024 IEEE 32nd International Requirements Engineering Conference (RE). 155–167. doi:10.1109/RE59067.2024.00024

work page doi:10.1109/re59067.2024.00024 2024
[31]

Xinyi Hou, Yanjie Zhao, Yue Liu, Zhou Yang, Kailong Wang, Li Li, Xiapu Luo, David Lo, John Grundy, and Haoyu Wang. 2024. Large Language Models for Software Engineering: A Systematic Literature Review.ACM Trans. Softw. Eng. Methodol.(Sept. 2024). doi:10.1145/3695988 Just Accepted

work page doi:10.1145/3695988 2024
[32]

Jan Keim, Sophie Corallo, Dominik Fuchß, Tobias Hey, Tobias Telge, and Anne Koziolek. 2024. Recovering Trace Links Between Software Documentation And Code. InProceedings of the IEEE/ACM 46th International Conference on Software Engineering(Lisbon, Portugal)(ICSE ’24). Association for Computing Machinery, New York, NY, USA, Article 215, 13 pages. doi:10.11...

work page doi:10.1145/3597503.3639130 2024
[33]

Jan Keim, Sophie Corallo, Dominik Fuchß, and Anne Koziolek. 2023. Detecting Inconsistencies in Software Architecture Documentation Using Traceability Link Recovery. In2023 IEEE 20th International Conference on Software Architecture (ICSA). 141–152. doi:10.1109/ICSA56044.2023.00021

work page doi:10.1109/icsa56044.2023.00021 2023
[34]

Jan Keim, Sophie Schulz, Dominik Fuchß, Claudius Kocher, Janek Speit, and Anne Koziolek. 2021. Trace Link Recovery for Software Architecture Documentation. InSoftware Architecture, Stefan Biffl, Elena Navarro, Welf Löwe, Marjan Sirjani, Raffaela Mirandola, and Danny Weyns (Eds.). Springer International Publishing, Cham, 101–116

work page 2021
[35]

Jan Keim, Sophie Schulz, Dominik Fuchß, Claudius Kocher, Janek Speit, and Anne Koziolek. 2021. Trace Link Recovery for Software Architecture Documentation. InSoftware Architecture, Stefan Biffl, Elena Navarro, Welf Löwe, Marjan Sirjani, Raffaela Mirandola, and Danny Weyns (Eds.). Springer International Publishing, Cham, 101–116. doi:10.1007/978-3-030-86044-8_7

work page doi:10.1007/978-3-030-86044-8_7 2021
[36]

Hongyu Kuang, Patrick Mäder, Hao Hu, Achraf Ghabi, LiGuo Huang, Jian Lü, and Alexander Egyed. 2015. Can method data dependencies support the assessment of traceability between requirements and source code?Journal of Software: Evolution and Process27, 11 (2015), 838–866. doi:10.1002/smr.1736

work page doi:10.1002/smr.1736 2015
[38]

Vladimir I Levenshtein et al. 1966. Binary codes capable of correcting deletions, insertions, and reversals. InSoviet physics doklady, Vol. 10. Soviet Union, 707–710

work page 1966
[39]

Jinfeng Lin, Yalin Liu, Qingkai Zeng, Meng Jiang, and Jane Cleland-Huang. 2021. Traceability Transformed: Generating more Accurate Links with Pre-Trained BERT Models. InProceedings of the 43rd International Conference on Software Engineering (ICSE ’21). IEEE Press, Madrid, Spain, 324–335. doi:10.1109/ICSE43902.2021.00040

work page doi:10.1109/icse43902.2021.00040 2021
[40]

Thibaud Lutellier, Devin Chollak, Joshua Garcia, Lin Tan, Derek Rayside, Nenad Medvidović, and Robert Kroeger. 2018. Measuring the Impact of Code Dependencies on Software Architecture Recovery Techniques.IEEE Transactions on Software Engineering44, 2 (2018), 159–181. doi:10.1109/ TSE.2017.2671865

work page arXiv 2018
[41]

Garima Malik, Mucahit Cevik, Swayami Bera, Savas Yildirim, Devang Parikh, and Ayse Basar. [n. d.]. Software requirement specific entity extraction using transformer models

work page
[42]

Niklas Meissner, Sandro Speth, and Steffen Becker. 2024. Automated Programming Exercise Generation in the Era of Large Language Models. In 2024 36th International Conference on Software Engineering Education and Training. 1–5. doi:10.1109/CSEET62301.2024.10662984

work page doi:10.1109/cseet62301.2024.10662984 2024
[44]

In: Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering

Kevin Moran, David N. Palacio, Carlos Bernal-Cárdenas, Daniel McCrystal, Denys Poshyvanyk, Chris Shenefiel, and Jeff Johnson. 2020. Improving the effectiveness of traceability link recovery using hierarchical bayesian networks. InProceedings of the ACM/IEEE 42nd International Conference on Software Engineering (ICSE ’20). Association for Computing Machine...

work page doi:10.1145/3377811.3380418 2020
[45]

Kazuki Nishikawa, Hironori Washizaki, Yoshiaki Fukazawa, Keishi Oshima, and Ryota Mibe. 2015. Recovering transitive traceability links among software artifacts. In2015 IEEE International Conference on Software Maintenance and Evolution (ICSME). 576–580. doi:10.1109/ICSM.2015.7332517

work page doi:10.1109/icsm.2015.7332517 2015
[46]

Marc North, Amir Atapour-Abarghouei, and Nelly Bencomo. 2024. Code Gradients: Towards Automated Traceability of LLM-Generated Code. In 2024 IEEE 32nd International Requirements Engineering Conference (RE). 321–329. doi:10.1109/RE59067.2024.00038

work page doi:10.1109/re59067.2024.00038 2024
[47]

Panichella, C

A. Panichella, C. McMillan, E. Moritz, D. Palmieri, R. Oliveto, D. Poshyvanyk, and A. De Lucia. 2013. When and How Using Structural Information to Improve IR-Based Traceability Recovery. In2013 17th European Conference on Software Maintenance and Reengineering. 199–208. doi:10.1109/ CSMR.2013.29

work page 2013
[48]

Patrick Rempel and Parick Mäder. 2017. Preventing Defects: The Impact of Requirements Traceability Completeness on Software Quality.IEEE Transactions on Software Engineering43, 8 (2017). doi:10.1109/TSE.2016.2622264 Manuscript submitted to ACM Who’s Who? LLM-assisted Software Traceability with Architecture Entity Recognition 25

work page doi:10.1109/tse.2016.2622264 2017
[49]

In2021 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Alberto D. Rodriguez, Jane Cleland-Huang, and Davide Falessi. 2021. Leveraging Intermediate Artifacts to Improve Automated Trace Link Retrieval. In2021 IEEE International Conference on Software Maintenance and Evolution (ICSME). 81–92. doi:10.1109/ICSME52107.2021.00014

work page doi:10.1109/icsme52107.2021.00014 2021
[50]

[RPG+21] Ron Ross, Victoria Pillitteri, Richard Graubart, Deborah Bodeau, and Rosalie Mcquaid

Alberto D. Rodriguez, Katherine R. Dearstyne, and Jane Cleland-Huang. 2023. Prompts Matter: Insights and Strategies for Prompt Engineering in Automated Software Traceability. In2023 IEEE 31st International Requirements Engineering Conference Workshops (REW). 455–464. doi:10.1109/ REW57809.2023.00087

work page arXiv 2023
[51]

Satrio Adi Rukmono, Lina Ochoa, and Michel Chaudron. 2024. Deductive Software Architecture Recovery via Chain-of-thought Prompting. InProceedings of the 2024 ACM/IEEE 44th International Conference on Software Engineering: New Ideas and Emerging Results(Lisbon, Portugal) (ICSE-NIER). Association for Computing Machinery, New York, NY, USA, 92–96. doi:10.114...

work page doi:10.1145/3639476.3639776 2024
[52]

Per Runeson and Martin Höst. 2008. Guidelines for conducting and reporting case study research in software engineering.Empirical Software Engineering14, 2 (2008), 131. doi:10.1007/s10664-008-9102-8

work page doi:10.1007/s10664-008-9102-8 2008
[53]

Daniel Russo, Sebastian Baltes, Niels van Berkel, Paris Avgeriou, Fabio Calefato, Beatriz Cabrero-Daniel, Gemma Catolino, Jürgen Cito, Neil Ernst, Thomas Fritz, Hideaki Hata, Reid Holmes, Maliheh Izadi, Foutse Khomh, Mikkel Baun Kjærgaard, Grischa Liebel, Alberto Lluch Lafuente, Stefano Lambiase, Walid Maalej, Gail Murphy, Nils Brede Moe, Gabrielle O’Brie...

work page doi:10.1016/j.jss.2024.112115 2024
[54]

June Sallou, Thomas Durieux, and Annibale Panichella. 2024. Breaking the Silence: the Threats of Using LLMs in Software Engineering. InProceedings of the 2024 ACM/IEEE 44th International Conference on Software Engineering: New Ideas and Emerging Results(Lisbon, Portugal)(ICSE-NIER’24). Association for Computing Machinery, New York, NY, USA, 102–106. doi:1...

work page doi:10.1145/3639476.3639764 2024
[55]

Zipani Tom Sinkala and Sebastian Herold. 2022. Hierarchical Code-to-Architecture Mapping. InSoftware Architecture, Patrizia Scandurra, Matthias Galster, Raffaela Mirandola, and Danny Weyns (Eds.). Springer International Publishing, Cham, 86–104

work page 2022
[56]

Skander Soltani and Elias Limouni. 2025. LLM Based Data Annotation and Augmentation for NER and Relationship Extraction Models Enhancement. InArtificial Intelligence for Global Security, Dominique Verdejo and Eunika Mercier-Laurent (Eds.). Springer Nature Switzerland, Cham, 153–160

work page 2025
[57]

1973.Allgemeine Modelltheorie

Herbert Stachowiak. 1973.Allgemeine Modelltheorie. Springer Verlag, Wien

work page 1973
[58]

Chao Sun, Mingjing Tang, Li Liang, and Wei Zou. 2020. Software Entity Recognition Method Based on BERT Embedding. InMachine Learning for Cyber Security, Xiaofeng Chen, Hongyang Yan, Qiben Yan, and Xiangliang Zhang (Eds.). Springer International Publishing, Cham, 33–47

work page 2020
[59]

Jeniya Tabassum, Mounica Maddela, Wei Xu, and Alan Ritter. 2020. Code and Named Entity Recognition in StackOverflow. InProceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Dan Jurafsky, Joyce Chai, Natalie Schluter, and Joel Tetreault (Eds.). Association for Computational Linguistics, Online, 4913–4926. doi:10.18653/v1/...

work page doi:10.18653/v1/2020.acl-main.443 2020
[60]

Mingjing Tang, Tong Li, Wei Gao, and Yu Xia. 2022. AttenSy-SNER: software knowledge entity extraction with syntactic features and semantic augmentation information.Complex & Intelligent Systems9, 1 (June 2022), 25–39. doi:10.1007/s40747-022-00742-5

work page doi:10.1007/s40747-022-00742-5 2022
[61]

Fangchao Tian, Tianlu Wang, Peng Liang, Chong Wang, Arif Ali Khan, and Muhammad Ali Babar. 2021. The impact of trace- ability on software maintenance and evolution: A mapping study.Journal of Software: Evolution and Process33, 10 (2021), e2374. arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1002/smr.2374 doi:10.1002/smr.2374

work page doi:10.1002/smr.2374 2021
[62]

MM Tikhomirov, NV Loukachevitch, and BV Dobrov. 2020. Recognizing named entities in specific domain.Lobachevskii Journal of Mathematics41, 8 (2020), 1591–1602

work page 2020
[63]

Vassilios Tzerpos and Richard C Holt. 2000. Accd: an algorithm for comprehension-driven clustering. InProceedings Seventh Working Conference on Reverse Engineering. IEEE, 258–267

work page 2000
[64]

Veera Prathap Reddy, P

M. Veera Prathap Reddy, P. V. R. D. Prasad, Manjunath Chikkamath, and Sarathchandra Mandadi. 2019. NERSE: Named Entity Recognition in Software Engineering as a Service. InService Research and Innovation, Ho-Pun Lam and Sajib Mistry (Eds.). Springer International Publishing, Cham, 65–80

work page 2019
[65]

William E Winkler. 1990. String comparator metrics and enhanced decision rules in the Fellegi-Sunter model of record linkage. (1990)

work page 1990
[66]

Han Wu, Xiaoyong Li, and Yali Gao. 2020. An Effective Approach of Named Entity Recognition for Cyber Threat Intelligence. In2020 IEEE 4th Information Technology, Networking, Electronic and Automation Control Conference (ITNEC), Vol. 1. 1370–1374. doi:10.1109/ITNEC48623.2020.9085102

work page doi:10.1109/itnec48623.2020.9085102 2020
[67]

Xiao and V

C. Xiao and V. Tzerpos. 2005. Software clustering based on dynamic dependencies. InNinth European Conference on Software Maintenance and Reengineering. 124–133. doi:10.1109/CSMR.2005.49

work page doi:10.1109/csmr.2005.49 2005
[68]

Deheng Ye, Zhenchang Xing, Chee Yong Foo, Zi Qun Ang, Jing Li, and Nachiket Kapre. 2016. Software-Specific Named Entity Recognition in Software Engineering Social Content. In2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER), Vol. 1. 90–101. doi:10.1109/SANER.2016.10

work page doi:10.1109/saner.2016.10 2016
[69]

Yiran Zhang, Zhengzi Xu, Chengwei Liu, Hongxu Chen, Jianwen Sun, Dong Qiu, and Yang Liu. 2023. Software Architecture Recovery with Information Fusion. InProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (San Francisco, CA, USA)(ESEC/FSE 2023). Association for Computing Machi...

work page doi:10.1145/3611643.3616285 2023
[70]

Zejun Zhang, Zhenchang Xing, Xiaoxue Ren, Qinghua Lu, and Xiwei Xu. 2024. Refactoring to Pythonic Idioms: A Hybrid Knowledge-Driven Approach Leveraging Large Language Models.Proc. ACM Softw. Eng.1, FSE, Article 50 (July 2024), 22 pages. doi:10.1145/3643776

work page doi:10.1145/3643776 2024
[71]

c o n v e r t

Cheng Zhou, Bin Li, and Xiaobing Sun. 2020. Improving software bug-specific named entity recognition with deep neural network.Journal of Systems and Software165 (2020), 110572. doi:10.1016/j.jss.2020.110572 Manuscript submitted to ACM 26 Dominik Fuchß, Haoyu Liu, Sophie Corallo, Tobias Hey, Jan Keim, Johannes von Geisau, and Anne Koziolek A ArTEMiS Prompt...

work page doi:10.1016/j.jss.2020.110572 2020

[1] [1]

Aakash Ahmad, Muhammad Waseem, Peng Liang, Mahdi Fahmideh, Mst Shamima Aktar, and Tommi Mikkonen. 2023. Towards Human-Bot Collaborative Software Architecting with ChatGPT. InProceedings of the 27th International Conference on Evaluation and Assessment in Software Engineering(Oulu, Finland)(EASE ’23). Association for Computing Machinery, New York, NY, USA,...

work page doi:10.1145/3593434.3593468 2023

[2] [2]

Amarjeet and Jitender Kumar Chhabra. 2017. Improving modular structure of software system using structural and lexical dependency.Information and Software Technology82 (2017), 96–120. doi:10.1016/j.infsof.2016.09.011

work page doi:10.1016/j.infsof.2016.09.011 2017

[3] [3]

Antoniol, G

G. Antoniol, G. Canfora, G. Casazza, A. De Lucia, and E. Merlo. 2002. Recovering Traceability Links between Code and Documentation.IEEE Transactions on Software Engineering28, 10 (Oct. 2002), 970–983. doi:10.1109/TSE.2002.1041053

work page doi:10.1109/tse.2002.1041053 2002

[4] [4]

Thazin Win Win Aung, Huan Huo, and Yulei Sui. 2020. A Literature Review of Automatic Traceability Links Recovery for Software Change Impact Analysis. InProceedings of the 28th International Conference on Program Comprehension(Seoul, Republic of Korea)(ICPC ’20). Association for Manuscript submitted to ACM Who’s Who? LLM-assisted Software Traceability with...

work page doi:10.1145/3387904.3389251 2020

[5] [5]

YuXuan Chen, Jianwei Ding, Dashuang Li, and Zhouguo Chen. 2021. Joint BERT Model based Cybersecurity Named Entity Recognition. In Proceedings of the 2021 4th International Conference on Software Engineering and Information Management(Yokohama, Japan)(ICSIM ’21). Association for Computing Machinery, New York, NY, USA, 236–242. doi:10.1145/3451471.3451508

work page doi:10.1145/3451471.3451508 2021

[6] [6]

Choongki Cho, Ki-Seong Lee, Minsoo Lee, and Chan-Gun Lee. 2019. Software Architecture Module-View Recovery Using Cluster Ensembles.IEEE Access7 (2019), 72872–72884. doi:10.1109/ACCESS.2019.2920427

work page doi:10.1109/access.2019.2920427 2019

[7] [7]

2012.Software and systems traceability

Jane Cleland-Huang, Orlena Gotel, Andrea Zisman, et al. 2012.Software and systems traceability. Vol. 2. Springer. doi:10.1007/978-1-4471-2239-5

work page doi:10.1007/978-1-4471-2239-5 2012

[8] [8]

Anna Corazza, Sergio Di Martino, Valerio Maggio, and Giuseppe Scanniello. 2011. Investigating the use of lexical information for software system clustering. In2011 15th European Conference on Software Maintenance and Reengineering. 35–44. doi:10.1109/CSMR.2011.8

work page doi:10.1109/csmr.2011.8 2011

[9] [9]

Javier Cámara, Lola Burgueño, and Javier Troya. 2024. Towards standarized benchmarks of LLMs in software modeling tasks: a conceptual framework.Software and Systems Modeling(Sept. 2024). doi:10.1007/s10270-024-01206-9

work page doi:10.1007/s10270-024-01206-9 2024

[10] [10]

Javier Cámara, Javier Troya, Lola Burgueño, and Antonio Vallecillo. 2023. On the assessment of generative AI in modeling tasks: an experience report with ChatGPT and UML.Software and Systems Modeling22, 3 (May 2023), 781–793. doi:10.1007/s10270-023-01105-5

work page doi:10.1007/s10270-023-01105-5 2023

[11] [11]

Javier Cámara, Javier Troya, Julio Montes-Torres, and Francisco J. Jaime. 2024. Generative AI in the Software Modeling Classroom: An Experience Report With ChatGPT and Unified Modeling Language.IEEE Software41, 6 (2024), 73–81. doi:10.1109/MS.2024.3385309

work page doi:10.1109/ms.2024.3385309 2024

[12] [12]

Souvick Das, Novarun Deb, Agostino Cortesi, and Nabendu Chaki. 2023. Zero-shot Learning for Named Entity Recognition in Software Specification Documents. In2023 IEEE 31st International Requirements Engineering Conference (RE). 100–110. doi:10.1109/RE57278.2023.00019

work page doi:10.1109/re57278.2023.00019 2023

[13] [13]

R. Dhar, K. Vaidhyanathan, and V. Varma. 2024. Can LLMs Generate Architectural Design Decisions? - An Exploratory Empirical Study. In2024 IEEE 21st International Conference on Software Architecture (ICSA). IEEE Computer Society, Los Alamitos, CA, USA, 79–89. doi:10.1109/ICSA59870.2024. 00016

work page doi:10.1109/icsa59870.2024 2024

[14] [14]

Andrés Díaz-Pace, Antonela Tommasel, and Rafael Capilla

J. Andrés Díaz-Pace, Antonela Tommasel, and Rafael Capilla. 2024. Helping Novice Architects to Make Quality Design Decisions Using an LLM-Based Assistant. InSoftware Architecture, Matthias Galster, Patrizia Scandurra, Tommi Mikkonen, Pablo Oliveira Antonino, Elisa Yumi Nakagawa, and Elena Navarro (Eds.). Springer Nature Switzerland, Cham, 324–332

work page 2024

[15] [15]

Tobias Eisenreich, Sandro Speth, and Stefan Wagner. 2024. From Requirements to Architecture: An AI-Based Journey to Semi-Automatically Generate Software Architectures. InProceedings of the 1st International Workshop on Designing Software(Lisbon, Portugal)(Designing ’24). Association for Computing Machinery, New York, NY, USA, 52–55. doi:10.1145/3643660.3643942

work page doi:10.1145/3643660.3643942 2024

[16] [16]

Zhangyin Feng, Daya Guo, Duyu Tang, Nan Duan, Xiaocheng Feng, Ming Gong, Linjun Shou, Bing Qin, Ting Liu, Daxin Jiang, and Ming Zhou. 2020. CodeBERT: A Pre-Trained Model for Programming and Natural Languages. InFindings of the Association for Computational Linguistics: EMNLP 2020. Association for Computational Linguistics, Online, 1536–1547. doi:10.18653/...

work page doi:10.18653/v1/2020.findings-emnlp.139 2020

[17] [17]

Dominik Fuchß, Sophie Corallo, Jan Keim, Janek Speit, and Anne Koziolek. 2023. Establishing a Benchmark Dataset for Traceability Link Recovery Between Software Architecture Documentation and Models. InSoftware Architecture. ECSA 2022 Tracks and Workshops, Thais Batista, Tomáš Bureš, Claudia Raibulet, and Henry Muccini (Eds.). Springer International Publis...

work page doi:10.1007/978-3-031-36889-9_30 2023

[18] [18]

Dominik Fuchß, Tobias Hey, Jan Keim, Haoyu Liu, Niklas Ewald, Tobias Thirolf, and Anne Koziolek. 2025. LiSSA: Toward Generic Traceability Link Recovery through Retrieval-Augmented Generation. InProceedings of the IEEE/ACM 47th International Conference on Software Engineering(Ottawa, Canada)(ICSE ’25). Institute of Electrical and Electronics Engineers (IEE...

work page doi:10.1109/icse55347.2025.00186 2025

[19] [19]

Dominik Fuchß, Haoyu Liu, Tobias Hey, Jan Keim, and Anne Koziolek. 2025. Enabling Architecture Traceability by LLM-based Architecture Component Name Extraction. In2025 IEEE 22nd International Conference on Software Architecture (ICSA). Institute of Electrical and Electronics Engineers (IEEE). doi:10.1109/ICSA65012.2025.00011

work page doi:10.1109/icsa65012.2025.00011 2025

[20] [20]

Dominik Fuchß, Haoyu Liu, Sophie Corallo, Tobias Hey, Jan Keim, Johannes von Geisau, and Anne Koziolek. 2025. Replication Package: Who’s Who? LLM-assisted Software Traceability with Architecture Entity Recognition. https://github.com/ardoco/Replication-Package-TAAS25_LLM- assisted-Software-Traceability-with-Architecture-Entity-Recognition Note: If accepte...

work page 2025

[21] [21]

Dominik Fuchß, Haoyu Liu, Tobias Hey, Jan Keim, and Anne Koziolek. 2024. Replication Package: Enabling Architecture Traceability by LLM-based Architecture Component Name Extraction. doi:10.5281/ZENODO.14506935

work page doi:10.5281/zenodo.14506935 2024

[22] [22]

Hui Gao, Hongyu Kuang, Kexin Sun, Xiaoxing Ma, Alexander Egyed, Patrick Mäder, Guoping Rong, Dong Shao, and He Zhang. 2023. Using Consensual Biterms from Text Structures of Requirements and Code to Improve IR-Based Traceability Recovery. InProceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering (ASE ’22). Association fo...

work page doi:10.1145/3551349.3556948 2023

[23] [23]

Joshua Garcia, Daniel Popescu, Chris Mattmann, Nenad Medvidovic, and Yuanfang Cai. 2011. Enhancing architectural recovery using concerns. In 2011 26th IEEE/ACM International Conference on Automated Software Engineering (ASE 2011). 552–555. doi:10.1109/ASE.2011.6100123

work page doi:10.1109/ase.2011.6100123 2011

[24] [24]

Jameleddine Hassine. 2024. An LLM-based Approach to Recover Traceability Links between Security Requirements and Goal Models. InProceedings of the 28th International Conference on Evaluation and Assessment in Software Engineering(Salerno, Italy)(EASE ’24). Association for Computing Machinery, New York, NY, USA, 643–651. doi:10.1145/3661167.3661261

work page doi:10.1145/3661167.3661261 2024

[25] [25]

Jane Huffman Hayes, Alex Dekhtyar, and Senthil Karthikeyan Sundaram. 2006. Advancing Candidate Link Generation for Requirements Tracing: The Study of Methods.IEEE TSE32, 1 (Jan. 2006), 4–19. doi:10.1109/TSE.2006.3 Manuscript submitted to ACM 24 Dominik Fuchß, Haoyu Liu, Sophie Corallo, Tobias Hey, Jan Keim, Johannes von Geisau, and Anne Koziolek

work page doi:10.1109/tse.2006.3 2006

[26] [26]

Jane Huffman Hayes, Alex Dekhtyar, Senthil Karthikeyan Sundaram, E Ashlee Holbrook, Sravanthi Vadlamudi, and Alain April. 2007. REquirements TRacing On target (RETRO): improving software maintenance through traceability recovery.Innovations in Systems and Software Engineering3 (2007), 193–202

work page 2007

[27] [27]

Min Tjoa

Guntur Budi Herwanto, Gerald Quirchmayr, and A. Min Tjoa. 2024. Leveraging NLP Techniques for Privacy Requirements Engineering in User Stories.IEEE Access12 (2024), 22167–22189. doi:10.1109/ACCESS.2024.3364533

work page doi:10.1109/access.2024.3364533 2024

[28] [28]

Tobias Hey, Fei Chen, Sebastian Weigelt, and Walter F. Tichy. 2021. Improving Traceability Link Recovery Using Fine-grained Requirements-to-Code Relations. In2021 IEEE International Conference on Software Maintenance and Evolution (ICSME)(2021-09). 12–22. doi:10.1109/ICSME52107.2021.00008

work page doi:10.1109/icsme52107.2021.00008 2021

[29] [29]

Tobias Hey, Dominik Fuchß, Jan Keim, and Anne Koziolek. 2025. Requirements Traceability Link Recovery via Retrieval-Augmented Generation. In Requirements Engineering: Foundation for Software Quality. Springer, Cham. doi:10.1007/978-3-031-88531-0_27

work page doi:10.1007/978-3-031-88531-0_27 2025

[30] [30]

Tobias Hey, Jan Keim, and Sophie Corallo. 2024. Requirements Classification for Traceability Link Recovery. In2024 IEEE 32nd International Requirements Engineering Conference (RE). 155–167. doi:10.1109/RE59067.2024.00024

work page doi:10.1109/re59067.2024.00024 2024

[31] [31]

Xinyi Hou, Yanjie Zhao, Yue Liu, Zhou Yang, Kailong Wang, Li Li, Xiapu Luo, David Lo, John Grundy, and Haoyu Wang. 2024. Large Language Models for Software Engineering: A Systematic Literature Review.ACM Trans. Softw. Eng. Methodol.(Sept. 2024). doi:10.1145/3695988 Just Accepted

work page doi:10.1145/3695988 2024

[32] [32]

Jan Keim, Sophie Corallo, Dominik Fuchß, Tobias Hey, Tobias Telge, and Anne Koziolek. 2024. Recovering Trace Links Between Software Documentation And Code. InProceedings of the IEEE/ACM 46th International Conference on Software Engineering(Lisbon, Portugal)(ICSE ’24). Association for Computing Machinery, New York, NY, USA, Article 215, 13 pages. doi:10.11...

work page doi:10.1145/3597503.3639130 2024

[33] [33]

Jan Keim, Sophie Corallo, Dominik Fuchß, and Anne Koziolek. 2023. Detecting Inconsistencies in Software Architecture Documentation Using Traceability Link Recovery. In2023 IEEE 20th International Conference on Software Architecture (ICSA). 141–152. doi:10.1109/ICSA56044.2023.00021

work page doi:10.1109/icsa56044.2023.00021 2023

[34] [34]

Jan Keim, Sophie Schulz, Dominik Fuchß, Claudius Kocher, Janek Speit, and Anne Koziolek. 2021. Trace Link Recovery for Software Architecture Documentation. InSoftware Architecture, Stefan Biffl, Elena Navarro, Welf Löwe, Marjan Sirjani, Raffaela Mirandola, and Danny Weyns (Eds.). Springer International Publishing, Cham, 101–116

work page 2021

[35] [35]

Jan Keim, Sophie Schulz, Dominik Fuchß, Claudius Kocher, Janek Speit, and Anne Koziolek. 2021. Trace Link Recovery for Software Architecture Documentation. InSoftware Architecture, Stefan Biffl, Elena Navarro, Welf Löwe, Marjan Sirjani, Raffaela Mirandola, and Danny Weyns (Eds.). Springer International Publishing, Cham, 101–116. doi:10.1007/978-3-030-86044-8_7

work page doi:10.1007/978-3-030-86044-8_7 2021

[36] [36]

Hongyu Kuang, Patrick Mäder, Hao Hu, Achraf Ghabi, LiGuo Huang, Jian Lü, and Alexander Egyed. 2015. Can method data dependencies support the assessment of traceability between requirements and source code?Journal of Software: Evolution and Process27, 11 (2015), 838–866. doi:10.1002/smr.1736

work page doi:10.1002/smr.1736 2015

[37] [38]

Vladimir I Levenshtein et al. 1966. Binary codes capable of correcting deletions, insertions, and reversals. InSoviet physics doklady, Vol. 10. Soviet Union, 707–710

work page 1966

[38] [39]

Jinfeng Lin, Yalin Liu, Qingkai Zeng, Meng Jiang, and Jane Cleland-Huang. 2021. Traceability Transformed: Generating more Accurate Links with Pre-Trained BERT Models. InProceedings of the 43rd International Conference on Software Engineering (ICSE ’21). IEEE Press, Madrid, Spain, 324–335. doi:10.1109/ICSE43902.2021.00040

work page doi:10.1109/icse43902.2021.00040 2021

[39] [40]

Thibaud Lutellier, Devin Chollak, Joshua Garcia, Lin Tan, Derek Rayside, Nenad Medvidović, and Robert Kroeger. 2018. Measuring the Impact of Code Dependencies on Software Architecture Recovery Techniques.IEEE Transactions on Software Engineering44, 2 (2018), 159–181. doi:10.1109/ TSE.2017.2671865

work page arXiv 2018

[40] [41]

Garima Malik, Mucahit Cevik, Swayami Bera, Savas Yildirim, Devang Parikh, and Ayse Basar. [n. d.]. Software requirement specific entity extraction using transformer models

work page

[41] [42]

Niklas Meissner, Sandro Speth, and Steffen Becker. 2024. Automated Programming Exercise Generation in the Era of Large Language Models. In 2024 36th International Conference on Software Engineering Education and Training. 1–5. doi:10.1109/CSEET62301.2024.10662984

work page doi:10.1109/cseet62301.2024.10662984 2024

[42] [44]

In: Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering

Kevin Moran, David N. Palacio, Carlos Bernal-Cárdenas, Daniel McCrystal, Denys Poshyvanyk, Chris Shenefiel, and Jeff Johnson. 2020. Improving the effectiveness of traceability link recovery using hierarchical bayesian networks. InProceedings of the ACM/IEEE 42nd International Conference on Software Engineering (ICSE ’20). Association for Computing Machine...

work page doi:10.1145/3377811.3380418 2020

[43] [45]

Kazuki Nishikawa, Hironori Washizaki, Yoshiaki Fukazawa, Keishi Oshima, and Ryota Mibe. 2015. Recovering transitive traceability links among software artifacts. In2015 IEEE International Conference on Software Maintenance and Evolution (ICSME). 576–580. doi:10.1109/ICSM.2015.7332517

work page doi:10.1109/icsm.2015.7332517 2015

[44] [46]

Marc North, Amir Atapour-Abarghouei, and Nelly Bencomo. 2024. Code Gradients: Towards Automated Traceability of LLM-Generated Code. In 2024 IEEE 32nd International Requirements Engineering Conference (RE). 321–329. doi:10.1109/RE59067.2024.00038

work page doi:10.1109/re59067.2024.00038 2024

[45] [47]

Panichella, C

A. Panichella, C. McMillan, E. Moritz, D. Palmieri, R. Oliveto, D. Poshyvanyk, and A. De Lucia. 2013. When and How Using Structural Information to Improve IR-Based Traceability Recovery. In2013 17th European Conference on Software Maintenance and Reengineering. 199–208. doi:10.1109/ CSMR.2013.29

work page 2013

[46] [48]

Patrick Rempel and Parick Mäder. 2017. Preventing Defects: The Impact of Requirements Traceability Completeness on Software Quality.IEEE Transactions on Software Engineering43, 8 (2017). doi:10.1109/TSE.2016.2622264 Manuscript submitted to ACM Who’s Who? LLM-assisted Software Traceability with Architecture Entity Recognition 25

work page doi:10.1109/tse.2016.2622264 2017

[47] [49]

In2021 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Alberto D. Rodriguez, Jane Cleland-Huang, and Davide Falessi. 2021. Leveraging Intermediate Artifacts to Improve Automated Trace Link Retrieval. In2021 IEEE International Conference on Software Maintenance and Evolution (ICSME). 81–92. doi:10.1109/ICSME52107.2021.00014

work page doi:10.1109/icsme52107.2021.00014 2021

[48] [50]

[RPG+21] Ron Ross, Victoria Pillitteri, Richard Graubart, Deborah Bodeau, and Rosalie Mcquaid

Alberto D. Rodriguez, Katherine R. Dearstyne, and Jane Cleland-Huang. 2023. Prompts Matter: Insights and Strategies for Prompt Engineering in Automated Software Traceability. In2023 IEEE 31st International Requirements Engineering Conference Workshops (REW). 455–464. doi:10.1109/ REW57809.2023.00087

work page arXiv 2023

[49] [51]

Satrio Adi Rukmono, Lina Ochoa, and Michel Chaudron. 2024. Deductive Software Architecture Recovery via Chain-of-thought Prompting. InProceedings of the 2024 ACM/IEEE 44th International Conference on Software Engineering: New Ideas and Emerging Results(Lisbon, Portugal) (ICSE-NIER). Association for Computing Machinery, New York, NY, USA, 92–96. doi:10.114...

work page doi:10.1145/3639476.3639776 2024

[50] [52]

Per Runeson and Martin Höst. 2008. Guidelines for conducting and reporting case study research in software engineering.Empirical Software Engineering14, 2 (2008), 131. doi:10.1007/s10664-008-9102-8

work page doi:10.1007/s10664-008-9102-8 2008

[51] [53]

Daniel Russo, Sebastian Baltes, Niels van Berkel, Paris Avgeriou, Fabio Calefato, Beatriz Cabrero-Daniel, Gemma Catolino, Jürgen Cito, Neil Ernst, Thomas Fritz, Hideaki Hata, Reid Holmes, Maliheh Izadi, Foutse Khomh, Mikkel Baun Kjærgaard, Grischa Liebel, Alberto Lluch Lafuente, Stefano Lambiase, Walid Maalej, Gail Murphy, Nils Brede Moe, Gabrielle O’Brie...

work page doi:10.1016/j.jss.2024.112115 2024

[52] [54]

June Sallou, Thomas Durieux, and Annibale Panichella. 2024. Breaking the Silence: the Threats of Using LLMs in Software Engineering. InProceedings of the 2024 ACM/IEEE 44th International Conference on Software Engineering: New Ideas and Emerging Results(Lisbon, Portugal)(ICSE-NIER’24). Association for Computing Machinery, New York, NY, USA, 102–106. doi:1...

work page doi:10.1145/3639476.3639764 2024

[53] [55]

Zipani Tom Sinkala and Sebastian Herold. 2022. Hierarchical Code-to-Architecture Mapping. InSoftware Architecture, Patrizia Scandurra, Matthias Galster, Raffaela Mirandola, and Danny Weyns (Eds.). Springer International Publishing, Cham, 86–104

work page 2022

[54] [56]

Skander Soltani and Elias Limouni. 2025. LLM Based Data Annotation and Augmentation for NER and Relationship Extraction Models Enhancement. InArtificial Intelligence for Global Security, Dominique Verdejo and Eunika Mercier-Laurent (Eds.). Springer Nature Switzerland, Cham, 153–160

work page 2025

[55] [57]

1973.Allgemeine Modelltheorie

Herbert Stachowiak. 1973.Allgemeine Modelltheorie. Springer Verlag, Wien

work page 1973

[56] [58]

Chao Sun, Mingjing Tang, Li Liang, and Wei Zou. 2020. Software Entity Recognition Method Based on BERT Embedding. InMachine Learning for Cyber Security, Xiaofeng Chen, Hongyang Yan, Qiben Yan, and Xiangliang Zhang (Eds.). Springer International Publishing, Cham, 33–47

work page 2020

[57] [59]

Jeniya Tabassum, Mounica Maddela, Wei Xu, and Alan Ritter. 2020. Code and Named Entity Recognition in StackOverflow. InProceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Dan Jurafsky, Joyce Chai, Natalie Schluter, and Joel Tetreault (Eds.). Association for Computational Linguistics, Online, 4913–4926. doi:10.18653/v1/...

work page doi:10.18653/v1/2020.acl-main.443 2020

[58] [60]

Mingjing Tang, Tong Li, Wei Gao, and Yu Xia. 2022. AttenSy-SNER: software knowledge entity extraction with syntactic features and semantic augmentation information.Complex & Intelligent Systems9, 1 (June 2022), 25–39. doi:10.1007/s40747-022-00742-5

work page doi:10.1007/s40747-022-00742-5 2022

[59] [61]

Fangchao Tian, Tianlu Wang, Peng Liang, Chong Wang, Arif Ali Khan, and Muhammad Ali Babar. 2021. The impact of trace- ability on software maintenance and evolution: A mapping study.Journal of Software: Evolution and Process33, 10 (2021), e2374. arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1002/smr.2374 doi:10.1002/smr.2374

work page doi:10.1002/smr.2374 2021

[60] [62]

MM Tikhomirov, NV Loukachevitch, and BV Dobrov. 2020. Recognizing named entities in specific domain.Lobachevskii Journal of Mathematics41, 8 (2020), 1591–1602

work page 2020

[61] [63]

Vassilios Tzerpos and Richard C Holt. 2000. Accd: an algorithm for comprehension-driven clustering. InProceedings Seventh Working Conference on Reverse Engineering. IEEE, 258–267

work page 2000

[62] [64]

Veera Prathap Reddy, P

M. Veera Prathap Reddy, P. V. R. D. Prasad, Manjunath Chikkamath, and Sarathchandra Mandadi. 2019. NERSE: Named Entity Recognition in Software Engineering as a Service. InService Research and Innovation, Ho-Pun Lam and Sajib Mistry (Eds.). Springer International Publishing, Cham, 65–80

work page 2019

[63] [65]

William E Winkler. 1990. String comparator metrics and enhanced decision rules in the Fellegi-Sunter model of record linkage. (1990)

work page 1990

[64] [66]

Han Wu, Xiaoyong Li, and Yali Gao. 2020. An Effective Approach of Named Entity Recognition for Cyber Threat Intelligence. In2020 IEEE 4th Information Technology, Networking, Electronic and Automation Control Conference (ITNEC), Vol. 1. 1370–1374. doi:10.1109/ITNEC48623.2020.9085102

work page doi:10.1109/itnec48623.2020.9085102 2020

[65] [67]

Xiao and V

C. Xiao and V. Tzerpos. 2005. Software clustering based on dynamic dependencies. InNinth European Conference on Software Maintenance and Reengineering. 124–133. doi:10.1109/CSMR.2005.49

work page doi:10.1109/csmr.2005.49 2005

[66] [68]

Deheng Ye, Zhenchang Xing, Chee Yong Foo, Zi Qun Ang, Jing Li, and Nachiket Kapre. 2016. Software-Specific Named Entity Recognition in Software Engineering Social Content. In2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER), Vol. 1. 90–101. doi:10.1109/SANER.2016.10

work page doi:10.1109/saner.2016.10 2016

[67] [69]

Yiran Zhang, Zhengzi Xu, Chengwei Liu, Hongxu Chen, Jianwen Sun, Dong Qiu, and Yang Liu. 2023. Software Architecture Recovery with Information Fusion. InProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (San Francisco, CA, USA)(ESEC/FSE 2023). Association for Computing Machi...

work page doi:10.1145/3611643.3616285 2023

[68] [70]

Zejun Zhang, Zhenchang Xing, Xiaoxue Ren, Qinghua Lu, and Xiwei Xu. 2024. Refactoring to Pythonic Idioms: A Hybrid Knowledge-Driven Approach Leveraging Large Language Models.Proc. ACM Softw. Eng.1, FSE, Article 50 (July 2024), 22 pages. doi:10.1145/3643776

work page doi:10.1145/3643776 2024

[69] [71]

c o n v e r t

Cheng Zhou, Bin Li, and Xiaobing Sun. 2020. Improving software bug-specific named entity recognition with deep neural network.Journal of Systems and Software165 (2020), 110572. doi:10.1016/j.jss.2020.110572 Manuscript submitted to ACM 26 Dominik Fuchß, Haoyu Liu, Sophie Corallo, Tobias Hey, Jan Keim, Johannes von Geisau, and Anne Koziolek A ArTEMiS Prompt...

work page doi:10.1016/j.jss.2020.110572 2020