A systematic review of 50 studies identifies 69 LLM-assisted tasks in empirical software engineering, concentrated in data processing and analysis with gaps in human-centered integration and reproducibility reporting.
Title resolution pending
9 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
R2Code improves requirement-to-code traceability with a bidirectional alignment network, self-reflective consistency verification, and dynamic context-adaptive retrieval, yielding 7.4% average F1 gain and up to 41.7% lower token use on five datasets.
A systematic mapping study of 45 LLM-based RE papers identifies and characterizes 62 public datasets, revealing imbalances in open-science practices, elicitation support, and socio-technical diversity.
LLMs can detect usability content in user reviews with F-scores comparable to humans, though performance depends strongly on prompt design.
LLM pipeline with generation-critic feedback reaches 61% accuracy on low-level goal extraction from requirements documents and outperforms standalone few-shot prompting, yet remains best suited as an accelerator for manual work.
An agentic LLM pipeline extracts and translates unstructured requirements into syntactically and semantically aligned formal properties, achieving 77.8% accuracy across three scenarios.
Design-OS is a specification-driven five-stage framework for engineering system design that maintains traceability from intent to implementation and supports human-AI collaboration, demonstrated on rotary inverted pendulum control cases.
LoRA fine-tuning enables open-source LLMs such as Ministral-8B to generate requirement-based test cases at a level comparable to pre-tuned proprietary GPT-4.1 models.
ProReFiCIA uses LLMs with tailored prompts to identify impacted requirements, achieving 85.7% recall on unseen industrial data while requiring review of only 3% of requirements, rising to 95.7% recall with RAG at 3.6% review cost.
citing papers explorer
-
LLM-Assisted Empirical Software Engineering: Systematic Literature Review and Research Agenda
A systematic review of 50 studies identifies 69 LLM-assisted tasks in empirical software engineering, concentrated in data processing and analysis with gaps in human-centered integration and reproducibility reporting.
-
R2Code: A Self-Reflective LLM Framework for Requirements-to-Code Traceability
R2Code improves requirement-to-code traceability with a bidirectional alignment network, self-reflective consistency verification, and dynamic context-adaptive retrieval, yielding 7.4% average F1 gain and up to 41.7% lower token use on five datasets.
-
Characterizing Datasets for LLM-based Requirements Engineering: A Systematic Mapping Study
A systematic mapping study of 45 LLM-based RE papers identifies and characterizes 62 public datasets, revealing imbalances in open-science practices, elicitation support, and socio-technical diversity.
-
User Reviews as a Source for Usability Requirements: A Precursor Study on Using Large Language Models
LLMs can detect usability content in user reviews with F-scores comparable to humans, though performance depends strongly on prompt design.
-
Evaluating LLM-Based Goal Extraction in Requirements Engineering: Prompting Strategies and Their Limitations
LLM pipeline with generation-critic feedback reaches 61% accuracy on low-level goal extraction from requirements documents and outperforms standalone few-shot prompting, yet remains best suited as an accelerator for manual work.
-
Towards an Agentic LLM-based Approach to Requirement Formalization from Unstructured Specifications
An agentic LLM pipeline extracts and translates unstructured requirements into syntactically and semantically aligned formal properties, achieving 77.8% accuracy across three scenarios.
-
Design-OS: A Specification-Driven Framework for Engineering System Design with a Control-Systems Design Case
Design-OS is a specification-driven five-stage framework for engineering system design that maintains traceability from intent to implementation and supports human-AI collaboration, demonstrated on rotary inverted pendulum control cases.
-
An empirical study of LoRA-based fine-tuning of large language models for automated test case generation
LoRA fine-tuning enables open-source LLMs such as Ministral-8B to generate requirement-based test cases at a level comparable to pre-tuned proprietary GPT-4.1 models.
-
LLM-Driven Cost-Effective Requirements Change Impact Analysis
ProReFiCIA uses LLMs with tailored prompts to identify impacted requirements, achieving 85.7% recall on unseen industrial data while requiring review of only 3% of requirements, rising to 95.7% recall with RAG at 3.6% review cost.