Software architecture meets LLMs: A systematic literature review

Schmid,L · 2025 · arXiv 2505.16697

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

read on arXiv browse 6 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Using LLMs in Software Design: An Empirical Study of GitHub and A Practitioner Survey

cs.SE · 2026-05-02 · unverdicted · novelty 7.0

Developers use LLMs like ChatGPT mainly for knowledge acquisition and code generation at the detailed design level, reporting benefits such as better technology selection and early flaw detection alongside limitations like lengthy outputs, incorrect code, and hallucinations.

Benchmarking Requirement-to-Architecture Generation with Hybrid Evaluation

cs.SE · 2026-04-08 · unverdicted · novelty 7.0

R2ABench benchmark shows LLMs generate syntactically valid software architectures from requirements but produce structurally fragmented results due to weak relational reasoning.

Architecture Without Architects: How AI Coding Agents Shape Software Architecture

cs.SE · 2026-04-05 · unverdicted · novelty 7.0

AI coding agents perform vibe architecting by making prompt-driven architectural choices that produce structurally different systems for identical tasks.

(How) Do Large Language Models Understand High-Level Message Sequence Charts?

cs.SE · 2026-05-13 · conditional · novelty 6.0 · 2 refs

LLMs achieve only modest understanding of HMSC formal semantics at 52 percent accuracy, performing strongly on basic constructs but weakly on abstractions and traces.

CAKE: Cloud Architecture Knowledge Evaluation of Large Language Models

cs.SE · 2026-04-07 · unverdicted · novelty 6.0

CAKE benchmark shows MCQ accuracy on cloud architecture plateaus near 99% above 3B parameters while free-response scores improve steadily with size, and reasoning steps help but tools hurt small models.

Can Large Language Models Assist the Comprehension of ROS2 Software Architectures?

cs.SE · 2026-04-23 · unverdicted · novelty 5.0

LLMs achieve 98.22% accuracy answering factual questions about ROS2 software architectures, with top models reaching 100%.

citing papers explorer

Showing 6 of 6 citing papers.

Using LLMs in Software Design: An Empirical Study of GitHub and A Practitioner Survey cs.SE · 2026-05-02 · unverdicted · none · ref 36
Developers use LLMs like ChatGPT mainly for knowledge acquisition and code generation at the detailed design level, reporting benefits such as better technology selection and early flaw detection alongside limitations like lengthy outputs, incorrect code, and hallucinations.
Benchmarking Requirement-to-Architecture Generation with Hybrid Evaluation cs.SE · 2026-04-08 · unverdicted · none · ref 25
R2ABench benchmark shows LLMs generate syntactically valid software architectures from requirements but produce structurally fragmented results due to weak relational reasoning.
Architecture Without Architects: How AI Coding Agents Shape Software Architecture cs.SE · 2026-04-05 · unverdicted · none · ref 14
AI coding agents perform vibe architecting by making prompt-driven architectural choices that produce structurally different systems for identical tasks.
(How) Do Large Language Models Understand High-Level Message Sequence Charts? cs.SE · 2026-05-13 · conditional · none · ref 14 · 2 links
LLMs achieve only modest understanding of HMSC formal semantics at 52 percent accuracy, performing strongly on basic constructs but weakly on abstractions and traces.
CAKE: Cloud Architecture Knowledge Evaluation of Large Language Models cs.SE · 2026-04-07 · unverdicted · none · ref 24
CAKE benchmark shows MCQ accuracy on cloud architecture plateaus near 99% above 3B parameters while free-response scores improve steadily with size, and reasoning steps help but tools hurt small models.
Can Large Language Models Assist the Comprehension of ROS2 Software Architectures? cs.SE · 2026-04-23 · unverdicted · none · ref 39
LLMs achieve 98.22% accuracy answering factual questions about ROS2 software architectures, with top models reaching 100%.

Software architecture meets LLMs: A systematic literature review

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer